org.apache.spark.sql.functions

public class functions extends Object

Commonly used functions available for DataFrame operations. Using functions defined here provides a little bit more compile-time safety to make sure the function exists.

You can call the functions defined here by two ways: _FUNC_(...) and functions.expr("_FUNC_(...)").

As an example, regr_count is a function that is defined here. You can use regr_count(col("yCol", col("xCol"))) to invoke the regr_count function. This way the programming language's compiler ensures regr_count exists and is of the proper form. You can also use expr("regr_count(yCol, xCol)") function to invoke the same function. In this case, Spark itself will ensure regr_count exists when it analyzes the query.

You can find the entire list of functions at SQL API documentation of your Spark version, see also the latest list

This function APIs usually have methods with Column signature only because it can support not only Column but also other types such as a native string. The other variants currently exist for historical reasons.

Since:: 1.3.0

Nested Class Summary

Nested Classes

Modifier and Type

Class

Description

static class

functions.partitioning$
Constructor Summary

Constructors

Constructor

Description

functions()
Method Summary

Modifier and Type

Method

Description

static Column

abs(Column e)

Computes the absolute value of a numeric value.

static Column

acos(String columnName)

static Column

acos(Column e)

static Column

acosh(String columnName)

static Column

acosh(Column e)

static Column

add_months(Column startDate, int numMonths)

Returns the date that is numMonths after startDate.

static Column

add_months(Column startDate, Column numMonths)

Returns the date that is numMonths after startDate.

static Column

aes_decrypt(Column input, Column key)

Returns a decrypted value of input.

static Column

aes_decrypt(Column input, Column key, Column mode)

Returns a decrypted value of input.

static Column

aes_decrypt(Column input, Column key, Column mode, Column padding)

Returns a decrypted value of input.

static Column

aes_decrypt(Column input, Column key, Column mode, Column padding, Column aad)

Returns a decrypted value of input using AES in mode with padding.

static Column

aes_encrypt(Column input, Column key)

Returns an encrypted value of input.

static Column

aes_encrypt(Column input, Column key, Column mode)

Returns an encrypted value of input.

static Column

aes_encrypt(Column input, Column key, Column mode, Column padding)

Returns an encrypted value of input.

static Column

aes_encrypt(Column input, Column key, Column mode, Column padding, Column iv)

Returns an encrypted value of input.

static Column

aes_encrypt(Column input, Column key, Column mode, Column padding, Column iv, Column aad)

Returns an encrypted value of input using AES in given mode with the specified padding.

static Column

aggregate(Column expr, Column initialValue, scala.Function2<Column,Column,Column> merge)

Applies a binary operator to an initial state and all elements in the array, and reduces this to a single state.

static Column

aggregate(Column expr, Column initialValue, scala.Function2<Column,Column,Column> merge, scala.Function1<Column,Column> finish)

Applies a binary operator to an initial state and all elements in the array, and reduces this to a single state.

static Column

any(Column e)

Aggregate function: returns true if at least one value of e is true.

static Column

any_value(Column e)

Aggregate function: returns some value of e for a group of rows.

static Column

any_value(Column e, Column ignoreNulls)

Aggregate function: returns some value of e for a group of rows.

static Column

approx_count_distinct(String columnName)

Aggregate function: returns the approximate number of distinct items in a group.

static Column

approx_count_distinct(String columnName, double rsd)

Aggregate function: returns the approximate number of distinct items in a group.

static Column

approx_count_distinct(Column e)

Aggregate function: returns the approximate number of distinct items in a group.

static Column

approx_count_distinct(Column e, double rsd)

Aggregate function: returns the approximate number of distinct items in a group.

static Column

approx_percentile(Column e, Column percentage, Column accuracy)

Aggregate function: returns the approximate percentile of the numeric column col which is the smallest value in the ordered col values (sorted from least to greatest) such that no more than percentage of col values is less than the value or equal to that value.

static Column

approxCountDistinct(String columnName)

Deprecated.
Use approx_count_distinct.

static Column

approxCountDistinct(String columnName, double rsd)

Deprecated.
Use approx_count_distinct.

static Column

approxCountDistinct(Column e)

Deprecated.
Use approx_count_distinct.

static Column

approxCountDistinct(Column e, double rsd)

Deprecated.
Use approx_count_distinct.

static Column

array(String colName, String... colNames)

Creates a new array column.

static Column

array(String colName, scala.collection.immutable.Seq<String> colNames)

Creates a new array column.

static Column

array(Column... cols)

Creates a new array column.

static Column

array(scala.collection.immutable.Seq<Column> cols)

Creates a new array column.

static Column

array_agg(Column e)

Aggregate function: returns a list of objects with duplicates.

static Column

array_append(Column column, Object element)

Returns an ARRAY containing all elements from the source ARRAY as well as the new element.

static Column

array_compact(Column column)

Remove all null elements from the given array.

static Column

array_contains(Column column, Object value)

Returns null if the array is null, true if the array contains value, and false otherwise.

static Column

array_distinct(Column e)

Removes duplicate values from the array.

static Column

array_except(Column col1, Column col2)

Returns an array of the elements in the first array but not in the second array, without duplicates.

static Column

array_insert(Column arr, Column pos, Column value)

Adds an item into a given array at a specified position

static Column

array_intersect(Column col1, Column col2)

Returns an array of the elements in the intersection of the given two arrays, without duplicates.

static Column

array_join(Column column, String delimiter)

Concatenates the elements of column using the delimiter.

static Column

array_join(Column column, String delimiter, String nullReplacement)

Concatenates the elements of column using the delimiter.

static Column

array_max(Column e)

Returns the maximum value in the array.

static Column

array_min(Column e)

Returns the minimum value in the array.

static Column

array_position(Column column, Object value)

Locates the position of the first occurrence of the value in the given array as long.

static Column

array_prepend(Column column, Object element)

Returns an array containing value as well as all elements from array.

static Column

array_remove(Column column, Object element)

Remove all elements that equal to element from the given array.

static Column

array_repeat(Column e, int count)

Creates an array containing the left argument repeated the number of times given by the right argument.

static Column

array_repeat(Column left, Column right)

Creates an array containing the left argument repeated the number of times given by the right argument.

static Column

array_size(Column e)

Returns the total number of elements in the array.

static Column

array_sort(Column e)

Sorts the input array in ascending order.

static Column

array_sort(Column e, scala.Function2<Column,Column,Column> comparator)

Sorts the input array based on the given comparator function.

static Column

array_union(Column col1, Column col2)

Returns an array of the elements in the union of the given two arrays, without duplicates.

static Column

arrays_overlap(Column a1, Column a2)

Returns true if a1 and a2 have at least one non-null element in common.

static Column

arrays_zip(Column... e)

Returns a merged array of structs in which the N-th struct contains all N-th values of input arrays.

static Column

arrays_zip(scala.collection.immutable.Seq<Column> e)

Returns a merged array of structs in which the N-th struct contains all N-th values of input arrays.

static Column

asc(String columnName)

Returns a sort expression based on ascending order of the column.

static Column

asc_nulls_first(String columnName)

Returns a sort expression based on ascending order of the column, and null values return before non-null values.

static Column

asc_nulls_last(String columnName)

Returns a sort expression based on ascending order of the column, and null values appear after non-null values.

static Column

ascii(Column e)

Computes the numeric value of the first character of the string column, and returns the result as an int column.

static Column

asin(String columnName)

static Column

asin(Column e)

static Column

asinh(String columnName)

static Column

asinh(Column e)

static Column

assert_true(Column c)

Returns null if the condition is true, and throws an exception otherwise.

static Column

assert_true(Column c, Column e)

Returns null if the condition is true; throws an exception with the error message otherwise.

static Column

atan(String columnName)

static Column

atan(Column e)

static Column

atan2(double yValue, String xName)

static Column

atan2(double yValue, Column x)

static Column

atan2(String yName, double xValue)

static Column

atan2(String yName, String xName)

static Column

atan2(String yName, Column x)

static Column

atan2(Column y, double xValue)

static Column

atan2(Column y, String xName)

static Column

atan2(Column y, Column x)

static Column

atanh(String columnName)

static Column

atanh(Column e)

static Column

avg(String columnName)

Aggregate function: returns the average of the values in a group.

static Column

avg(Column e)

Aggregate function: returns the average of the values in a group.

static Column

base64(Column e)

Computes the BASE64 encoding of a binary column and returns it as a string column.

static Column

bin(String columnName)

An expression that returns the string representation of the binary value of the given long column.

static Column

bin(Column e)

An expression that returns the string representation of the binary value of the given long column.

static Column

bit_and(Column e)

Aggregate function: returns the bitwise AND of all non-null input values, or null if none.

static Column

bit_count(Column e)

Returns the number of bits that are set in the argument expr as an unsigned 64-bit integer, or NULL if the argument is NULL.

static Column

bit_get(Column e, Column pos)

Returns the value of the bit (0 or 1) at the specified position.

static Column

bit_length(Column e)

Calculates the bit length for the specified string column.

static Column

bit_or(Column e)

Aggregate function: returns the bitwise OR of all non-null input values, or null if none.

static Column

bit_xor(Column e)

Aggregate function: returns the bitwise XOR of all non-null input values, or null if none.

static Column

bitmap_bit_position(Column col)

Returns the bucket number for the given input column.

static Column

bitmap_bucket_number(Column col)

Returns the bit position for the given input column.

static Column

bitmap_construct_agg(Column col)

Returns a bitmap with the positions of the bits set from all the values from the input column.

static Column

bitmap_count(Column col)

Returns the number of set bits in the input bitmap.

static Column

bitmap_or_agg(Column col)

Returns a bitmap that is the bitwise OR of all of the bitmaps from the input column.

static Column

bitwise_not(Column e)

Computes bitwise NOT (~) of a number.

static Column

bitwiseNOT(Column e)

Deprecated.
Use bitwise_not.

static Column

bool_and(Column e)

Aggregate function: returns true if all values of e are true.

static Column

bool_or(Column e)

Aggregate function: returns true if at least one value of e is true.

static <T> Dataset<T>

broadcast(Dataset<T> df)

Marks a DataFrame as small enough for use in broadcast joins.

static Column

bround(Column e)

Returns the value of the column e rounded to 0 decimal places with HALF_EVEN round mode.

static Column

bround(Column e, int scale)

Round the value of e to scale decimal places with HALF_EVEN round mode if scale is greater than or equal to 0 or at integral part when scale is less than 0.

static Column

bround(Column e, Column scale)

Round the value of e to scale decimal places with HALF_EVEN round mode if scale is greater than or equal to 0 or at integral part when scale is less than 0.

static Column

btrim(Column str)

Removes the leading and trailing space characters from str.

static Column

btrim(Column str, Column trim)

Remove the leading and trailing trim characters from str.

static Column

bucket(int numBuckets, Column e)

(Java-specific) A transform for any type that partitions by a hash of the input column.

static Column

bucket(Column numBuckets, Column e)

(Java-specific) A transform for any type that partitions by a hash of the input column.

static Column

call_function(String funcName, Column... cols)

Call a SQL function.

static Column

call_function(String funcName, scala.collection.immutable.Seq<Column> cols)

Call a SQL function.

static Column

call_udf(String udfName, Column... cols)

Call an user-defined function.

static Column

call_udf(String udfName, scala.collection.immutable.Seq<Column> cols)

Call an user-defined function.

static Column

callUDF(String udfName, Column... cols)

Call an user-defined function.

static Column

callUDF(String udfName, scala.collection.immutable.Seq<Column> cols)

Deprecated.
Use call_udf.

static Column

cardinality(Column e)

Returns length of array or map.

static Column

cbrt(String columnName)

Computes the cube-root of the given column.

static Column

cbrt(Column e)

Computes the cube-root of the given value.

static Column

ceil(String columnName)

Computes the ceiling of the given value of e to 0 decimal places.

static Column

ceil(Column e)

Computes the ceiling of the given value of e to 0 decimal places.

static Column

ceil(Column e, Column scale)

Computes the ceiling of the given value of e to scale decimal places.

static Column

ceiling(Column e)

Computes the ceiling of the given value of e to 0 decimal places.

static Column

ceiling(Column e, Column scale)

Computes the ceiling of the given value of e to scale decimal places.

static Column

char_length(Column str)

Returns the character length of string data or number of bytes of binary data.

static Column

character_length(Column str)

Returns the character length of string data or number of bytes of binary data.

static Column

chr(Column n)

Returns the ASCII character having the binary equivalent to n.

static Column

coalesce(Column... e)

Returns the first column that is not null, or null if all inputs are null.

static Column

coalesce(scala.collection.immutable.Seq<Column> e)

Returns the first column that is not null, or null if all inputs are null.

static Column

col(String colName)

Returns a Column based on the given column name.

static Column

collate(Column e, String collation)

Marks a given column with specified collation.

static Column

collation(Column e)

Returns the collation name of a given column.

static Column

collect_list(String columnName)

Aggregate function: returns a list of objects with duplicates.

static Column

collect_list(Column e)

Aggregate function: returns a list of objects with duplicates.

static Column

collect_set(String columnName)

Aggregate function: returns a set of objects with duplicate elements eliminated.

static Column

collect_set(Column e)

Aggregate function: returns a set of objects with duplicate elements eliminated.

static Column

column(String colName)

Returns a Column based on the given column name.

static Column

concat(Column... exprs)

Concatenates multiple input columns together into a single column.

static Column

concat(scala.collection.immutable.Seq<Column> exprs)

Concatenates multiple input columns together into a single column.

static Column

concat_ws(String sep, Column... exprs)

Concatenates multiple input string columns together into a single string column, using the given separator.

static Column

concat_ws(String sep, scala.collection.immutable.Seq<Column> exprs)

Concatenates multiple input string columns together into a single string column, using the given separator.

static Column

contains(Column left, Column right)

Returns a boolean.

static Column

conv(Column num, int fromBase, int toBase)

Convert a number in a string column from one base to another.

static Column

convert_timezone(Column targetTz, Column sourceTs)

Converts the timestamp without time zone sourceTs from the current time zone to targetTz.

static Column

convert_timezone(Column sourceTz, Column targetTz, Column sourceTs)

Converts the timestamp without time zone sourceTs from the sourceTz time zone to targetTz.

static Column

corr(String columnName1, String columnName2)

Aggregate function: returns the Pearson Correlation Coefficient for two columns.

static Column

corr(Column column1, Column column2)

Aggregate function: returns the Pearson Correlation Coefficient for two columns.

static Column

cos(String columnName)

static Column

cos(Column e)

static Column

cosh(String columnName)

static Column

cosh(Column e)

static Column

cot(Column e)

static TypedColumn<Object,Object>

count(String columnName)

Aggregate function: returns the number of items in a group.

static Column

count(Column e)

Aggregate function: returns the number of items in a group.

static Column

count_distinct(Column expr, Column... exprs)

Aggregate function: returns the number of distinct items in a group.

static Column

count_distinct(Column expr, scala.collection.immutable.Seq<Column> exprs)

Aggregate function: returns the number of distinct items in a group.

static Column

count_if(Column e)

Aggregate function: returns the number of TRUE values for the expression.

static Column

count_min_sketch(Column e, Column eps, Column confidence, Column seed)

Returns a count-min sketch of a column with the given esp, confidence and seed.

static Column

countDistinct(String columnName, String... columnNames)

Aggregate function: returns the number of distinct items in a group.

static Column

countDistinct(String columnName, scala.collection.immutable.Seq<String> columnNames)

Aggregate function: returns the number of distinct items in a group.

static Column

countDistinct(Column expr, Column... exprs)

Aggregate function: returns the number of distinct items in a group.

static Column

countDistinct(Column expr, scala.collection.immutable.Seq<Column> exprs)

Aggregate function: returns the number of distinct items in a group.

static Column

covar_pop(String columnName1, String columnName2)

Aggregate function: returns the population covariance for two columns.

static Column

covar_pop(Column column1, Column column2)

Aggregate function: returns the population covariance for two columns.

static Column

covar_samp(String columnName1, String columnName2)

Aggregate function: returns the sample covariance for two columns.

static Column

covar_samp(Column column1, Column column2)

Aggregate function: returns the sample covariance for two columns.

static Column

crc32(Column e)

Calculates the cyclic redundancy check value (CRC32) of a binary column and returns the value as a bigint.

static Column

csc(Column e)

static Column

cume_dist()

Window function: returns the cumulative distribution of values within a window partition, i.e.

static Column

curdate()

Returns the current date at the start of query evaluation as a date column.

static Column

current_catalog()

Returns the current catalog.

static Column

current_database()

Returns the current database.

static Column

current_date()

Returns the current date at the start of query evaluation as a date column.

static Column

current_schema()

Returns the current schema.

static Column

current_timestamp()

Returns the current timestamp at the start of query evaluation as a timestamp column.

static Column

current_timezone()

Returns the current session local timezone.

static Column

current_user()

Returns the user name of current execution context.

static Column

date_add(Column start, int days)

Returns the date that is days days after start

static Column

date_add(Column start, Column days)

Returns the date that is days days after start

static Column

date_diff(Column end, Column start)

Returns the number of days from start to end.

static Column

date_format(Column dateExpr, String format)

Converts a date/timestamp/string to a value of string in the format specified by the date format given by the second argument.

static Column

date_from_unix_date(Column days)

Create date from the number of days since 1970-01-01.

static Column

date_part(Column field, Column source)

Extracts a part of the date/timestamp or interval source.

static Column

date_sub(Column start, int days)

Returns the date that is days days before start

static Column

date_sub(Column start, Column days)

Returns the date that is days days before start

static Column

date_trunc(String format, Column timestamp)

Returns timestamp truncated to the unit specified by the format.

static Column

dateadd(Column start, Column days)

Returns the date that is days days after start

static Column

datediff(Column end, Column start)

Returns the number of days from start to end.

static Column

datepart(Column field, Column source)

Extracts a part of the date/timestamp or interval source.

static Column

day(Column e)

Extracts the day of the month as an integer from a given date/timestamp/string.

static Column

dayname(Column timeExp)

Extracts the three-letter abbreviated day name from a given date/timestamp/string.

static Column

dayofmonth(Column e)

Extracts the day of the month as an integer from a given date/timestamp/string.

static Column

dayofweek(Column e)

Extracts the day of the week as an integer from a given date/timestamp/string.

static Column

dayofyear(Column e)

Extracts the day of the year as an integer from a given date/timestamp/string.

static Column

days(Column e)

(Java-specific) A transform for timestamps and dates to partition data into days.

static Column

decode(Column value, String charset)

Computes the first argument into a string from a binary using the provided character set (one of 'US-ASCII', 'ISO-8859-1', 'UTF-8', 'UTF-16BE', 'UTF-16LE', 'UTF-16').

static Column

degrees(String columnName)

Converts an angle measured in radians to an approximately equivalent angle measured in degrees.

static Column

degrees(Column e)

Converts an angle measured in radians to an approximately equivalent angle measured in degrees.

static Column

dense_rank()

Window function: returns the rank of rows within a window partition, without any gaps.

static Column

desc(String columnName)

Returns a sort expression based on the descending order of the column.

static Column

desc_nulls_first(String columnName)

Returns a sort expression based on the descending order of the column, and null values appear before non-null values.

static Column

desc_nulls_last(String columnName)

Returns a sort expression based on the descending order of the column, and null values appear after non-null values.

static Column

e()

Returns Euler's number.

static Column

element_at(Column column, Object value)

Returns element of array at given index in value if column is array.

static Column

elt(Column... inputs)

Returns the n-th input, e.g., returns input2 when n is 2.

static Column

elt(scala.collection.immutable.Seq<Column> inputs)

Returns the n-th input, e.g., returns input2 when n is 2.

static Column

encode(Column value, String charset)

Computes the first argument into a binary from a string using the provided character set (one of 'US-ASCII', 'ISO-8859-1', 'UTF-8', 'UTF-16BE', 'UTF-16LE', 'UTF-16').

static Column

endswith(Column str, Column suffix)

Returns a boolean.

static Column

equal_null(Column col1, Column col2)

Returns same result as the EQUAL(=) operator for non-null operands, but returns true if both are null, false if one of the them is null.

static Column

every(Column e)

Aggregate function: returns true if all values of e are true.

static Column

exists(Column column, scala.Function1<Column,Column> f)

Returns whether a predicate holds for one or more elements in the array.

static Column

exp(String columnName)

Computes the exponential of the given column.

static Column

exp(Column e)

Computes the exponential of the given value.

static Column

explode(Column e)

Creates a new row for each element in the given array or map column.

static Column

explode_outer(Column e)

Creates a new row for each element in the given array or map column.

static Column

expm1(String columnName)

Computes the exponential of the given column minus one.

static Column

expm1(Column e)

Computes the exponential of the given value minus one.

static Column

expr(String expr)

Parses the expression string into the column that it represents, similar to Dataset.selectExpr(java.lang.String...).

static Column

extract(Column field, Column source)

Extracts a part of the date/timestamp or interval source.

static Column

factorial(Column e)

Computes the factorial of the given value.

static Column

filter(Column column, scala.Function1<Column,Column> f)

Returns an array of elements for which a predicate holds in a given array.

static Column

filter(Column column, scala.Function2<Column,Column,Column> f)

Returns an array of elements for which a predicate holds in a given array.

static Column

find_in_set(Column str, Column strArray)

Returns the index (1-based) of the given string (str) in the comma-delimited list (strArray).

static Column

first(String columnName)

Aggregate function: returns the first value of a column in a group.

static Column

first(String columnName, boolean ignoreNulls)

Aggregate function: returns the first value of a column in a group.

static Column

first(Column e)

Aggregate function: returns the first value in a group.

static Column

first(Column e, boolean ignoreNulls)

Aggregate function: returns the first value in a group.

static Column

first_value(Column e)

Aggregate function: returns the first value in a group.

static Column

first_value(Column e, Column ignoreNulls)

Aggregate function: returns the first value in a group.

static Column

flatten(Column e)

Creates a single array from an array of arrays.

static Column

floor(String columnName)

Computes the floor of the given column value to 0 decimal places.

static Column

floor(Column e)

Computes the floor of the given value of e to 0 decimal places.

static Column

floor(Column e, Column scale)

Computes the floor of the given value of e to scale decimal places.

static Column

forall(Column column, scala.Function1<Column,Column> f)

Returns whether a predicate holds for every element in the array.

static Column

format_number(Column x, int d)

Formats numeric column x to a format like '#,###,###.##', rounded to d decimal places with HALF_EVEN round mode, and returns the result as a string column.

static Column

format_string(String format, Column... arguments)

Formats the arguments in printf-style and returns the result as a string column.

static Column

format_string(String format, scala.collection.immutable.Seq<Column> arguments)

Formats the arguments in printf-style and returns the result as a string column.

static Column

from_csv(Column e, Column schema, Map<String,String> options)

(Java-specific) Parses a column containing a CSV string into a StructType with the specified schema.

static Column

from_csv(Column e, StructType schema, scala.collection.immutable.Map<String,String> options)

Parses a column containing a CSV string into a StructType with the specified schema.

static Column

from_json(Column e, String schema, Map<String,String> options)

(Java-specific) Parses a column containing a JSON string into a MapType with StringType as keys type, StructType or ArrayType with the specified schema.

static Column

from_json(Column e, String schema, scala.collection.immutable.Map<String,String> options)

(Scala-specific) Parses a column containing a JSON string into a MapType with StringType as keys type, StructType or ArrayType with the specified schema.

static Column

from_json(Column e, Column schema)

(Scala-specific) Parses a column containing a JSON string into a MapType with StringType as keys type, StructType or ArrayType of StructTypes with the specified schema.

static Column

from_json(Column e, Column schema, Map<String,String> options)

(Java-specific) Parses a column containing a JSON string into a MapType with StringType as keys type, StructType or ArrayType of StructTypes with the specified schema.

static Column

from_json(Column e, DataType schema)

Parses a column containing a JSON string into a MapType with StringType as keys type, StructType or ArrayType with the specified schema.

static Column

from_json(Column e, DataType schema, Map<String,String> options)

(Java-specific) Parses a column containing a JSON string into a MapType with StringType as keys type, StructType or ArrayType with the specified schema.

static Column

from_json(Column e, DataType schema, scala.collection.immutable.Map<String,String> options)

(Scala-specific) Parses a column containing a JSON string into a MapType with StringType as keys type, StructType or ArrayType with the specified schema.

static Column

from_json(Column e, StructType schema)

Parses a column containing a JSON string into a StructType with the specified schema.

static Column

from_json(Column e, StructType schema, Map<String,String> options)

(Java-specific) Parses a column containing a JSON string into a StructType with the specified schema.

static Column

from_json(Column e, StructType schema, scala.collection.immutable.Map<String,String> options)

(Scala-specific) Parses a column containing a JSON string into a StructType with the specified schema.

static Column

from_unixtime(Column ut)

Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string representing the timestamp of that moment in the current system time zone in the yyyy-MM-dd HH:mm:ss format.

static Column

from_unixtime(Column ut, String f)

Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string representing the timestamp of that moment in the current system time zone in the given format.

static Column

from_utc_timestamp(Column ts, String tz)

Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in UTC, and renders that time as a timestamp in the given time zone.

static Column

from_utc_timestamp(Column ts, Column tz)

Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in UTC, and renders that time as a timestamp in the given time zone.

static Column

from_xml(Column e, String schema, Map<String,String> options)

(Java-specific) Parses a column containing a XML string into a StructType with the specified schema.

static Column

from_xml(Column e, Column schema)

(Java-specific) Parses a column containing a XML string into a StructType with the specified schema.

static Column

from_xml(Column e, Column schema, Map<String,String> options)

(Java-specific) Parses a column containing a XML string into a StructType with the specified schema.

static Column

from_xml(Column e, StructType schema)

Parses a column containing a XML string into the data type corresponding to the specified schema.

static Column

from_xml(Column e, StructType schema, Map<String,String> options)

Parses a column containing a XML string into the data type corresponding to the specified schema.

static Column

get(Column column, Column index)

Returns element of array at given (0-based) index.

static Column

get_json_object(Column e, String path)

Extracts json object from a json string based on json path specified, and returns json string of the extracted json object.

static Column

getbit(Column e, Column pos)

Returns the value of the bit (0 or 1) at the specified position.

static Column

greatest(String columnName, String... columnNames)

Returns the greatest value of the list of column names, skipping null values.

static Column

greatest(String columnName, scala.collection.immutable.Seq<String> columnNames)

Returns the greatest value of the list of column names, skipping null values.

static Column

greatest(Column... exprs)

Returns the greatest value of the list of values, skipping null values.

static Column

greatest(scala.collection.immutable.Seq<Column> exprs)

Returns the greatest value of the list of values, skipping null values.

static Column

grouping(String columnName)

Aggregate function: indicates whether a specified column in a GROUP BY list is aggregated or not, returns 1 for aggregated or 0 for not aggregated in the result set.

static Column

grouping(Column e)

Aggregate function: indicates whether a specified column in a GROUP BY list is aggregated or not, returns 1 for aggregated or 0 for not aggregated in the result set.

static Column

grouping_id(String colName, scala.collection.immutable.Seq<String> colNames)

Aggregate function: returns the level of grouping, equals to

static Column

grouping_id(scala.collection.immutable.Seq<Column> cols)

Aggregate function: returns the level of grouping, equals to

static Column

hash(Column... cols)

Calculates the hash code of given columns, and returns the result as an int column.

static Column

hash(scala.collection.immutable.Seq<Column> cols)

Calculates the hash code of given columns, and returns the result as an int column.

static Column

hex(Column column)

Computes hex value of the given column.

static Column

histogram_numeric(Column e, Column nBins)

Aggregate function: computes a histogram on numeric 'expr' using nb bins.

static Column

hll_sketch_agg(String columnName)

Aggregate function: returns the updatable binary representation of the Datasketches HllSketch configured with default lgConfigK value.

static Column

hll_sketch_agg(String columnName, int lgConfigK)

Aggregate function: returns the updatable binary representation of the Datasketches HllSketch configured with lgConfigK arg.

static Column

hll_sketch_agg(Column e)

Aggregate function: returns the updatable binary representation of the Datasketches HllSketch configured with default lgConfigK value.

static Column

hll_sketch_agg(Column e, int lgConfigK)

Aggregate function: returns the updatable binary representation of the Datasketches HllSketch configured with lgConfigK arg.

static Column

hll_sketch_agg(Column e, Column lgConfigK)

Aggregate function: returns the updatable binary representation of the Datasketches HllSketch configured with lgConfigK arg.

static Column

hll_sketch_estimate(String columnName)

Returns the estimated number of unique values given the binary representation of a Datasketches HllSketch.

static Column

hll_sketch_estimate(Column c)

Returns the estimated number of unique values given the binary representation of a Datasketches HllSketch.

static Column

hll_union(String columnName1, String columnName2)

Merges two binary representations of Datasketches HllSketch objects, using a Datasketches Union object.

static Column

hll_union(String columnName1, String columnName2, boolean allowDifferentLgConfigK)

Merges two binary representations of Datasketches HllSketch objects, using a Datasketches Union object.

static Column

hll_union(Column c1, Column c2)

Merges two binary representations of Datasketches HllSketch objects, using a Datasketches Union object.

static Column

hll_union(Column c1, Column c2, boolean allowDifferentLgConfigK)

Merges two binary representations of Datasketches HllSketch objects, using a Datasketches Union object.

static Column

hll_union_agg(String columnName)

Aggregate function: returns the updatable binary representation of the Datasketches HllSketch, generated by merging previously created Datasketches HllSketch instances via a Datasketches Union instance.

static Column

hll_union_agg(String columnName, boolean allowDifferentLgConfigK)

Aggregate function: returns the updatable binary representation of the Datasketches HllSketch, generated by merging previously created Datasketches HllSketch instances via a Datasketches Union instance.

static Column

hll_union_agg(Column e)

Aggregate function: returns the updatable binary representation of the Datasketches HllSketch, generated by merging previously created Datasketches HllSketch instances via a Datasketches Union instance.

static Column

hll_union_agg(Column e, boolean allowDifferentLgConfigK)

Aggregate function: returns the updatable binary representation of the Datasketches HllSketch, generated by merging previously created Datasketches HllSketch instances via a Datasketches Union instance.

static Column

hll_union_agg(Column e, Column allowDifferentLgConfigK)

Aggregate function: returns the updatable binary representation of the Datasketches HllSketch, generated by merging previously created Datasketches HllSketch instances via a Datasketches Union instance.

static Column

hour(Column e)

Extracts the hours as an integer from a given date/timestamp/string.

static Column

hours(Column e)

(Java-specific) A transform for timestamps to partition data into hours.

static Column

hypot(double l, String rightName)

Computes sqrt(a^2^ + b^2^) without intermediate overflow or underflow.

static Column

hypot(double l, Column r)

Computes sqrt(a^2^ + b^2^) without intermediate overflow or underflow.

static Column

hypot(String leftName, double r)

Computes sqrt(a^2^ + b^2^) without intermediate overflow or underflow.

static Column

hypot(String leftName, String rightName)

Computes sqrt(a^2^ + b^2^) without intermediate overflow or underflow.

static Column

hypot(String leftName, Column r)

Computes sqrt(a^2^ + b^2^) without intermediate overflow or underflow.

static Column

hypot(Column l, double r)

Computes sqrt(a^2^ + b^2^) without intermediate overflow or underflow.

static Column

hypot(Column l, String rightName)

Computes sqrt(a^2^ + b^2^) without intermediate overflow or underflow.

static Column

hypot(Column l, Column r)

Computes sqrt(a^2^ + b^2^) without intermediate overflow or underflow.

static Column

ifnull(Column col1, Column col2)

Returns col2 if col1 is null, or col1 otherwise.

static Column

ilike(Column str, Column pattern)

Returns true if str matches pattern with escapeChar('\') case-insensitively, null if any arguments are null, false otherwise.

static Column

ilike(Column str, Column pattern, Column escapeChar)

Returns true if str matches pattern with escapeChar case-insensitively, null if any arguments are null, false otherwise.

static Column

initcap(Column e)

Returns a new string column by converting the first letter of each word to uppercase.

static Column

inline(Column e)

Creates a new row for each element in the given array of structs.

static Column

inline_outer(Column e)

Creates a new row for each element in the given array of structs.

static Column

input_file_block_length()

Returns the length of the block being read, or -1 if not available.

static Column

input_file_block_start()

Returns the start offset of the block being read, or -1 if not available.

static Column

input_file_name()

Creates a string column for the file name of the current Spark task.

static Column

instr(Column str, String substring)

Locate the position of the first occurrence of substr column in the given string.

static Column

is_variant_null(Column v)

Check if a variant value is a variant null.

static Column

isnan(Column e)

Return true iff the column is NaN.

static Column

isnotnull(Column col)

Returns true if col is not null, or false otherwise.

static Column

isnull(Column e)

Return true iff the column is null.

static Column

java_method(scala.collection.immutable.Seq<Column> cols)

Calls a method with reflection.

static Column

json_array_length(Column e)

Returns the number of elements in the outermost JSON array.

static Column

json_object_keys(Column e)

Returns all the keys of the outermost JSON object as an array.

static Column

json_tuple(Column json, String... fields)

Creates a new row for a json column according to the given field names.

static Column

json_tuple(Column json, scala.collection.immutable.Seq<String> fields)

Creates a new row for a json column according to the given field names.

static Column

kurtosis(String columnName)

Aggregate function: returns the kurtosis of the values in a group.

static Column

kurtosis(Column e)

Aggregate function: returns the kurtosis of the values in a group.

static Column

lag(String columnName, int offset)

Window function: returns the value that is offset rows before the current row, and null if there is less than offset rows before the current row.

static Column

lag(String columnName, int offset, Object defaultValue)

Window function: returns the value that is offset rows before the current row, and defaultValue if there is less than offset rows before the current row.

static Column

lag(Column e, int offset)

Window function: returns the value that is offset rows before the current row, and null if there is less than offset rows before the current row.

static Column

lag(Column e, int offset, Object defaultValue)

Window function: returns the value that is offset rows before the current row, and defaultValue if there is less than offset rows before the current row.

static Column

lag(Column e, int offset, Object defaultValue, boolean ignoreNulls)

Window function: returns the value that is offset rows before the current row, and defaultValue if there is less than offset rows before the current row.

static Column

last(String columnName)

Aggregate function: returns the last value of the column in a group.

static Column

last(String columnName, boolean ignoreNulls)

Aggregate function: returns the last value of the column in a group.

static Column

last(Column e)

Aggregate function: returns the last value in a group.

static Column

last(Column e, boolean ignoreNulls)

Aggregate function: returns the last value in a group.

static Column

last_day(Column e)

Returns the last day of the month which the given date belongs to.

static Column

last_value(Column e)

Aggregate function: returns the last value in a group.

static Column

last_value(Column e, Column ignoreNulls)

Aggregate function: returns the last value in a group.

static Column

lcase(Column str)

Returns str with all characters changed to lowercase.

static Column

lead(String columnName, int offset)

Window function: returns the value that is offset rows after the current row, and null if there is less than offset rows after the current row.

static Column

lead(String columnName, int offset, Object defaultValue)

Window function: returns the value that is offset rows after the current row, and defaultValue if there is less than offset rows after the current row.

static Column

lead(Column e, int offset)

Window function: returns the value that is offset rows after the current row, and null if there is less than offset rows after the current row.

static Column

lead(Column e, int offset, Object defaultValue)

Window function: returns the value that is offset rows after the current row, and defaultValue if there is less than offset rows after the current row.

static Column

lead(Column e, int offset, Object defaultValue, boolean ignoreNulls)

Window function: returns the value that is offset rows after the current row, and defaultValue if there is less than offset rows after the current row.

static Column

least(String columnName, String... columnNames)

Returns the least value of the list of column names, skipping null values.

static Column

least(String columnName, scala.collection.immutable.Seq<String> columnNames)

Returns the least value of the list of column names, skipping null values.

static Column

least(Column... exprs)

Returns the least value of the list of values, skipping null values.

static Column

least(scala.collection.immutable.Seq<Column> exprs)

Returns the least value of the list of values, skipping null values.

static Column

left(Column str, Column len)

Returns the leftmost len(len can be string type) characters from the string str, if len is less or equal than 0 the result is an empty string.

static Column

len(Column e)

Computes the character length of a given string or number of bytes of a binary string.

static Column

length(Column e)

Computes the character length of a given string or number of bytes of a binary string.

static Column

levenshtein(Column l, Column r)

Computes the Levenshtein distance of the two given string columns.

static Column

levenshtein(Column l, Column r, int threshold)

Computes the Levenshtein distance of the two given string columns if it's less than or equal to a given threshold.

static Column

like(Column str, Column pattern)

Returns true if str matches pattern with escapeChar('\'), null if any arguments are null, false otherwise.

static Column

like(Column str, Column pattern, Column escapeChar)

Returns true if str matches pattern with escapeChar, null if any arguments are null, false otherwise.

static Column

lit(Object literal)

Creates a Column of literal value.

static Column

ln(Column e)

Computes the natural logarithm of the given value.

static Column

localtimestamp()

Returns the current timestamp without time zone at the start of query evaluation as a timestamp without time zone column.

static Column

locate(String substr, Column str)

Locate the position of the first occurrence of substr.

static Column

locate(String substr, Column str, int pos)

Locate the position of the first occurrence of substr in a string column, after position pos.

static Column

log(double base, String columnName)

Returns the first argument-base logarithm of the second argument.

static Column

log(double base, Column a)

Returns the first argument-base logarithm of the second argument.

static Column

log(String columnName)

Computes the natural logarithm of the given column.

static Column

log(Column e)

Computes the natural logarithm of the given value.

static Column

log10(String columnName)

Computes the logarithm of the given value in base 10.

static Column

log10(Column e)

Computes the logarithm of the given value in base 10.

static Column

log1p(String columnName)

Computes the natural logarithm of the given column plus one.

static Column

log1p(Column e)

Computes the natural logarithm of the given value plus one.

static Column

log2(String columnName)

Computes the logarithm of the given value in base 2.

static Column

log2(Column expr)

Computes the logarithm of the given column in base 2.

static Column

lower(Column e)

Converts a string column to lower case.

static Column

lpad(Column str, int len, byte[] pad)

Left-pad the binary column with pad to a byte length of len.

static Column

lpad(Column str, int len, String pad)

Left-pad the string column with pad to a length of len.

static Column

ltrim(Column e)

Trim the spaces from left end for the specified string value.

static Column

ltrim(Column e, String trimString)

Trim the specified character string from left end for the specified string column.

static Column

make_date(Column year, Column month, Column day)

static Column

make_dt_interval()

Make DayTimeIntervalType duration.

static Column

make_dt_interval(Column days)

Make DayTimeIntervalType duration from days.

static Column

make_dt_interval(Column days, Column hours)

Make DayTimeIntervalType duration from days and hours.

static Column

make_dt_interval(Column days, Column hours, Column mins)

Make DayTimeIntervalType duration from days, hours and mins.

static Column

make_dt_interval(Column days, Column hours, Column mins, Column secs)

Make DayTimeIntervalType duration from days, hours, mins and secs.

static Column

make_interval()

Make interval.

static Column

make_interval(Column years)

Make interval from years.

static Column

make_interval(Column years, Column months)

Make interval from years and months.

static Column

make_interval(Column years, Column months, Column weeks)

Make interval from years, months and weeks.

static Column

make_interval(Column years, Column months, Column weeks, Column days)

Make interval from years, months, weeks and days.

static Column

make_interval(Column years, Column months, Column weeks, Column days, Column hours)

Make interval from years, months, weeks, days and hours.

static Column

make_interval(Column years, Column months, Column weeks, Column days, Column hours, Column mins)

Make interval from years, months, weeks, days, hours and mins.

static Column

make_interval(Column years, Column months, Column weeks, Column days, Column hours, Column mins, Column secs)

Make interval from years, months, weeks, days, hours, mins and secs.

static Column

make_timestamp(Column years, Column months, Column days, Column hours, Column mins, Column secs)

Create timestamp from years, months, days, hours, mins and secs fields.

static Column

make_timestamp(Column years, Column months, Column days, Column hours, Column mins, Column secs, Column timezone)

Create timestamp from years, months, days, hours, mins, secs and timezone fields.

static Column

make_timestamp_ltz(Column years, Column months, Column days, Column hours, Column mins, Column secs)

Create the current timestamp with local time zone from years, months, days, hours, mins and secs fields.

static Column

make_timestamp_ltz(Column years, Column months, Column days, Column hours, Column mins, Column secs, Column timezone)

Create the current timestamp with local time zone from years, months, days, hours, mins, secs and timezone fields.

static Column

make_timestamp_ntz(Column years, Column months, Column days, Column hours, Column mins, Column secs)

Create local date-time from years, months, days, hours, mins, secs fields.

static Column

make_ym_interval()

Make year-month interval.

static Column

make_ym_interval(Column years)

Make year-month interval from years.

static Column

make_ym_interval(Column years, Column months)

Make year-month interval from years, months.

static Column

map(Column... cols)

Creates a new map column.

static Column

map(scala.collection.immutable.Seq<Column> cols)

Creates a new map column.

static Column

map_concat(Column... cols)

Returns the union of all the given maps.

static Column

map_concat(scala.collection.immutable.Seq<Column> cols)

Returns the union of all the given maps.

static Column

map_contains_key(Column column, Object key)

Returns true if the map contains the key.

static Column

map_entries(Column e)

Returns an unordered array of all entries in the given map.

static Column

map_filter(Column expr, scala.Function2<Column,Column,Column> f)

Returns a map whose key-value pairs satisfy a predicate.

static Column

map_from_arrays(Column keys, Column values)

Creates a new map column.

static Column

map_from_entries(Column e)

Returns a map created from the given array of entries.

static Column

map_keys(Column e)

Returns an unordered array containing the keys of the map.

static Column

map_values(Column e)

Returns an unordered array containing the values of the map.

static Column

map_zip_with(Column left, Column right, scala.Function3<Column,Column,Column,Column> f)

Merge two given maps, key-wise into a single map using a function.

static Column

mask(Column input)

Masks the given string value.

static Column

mask(Column input, Column upperChar)

Masks the given string value.

static Column

mask(Column input, Column upperChar, Column lowerChar)

Masks the given string value.

static Column

mask(Column input, Column upperChar, Column lowerChar, Column digitChar)

Masks the given string value.

static Column

mask(Column input, Column upperChar, Column lowerChar, Column digitChar, Column otherChar)

Masks the given string value.

static Column

max(String columnName)

Aggregate function: returns the maximum value of the column in a group.

static Column

max(Column e)

Aggregate function: returns the maximum value of the expression in a group.

static Column

max_by(Column e, Column ord)

Aggregate function: returns the value associated with the maximum value of ord.

static Column

md5(Column e)

Calculates the MD5 digest of a binary column and returns the value as a 32 character hex string.

static Column

mean(String columnName)

Aggregate function: returns the average of the values in a group.

static Column

mean(Column e)

Aggregate function: returns the average of the values in a group.

static Column

median(Column e)

Aggregate function: returns the median of the values in a group.

static Column

min(String columnName)

Aggregate function: returns the minimum value of the column in a group.

static Column

min(Column e)

Aggregate function: returns the minimum value of the expression in a group.

static Column

min_by(Column e, Column ord)

Aggregate function: returns the value associated with the minimum value of ord.

static Column

minute(Column e)

Extracts the minutes as an integer from a given date/timestamp/string.

static Column

mode(Column e)

Aggregate function: returns the most frequent value in a group.

static Column

mode(Column e, boolean deterministic)

Aggregate function: returns the most frequent value in a group.

static Column

monotonically_increasing_id()

A column expression that generates monotonically increasing 64-bit integers.

static Column

monotonicallyIncreasingId()

Deprecated.
Use monotonically_increasing_id().

static Column

month(Column e)

Extracts the month as an integer from a given date/timestamp/string.

static Column

monthname(Column timeExp)

Extracts the three-letter abbreviated month name from a given date/timestamp/string.

static Column

months(Column e)

(Java-specific) A transform for timestamps and dates to partition data into months.

static Column

months_between(Column end, Column start)

Returns number of months between dates start and end.

static Column

months_between(Column end, Column start, boolean roundOff)

Returns number of months between dates end and start.

static Column

named_struct(scala.collection.immutable.Seq<Column> cols)

Creates a struct with the given field names and values.

static Column

nanvl(Column col1, Column col2)

Returns col1 if it is not NaN, or col2 if col1 is NaN.

static Column

negate(Column e)

Unary minus, i.e.

static Column

negative(Column e)

Returns the negated value.

static Column

next_day(Column date, String dayOfWeek)

Returns the first date which is later than the value of the date column that is on the specified day of the week.

static Column

next_day(Column date, Column dayOfWeek)

Returns the first date which is later than the value of the date column that is on the specified day of the week.

static Column

not(Column e)

Inversion of boolean expression, i.e.

static Column

now()

Returns the current timestamp at the start of query evaluation.

static Column

nth_value(Column e, int offset)

Window function: returns the value that is the offsetth row of the window frame (counting from 1), and null if the size of window frame is less than offset rows.

static Column

nth_value(Column e, int offset, boolean ignoreNulls)

Window function: returns the value that is the offsetth row of the window frame (counting from 1), and null if the size of window frame is less than offset rows.

static Column

ntile(int n)

Window function: returns the ntile group id (from 1 to n inclusive) in an ordered window partition.

static Column

nullif(Column col1, Column col2)

Returns null if col1 equals to col2, or col1 otherwise.

static Column

nvl(Column col1, Column col2)

Returns col2 if col1 is null, or col1 otherwise.

static Column

nvl2(Column col1, Column col2, Column col3)

Returns col2 if col1 is not null, or col3 otherwise.

static Column

octet_length(Column e)

Calculates the byte length for the specified string column.

static Column

overlay(Column src, Column replace, Column pos)

Overlay the specified portion of src with replace, starting from byte position pos of src.

static Column

overlay(Column src, Column replace, Column pos, Column len)

Overlay the specified portion of src with replace, starting from byte position pos of src and proceeding for len bytes.

static Column

parse_json(Column json)

Parses a JSON string and constructs a Variant value.

static Column

parse_url(Column url, Column partToExtract)

Extracts a part from a URL.

static Column

parse_url(Column url, Column partToExtract, Column key)

Extracts a part from a URL.

static Column

percent_rank()

Window function: returns the relative rank (i.e.

static Column

percentile(Column e, Column percentage)

Aggregate function: returns the exact percentile(s) of numeric column expr at the given percentage(s) with value range in [0.0, 1.0].

static Column

percentile(Column e, Column percentage, Column frequency)

Aggregate function: returns the exact percentile(s) of numeric column expr at the given percentage(s) with value range in [0.0, 1.0].

static Column

percentile_approx(Column e, Column percentage, Column accuracy)

Aggregate function: returns the approximate percentile of the numeric column col which is the smallest value in the ordered col values (sorted from least to greatest) such that no more than percentage of col values is less than the value or equal to that value.

static Column

pi()

Returns Pi.

static Column

pmod(Column dividend, Column divisor)

Returns the positive value of dividend mod divisor.

static Column

posexplode(Column e)

Creates a new row for each element with position in the given array or map column.

static Column

posexplode_outer(Column e)

Creates a new row for each element with position in the given array or map column.

static Column

position(Column substr, Column str)

Returns the position of the first occurrence of substr in str after position 1.

static Column

position(Column substr, Column str, Column start)

Returns the position of the first occurrence of substr in str after position start.

static Column

positive(Column e)

Returns the value.

static Column

pow(double l, String rightName)

Returns the value of the first argument raised to the power of the second argument.

static Column

pow(double l, Column r)

Returns the value of the first argument raised to the power of the second argument.

static Column

pow(String leftName, double r)

Returns the value of the first argument raised to the power of the second argument.

static Column

pow(String leftName, String rightName)

Returns the value of the first argument raised to the power of the second argument.

static Column

pow(String leftName, Column r)

Returns the value of the first argument raised to the power of the second argument.

static Column

pow(Column l, double r)

Returns the value of the first argument raised to the power of the second argument.

static Column

pow(Column l, String rightName)

Returns the value of the first argument raised to the power of the second argument.

static Column

pow(Column l, Column r)

Returns the value of the first argument raised to the power of the second argument.

static Column

power(Column l, Column r)

Returns the value of the first argument raised to the power of the second argument.

static Column

printf(Column format, scala.collection.immutable.Seq<Column> arguments)

Formats the arguments in printf-style and returns the result as a string column.

static Column

product(Column e)

Aggregate function: returns the product of all numerical elements in a group.

static Column

quarter(Column e)

Extracts the quarter as an integer from a given date/timestamp/string.

static Column

radians(String columnName)

Converts an angle measured in degrees to an approximately equivalent angle measured in radians.

static Column

radians(Column e)

Converts an angle measured in degrees to an approximately equivalent angle measured in radians.

static Column

raise_error(Column c)

Throws an exception with the provided error message.

static Column

rand()

Generate a random column with independent and identically distributed (i.i.d.) samples uniformly distributed in [0.0, 1.0).

static Column

rand(long seed)

Generate a random column with independent and identically distributed (i.i.d.) samples uniformly distributed in [0.0, 1.0).

static Column

randn()

Generate a column with independent and identically distributed (i.i.d.) samples from the standard normal distribution.

static Column

randn(long seed)

Generate a column with independent and identically distributed (i.i.d.) samples from the standard normal distribution.

static Column

random()

Returns a random value with independent and identically distributed (i.i.d.) uniformly distributed values in [0, 1).

static Column

random(Column seed)

Returns a random value with independent and identically distributed (i.i.d.) uniformly distributed values in [0, 1).

static Column

rank()

Window function: returns the rank of rows within a window partition.

static Column

reduce(Column expr, Column initialValue, scala.Function2<Column,Column,Column> merge)

Applies a binary operator to an initial state and all elements in the array, and reduces this to a single state.

static Column

reduce(Column expr, Column initialValue, scala.Function2<Column,Column,Column> merge, scala.Function1<Column,Column> finish)

Applies a binary operator to an initial state and all elements in the array, and reduces this to a single state.

static Column

reflect(scala.collection.immutable.Seq<Column> cols)

Calls a method with reflection.

static Column

regexp(Column str, Column regexp)

Returns true if str matches regexp, or false otherwise.

static Column

regexp_count(Column str, Column regexp)

Returns a count of the number of times that the regular expression pattern regexp is matched in the string str.

static Column

regexp_extract(Column e, String exp, int groupIdx)

Extract a specific group matched by a Java regex, from the specified string column.

static Column

regexp_extract_all(Column str, Column regexp)

Extract all strings in the str that match the regexp expression and corresponding to the first regex group index.

static Column

regexp_extract_all(Column str, Column regexp, Column idx)

Extract all strings in the str that match the regexp expression and corresponding to the regex group index.

static Column

regexp_instr(Column str, Column regexp)

Searches a string for a regular expression and returns an integer that indicates the beginning position of the matched substring.

static Column

regexp_instr(Column str, Column regexp, Column idx)

Searches a string for a regular expression and returns an integer that indicates the beginning position of the matched substring.

static Column

regexp_like(Column str, Column regexp)

Returns true if str matches regexp, or false otherwise.

static Column

regexp_replace(Column e, String pattern, String replacement)

Replace all substrings of the specified string value that match regexp with rep.

static Column

regexp_replace(Column e, Column pattern, Column replacement)

Replace all substrings of the specified string value that match regexp with rep.

static Column

regexp_substr(Column str, Column regexp)

Returns the substring that matches the regular expression regexp within the string str.

static Column

regr_avgx(Column y, Column x)

Aggregate function: returns the average of the independent variable for non-null pairs in a group, where y is the dependent variable and x is the independent variable.

static Column

regr_avgy(Column y, Column x)

Aggregate function: returns the average of the independent variable for non-null pairs in a group, where y is the dependent variable and x is the independent variable.

static Column

regr_count(Column y, Column x)

Aggregate function: returns the number of non-null number pairs in a group, where y is the dependent variable and x is the independent variable.

static Column

regr_intercept(Column y, Column x)

Aggregate function: returns the intercept of the univariate linear regression line for non-null pairs in a group, where y is the dependent variable and x is the independent variable.

static Column

regr_r2(Column y, Column x)

Aggregate function: returns the coefficient of determination for non-null pairs in a group, where y is the dependent variable and x is the independent variable.

static Column

regr_slope(Column y, Column x)

Aggregate function: returns the slope of the linear regression line for non-null pairs in a group, where y is the dependent variable and x is the independent variable.

static Column

regr_sxx(Column y, Column x)

Aggregate function: returns REGR_COUNT(y, x) * VAR_POP(x) for non-null pairs in a group, where y is the dependent variable and x is the independent variable.

static Column

regr_sxy(Column y, Column x)

Aggregate function: returns REGR_COUNT(y, x) * COVAR_POP(y, x) for non-null pairs in a group, where y is the dependent variable and x is the independent variable.

static Column

regr_syy(Column y, Column x)

Aggregate function: returns REGR_COUNT(y, x) * VAR_POP(y) for non-null pairs in a group, where y is the dependent variable and x is the independent variable.

static Column

repeat(Column str, int n)

Repeats a string column n times, and returns it as a new string column.

static Column

repeat(Column str, Column n)

Repeats a string column n times, and returns it as a new string column.

static Column

replace(Column src, Column search)

Replaces all occurrences of search with replace.

static Column

replace(Column src, Column search, Column replace)

Replaces all occurrences of search with replace.

static Column

reverse(Column e)

Returns a reversed string or an array with reverse order of elements.

static Column

right(Column str, Column len)

Returns the rightmost len(len can be string type) characters from the string str, if len is less or equal than 0 the result is an empty string.

static Column

rint(String columnName)

Returns the double value that is closest in value to the argument and is equal to a mathematical integer.

static Column

rint(Column e)

Returns the double value that is closest in value to the argument and is equal to a mathematical integer.

static Column

rlike(Column str, Column regexp)

Returns true if str matches regexp, or false otherwise.

static Column

round(Column e)

Returns the value of the column e rounded to 0 decimal places with HALF_UP round mode.

static Column

round(Column e, int scale)

Round the value of e to scale decimal places with HALF_UP round mode if scale is greater than or equal to 0 or at integral part when scale is less than 0.

static Column

round(Column e, Column scale)

Round the value of e to scale decimal places with HALF_UP round mode if scale is greater than or equal to 0 or at integral part when scale is less than 0.

static Column

row_number()

Window function: returns a sequential number starting at 1 within a window partition.

static Column

rpad(Column str, int len, byte[] pad)

Right-pad the binary column with pad to a byte length of len.

static Column

rpad(Column str, int len, String pad)

Right-pad the string column with pad to a length of len.

static Column

rtrim(Column e)

Trim the spaces from right end for the specified string value.

static Column

rtrim(Column e, String trimString)

Trim the specified character string from right end for the specified string column.

static Column

schema_of_csv(String csv)

Parses a CSV string and infers its schema in DDL format.

static Column

schema_of_csv(Column csv)

Parses a CSV string and infers its schema in DDL format.

static Column

schema_of_csv(Column csv, Map<String,String> options)

Parses a CSV string and infers its schema in DDL format using options.

static Column

schema_of_json(String json)

Parses a JSON string and infers its schema in DDL format.

static Column

schema_of_json(Column json)

Parses a JSON string and infers its schema in DDL format.

static Column

schema_of_json(Column json, Map<String,String> options)

Parses a JSON string and infers its schema in DDL format using options.

static Column

schema_of_variant(Column v)

Returns schema in the SQL format of a variant.

static Column

schema_of_variant_agg(Column v)

Returns the merged schema in the SQL format of a variant column.

static Column

schema_of_xml(String xml)

Parses a XML string and infers its schema in DDL format.

static Column

schema_of_xml(Column xml)

Parses a XML string and infers its schema in DDL format.

static Column

schema_of_xml(Column xml, Map<String,String> options)

Parses a XML string and infers its schema in DDL format using options.

static Column

sec(Column e)

static Column

second(Column e)

Extracts the seconds as an integer from a given date/timestamp/string.

static Column

sentences(Column string)

Splits a string into arrays of sentences, where each sentence is an array of words.

static Column

sentences(Column string, Column language, Column country)

Splits a string into arrays of sentences, where each sentence is an array of words.

static Column

sequence(Column start, Column stop)

Generate a sequence of integers from start to stop, incrementing by 1 if start is less than or equal to stop, otherwise -1.

static Column

sequence(Column start, Column stop, Column step)

Generate a sequence of integers from start to stop, incrementing by step.

static Column

session_user()

Returns the user name of current execution context.

static Column

session_window(Column timeColumn, String gapDuration)

Generates session window given a timestamp specifying column.

static Column

session_window(Column timeColumn, Column gapDuration)

Generates session window given a timestamp specifying column.

static Column

sha(Column col)

Returns a sha1 hash value as a hex string of the col.

static Column

sha1(Column e)

Calculates the SHA-1 digest of a binary column and returns the value as a 40 character hex string.

static Column

sha2(Column e, int numBits)

Calculates the SHA-2 family of hash functions of a binary column and returns the value as a hex string.

static Column

shiftleft(Column e, int numBits)

Shift the given value numBits left.

static Column

shiftLeft(Column e, int numBits)

Deprecated.
Use shiftleft.

static Column

shiftright(Column e, int numBits)

(Signed) shift the given value numBits right.

static Column

shiftRight(Column e, int numBits)

Deprecated.
Use shiftright.

static Column

shiftrightunsigned(Column e, int numBits)

Unsigned shift the given value numBits right.

static Column

shiftRightUnsigned(Column e, int numBits)

Deprecated.
Use shiftrightunsigned.

static Column

shuffle(Column e)

Returns a random permutation of the given array.

static Column

sign(Column e)

Computes the signum of the given value.

static Column

signum(String columnName)

Computes the signum of the given column.

static Column

signum(Column e)

Computes the signum of the given value.

static Column

sin(String columnName)

static Column

sin(Column e)

static Column

sinh(String columnName)

static Column

sinh(Column e)

static Column

size(Column e)

Returns length of array or map.

static Column

skewness(String columnName)

Aggregate function: returns the skewness of the values in a group.

static Column

skewness(Column e)

Aggregate function: returns the skewness of the values in a group.

static Column

slice(Column x, int start, int length)

Returns an array containing all the elements in x from index start (or starting from the end if start is negative) with the specified length.

static Column

slice(Column x, Column start, Column length)

Returns an array containing all the elements in x from index start (or starting from the end if start is negative) with the specified length.

static Column

some(Column e)

Aggregate function: returns true if at least one value of e is true.

static Column

sort_array(Column e)

Sorts the input array for the given column in ascending order, according to the natural ordering of the array elements.

static Column

sort_array(Column e, boolean asc)

Sorts the input array for the given column in ascending or descending order, according to the natural ordering of the array elements.

static Column

soundex(Column e)

Returns the soundex code for the specified expression.

static Column

spark_partition_id()

Partition ID.

static Column

split(Column str, String pattern)

Splits str around matches of the given pattern.

static Column

split(Column str, String pattern, int limit)

Splits str around matches of the given pattern.

static Column

split(Column str, Column pattern)

Splits str around matches of the given pattern.

static Column

split(Column str, Column pattern, Column limit)

Splits str around matches of the given pattern.

static Column

split_part(Column str, Column delimiter, Column partNum)

Splits str by delimiter and return requested part of the split (1-based).

static Column

sqrt(String colName)

Computes the square root of the specified float value.

static Column

sqrt(Column e)

Computes the square root of the specified float value.

static Column

stack(scala.collection.immutable.Seq<Column> cols)

Separates col1, ..., colk into n rows.

static Column

startswith(Column str, Column prefix)

Returns a boolean.

static Column

std(Column e)

Aggregate function: alias for stddev_samp.

static Column

stddev(String columnName)

Aggregate function: alias for stddev_samp.

static Column

stddev(Column e)

Aggregate function: alias for stddev_samp.

static Column

stddev_pop(String columnName)

Aggregate function: returns the population standard deviation of the expression in a group.

static Column

stddev_pop(Column e)

Aggregate function: returns the population standard deviation of the expression in a group.

static Column

stddev_samp(String columnName)

Aggregate function: returns the sample standard deviation of the expression in a group.

static Column

stddev_samp(Column e)

Aggregate function: returns the sample standard deviation of the expression in a group.

static Column

str_to_map(Column text)

Creates a map after splitting the text into key/value pairs using delimiters.

static Column

str_to_map(Column text, Column pairDelim)

Creates a map after splitting the text into key/value pairs using delimiters.

static Column

str_to_map(Column text, Column pairDelim, Column keyValueDelim)

Creates a map after splitting the text into key/value pairs using delimiters.

static Column

struct(String colName, String... colNames)

Creates a new struct column that composes multiple input columns.

static Column

struct(String colName, scala.collection.immutable.Seq<String> colNames)

Creates a new struct column that composes multiple input columns.

static Column

struct(Column... cols)

Creates a new struct column.

static Column

struct(scala.collection.immutable.Seq<Column> cols)

Creates a new struct column.

static Column

substr(Column str, Column pos)

Returns the substring of str that starts at pos, or the slice of byte array that starts at pos.

static Column

substr(Column str, Column pos, Column len)

Returns the substring of str that starts at pos and is of length len, or the slice of byte array that starts at pos and is of length len.

static Column

substring(Column str, int pos, int len)

Substring starts at pos and is of length len when str is String type or returns the slice of byte array that starts at pos in byte and is of length len when str is Binary type

static Column

substring_index(Column str, String delim, int count)

Returns the substring from string str before count occurrences of the delimiter delim.

static Column

sum(String columnName)

Aggregate function: returns the sum of all values in the given column.

static Column

sum(Column e)

Aggregate function: returns the sum of all values in the expression.

static Column

sum_distinct(Column e)

Aggregate function: returns the sum of distinct values in the expression.

static Column

sumDistinct(String columnName)

Deprecated.
Use sum_distinct.

static Column

sumDistinct(Column e)

Deprecated.
Use sum_distinct.

static Column

tan(String columnName)

static Column

tan(Column e)

static Column

tanh(String columnName)

static Column

tanh(Column e)

static Column

timestamp_add(String unit, Column quantity, Column ts)

Adds the specified number of units to the given timestamp.

static Column

timestamp_diff(String unit, Column start, Column end)

Gets the difference between the timestamps in the specified units by truncating the fraction part.

static Column

timestamp_micros(Column e)

Creates timestamp from the number of microseconds since UTC epoch.

static Column

timestamp_millis(Column e)

Creates timestamp from the number of milliseconds since UTC epoch.

static Column

timestamp_seconds(Column e)

Converts the number of seconds from the Unix epoch (1970-01-01T00:00:00Z) to a timestamp.

static Column

to_binary(Column e)

Converts the input e to a binary value based on the default format "hex".

static Column

to_binary(Column e, Column f)

Converts the input e to a binary value based on the supplied format.

static Column

to_char(Column e, Column format)

Convert e to a string based on the format.

static Column

to_csv(Column e)

Converts a column containing a StructType into a CSV string with the specified schema.

static Column

to_csv(Column e, Map<String,String> options)

(Java-specific) Converts a column containing a StructType into a CSV string with the specified schema.

static Column

to_date(Column e)

Converts the column into DateType by casting rules to DateType.

static Column

to_date(Column e, String fmt)

Converts the column into a DateType with a specified format

static Column

to_json(Column e)

Converts a column containing a StructType, ArrayType or a MapType into a JSON string with the specified schema.

static Column

to_json(Column e, Map<String,String> options)

(Java-specific) Converts a column containing a StructType, ArrayType or a MapType into a JSON string with the specified schema.

static Column

to_json(Column e, scala.collection.immutable.Map<String,String> options)

(Scala-specific) Converts a column containing a StructType, ArrayType or a MapType into a JSON string with the specified schema.

static Column

to_number(Column e, Column format)

Convert string 'e' to a number based on the string format 'format'.

static Column

to_timestamp(Column s)

Converts to a timestamp by casting rules to TimestampType.

static Column

to_timestamp(Column s, String fmt)

Converts time string with the given pattern to timestamp.

static Column

to_timestamp_ltz(Column timestamp)

Parses the timestamp expression with the default format to a timestamp without time zone.

static Column

to_timestamp_ltz(Column timestamp, Column format)

Parses the timestamp expression with the format expression to a timestamp without time zone.

static Column

to_timestamp_ntz(Column timestamp)

Parses the timestamp expression with the default format to a timestamp without time zone.

static Column

to_timestamp_ntz(Column timestamp, Column format)

Parses the timestamp_str expression with the format expression to a timestamp without time zone.

static Column

to_unix_timestamp(Column timeExp)

Returns the UNIX timestamp of the given time.

static Column

to_unix_timestamp(Column timeExp, Column format)

Returns the UNIX timestamp of the given time.

static Column

to_utc_timestamp(Column ts, String tz)

Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in the given time zone, and renders that time as a timestamp in UTC.

static Column

to_utc_timestamp(Column ts, Column tz)

Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in the given time zone, and renders that time as a timestamp in UTC.

static Column

to_varchar(Column e, Column format)

Convert e to a string based on the format.

static Column

to_xml(Column e)

Converts a column containing a StructType into a XML string with the specified schema.

static Column

to_xml(Column e, Map<String,String> options)

(Java-specific) Converts a column containing a StructType into a XML string with the specified schema.

static Column

toDegrees(String columnName)

Deprecated.
Use degrees.

static Column

toDegrees(Column e)

Deprecated.
Use degrees.

static Column

toRadians(String columnName)

Deprecated.
Use radians.

static Column

toRadians(Column e)

Deprecated.
Use radians.

static Column

transform(Column column, scala.Function1<Column,Column> f)

Returns an array of elements after applying a transformation to each element in the input array.

static Column

transform(Column column, scala.Function2<Column,Column,Column> f)

Returns an array of elements after applying a transformation to each element in the input array.

static Column

transform_keys(Column expr, scala.Function2<Column,Column,Column> f)

Applies a function to every key-value pair in a map and returns a map with the results of those applications as the new keys for the pairs.

static Column

transform_values(Column expr, scala.Function2<Column,Column,Column> f)

Applies a function to every key-value pair in a map and returns a map with the results of those applications as the new values for the pairs.

static Column

translate(Column src, String matchingString, String replaceString)

Translate any character in the src by a character in replaceString.

static Column

trim(Column e)

Trim the spaces from both ends for the specified string column.

static Column

trim(Column e, String trimString)

Trim the specified character from both ends for the specified string column.

static Column

trunc(Column date, String format)

Returns date truncated to the unit specified by the format.

static Column

try_add(Column left, Column right)

Returns the sum of left and right and the result is null on overflow.

static Column

try_aes_decrypt(Column input, Column key)

Returns a decrypted value of input.

static Column

try_aes_decrypt(Column input, Column key, Column mode)

Returns a decrypted value of input.

static Column

try_aes_decrypt(Column input, Column key, Column mode, Column padding)

Returns a decrypted value of input.

static Column

try_aes_decrypt(Column input, Column key, Column mode, Column padding, Column aad)

This is a special version of aes_decrypt that performs the same operation, but returns a NULL value instead of raising an error if the decryption cannot be performed.

static Column

try_avg(Column e)

Returns the mean calculated from values of a group and the result is null on overflow.

static Column

try_divide(Column left, Column right)

Returns dividend/divisor.

static Column

try_element_at(Column column, Column value)

(array, index) - Returns element of array at given (1-based) index.

static Column

try_multiply(Column left, Column right)

Returns left*right and the result is null on overflow.

static Column

try_parse_json(Column json)

Parses a JSON string and constructs a Variant value.

static Column

try_reflect(scala.collection.immutable.Seq<Column> cols)

This is a special version of reflect that performs the same operation, but returns a NULL value instead of raising an error if the invoke method thrown exception.

static Column

try_remainder(Column left, Column right)

Returns the remainder of dividend/divisor.

static Column

try_subtract(Column left, Column right)

Returns left-right and the result is null on overflow.

static Column

try_sum(Column e)

Returns the sum calculated from values of a group and the result is null on overflow.

static Column

try_to_binary(Column e)

This is a special version of to_binary that performs the same operation, but returns a NULL value instead of raising an error if the conversion cannot be performed.

static Column

try_to_binary(Column e, Column f)

This is a special version of to_binary that performs the same operation, but returns a NULL value instead of raising an error if the conversion cannot be performed.

static Column

try_to_number(Column e, Column format)

Convert string e to a number based on the string format format.

static Column

try_to_timestamp(Column s)

Parses the s to a timestamp.

static Column

try_to_timestamp(Column s, Column format)

Parses the s with the format to a timestamp.

static Column

try_variant_get(Column v, String path, String targetType)

Extracts a sub-variant from v according to path, and then cast the sub-variant to targetType.

static <T> Column

typedlit(T literal, scala.reflect.api.TypeTags.TypeTag<T> evidence$2)

Creates a Column of literal value.

static <T> Column

typedLit(T literal, scala.reflect.api.TypeTags.TypeTag<T> evidence$1)

Creates a Column of literal value.

static Column

typeof(Column col)

Return DDL-formatted type string for the data type of the input.

static Column

ucase(Column str)

Returns str with all characters changed to uppercase.

static <IN, BUF, OUT> UserDefinedFunction

udaf(Aggregator<IN,BUF,OUT> agg, Encoder<IN> inputEncoder)

Obtains a UserDefinedFunction that wraps the given Aggregator so that it may be used with untyped Data Frames.

static <IN, BUF, OUT> UserDefinedFunction

udaf(Aggregator<IN,BUF,OUT> agg, scala.reflect.api.TypeTags.TypeTag<IN> evidence$3)

Obtains a UserDefinedFunction that wraps the given Aggregator so that it may be used with untyped Data Frames.

static UserDefinedFunction

udf(Object f, DataType dataType)

Deprecated.
Scala `udf` method with return type parameter is deprecated.

static UserDefinedFunction

udf(UDF0<?> f, DataType returnType)

Defines a Java UDF0 instance as user-defined function (UDF).

static UserDefinedFunction

udf(UDF1<?,?> f, DataType returnType)

Defines a Java UDF1 instance as user-defined function (UDF).

static UserDefinedFunction

udf(UDF10<?,?,?,?,?,?,?,?,?,?,?> f, DataType returnType)

Defines a Java UDF10 instance as user-defined function (UDF).

static UserDefinedFunction

udf(UDF2<?,?,?> f, DataType returnType)

Defines a Java UDF2 instance as user-defined function (UDF).

static UserDefinedFunction

udf(UDF3<?,?,?,?> f, DataType returnType)

Defines a Java UDF3 instance as user-defined function (UDF).

static UserDefinedFunction

udf(UDF4<?,?,?,?,?> f, DataType returnType)

Defines a Java UDF4 instance as user-defined function (UDF).

static UserDefinedFunction

udf(UDF5<?,?,?,?,?,?> f, DataType returnType)

Defines a Java UDF5 instance as user-defined function (UDF).

static UserDefinedFunction

udf(UDF6<?,?,?,?,?,?,?> f, DataType returnType)

Defines a Java UDF6 instance as user-defined function (UDF).

static UserDefinedFunction

udf(UDF7<?,?,?,?,?,?,?,?> f, DataType returnType)

Defines a Java UDF7 instance as user-defined function (UDF).

static UserDefinedFunction

udf(UDF8<?,?,?,?,?,?,?,?,?> f, DataType returnType)

Defines a Java UDF8 instance as user-defined function (UDF).

static UserDefinedFunction

udf(UDF9<?,?,?,?,?,?,?,?,?,?> f, DataType returnType)

Defines a Java UDF9 instance as user-defined function (UDF).

static <RT> UserDefinedFunction

udf(scala.Function0<RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$4)

Defines a Scala closure of 0 arguments as user-defined function (UDF).

static <RT, A1> UserDefinedFunction

udf(scala.Function1<A1,RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$5, scala.reflect.api.TypeTags.TypeTag<A1> evidence$6)

Defines a Scala closure of 1 arguments as user-defined function (UDF).

static <RT, A1, A2, A3, A4, A5, A6, A7, A8, A9, A10> UserDefinedFunction

udf(scala.Function10<A1,A2,A3,A4,A5,A6,A7,A8,A9,A10,RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$59, scala.reflect.api.TypeTags.TypeTag<A1> evidence$60, scala.reflect.api.TypeTags.TypeTag<A2> evidence$61, scala.reflect.api.TypeTags.TypeTag<A3> evidence$62, scala.reflect.api.TypeTags.TypeTag<A4> evidence$63, scala.reflect.api.TypeTags.TypeTag<A5> evidence$64, scala.reflect.api.TypeTags.TypeTag<A6> evidence$65, scala.reflect.api.TypeTags.TypeTag<A7> evidence$66, scala.reflect.api.TypeTags.TypeTag<A8> evidence$67, scala.reflect.api.TypeTags.TypeTag<A9> evidence$68, scala.reflect.api.TypeTags.TypeTag<A10> evidence$69)

Defines a Scala closure of 10 arguments as user-defined function (UDF).

static <RT, A1, A2> UserDefinedFunction

udf(scala.Function2<A1,A2,RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$7, scala.reflect.api.TypeTags.TypeTag<A1> evidence$8, scala.reflect.api.TypeTags.TypeTag<A2> evidence$9)

Defines a Scala closure of 2 arguments as user-defined function (UDF).

static <RT, A1, A2, A3> UserDefinedFunction

udf(scala.Function3<A1,A2,A3,RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$10, scala.reflect.api.TypeTags.TypeTag<A1> evidence$11, scala.reflect.api.TypeTags.TypeTag<A2> evidence$12, scala.reflect.api.TypeTags.TypeTag<A3> evidence$13)

Defines a Scala closure of 3 arguments as user-defined function (UDF).

static <RT, A1, A2, A3, A4> UserDefinedFunction

udf(scala.Function4<A1,A2,A3,A4,RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$14, scala.reflect.api.TypeTags.TypeTag<A1> evidence$15, scala.reflect.api.TypeTags.TypeTag<A2> evidence$16, scala.reflect.api.TypeTags.TypeTag<A3> evidence$17, scala.reflect.api.TypeTags.TypeTag<A4> evidence$18)

Defines a Scala closure of 4 arguments as user-defined function (UDF).

static <RT, A1, A2, A3, A4, A5> UserDefinedFunction

udf(scala.Function5<A1,A2,A3,A4,A5,RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$19, scala.reflect.api.TypeTags.TypeTag<A1> evidence$20, scala.reflect.api.TypeTags.TypeTag<A2> evidence$21, scala.reflect.api.TypeTags.TypeTag<A3> evidence$22, scala.reflect.api.TypeTags.TypeTag<A4> evidence$23, scala.reflect.api.TypeTags.TypeTag<A5> evidence$24)

Defines a Scala closure of 5 arguments as user-defined function (UDF).

static <RT, A1, A2, A3, A4, A5, A6> UserDefinedFunction

udf(scala.Function6<A1,A2,A3,A4,A5,A6,RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$25, scala.reflect.api.TypeTags.TypeTag<A1> evidence$26, scala.reflect.api.TypeTags.TypeTag<A2> evidence$27, scala.reflect.api.TypeTags.TypeTag<A3> evidence$28, scala.reflect.api.TypeTags.TypeTag<A4> evidence$29, scala.reflect.api.TypeTags.TypeTag<A5> evidence$30, scala.reflect.api.TypeTags.TypeTag<A6> evidence$31)

Defines a Scala closure of 6 arguments as user-defined function (UDF).

static <RT, A1, A2, A3, A4, A5, A6, A7> UserDefinedFunction

udf(scala.Function7<A1,A2,A3,A4,A5,A6,A7,RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$32, scala.reflect.api.TypeTags.TypeTag<A1> evidence$33, scala.reflect.api.TypeTags.TypeTag<A2> evidence$34, scala.reflect.api.TypeTags.TypeTag<A3> evidence$35, scala.reflect.api.TypeTags.TypeTag<A4> evidence$36, scala.reflect.api.TypeTags.TypeTag<A5> evidence$37, scala.reflect.api.TypeTags.TypeTag<A6> evidence$38, scala.reflect.api.TypeTags.TypeTag<A7> evidence$39)

Defines a Scala closure of 7 arguments as user-defined function (UDF).

static <RT, A1, A2, A3, A4, A5, A6, A7, A8> UserDefinedFunction

udf(scala.Function8<A1,A2,A3,A4,A5,A6,A7,A8,RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$40, scala.reflect.api.TypeTags.TypeTag<A1> evidence$41, scala.reflect.api.TypeTags.TypeTag<A2> evidence$42, scala.reflect.api.TypeTags.TypeTag<A3> evidence$43, scala.reflect.api.TypeTags.TypeTag<A4> evidence$44, scala.reflect.api.TypeTags.TypeTag<A5> evidence$45, scala.reflect.api.TypeTags.TypeTag<A6> evidence$46, scala.reflect.api.TypeTags.TypeTag<A7> evidence$47, scala.reflect.api.TypeTags.TypeTag<A8> evidence$48)

Defines a Scala closure of 8 arguments as user-defined function (UDF).

static <RT, A1, A2, A3, A4, A5, A6, A7, A8, A9> UserDefinedFunction

udf(scala.Function9<A1,A2,A3,A4,A5,A6,A7,A8,A9,RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$49, scala.reflect.api.TypeTags.TypeTag<A1> evidence$50, scala.reflect.api.TypeTags.TypeTag<A2> evidence$51, scala.reflect.api.TypeTags.TypeTag<A3> evidence$52, scala.reflect.api.TypeTags.TypeTag<A4> evidence$53, scala.reflect.api.TypeTags.TypeTag<A5> evidence$54, scala.reflect.api.TypeTags.TypeTag<A6> evidence$55, scala.reflect.api.TypeTags.TypeTag<A7> evidence$56, scala.reflect.api.TypeTags.TypeTag<A8> evidence$57, scala.reflect.api.TypeTags.TypeTag<A9> evidence$58)

Defines a Scala closure of 9 arguments as user-defined function (UDF).

static Column

unbase64(Column e)

Decodes a BASE64 encoded string column and returns it as a binary column.

static Column

unhex(Column column)

Inverse of hex.

static Column

unix_date(Column e)

Returns the number of days since 1970-01-01.

static Column

unix_micros(Column e)

Returns the number of microseconds since 1970-01-01 00:00:00 UTC.

static Column

unix_millis(Column e)

Returns the number of milliseconds since 1970-01-01 00:00:00 UTC.

static Column

unix_seconds(Column e)

Returns the number of seconds since 1970-01-01 00:00:00 UTC.

static Column

unix_timestamp()

Returns the current Unix timestamp (in seconds) as a long.

static Column

unix_timestamp(Column s)

Converts time string in format yyyy-MM-dd HH:mm:ss to Unix timestamp (in seconds), using the default timezone and the default locale.

static Column

unix_timestamp(Column s, String p)

Converts time string with given pattern to Unix timestamp (in seconds).

static Column

unwrap_udt(Column column)

Unwrap UDT data type column into its underlying type.

static Column

upper(Column e)

Converts a string column to upper case.

static Column

url_decode(Column str)

Decodes a str in 'application/x-www-form-urlencoded' format using a specific encoding scheme.

static Column

url_encode(Column str)

Translates a string into 'application/x-www-form-urlencoded' format using a specific encoding scheme.

static Column

user()

Returns the user name of current execution context.

static Column

uuid()

Returns an universally unique identifier (UUID) string.

static Column

var_pop(String columnName)

Aggregate function: returns the population variance of the values in a group.

static Column

var_pop(Column e)

Aggregate function: returns the population variance of the values in a group.

static Column

var_samp(String columnName)

Aggregate function: returns the unbiased variance of the values in a group.

static Column

var_samp(Column e)

Aggregate function: returns the unbiased variance of the values in a group.

static Column

variance(String columnName)

Aggregate function: alias for var_samp.

static Column

variance(Column e)

Aggregate function: alias for var_samp.

static Column

variant_get(Column v, String path, String targetType)

Extracts a sub-variant from v according to path, and then cast the sub-variant to targetType.

static Column

version()

Returns the Spark version.

static Column

weekday(Column e)

Returns the day of the week for date/timestamp (0 = Monday, 1 = Tuesday, ..., 6 = Sunday).

static Column

weekofyear(Column e)

Extracts the week number as an integer from a given date/timestamp/string.

static Column

when(Column condition, Object value)

Evaluates a list of conditions and returns one of multiple possible result expressions.

static Column

width_bucket(Column v, Column min, Column max, Column numBucket)

Returns the bucket number into which the value of this expression would fall after being evaluated.

static Column

window(Column timeColumn, String windowDuration)

Generates tumbling time windows given a timestamp specifying column.

static Column

window(Column timeColumn, String windowDuration, String slideDuration)

Bucketize rows into one or more time windows given a timestamp specifying column.

static Column

window(Column timeColumn, String windowDuration, String slideDuration, String startTime)

Bucketize rows into one or more time windows given a timestamp specifying column.

static Column

window_time(Column windowColumn)

Extracts the event time from the window column.

static Column

xpath(Column xml, Column path)

Returns a string array of values within the nodes of xml that match the XPath expression.

static Column

xpath_boolean(Column xml, Column path)

Returns true if the XPath expression evaluates to true, or if a matching node is found.

static Column

xpath_double(Column xml, Column path)

Returns a double value, the value zero if no match is found, or NaN if a match is found but the value is non-numeric.

static Column

xpath_float(Column xml, Column path)

Returns a float value, the value zero if no match is found, or NaN if a match is found but the value is non-numeric.

static Column

xpath_int(Column xml, Column path)

Returns an integer value, or the value zero if no match is found, or a match is found but the value is non-numeric.

static Column

xpath_long(Column xml, Column path)

Returns a long integer value, or the value zero if no match is found, or a match is found but the value is non-numeric.

static Column

xpath_number(Column xml, Column path)

Returns a double value, the value zero if no match is found, or NaN if a match is found but the value is non-numeric.

static Column

xpath_short(Column xml, Column path)

Returns a short integer value, or the value zero if no match is found, or a match is found but the value is non-numeric.

static Column

xpath_string(Column xml, Column path)

Returns the text contents of the first xml node that matches the XPath expression.

static Column

xxhash64(Column... cols)

Calculates the hash code of given columns using the 64-bit variant of the xxHash algorithm, and returns the result as a long column.

static Column

xxhash64(scala.collection.immutable.Seq<Column> cols)

Calculates the hash code of given columns using the 64-bit variant of the xxHash algorithm, and returns the result as a long column.

static Column

year(Column e)

Extracts the year as an integer from a given date/timestamp/string.

static Column

years(Column e)

(Java-specific) A transform for timestamps and dates to partition data into years.

static Column

zip_with(Column left, Column right, scala.Function2<Column,Column,Column> f)

Merge two given arrays, element-wise, into a single array using a function.

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- functions
  
  public functions()
Method Details
- countDistinct
  
  public static Column countDistinct(Column expr, Column... exprs)
  
  Aggregate function: returns the number of distinct items in a group.
  An alias of count_distinct, and it is encouraged to use count_distinct directly.
  
  Parameters:
  
  expr - (undocumented)
  
  exprs - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- countDistinct
  
  public static Column countDistinct(String columnName, String... columnNames)
  
  Aggregate function: returns the number of distinct items in a group.
  An alias of count_distinct, and it is encouraged to use count_distinct directly.
  
  Parameters:
  
  columnName - (undocumented)
  
  columnNames - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- count_distinct
  
  public static Column count_distinct(Column expr, Column... exprs)
  
  Aggregate function: returns the number of distinct items in a group.
  
  Parameters:
  
  expr - (undocumented)
  
  exprs - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.2.0
- array
  
  public static Column array(Column... cols)
  
  Creates a new array column. The input columns must all have the same data type.
  
  Parameters:
  
  cols - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- array
  
  public static Column array(String colName, String... colNames)
  
  Creates a new array column. The input columns must all have the same data type.
  
  Parameters:
  
  colName - (undocumented)
  
  colNames - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- map
  
  public static Column map(Column... cols)
  
  Creates a new map column. The input columns must be grouped as key-value pairs, e.g. (key1, value1, key2, value2, ...). The key columns must all have the same data type, and can't be null. The value columns must all have the same data type.
  
  Parameters:
  
  cols - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.0
- coalesce
  
  public static Column coalesce(Column... e)
  
  Returns the first column that is not null, or null if all inputs are null.
  For example, coalesce(a, b, c) will return a if a is not null, or b if a is null and b is not null, or c if both a and b are null but c is not null.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- struct
  
  public static Column struct(Column... cols)
  
  Creates a new struct column. If the input column is a column in a DataFrame, or a derived column expression that is named (i.e. aliased), its name would be retained as the StructField's name, otherwise, the newly generated StructField's name would be auto generated as col with a suffix index + 1, i.e. col1, col2, col3, ...
  
  Parameters:
  
  cols - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- struct
  
  public static Column struct(String colName, String... colNames)
  
  Creates a new struct column that composes multiple input columns.
  
  Parameters:
  
  colName - (undocumented)
  
  colNames - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- greatest
  
  public static Column greatest(Column... exprs)
  
  Returns the greatest value of the list of values, skipping null values. This function takes at least 2 parameters. It will return null iff all parameters are null.
  
  Parameters:
  
  exprs - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- greatest
  
  public static Column greatest(String columnName, String... columnNames)
  
  Returns the greatest value of the list of column names, skipping null values. This function takes at least 2 parameters. It will return null iff all parameters are null.
  
  Parameters:
  
  columnName - (undocumented)
  
  columnNames - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- least
  
  public static Column least(Column... exprs)
  
  Returns the least value of the list of values, skipping null values. This function takes at least 2 parameters. It will return null iff all parameters are null.
  
  Parameters:
  
  exprs - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- least
  
  public static Column least(String columnName, String... columnNames)
  
  Returns the least value of the list of column names, skipping null values. This function takes at least 2 parameters. It will return null iff all parameters are null.
  
  Parameters:
  
  columnName - (undocumented)
  
  columnNames - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- hash
  
  public static Column hash(Column... cols)
  
  Calculates the hash code of given columns, and returns the result as an int column.
  
  Parameters:
  
  cols - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.0.0
- xxhash64
  
  public static Column xxhash64(Column... cols)
  
  Calculates the hash code of given columns using the 64-bit variant of the xxHash algorithm, and returns the result as a long column. The hash computation uses an initial seed of 42.
  
  Parameters:
  
  cols - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.0.0
- concat_ws
  
  public static Column concat_ws(String sep, Column... exprs)
  
  Concatenates multiple input string columns together into a single string column, using the given separator.
  
  Parameters:
  
  sep - (undocumented)
  
  exprs - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
  
  Note:
  
  Input strings which are null are skipped.
- format_string
  
  public static Column format_string(String format, Column... arguments)
  
  Formats the arguments in printf-style and returns the result as a string column.
  
  Parameters:
  
  format - (undocumented)
  
  arguments - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- elt
  
  public static Column elt(Column... inputs)
  
  Returns the n-th input, e.g., returns input2 when n is 2. The function returns NULL if the index exceeds the length of the array and spark.sql.ansi.enabled is set to false. If spark.sql.ansi.enabled is set to true, it throws ArrayIndexOutOfBoundsException for invalid indices.
  
  Parameters:
  
  inputs - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- concat
  
  public static Column concat(Column... exprs)
  
  Concatenates multiple input columns together into a single column. The function works with strings, binary and compatible array columns.
  
  Parameters:
  
  exprs - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
  
  Note:
  
  Returns null if any of the input columns are null.
- json_tuple
  
  public static Column json_tuple(Column json, String... fields)
  
  Creates a new row for a json column according to the given field names.
  
  Parameters:
  
  json - (undocumented)
  
  fields - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
- arrays_zip
  
  public static Column arrays_zip(Column... e)
  
  Returns a merged array of structs in which the N-th struct contains all N-th values of input arrays.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4.0
- map_concat
  
  public static Column map_concat(Column... cols)
  
  Returns the union of all the given maps.
  
  Parameters:
  
  cols - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4.0
- callUDF
  
  public static Column callUDF(String udfName, Column... cols)
  
  Call an user-defined function.
  
  Parameters:
  
  udfName - (undocumented)
  
  cols - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- call_udf
  
  public static Column call_udf(String udfName, Column... cols)
  Call an user-defined function. Example:
  import org.apache.spark.sql._ val df = Seq(("id1", 1), ("id2", 4), ("id3", 5)).toDF("id", "value") val spark = df.sparkSession spark.udf.register("simpleUDF", (v: Int) => v * v) df.select($"id", call_udf("simpleUDF", $"value"))
  Parameters:
  
  udfName - (undocumented)
  
  cols - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.2.0
- call_function
  
  public static Column call_function(String funcName, Column... cols)
  
  Call a SQL function.
  
  Parameters:
  
  funcName - function name that follows the SQL identifier syntax (can be quoted, can be qualified)
  
  cols - the expression parameters of function
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- col
  
  public static Column col(String colName)
  
  Returns a Column based on the given column name.
  
  Parameters:
  
  colName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- column
  
  public static Column column(String colName)
  
  Returns a Column based on the given column name. Alias of col(java.lang.String).
  
  Parameters:
  
  colName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- lit
  
  public static Column lit(Object literal)
  
  Creates a Column of literal value.
  The passed in object is returned directly if it is already a Column. If the object is a Scala Symbol, it is converted into a Column also. Otherwise, a new Column is created to represent the literal value.
  
  Parameters:
  
  literal - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- typedLit
  
  public static <T> Column typedLit(T literal, scala.reflect.api.TypeTags.TypeTag<T> evidence$1)
  
  Creates a Column of literal value.
  An alias of typedlit, and it is encouraged to use typedlit directly.
  
  Parameters:
  
  literal - (undocumented)
  
  evidence$1 - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.2.0
- typedlit
  
  public static <T> Column typedlit(T literal, scala.reflect.api.TypeTags.TypeTag<T> evidence$2)
  
  Creates a Column of literal value.
  The passed in object is returned directly if it is already a Column. If the object is a Scala Symbol, it is converted into a Column also. Otherwise, a new Column is created to represent the literal value. The difference between this function and lit(java.lang.Object) is that this function can handle parameterized scala types e.g.: List, Seq and Map.
  
  Parameters:
  
  literal - (undocumented)
  
  evidence$2 - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.2.0
  
  Note:
  
  typedlit will call expensive Scala reflection APIs. lit is preferred if parameterized Scala types are not used.
- asc
  
  public static Column asc(String columnName)
  Returns a sort expression based on ascending order of the column.
  df.sort(asc("dept"), desc("age"))
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- asc_nulls_first
  
  public static Column asc_nulls_first(String columnName)
  Returns a sort expression based on ascending order of the column, and null values return before non-null values.
  df.sort(asc_nulls_first("dept"), desc("age"))
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.1.0
- asc_nulls_last
  
  public static Column asc_nulls_last(String columnName)
  Returns a sort expression based on ascending order of the column, and null values appear after non-null values.
  df.sort(asc_nulls_last("dept"), desc("age"))
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.1.0
- desc
  
  public static Column desc(String columnName)
  Returns a sort expression based on the descending order of the column.
  df.sort(asc("dept"), desc("age"))
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- desc_nulls_first
  
  public static Column desc_nulls_first(String columnName)
  Returns a sort expression based on the descending order of the column, and null values appear before non-null values.
  df.sort(asc("dept"), desc_nulls_first("age"))
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.1.0
- desc_nulls_last
  
  public static Column desc_nulls_last(String columnName)
  Returns a sort expression based on the descending order of the column, and null values appear after non-null values.
  df.sort(asc("dept"), desc_nulls_last("age"))
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.1.0
- approxCountDistinct
  
  public static Column approxCountDistinct(Column e)
  
  Deprecated.
  Use approx_count_distinct. Since 2.1.0.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- approxCountDistinct
  
  public static Column approxCountDistinct(String columnName)
  
  Deprecated.
  Use approx_count_distinct. Since 2.1.0.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- approxCountDistinct
  
  public static Column approxCountDistinct(Column e, double rsd)
  
  Deprecated.
  Use approx_count_distinct. Since 2.1.0.
  
  Parameters:
  
  e - (undocumented)
  
  rsd - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- approxCountDistinct
  
  public static Column approxCountDistinct(String columnName, double rsd)
  
  Deprecated.
  Use approx_count_distinct. Since 2.1.0.
  
  Parameters:
  
  columnName - (undocumented)
  
  rsd - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- approx_count_distinct
  
  public static Column approx_count_distinct(Column e)
  
  Aggregate function: returns the approximate number of distinct items in a group.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.1.0
- approx_count_distinct
  
  public static Column approx_count_distinct(String columnName)
  
  Aggregate function: returns the approximate number of distinct items in a group.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.1.0
- approx_count_distinct
  
  public static Column approx_count_distinct(Column e, double rsd)
  
  Aggregate function: returns the approximate number of distinct items in a group.
  
  Parameters:
  
  rsd - maximum relative standard deviation allowed (default = 0.05)
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.1.0
- approx_count_distinct
  
  public static Column approx_count_distinct(String columnName, double rsd)
  
  Aggregate function: returns the approximate number of distinct items in a group.
  
  Parameters:
  
  rsd - maximum relative standard deviation allowed (default = 0.05)
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.1.0
- avg
  
  public static Column avg(Column e)
  
  Aggregate function: returns the average of the values in a group.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- avg
  
  public static Column avg(String columnName)
  
  Aggregate function: returns the average of the values in a group.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- collect_list
  
  public static Column collect_list(Column e)
  
  Aggregate function: returns a list of objects with duplicates.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
  
  Note:
  
  The function is non-deterministic because the order of collected results depends on the order of the rows which may be non-deterministic after a shuffle.
- collect_list
  
  public static Column collect_list(String columnName)
  
  Aggregate function: returns a list of objects with duplicates.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
  
  Note:
  
  The function is non-deterministic because the order of collected results depends on the order of the rows which may be non-deterministic after a shuffle.
- collect_set
  
  public static Column collect_set(Column e)
  
  Aggregate function: returns a set of objects with duplicate elements eliminated.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
  
  Note:
  
  The function is non-deterministic because the order of collected results depends on the order of the rows which may be non-deterministic after a shuffle.
- collect_set
  
  public static Column collect_set(String columnName)
  
  Aggregate function: returns a set of objects with duplicate elements eliminated.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
  
  Note:
  
  The function is non-deterministic because the order of collected results depends on the order of the rows which may be non-deterministic after a shuffle.
- count_min_sketch
  
  public static Column count_min_sketch(Column e, Column eps, Column confidence, Column seed)
  
  Returns a count-min sketch of a column with the given esp, confidence and seed. The result is an array of bytes, which can be deserialized to a CountMinSketch before usage. Count-min sketch is a probabilistic data structure used for cardinality estimation using sub-linear space.
  
  Parameters:
  
  e - (undocumented)
  
  eps - (undocumented)
  
  confidence - (undocumented)
  
  seed - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- corr
  
  public static Column corr(Column column1, Column column2)
  
  Aggregate function: returns the Pearson Correlation Coefficient for two columns.
  
  Parameters:
  
  column1 - (undocumented)
  
  column2 - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
- corr
  
  public static Column corr(String columnName1, String columnName2)
  
  Aggregate function: returns the Pearson Correlation Coefficient for two columns.
  
  Parameters:
  
  columnName1 - (undocumented)
  
  columnName2 - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
- count
  
  public static Column count(Column e)
  
  Aggregate function: returns the number of items in a group.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- count
  
  public static TypedColumn<Object,Object> count(String columnName)
  
  Aggregate function: returns the number of items in a group.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- countDistinct
  
  public static Column countDistinct(Column expr, scala.collection.immutable.Seq<Column> exprs)
  
  Aggregate function: returns the number of distinct items in a group.
  An alias of count_distinct, and it is encouraged to use count_distinct directly.
  
  Parameters:
  
  expr - (undocumented)
  
  exprs - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- countDistinct
  
  public static Column countDistinct(String columnName, scala.collection.immutable.Seq<String> columnNames)
  
  Aggregate function: returns the number of distinct items in a group.
  An alias of count_distinct, and it is encouraged to use count_distinct directly.
  
  Parameters:
  
  columnName - (undocumented)
  
  columnNames - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- count_distinct
  
  public static Column count_distinct(Column expr, scala.collection.immutable.Seq<Column> exprs)
  
  Aggregate function: returns the number of distinct items in a group.
  
  Parameters:
  
  expr - (undocumented)
  
  exprs - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.2.0
- covar_pop
  
  public static Column covar_pop(Column column1, Column column2)
  
  Aggregate function: returns the population covariance for two columns.
  
  Parameters:
  
  column1 - (undocumented)
  
  column2 - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.0.0
- covar_pop
  
  public static Column covar_pop(String columnName1, String columnName2)
  
  Aggregate function: returns the population covariance for two columns.
  
  Parameters:
  
  columnName1 - (undocumented)
  
  columnName2 - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.0.0
- covar_samp
  
  public static Column covar_samp(Column column1, Column column2)
  
  Aggregate function: returns the sample covariance for two columns.
  
  Parameters:
  
  column1 - (undocumented)
  
  column2 - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.0.0
- covar_samp
  
  public static Column covar_samp(String columnName1, String columnName2)
  
  Aggregate function: returns the sample covariance for two columns.
  
  Parameters:
  
  columnName1 - (undocumented)
  
  columnName2 - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.0.0
- first
  
  public static Column first(Column e, boolean ignoreNulls)
  
  Aggregate function: returns the first value in a group.
  The function by default returns the first values it sees. It will return the first non-null value it sees when ignoreNulls is set to true. If all values are null, then null is returned.
  
  Parameters:
  
  e - (undocumented)
  
  ignoreNulls - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.0.0
  
  Note:
  
  The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.
- first
  
  public static Column first(String columnName, boolean ignoreNulls)
  
  Aggregate function: returns the first value of a column in a group.
  The function by default returns the first values it sees. It will return the first non-null value it sees when ignoreNulls is set to true. If all values are null, then null is returned.
  
  Parameters:
  
  columnName - (undocumented)
  
  ignoreNulls - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.0.0
  
  Note:
  
  The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.
- first
  
  public static Column first(Column e)
  
  Aggregate function: returns the first value in a group.
  The function by default returns the first values it sees. It will return the first non-null value it sees when ignoreNulls is set to true. If all values are null, then null is returned.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
  
  Note:
  
  The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.
- first
  
  public static Column first(String columnName)
  
  Aggregate function: returns the first value of a column in a group.
  The function by default returns the first values it sees. It will return the first non-null value it sees when ignoreNulls is set to true. If all values are null, then null is returned.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
  
  Note:
  
  The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.
- first_value
  
  public static Column first_value(Column e)
  
  Aggregate function: returns the first value in a group.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
  
  Note:
  
  The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.
- first_value
  
  public static Column first_value(Column e, Column ignoreNulls)
  
  Aggregate function: returns the first value in a group.
  The function by default returns the first values it sees. It will return the first non-null value it sees when ignoreNulls is set to true. If all values are null, then null is returned.
  
  Parameters:
  
  e - (undocumented)
  
  ignoreNulls - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
  
  Note:
  
  The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.
- grouping
  
  public static Column grouping(Column e)
  
  Aggregate function: indicates whether a specified column in a GROUP BY list is aggregated or not, returns 1 for aggregated or 0 for not aggregated in the result set.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.0.0
- grouping
  
  public static Column grouping(String columnName)
  
  Aggregate function: indicates whether a specified column in a GROUP BY list is aggregated or not, returns 1 for aggregated or 0 for not aggregated in the result set.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.0.0
- grouping_id
  
  public static Column grouping_id(scala.collection.immutable.Seq<Column> cols)
  Aggregate function: returns the level of grouping, equals to
  
  (grouping(c1) <<; (n-1)) + (grouping(c2) <<; (n-2)) + ... + grouping(cn)
  Parameters:
  
  cols - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.0.0
  
  Note:
  
  The list of columns should match with grouping columns exactly, or empty (means all the grouping columns).
- grouping_id
  
  public static Column grouping_id(String colName, scala.collection.immutable.Seq<String> colNames)
  Aggregate function: returns the level of grouping, equals to
  
  (grouping(c1) <<; (n-1)) + (grouping(c2) <<; (n-2)) + ... + grouping(cn)
  Parameters:
  
  colName - (undocumented)
  
  colNames - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.0.0
  
  Note:
  
  The list of columns should match with grouping columns exactly.
- hll_sketch_agg
  
  public static Column hll_sketch_agg(Column e, Column lgConfigK)
  
  Aggregate function: returns the updatable binary representation of the Datasketches HllSketch configured with lgConfigK arg.
  
  Parameters:
  
  e - (undocumented)
  
  lgConfigK - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- hll_sketch_agg
  
  public static Column hll_sketch_agg(Column e, int lgConfigK)
  
  Aggregate function: returns the updatable binary representation of the Datasketches HllSketch configured with lgConfigK arg.
  
  Parameters:
  
  e - (undocumented)
  
  lgConfigK - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- hll_sketch_agg
  
  public static Column hll_sketch_agg(String columnName, int lgConfigK)
  
  Aggregate function: returns the updatable binary representation of the Datasketches HllSketch configured with lgConfigK arg.
  
  Parameters:
  
  columnName - (undocumented)
  
  lgConfigK - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- hll_sketch_agg
  
  public static Column hll_sketch_agg(Column e)
  
  Aggregate function: returns the updatable binary representation of the Datasketches HllSketch configured with default lgConfigK value.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- hll_sketch_agg
  
  public static Column hll_sketch_agg(String columnName)
  
  Aggregate function: returns the updatable binary representation of the Datasketches HllSketch configured with default lgConfigK value.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- hll_union_agg
  
  public static Column hll_union_agg(Column e, Column allowDifferentLgConfigK)
  
  Aggregate function: returns the updatable binary representation of the Datasketches HllSketch, generated by merging previously created Datasketches HllSketch instances via a Datasketches Union instance. Throws an exception if sketches have different lgConfigK values and allowDifferentLgConfigK is set to false.
  
  Parameters:
  
  e - (undocumented)
  
  allowDifferentLgConfigK - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- hll_union_agg
  
  public static Column hll_union_agg(Column e, boolean allowDifferentLgConfigK)
  
  Aggregate function: returns the updatable binary representation of the Datasketches HllSketch, generated by merging previously created Datasketches HllSketch instances via a Datasketches Union instance. Throws an exception if sketches have different lgConfigK values and allowDifferentLgConfigK is set to false.
  
  Parameters:
  
  e - (undocumented)
  
  allowDifferentLgConfigK - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- hll_union_agg
  
  public static Column hll_union_agg(String columnName, boolean allowDifferentLgConfigK)
  
  Aggregate function: returns the updatable binary representation of the Datasketches HllSketch, generated by merging previously created Datasketches HllSketch instances via a Datasketches Union instance. Throws an exception if sketches have different lgConfigK values and allowDifferentLgConfigK is set to false.
  
  Parameters:
  
  columnName - (undocumented)
  
  allowDifferentLgConfigK - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- hll_union_agg
  
  public static Column hll_union_agg(Column e)
  
  Aggregate function: returns the updatable binary representation of the Datasketches HllSketch, generated by merging previously created Datasketches HllSketch instances via a Datasketches Union instance. Throws an exception if sketches have different lgConfigK values.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- hll_union_agg
  
  public static Column hll_union_agg(String columnName)
  
  Aggregate function: returns the updatable binary representation of the Datasketches HllSketch, generated by merging previously created Datasketches HllSketch instances via a Datasketches Union instance. Throws an exception if sketches have different lgConfigK values.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- kurtosis
  
  public static Column kurtosis(Column e)
  
  Aggregate function: returns the kurtosis of the values in a group.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
- kurtosis
  
  public static Column kurtosis(String columnName)
  
  Aggregate function: returns the kurtosis of the values in a group.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
- last
  
  public static Column last(Column e, boolean ignoreNulls)
  
  Aggregate function: returns the last value in a group.
  The function by default returns the last values it sees. It will return the last non-null value it sees when ignoreNulls is set to true. If all values are null, then null is returned.
  
  Parameters:
  
  e - (undocumented)
  
  ignoreNulls - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.0.0
  
  Note:
  
  The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.
- last
  
  public static Column last(String columnName, boolean ignoreNulls)
  
  Aggregate function: returns the last value of the column in a group.
  The function by default returns the last values it sees. It will return the last non-null value it sees when ignoreNulls is set to true. If all values are null, then null is returned.
  
  Parameters:
  
  columnName - (undocumented)
  
  ignoreNulls - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.0.0
  
  Note:
  
  The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.
- last
  
  public static Column last(Column e)
  
  Aggregate function: returns the last value in a group.
  The function by default returns the last values it sees. It will return the last non-null value it sees when ignoreNulls is set to true. If all values are null, then null is returned.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
  
  Note:
  
  The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.
- last
  
  public static Column last(String columnName)
  
  Aggregate function: returns the last value of the column in a group.
  The function by default returns the last values it sees. It will return the last non-null value it sees when ignoreNulls is set to true. If all values are null, then null is returned.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
  
  Note:
  
  The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.
- last_value
  
  public static Column last_value(Column e)
  
  Aggregate function: returns the last value in a group.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
  
  Note:
  
  The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.
- last_value
  
  public static Column last_value(Column e, Column ignoreNulls)
  
  Aggregate function: returns the last value in a group.
  The function by default returns the last values it sees. It will return the last non-null value it sees when ignoreNulls is set to true. If all values are null, then null is returned.
  
  Parameters:
  
  e - (undocumented)
  
  ignoreNulls - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
  
  Note:
  
  The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.
- mode
  
  public static Column mode(Column e)
  
  Aggregate function: returns the most frequent value in a group.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.4.0
- mode
  
  public static Column mode(Column e, boolean deterministic)
  
  Aggregate function: returns the most frequent value in a group.
  When multiple values have the same greatest frequency then either any of values is returned if deterministic is false or is not defined, or the lowest value is returned if deterministic is true.
  
  Parameters:
  
  e - (undocumented)
  
  deterministic - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.0.0
- max
  
  public static Column max(Column e)
  
  Aggregate function: returns the maximum value of the expression in a group.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- max
  
  public static Column max(String columnName)
  
  Aggregate function: returns the maximum value of the column in a group.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- max_by
  
  public static Column max_by(Column e, Column ord)
  
  Aggregate function: returns the value associated with the maximum value of ord.
  
  Parameters:
  
  e - (undocumented)
  
  ord - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.3.0
- mean
  
  public static Column mean(Column e)
  
  Aggregate function: returns the average of the values in a group. Alias for avg.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- mean
  
  public static Column mean(String columnName)
  
  Aggregate function: returns the average of the values in a group. Alias for avg.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- median
  
  public static Column median(Column e)
  
  Aggregate function: returns the median of the values in a group.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.4.0
- min
  
  public static Column min(Column e)
  
  Aggregate function: returns the minimum value of the expression in a group.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- min
  
  public static Column min(String columnName)
  
  Aggregate function: returns the minimum value of the column in a group.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- min_by
  
  public static Column min_by(Column e, Column ord)
  
  Aggregate function: returns the value associated with the minimum value of ord.
  
  Parameters:
  
  e - (undocumented)
  
  ord - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.3.0
- percentile
  
  public static Column percentile(Column e, Column percentage)
  
  Aggregate function: returns the exact percentile(s) of numeric column expr at the given percentage(s) with value range in [0.0, 1.0].
  
  Parameters:
  
  e - (undocumented)
  
  percentage - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- percentile
  
  public static Column percentile(Column e, Column percentage, Column frequency)
  
  Aggregate function: returns the exact percentile(s) of numeric column expr at the given percentage(s) with value range in [0.0, 1.0].
  
  Parameters:
  
  e - (undocumented)
  
  percentage - (undocumented)
  
  frequency - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- percentile_approx
  
  public static Column percentile_approx(Column e, Column percentage, Column accuracy)
  
  Aggregate function: returns the approximate percentile of the numeric column col which is the smallest value in the ordered col values (sorted from least to greatest) such that no more than percentage of col values is less than the value or equal to that value.
  If percentage is an array, each value must be between 0.0 and 1.0. If it is a single floating point value, it must be between 0.0 and 1.0.
  The accuracy parameter is a positive numeric literal which controls approximation accuracy at the cost of memory. Higher value of accuracy yields better accuracy, 1.0/accuracy is the relative error of the approximation.
  
  Parameters:
  
  e - (undocumented)
  
  percentage - (undocumented)
  
  accuracy - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.1.0
- approx_percentile
  
  public static Column approx_percentile(Column e, Column percentage, Column accuracy)
  
  Aggregate function: returns the approximate percentile of the numeric column col which is the smallest value in the ordered col values (sorted from least to greatest) such that no more than percentage of col values is less than the value or equal to that value.
  If percentage is an array, each value must be between 0.0 and 1.0. If it is a single floating point value, it must be between 0.0 and 1.0.
  The accuracy parameter is a positive numeric literal which controls approximation accuracy at the cost of memory. Higher value of accuracy yields better accuracy, 1.0/accuracy is the relative error of the approximation.
  
  Parameters:
  
  e - (undocumented)
  
  percentage - (undocumented)
  
  accuracy - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- product
  
  public static Column product(Column e)
  
  Aggregate function: returns the product of all numerical elements in a group.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.2.0
- skewness
  
  public static Column skewness(Column e)
  
  Aggregate function: returns the skewness of the values in a group.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
- skewness
  
  public static Column skewness(String columnName)
  
  Aggregate function: returns the skewness of the values in a group.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
- std
  
  public static Column std(Column e)
  
  Aggregate function: alias for stddev_samp.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- stddev
  
  public static Column stddev(Column e)
  
  Aggregate function: alias for stddev_samp.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
- stddev
  
  public static Column stddev(String columnName)
  
  Aggregate function: alias for stddev_samp.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
- stddev_samp
  
  public static Column stddev_samp(Column e)
  
  Aggregate function: returns the sample standard deviation of the expression in a group.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
- stddev_samp
  
  public static Column stddev_samp(String columnName)
  
  Aggregate function: returns the sample standard deviation of the expression in a group.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
- stddev_pop
  
  public static Column stddev_pop(Column e)
  
  Aggregate function: returns the population standard deviation of the expression in a group.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
- stddev_pop
  
  public static Column stddev_pop(String columnName)
  
  Aggregate function: returns the population standard deviation of the expression in a group.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
- sum
  
  public static Column sum(Column e)
  
  Aggregate function: returns the sum of all values in the expression.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- sum
  
  public static Column sum(String columnName)
  
  Aggregate function: returns the sum of all values in the given column.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- sumDistinct
  
  public static Column sumDistinct(Column e)
  
  Deprecated.
  Use sum_distinct. Since 3.2.0.
  
  Aggregate function: returns the sum of distinct values in the expression.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- sumDistinct
  
  public static Column sumDistinct(String columnName)
  
  Deprecated.
  Use sum_distinct. Since 3.2.0.
  
  Aggregate function: returns the sum of distinct values in the expression.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- sum_distinct
  
  public static Column sum_distinct(Column e)
  
  Aggregate function: returns the sum of distinct values in the expression.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.2.0
- variance
  
  public static Column variance(Column e)
  
  Aggregate function: alias for var_samp.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
- variance
  
  public static Column variance(String columnName)
  
  Aggregate function: alias for var_samp.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
- var_samp
  
  public static Column var_samp(Column e)
  
  Aggregate function: returns the unbiased variance of the values in a group.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
- var_samp
  
  public static Column var_samp(String columnName)
  
  Aggregate function: returns the unbiased variance of the values in a group.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
- var_pop
  
  public static Column var_pop(Column e)
  
  Aggregate function: returns the population variance of the values in a group.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
- var_pop
  
  public static Column var_pop(String columnName)
  
  Aggregate function: returns the population variance of the values in a group.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
- regr_avgx
  
  public static Column regr_avgx(Column y, Column x)
  
  Aggregate function: returns the average of the independent variable for non-null pairs in a group, where y is the dependent variable and x is the independent variable.
  
  Parameters:
  
  y - (undocumented)
  
  x - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- regr_avgy
  
  public static Column regr_avgy(Column y, Column x)
  
  Aggregate function: returns the average of the independent variable for non-null pairs in a group, where y is the dependent variable and x is the independent variable.
  
  Parameters:
  
  y - (undocumented)
  
  x - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- regr_count
  
  public static Column regr_count(Column y, Column x)
  
  Aggregate function: returns the number of non-null number pairs in a group, where y is the dependent variable and x is the independent variable.
  
  Parameters:
  
  y - (undocumented)
  
  x - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- regr_intercept
  
  public static Column regr_intercept(Column y, Column x)
  
  Aggregate function: returns the intercept of the univariate linear regression line for non-null pairs in a group, where y is the dependent variable and x is the independent variable.
  
  Parameters:
  
  y - (undocumented)
  
  x - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- regr_r2
  
  public static Column regr_r2(Column y, Column x)
  
  Aggregate function: returns the coefficient of determination for non-null pairs in a group, where y is the dependent variable and x is the independent variable.
  
  Parameters:
  
  y - (undocumented)
  
  x - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- regr_slope
  
  public static Column regr_slope(Column y, Column x)
  
  Aggregate function: returns the slope of the linear regression line for non-null pairs in a group, where y is the dependent variable and x is the independent variable.
  
  Parameters:
  
  y - (undocumented)
  
  x - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- regr_sxx
  
  public static Column regr_sxx(Column y, Column x)
  
  Aggregate function: returns REGR_COUNT(y, x) * VAR_POP(x) for non-null pairs in a group, where y is the dependent variable and x is the independent variable.
  
  Parameters:
  
  y - (undocumented)
  
  x - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- regr_sxy
  
  public static Column regr_sxy(Column y, Column x)
  
  Aggregate function: returns REGR_COUNT(y, x) * COVAR_POP(y, x) for non-null pairs in a group, where y is the dependent variable and x is the independent variable.
  
  Parameters:
  
  y - (undocumented)
  
  x - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- regr_syy
  
  public static Column regr_syy(Column y, Column x)
  
  Aggregate function: returns REGR_COUNT(y, x) * VAR_POP(y) for non-null pairs in a group, where y is the dependent variable and x is the independent variable.
  
  Parameters:
  
  y - (undocumented)
  
  x - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- any_value
  
  public static Column any_value(Column e)
  
  Aggregate function: returns some value of e for a group of rows.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- any_value
  
  public static Column any_value(Column e, Column ignoreNulls)
  
  Aggregate function: returns some value of e for a group of rows. If isIgnoreNull is true, returns only non-null values.
  
  Parameters:
  
  e - (undocumented)
  
  ignoreNulls - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- count_if
  
  public static Column count_if(Column e)
  
  Aggregate function: returns the number of TRUE values for the expression.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- histogram_numeric
  
  public static Column histogram_numeric(Column e, Column nBins)
  
  Aggregate function: computes a histogram on numeric 'expr' using nb bins. The return value is an array of (x,y) pairs representing the centers of the histogram's bins. As the value of 'nb' is increased, the histogram approximation gets finer-grained, but may yield artifacts around outliers. In practice, 20-40 histogram bins appear to work well, with more bins being required for skewed or smaller datasets. Note that this function creates a histogram with non-uniform bin widths. It offers no guarantees in terms of the mean-squared-error of the histogram, but in practice is comparable to the histograms produced by the R/S-Plus statistical computing packages. Note: the output type of the 'x' field in the return value is propagated from the input value consumed in the aggregate function.
  
  Parameters:
  
  e - (undocumented)
  
  nBins - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- every
  
  public static Column every(Column e)
  
  Aggregate function: returns true if all values of e are true.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- bool_and
  
  public static Column bool_and(Column e)
  
  Aggregate function: returns true if all values of e are true.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- some
  
  public static Column some(Column e)
  
  Aggregate function: returns true if at least one value of e is true.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- any
  
  public static Column any(Column e)
  
  Aggregate function: returns true if at least one value of e is true.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- bool_or
  
  public static Column bool_or(Column e)
  
  Aggregate function: returns true if at least one value of e is true.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- bit_and
  
  public static Column bit_and(Column e)
  
  Aggregate function: returns the bitwise AND of all non-null input values, or null if none.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- bit_or
  
  public static Column bit_or(Column e)
  
  Aggregate function: returns the bitwise OR of all non-null input values, or null if none.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- bit_xor
  
  public static Column bit_xor(Column e)
  
  Aggregate function: returns the bitwise XOR of all non-null input values, or null if none.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- cume_dist
  
  public static Column cume_dist()
  Window function: returns the cumulative distribution of values within a window partition, i.e. the fraction of rows that are below the current row.
  
  N = total number of rows in the partition cumeDist(x) = number of values before (and including) x / N
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
- dense_rank
  
  public static Column dense_rank()
  
  Window function: returns the rank of rows within a window partition, without any gaps.
  The difference between rank and dense_rank is that denseRank leaves no gaps in ranking sequence when there are ties. That is, if you were ranking a competition using dense_rank and had three people tie for second place, you would say that all three were in second place and that the next person came in third. Rank would give me sequential numbers, making the person that came in third place (after the ties) would register as coming in fifth.
  This is equivalent to the DENSE_RANK function in SQL.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
- lag
  
  public static Column lag(Column e, int offset)
  
  Window function: returns the value that is offset rows before the current row, and null if there is less than offset rows before the current row. For example, an offset of one will return the previous row at any given point in the window partition.
  This is equivalent to the LAG function in SQL.
  
  Parameters:
  
  e - (undocumented)
  
  offset - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- lag
  
  public static Column lag(String columnName, int offset)
  
  Window function: returns the value that is offset rows before the current row, and null if there is less than offset rows before the current row. For example, an offset of one will return the previous row at any given point in the window partition.
  This is equivalent to the LAG function in SQL.
  
  Parameters:
  
  columnName - (undocumented)
  
  offset - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- lag
  
  public static Column lag(String columnName, int offset, Object defaultValue)
  
  Window function: returns the value that is offset rows before the current row, and defaultValue if there is less than offset rows before the current row. For example, an offset of one will return the previous row at any given point in the window partition.
  This is equivalent to the LAG function in SQL.
  
  Parameters:
  
  columnName - (undocumented)
  
  offset - (undocumented)
  
  defaultValue - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- lag
  
  public static Column lag(Column e, int offset, Object defaultValue)
  
  Window function: returns the value that is offset rows before the current row, and defaultValue if there is less than offset rows before the current row. For example, an offset of one will return the previous row at any given point in the window partition.
  This is equivalent to the LAG function in SQL.
  
  Parameters:
  
  e - (undocumented)
  
  offset - (undocumented)
  
  defaultValue - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- lag
  
  public static Column lag(Column e, int offset, Object defaultValue, boolean ignoreNulls)
  
  Window function: returns the value that is offset rows before the current row, and defaultValue if there is less than offset rows before the current row. ignoreNulls determines whether null values of row are included in or eliminated from the calculation. For example, an offset of one will return the previous row at any given point in the window partition.
  This is equivalent to the LAG function in SQL.
  
  Parameters:
  
  e - (undocumented)
  
  offset - (undocumented)
  
  defaultValue - (undocumented)
  
  ignoreNulls - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.2.0
- lead
  
  public static Column lead(String columnName, int offset)
  
  Window function: returns the value that is offset rows after the current row, and null if there is less than offset rows after the current row. For example, an offset of one will return the next row at any given point in the window partition.
  This is equivalent to the LEAD function in SQL.
  
  Parameters:
  
  columnName - (undocumented)
  
  offset - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- lead
  
  public static Column lead(Column e, int offset)
  
  Window function: returns the value that is offset rows after the current row, and null if there is less than offset rows after the current row. For example, an offset of one will return the next row at any given point in the window partition.
  This is equivalent to the LEAD function in SQL.
  
  Parameters:
  
  e - (undocumented)
  
  offset - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- lead
  
  public static Column lead(String columnName, int offset, Object defaultValue)
  
  Window function: returns the value that is offset rows after the current row, and defaultValue if there is less than offset rows after the current row. For example, an offset of one will return the next row at any given point in the window partition.
  This is equivalent to the LEAD function in SQL.
  
  Parameters:
  
  columnName - (undocumented)
  
  offset - (undocumented)
  
  defaultValue - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- lead
  
  public static Column lead(Column e, int offset, Object defaultValue)
  
  Window function: returns the value that is offset rows after the current row, and defaultValue if there is less than offset rows after the current row. For example, an offset of one will return the next row at any given point in the window partition.
  This is equivalent to the LEAD function in SQL.
  
  Parameters:
  
  e - (undocumented)
  
  offset - (undocumented)
  
  defaultValue - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- lead
  
  public static Column lead(Column e, int offset, Object defaultValue, boolean ignoreNulls)
  
  Window function: returns the value that is offset rows after the current row, and defaultValue if there is less than offset rows after the current row. ignoreNulls determines whether null values of row are included in or eliminated from the calculation. The default value of ignoreNulls is false. For example, an offset of one will return the next row at any given point in the window partition.
  This is equivalent to the LEAD function in SQL.
  
  Parameters:
  
  e - (undocumented)
  
  offset - (undocumented)
  
  defaultValue - (undocumented)
  
  ignoreNulls - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.2.0
- nth_value
  
  public static Column nth_value(Column e, int offset, boolean ignoreNulls)
  
  Window function: returns the value that is the offsetth row of the window frame (counting from 1), and null if the size of window frame is less than offset rows.
  It will return the offsetth non-null value it sees when ignoreNulls is set to true. If all values are null, then null is returned.
  This is equivalent to the nth_value function in SQL.
  
  Parameters:
  
  e - (undocumented)
  
  offset - (undocumented)
  
  ignoreNulls - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.1.0
- nth_value
  
  public static Column nth_value(Column e, int offset)
  
  Window function: returns the value that is the offsetth row of the window frame (counting from 1), and null if the size of window frame is less than offset rows.
  This is equivalent to the nth_value function in SQL.
  
  Parameters:
  
  e - (undocumented)
  
  offset - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.1.0
- ntile
  
  public static Column ntile(int n)
  
  Window function: returns the ntile group id (from 1 to n inclusive) in an ordered window partition. For example, if n is 4, the first quarter of the rows will get value 1, the second quarter will get 2, the third quarter will get 3, and the last quarter will get 4.
  This is equivalent to the NTILE function in SQL.
  
  Parameters:
  
  n - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- percent_rank
  
  public static Column percent_rank()
  Window function: returns the relative rank (i.e. percentile) of rows within a window partition.
  This is computed by:
  (rank of row in its partition - 1) / (number of rows in the partition - 1)
  
  This is equivalent to the PERCENT_RANK function in SQL.
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
- rank
  
  public static Column rank()
  
  Window function: returns the rank of rows within a window partition.
  The difference between rank and dense_rank is that dense_rank leaves no gaps in ranking sequence when there are ties. That is, if you were ranking a competition using dense_rank and had three people tie for second place, you would say that all three were in second place and that the next person came in third. Rank would give me sequential numbers, making the person that came in third place (after the ties) would register as coming in fifth.
  This is equivalent to the RANK function in SQL.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- row_number
  
  public static Column row_number()
  
  Window function: returns a sequential number starting at 1 within a window partition.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
- array
  
  public static Column array(scala.collection.immutable.Seq<Column> cols)
  
  Creates a new array column. The input columns must all have the same data type.
  
  Parameters:
  
  cols - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- array
  
  public static Column array(String colName, scala.collection.immutable.Seq<String> colNames)
  
  Creates a new array column. The input columns must all have the same data type.
  
  Parameters:
  
  colName - (undocumented)
  
  colNames - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- map
  
  public static Column map(scala.collection.immutable.Seq<Column> cols)
  
  Creates a new map column. The input columns must be grouped as key-value pairs, e.g. (key1, value1, key2, value2, ...). The key columns must all have the same data type, and can't be null. The value columns must all have the same data type.
  
  Parameters:
  
  cols - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.0
- named_struct
  
  public static Column named_struct(scala.collection.immutable.Seq<Column> cols)
  
  Creates a struct with the given field names and values.
  
  Parameters:
  
  cols - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- map_from_arrays
  
  public static Column map_from_arrays(Column keys, Column values)
  
  Creates a new map column. The array in the first column is used for keys. The array in the second column is used for values. All elements in the array for key should not be null.
  
  Parameters:
  
  keys - (undocumented)
  
  values - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4
- str_to_map
  
  public static Column str_to_map(Column text, Column pairDelim, Column keyValueDelim)
  
  Creates a map after splitting the text into key/value pairs using delimiters. Both pairDelim and keyValueDelim are treated as regular expressions.
  
  Parameters:
  
  text - (undocumented)
  
  pairDelim - (undocumented)
  
  keyValueDelim - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- str_to_map
  
  public static Column str_to_map(Column text, Column pairDelim)
  
  Creates a map after splitting the text into key/value pairs using delimiters. The pairDelim is treated as regular expressions.
  
  Parameters:
  
  text - (undocumented)
  
  pairDelim - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- str_to_map
  
  public static Column str_to_map(Column text)
  
  Creates a map after splitting the text into key/value pairs using delimiters.
  
  Parameters:
  
  text - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- broadcast
  
  public static <T> Dataset<T> broadcast(Dataset<T> df)
  Marks a DataFrame as small enough for use in broadcast joins.
  The following example marks the right DataFrame for broadcast hash join using joinKey.
  // left and right are DataFrames left.join(broadcast(right), "joinKey")
  Parameters:
  
  df - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- coalesce
  
  public static Column coalesce(scala.collection.immutable.Seq<Column> e)
  
  Returns the first column that is not null, or null if all inputs are null.
  For example, coalesce(a, b, c) will return a if a is not null, or b if a is null and b is not null, or c if both a and b are null but c is not null.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- input_file_name
  
  public static Column input_file_name()
  
  Creates a string column for the file name of the current Spark task.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
- isnan
  
  public static Column isnan(Column e)
  
  Return true iff the column is NaN.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
- isnull
  
  public static Column isnull(Column e)
  
  Return true iff the column is null.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
- monotonicallyIncreasingId
  
  public static Column monotonicallyIncreasingId()
  
  Deprecated.
  Use monotonically_increasing_id(). Since 2.0.0.
  A column expression that generates monotonically increasing 64-bit integers.
  The generated ID is guaranteed to be monotonically increasing and unique, but not consecutive. The current implementation puts the partition ID in the upper 31 bits, and the record number within each partition in the lower 33 bits. The assumption is that the data frame has less than 1 billion partitions, and each partition has less than 8 billion records.
  As an example, consider a DataFrame with two partitions, each with 3 records. This expression would return the following IDs:
  
  0, 1, 2, 8589934592 (1L << 33), 8589934593, 8589934594.
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- monotonically_increasing_id
  
  public static Column monotonically_increasing_id()
  A column expression that generates monotonically increasing 64-bit integers.
  The generated ID is guaranteed to be monotonically increasing and unique, but not consecutive. The current implementation puts the partition ID in the upper 31 bits, and the record number within each partition in the lower 33 bits. The assumption is that the data frame has less than 1 billion partitions, and each partition has less than 8 billion records.
  As an example, consider a DataFrame with two partitions, each with 3 records. This expression would return the following IDs:
  
  0, 1, 2, 8589934592 (1L << 33), 8589934593, 8589934594.
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
- nanvl
  
  public static Column nanvl(Column col1, Column col2)
  
  Returns col1 if it is not NaN, or col2 if col1 is NaN.
  Both inputs should be floating point columns (DoubleType or FloatType).
  
  Parameters:
  
  col1 - (undocumented)
  
  col2 - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- negate
  
  public static Column negate(Column e)
  Unary minus, i.e. negate the expression.
  // Select the amount column and negates all values. // Scala: df.select( -df("amount") ) // Java: df.select( negate(df.col("amount")) );
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- not
  
  public static Column not(Column e)
  Inversion of boolean expression, i.e. NOT.
  // Scala: select rows that are not active (isActive === false) df.filter( !df("isActive") ) // Java: df.filter( not(df.col("isActive")) );
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- rand
  
  public static Column rand(long seed)
  
  Generate a random column with independent and identically distributed (i.i.d.) samples uniformly distributed in [0.0, 1.0).
  
  Parameters:
  
  seed - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
  
  Note:
  
  The function is non-deterministic in general case.
- rand
  
  public static Column rand()
  
  Generate a random column with independent and identically distributed (i.i.d.) samples uniformly distributed in [0.0, 1.0).
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
  
  Note:
  
  The function is non-deterministic in general case.
- randn
  
  public static Column randn(long seed)
  
  Generate a column with independent and identically distributed (i.i.d.) samples from the standard normal distribution.
  
  Parameters:
  
  seed - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
  
  Note:
  
  The function is non-deterministic in general case.
- randn
  
  public static Column randn()
  
  Generate a column with independent and identically distributed (i.i.d.) samples from the standard normal distribution.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
  
  Note:
  
  The function is non-deterministic in general case.
- spark_partition_id
  
  public static Column spark_partition_id()
  
  Partition ID.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
  
  Note:
  
  This is non-deterministic because it depends on data partitioning and task scheduling.
- sqrt
  
  public static Column sqrt(Column e)
  
  Computes the square root of the specified float value.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- sqrt
  
  public static Column sqrt(String colName)
  
  Computes the square root of the specified float value.
  
  Parameters:
  
  colName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- try_add
  
  public static Column try_add(Column left, Column right)
  
  Returns the sum of left and right and the result is null on overflow. The acceptable input types are the same with the + operator.
  
  Parameters:
  
  left - (undocumented)
  
  right - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- try_avg
  
  public static Column try_avg(Column e)
  
  Returns the mean calculated from values of a group and the result is null on overflow.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- try_divide
  
  public static Column try_divide(Column left, Column right)
  
  Returns dividend/divisor. It always performs floating point division. Its result is always null if divisor is 0.
  
  Parameters:
  
  left - (undocumented)
  
  right - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- try_remainder
  
  public static Column try_remainder(Column left, Column right)
  
  Returns the remainder of dividend/divisor. Its result is always null if divisor is 0.
  
  Parameters:
  
  left - (undocumented)
  
  right - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.0.0
- try_multiply
  
  public static Column try_multiply(Column left, Column right)
  
  Returns left*right and the result is null on overflow. The acceptable input types are the same with the * operator.
  
  Parameters:
  
  left - (undocumented)
  
  right - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- try_subtract
  
  public static Column try_subtract(Column left, Column right)
  
  Returns left-right and the result is null on overflow. The acceptable input types are the same with the - operator.
  
  Parameters:
  
  left - (undocumented)
  
  right - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- try_sum
  
  public static Column try_sum(Column e)
  
  Returns the sum calculated from values of a group and the result is null on overflow.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- struct
  
  public static Column struct(scala.collection.immutable.Seq<Column> cols)
  
  Creates a new struct column. If the input column is a column in a DataFrame, or a derived column expression that is named (i.e. aliased), its name would be retained as the StructField's name, otherwise, the newly generated StructField's name would be auto generated as col with a suffix index + 1, i.e. col1, col2, col3, ...
  
  Parameters:
  
  cols - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- struct
  
  public static Column struct(String colName, scala.collection.immutable.Seq<String> colNames)
  
  Creates a new struct column that composes multiple input columns.
  
  Parameters:
  
  colName - (undocumented)
  
  colNames - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- when
  
  public static Column when(Column condition, Object value)
  Evaluates a list of conditions and returns one of multiple possible result expressions. If otherwise is not defined at the end, null is returned for unmatched conditions.
  
  // Example: encoding gender string column into integer. // Scala: people.select(when(people("gender") === "male", 0) .when(people("gender") === "female", 1) .otherwise(2)) // Java: people.select(when(col("gender").equalTo("male"), 0) .when(col("gender").equalTo("female"), 1) .otherwise(2))
  Parameters:
  
  condition - (undocumented)
  
  value - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- bitwiseNOT
  
  public static Column bitwiseNOT(Column e)
  
  Deprecated.
  Use bitwise_not. Since 3.2.0.
  
  Computes bitwise NOT (~) of a number.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- bitwise_not
  
  public static Column bitwise_not(Column e)
  
  Computes bitwise NOT (~) of a number.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.2.0
- bit_count
  
  public static Column bit_count(Column e)
  
  Returns the number of bits that are set in the argument expr as an unsigned 64-bit integer, or NULL if the argument is NULL.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- bit_get
  
  public static Column bit_get(Column e, Column pos)
  
  Returns the value of the bit (0 or 1) at the specified position. The positions are numbered from right to left, starting at zero. The position argument cannot be negative.
  
  Parameters:
  
  e - (undocumented)
  
  pos - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- getbit
  
  public static Column getbit(Column e, Column pos)
  
  Returns the value of the bit (0 or 1) at the specified position. The positions are numbered from right to left, starting at zero. The position argument cannot be negative.
  
  Parameters:
  
  e - (undocumented)
  
  pos - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- expr
  
  public static Column expr(String expr)
  Parses the expression string into the column that it represents, similar to Dataset.selectExpr(java.lang.String...).
  // get the number of words of each length df.groupBy(expr("length(word)")).count()
  Parameters:
  
  expr - (undocumented)
  
  Returns:
  
  (undocumented)
- abs
  
  public static Column abs(Column e)
  
  Computes the absolute value of a numeric value.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- acos
  
  public static Column acos(Column e)
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  inverse cosine of e in radians, as if computed by java.lang.Math.acos
  
  Since:
  
  1.4.0
- acos
  
  public static Column acos(String columnName)
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  inverse cosine of columnName, as if computed by java.lang.Math.acos
  
  Since:
  
  1.4.0
- acosh
  
  public static Column acosh(Column e)
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  inverse hyperbolic cosine of e
  
  Since:
  
  3.1.0
- acosh
  
  public static Column acosh(String columnName)
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  inverse hyperbolic cosine of columnName
  
  Since:
  
  3.1.0
- asin
  
  public static Column asin(Column e)
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  inverse sine of e in radians, as if computed by java.lang.Math.asin
  
  Since:
  
  1.4.0
- asin
  
  public static Column asin(String columnName)
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  inverse sine of columnName, as if computed by java.lang.Math.asin
  
  Since:
  
  1.4.0
- asinh
  
  public static Column asinh(Column e)
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  inverse hyperbolic sine of e
  
  Since:
  
  3.1.0
- asinh
  
  public static Column asinh(String columnName)
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  inverse hyperbolic sine of columnName
  
  Since:
  
  3.1.0
- atan
  
  public static Column atan(Column e)
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  inverse tangent of e as if computed by java.lang.Math.atan
  
  Since:
  
  1.4.0
- atan
  
  public static Column atan(String columnName)
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  inverse tangent of columnName, as if computed by java.lang.Math.atan
  
  Since:
  
  1.4.0
- atan2
  
  public static Column atan2(Column y, Column x)
  
  Parameters:
  
  y - coordinate on y-axis
  
  x - coordinate on x-axis
  
  Returns:
  
  the theta component of the point (r, theta) in polar coordinates that corresponds to the point (x, y) in Cartesian coordinates, as if computed by java.lang.Math.atan2
  
  Since:
  
  1.4.0
- atan2
  
  public static Column atan2(Column y, String xName)
  
  Parameters:
  
  y - coordinate on y-axis
  
  xName - coordinate on x-axis
  
  Returns:
  
  the theta component of the point (r, theta) in polar coordinates that corresponds to the point (x, y) in Cartesian coordinates, as if computed by java.lang.Math.atan2
  
  Since:
  
  1.4.0
- atan2
  
  public static Column atan2(String yName, Column x)
  
  Parameters:
  
  yName - coordinate on y-axis
  
  x - coordinate on x-axis
  
  Returns:
  
  the theta component of the point (r, theta) in polar coordinates that corresponds to the point (x, y) in Cartesian coordinates, as if computed by java.lang.Math.atan2
  
  Since:
  
  1.4.0
- atan2
  
  public static Column atan2(String yName, String xName)
  
  Parameters:
  
  yName - coordinate on y-axis
  
  xName - coordinate on x-axis
  
  Returns:
  
  the theta component of the point (r, theta) in polar coordinates that corresponds to the point (x, y) in Cartesian coordinates, as if computed by java.lang.Math.atan2
  
  Since:
  
  1.4.0
- atan2
  
  public static Column atan2(Column y, double xValue)
  
  Parameters:
  
  y - coordinate on y-axis
  
  xValue - coordinate on x-axis
  
  Returns:
  
  the theta component of the point (r, theta) in polar coordinates that corresponds to the point (x, y) in Cartesian coordinates, as if computed by java.lang.Math.atan2
  
  Since:
  
  1.4.0
- atan2
  
  public static Column atan2(String yName, double xValue)
  
  Parameters:
  
  yName - coordinate on y-axis
  
  xValue - coordinate on x-axis
  
  Returns:
  
  the theta component of the point (r, theta) in polar coordinates that corresponds to the point (x, y) in Cartesian coordinates, as if computed by java.lang.Math.atan2
  
  Since:
  
  1.4.0
- atan2
  
  public static Column atan2(double yValue, Column x)
  
  Parameters:
  
  yValue - coordinate on y-axis
  
  x - coordinate on x-axis
  
  Returns:
  
  the theta component of the point (r, theta) in polar coordinates that corresponds to the point (x, y) in Cartesian coordinates, as if computed by java.lang.Math.atan2
  
  Since:
  
  1.4.0
- atan2
  
  public static Column atan2(double yValue, String xName)
  
  Parameters:
  
  yValue - coordinate on y-axis
  
  xName - coordinate on x-axis
  
  Returns:
  
  the theta component of the point (r, theta) in polar coordinates that corresponds to the point (x, y) in Cartesian coordinates, as if computed by java.lang.Math.atan2
  
  Since:
  
  1.4.0
- atanh
  
  public static Column atanh(Column e)
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  inverse hyperbolic tangent of e
  
  Since:
  
  3.1.0
- atanh
  
  public static Column atanh(String columnName)
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  inverse hyperbolic tangent of columnName
  
  Since:
  
  3.1.0
- bin
  
  public static Column bin(Column e)
  
  An expression that returns the string representation of the binary value of the given long column. For example, bin("12") returns "1100".
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- bin
  
  public static Column bin(String columnName)
  
  An expression that returns the string representation of the binary value of the given long column. For example, bin("12") returns "1100".
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- cbrt
  
  public static Column cbrt(Column e)
  
  Computes the cube-root of the given value.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- cbrt
  
  public static Column cbrt(String columnName)
  
  Computes the cube-root of the given column.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- ceil
  
  public static Column ceil(Column e, Column scale)
  
  Computes the ceiling of the given value of e to scale decimal places.
  
  Parameters:
  
  e - (undocumented)
  
  scale - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.3.0
- ceil
  
  public static Column ceil(Column e)
  
  Computes the ceiling of the given value of e to 0 decimal places.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- ceil
  
  public static Column ceil(String columnName)
  
  Computes the ceiling of the given value of e to 0 decimal places.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- ceiling
  
  public static Column ceiling(Column e, Column scale)
  
  Computes the ceiling of the given value of e to scale decimal places.
  
  Parameters:
  
  e - (undocumented)
  
  scale - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- ceiling
  
  public static Column ceiling(Column e)
  
  Computes the ceiling of the given value of e to 0 decimal places.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- conv
  
  public static Column conv(Column num, int fromBase, int toBase)
  
  Convert a number in a string column from one base to another.
  
  Parameters:
  
  num - (undocumented)
  
  fromBase - (undocumented)
  
  toBase - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- cos
  
  public static Column cos(Column e)
  
  Parameters:
  
  e - angle in radians
  
  Returns:
  
  cosine of the angle, as if computed by java.lang.Math.cos
  
  Since:
  
  1.4.0
- cos
  
  public static Column cos(String columnName)
  
  Parameters:
  
  columnName - angle in radians
  
  Returns:
  
  cosine of the angle, as if computed by java.lang.Math.cos
  
  Since:
  
  1.4.0
- cosh
  
  public static Column cosh(Column e)
  
  Parameters:
  
  e - hyperbolic angle
  
  Returns:
  
  hyperbolic cosine of the angle, as if computed by java.lang.Math.cosh
  
  Since:
  
  1.4.0
- cosh
  
  public static Column cosh(String columnName)
  
  Parameters:
  
  columnName - hyperbolic angle
  
  Returns:
  
  hyperbolic cosine of the angle, as if computed by java.lang.Math.cosh
  
  Since:
  
  1.4.0
- cot
  
  public static Column cot(Column e)
  
  Parameters:
  
  e - angle in radians
  
  Returns:
  
  cotangent of the angle
  
  Since:
  
  3.3.0
- csc
  
  public static Column csc(Column e)
  
  Parameters:
  
  e - angle in radians
  
  Returns:
  
  cosecant of the angle
  
  Since:
  
  3.3.0
- e
  
  public static Column e()
  
  Returns Euler's number.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- exp
  
  public static Column exp(Column e)
  
  Computes the exponential of the given value.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- exp
  
  public static Column exp(String columnName)
  
  Computes the exponential of the given column.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- expm1
  
  public static Column expm1(Column e)
  
  Computes the exponential of the given value minus one.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- expm1
  
  public static Column expm1(String columnName)
  
  Computes the exponential of the given column minus one.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- factorial
  
  public static Column factorial(Column e)
  
  Computes the factorial of the given value.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- floor
  
  public static Column floor(Column e, Column scale)
  
  Computes the floor of the given value of e to scale decimal places.
  
  Parameters:
  
  e - (undocumented)
  
  scale - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.3.0
- floor
  
  public static Column floor(Column e)
  
  Computes the floor of the given value of e to 0 decimal places.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- floor
  
  public static Column floor(String columnName)
  
  Computes the floor of the given column value to 0 decimal places.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- greatest
  
  public static Column greatest(scala.collection.immutable.Seq<Column> exprs)
  
  Returns the greatest value of the list of values, skipping null values. This function takes at least 2 parameters. It will return null iff all parameters are null.
  
  Parameters:
  
  exprs - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- greatest
  
  public static Column greatest(String columnName, scala.collection.immutable.Seq<String> columnNames)
  
  Returns the greatest value of the list of column names, skipping null values. This function takes at least 2 parameters. It will return null iff all parameters are null.
  
  Parameters:
  
  columnName - (undocumented)
  
  columnNames - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- hex
  
  public static Column hex(Column column)
  
  Computes hex value of the given column.
  
  Parameters:
  
  column - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- unhex
  
  public static Column unhex(Column column)
  
  Inverse of hex. Interprets each pair of characters as a hexadecimal number and converts to the byte representation of number.
  
  Parameters:
  
  column - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- hypot
  
  public static Column hypot(Column l, Column r)
  
  Computes sqrt(a^2^ + b^2^) without intermediate overflow or underflow.
  
  Parameters:
  
  l - (undocumented)
  
  r - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- hypot
  
  public static Column hypot(Column l, String rightName)
  
  Computes sqrt(a^2^ + b^2^) without intermediate overflow or underflow.
  
  Parameters:
  
  l - (undocumented)
  
  rightName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- hypot
  
  public static Column hypot(String leftName, Column r)
  
  Computes sqrt(a^2^ + b^2^) without intermediate overflow or underflow.
  
  Parameters:
  
  leftName - (undocumented)
  
  r - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- hypot
  
  public static Column hypot(String leftName, String rightName)
  
  Computes sqrt(a^2^ + b^2^) without intermediate overflow or underflow.
  
  Parameters:
  
  leftName - (undocumented)
  
  rightName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- hypot
  
  public static Column hypot(Column l, double r)
  
  Computes sqrt(a^2^ + b^2^) without intermediate overflow or underflow.
  
  Parameters:
  
  l - (undocumented)
  
  r - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- hypot
  
  public static Column hypot(String leftName, double r)
  
  Computes sqrt(a^2^ + b^2^) without intermediate overflow or underflow.
  
  Parameters:
  
  leftName - (undocumented)
  
  r - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- hypot
  
  public static Column hypot(double l, Column r)
  
  Computes sqrt(a^2^ + b^2^) without intermediate overflow or underflow.
  
  Parameters:
  
  l - (undocumented)
  
  r - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- hypot
  
  public static Column hypot(double l, String rightName)
  
  Computes sqrt(a^2^ + b^2^) without intermediate overflow or underflow.
  
  Parameters:
  
  l - (undocumented)
  
  rightName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- least
  
  public static Column least(scala.collection.immutable.Seq<Column> exprs)
  
  Returns the least value of the list of values, skipping null values. This function takes at least 2 parameters. It will return null iff all parameters are null.
  
  Parameters:
  
  exprs - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- least
  
  public static Column least(String columnName, scala.collection.immutable.Seq<String> columnNames)
  
  Returns the least value of the list of column names, skipping null values. This function takes at least 2 parameters. It will return null iff all parameters are null.
  
  Parameters:
  
  columnName - (undocumented)
  
  columnNames - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- ln
  
  public static Column ln(Column e)
  
  Computes the natural logarithm of the given value.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- log
  
  public static Column log(Column e)
  
  Computes the natural logarithm of the given value.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- log
  
  public static Column log(String columnName)
  
  Computes the natural logarithm of the given column.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- log
  
  public static Column log(double base, Column a)
  
  Returns the first argument-base logarithm of the second argument.
  
  Parameters:
  
  base - (undocumented)
  
  a - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- log
  
  public static Column log(double base, String columnName)
  
  Returns the first argument-base logarithm of the second argument.
  
  Parameters:
  
  base - (undocumented)
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- log10
  
  public static Column log10(Column e)
  
  Computes the logarithm of the given value in base 10.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- log10
  
  public static Column log10(String columnName)
  
  Computes the logarithm of the given value in base 10.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- log1p
  
  public static Column log1p(Column e)
  
  Computes the natural logarithm of the given value plus one.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- log1p
  
  public static Column log1p(String columnName)
  
  Computes the natural logarithm of the given column plus one.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- log2
  
  public static Column log2(Column expr)
  
  Computes the logarithm of the given column in base 2.
  
  Parameters:
  
  expr - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- log2
  
  public static Column log2(String columnName)
  
  Computes the logarithm of the given value in base 2.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- negative
  
  public static Column negative(Column e)
  
  Returns the negated value.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- pi
  
  public static Column pi()
  
  Returns Pi.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- positive
  
  public static Column positive(Column e)
  
  Returns the value.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- pow
  
  public static Column pow(Column l, Column r)
  
  Returns the value of the first argument raised to the power of the second argument.
  
  Parameters:
  
  l - (undocumented)
  
  r - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- pow
  
  public static Column pow(Column l, String rightName)
  
  Returns the value of the first argument raised to the power of the second argument.
  
  Parameters:
  
  l - (undocumented)
  
  rightName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- pow
  
  public static Column pow(String leftName, Column r)
  
  Returns the value of the first argument raised to the power of the second argument.
  
  Parameters:
  
  leftName - (undocumented)
  
  r - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- pow
  
  public static Column pow(String leftName, String rightName)
  
  Returns the value of the first argument raised to the power of the second argument.
  
  Parameters:
  
  leftName - (undocumented)
  
  rightName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- pow
  
  public static Column pow(Column l, double r)
  
  Returns the value of the first argument raised to the power of the second argument.
  
  Parameters:
  
  l - (undocumented)
  
  r - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- pow
  
  public static Column pow(String leftName, double r)
  
  Returns the value of the first argument raised to the power of the second argument.
  
  Parameters:
  
  leftName - (undocumented)
  
  r - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- pow
  
  public static Column pow(double l, Column r)
  
  Returns the value of the first argument raised to the power of the second argument.
  
  Parameters:
  
  l - (undocumented)
  
  r - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- pow
  
  public static Column pow(double l, String rightName)
  
  Returns the value of the first argument raised to the power of the second argument.
  
  Parameters:
  
  l - (undocumented)
  
  rightName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- power
  
  public static Column power(Column l, Column r)
  
  Returns the value of the first argument raised to the power of the second argument.
  
  Parameters:
  
  l - (undocumented)
  
  r - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- pmod
  
  public static Column pmod(Column dividend, Column divisor)
  
  Returns the positive value of dividend mod divisor.
  
  Parameters:
  
  dividend - (undocumented)
  
  divisor - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- rint
  
  public static Column rint(Column e)
  
  Returns the double value that is closest in value to the argument and is equal to a mathematical integer.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- rint
  
  public static Column rint(String columnName)
  
  Returns the double value that is closest in value to the argument and is equal to a mathematical integer.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- round
  
  public static Column round(Column e)
  
  Returns the value of the column e rounded to 0 decimal places with HALF_UP round mode.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- round
  
  public static Column round(Column e, int scale)
  
  Round the value of e to scale decimal places with HALF_UP round mode if scale is greater than or equal to 0 or at integral part when scale is less than 0.
  
  Parameters:
  
  e - (undocumented)
  
  scale - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- round
  
  public static Column round(Column e, Column scale)
  
  Round the value of e to scale decimal places with HALF_UP round mode if scale is greater than or equal to 0 or at integral part when scale is less than 0.
  
  Parameters:
  
  e - (undocumented)
  
  scale - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.0.0
- bround
  
  public static Column bround(Column e)
  
  Returns the value of the column e rounded to 0 decimal places with HALF_EVEN round mode.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.0.0
- bround
  
  public static Column bround(Column e, int scale)
  
  Round the value of e to scale decimal places with HALF_EVEN round mode if scale is greater than or equal to 0 or at integral part when scale is less than 0.
  
  Parameters:
  
  e - (undocumented)
  
  scale - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.0.0
- bround
  
  public static Column bround(Column e, Column scale)
  
  Round the value of e to scale decimal places with HALF_EVEN round mode if scale is greater than or equal to 0 or at integral part when scale is less than 0.
  
  Parameters:
  
  e - (undocumented)
  
  scale - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.0.0
- sec
  
  public static Column sec(Column e)
  
  Parameters:
  
  e - angle in radians
  
  Returns:
  
  secant of the angle
  
  Since:
  
  3.3.0
- shiftLeft
  
  public static Column shiftLeft(Column e, int numBits)
  
  Deprecated.
  Use shiftleft. Since 3.2.0.
  
  Shift the given value numBits left. If the given value is a long value, this function will return a long value else it will return an integer value.
  
  Parameters:
  
  e - (undocumented)
  
  numBits - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- shiftleft
  
  public static Column shiftleft(Column e, int numBits)
  
  Shift the given value numBits left. If the given value is a long value, this function will return a long value else it will return an integer value.
  
  Parameters:
  
  e - (undocumented)
  
  numBits - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.2.0
- shiftRight
  
  public static Column shiftRight(Column e, int numBits)
  
  Deprecated.
  Use shiftright. Since 3.2.0.
  
  (Signed) shift the given value numBits right. If the given value is a long value, it will return a long value else it will return an integer value.
  
  Parameters:
  
  e - (undocumented)
  
  numBits - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- shiftright
  
  public static Column shiftright(Column e, int numBits)
  
  (Signed) shift the given value numBits right. If the given value is a long value, it will return a long value else it will return an integer value.
  
  Parameters:
  
  e - (undocumented)
  
  numBits - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.2.0
- shiftRightUnsigned
  
  public static Column shiftRightUnsigned(Column e, int numBits)
  
  Deprecated.
  Use shiftrightunsigned. Since 3.2.0.
  
  Unsigned shift the given value numBits right. If the given value is a long value, it will return a long value else it will return an integer value.
  
  Parameters:
  
  e - (undocumented)
  
  numBits - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- shiftrightunsigned
  
  public static Column shiftrightunsigned(Column e, int numBits)
  
  Unsigned shift the given value numBits right. If the given value is a long value, it will return a long value else it will return an integer value.
  
  Parameters:
  
  e - (undocumented)
  
  numBits - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.2.0
- sign
  
  public static Column sign(Column e)
  
  Computes the signum of the given value.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- signum
  
  public static Column signum(Column e)
  
  Computes the signum of the given value.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- signum
  
  public static Column signum(String columnName)
  
  Computes the signum of the given column.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- sin
  
  public static Column sin(Column e)
  
  Parameters:
  
  e - angle in radians
  
  Returns:
  
  sine of the angle, as if computed by java.lang.Math.sin
  
  Since:
  
  1.4.0
- sin
  
  public static Column sin(String columnName)
  
  Parameters:
  
  columnName - angle in radians
  
  Returns:
  
  sine of the angle, as if computed by java.lang.Math.sin
  
  Since:
  
  1.4.0
- sinh
  
  public static Column sinh(Column e)
  
  Parameters:
  
  e - hyperbolic angle
  
  Returns:
  
  hyperbolic sine of the given value, as if computed by java.lang.Math.sinh
  
  Since:
  
  1.4.0
- sinh
  
  public static Column sinh(String columnName)
  
  Parameters:
  
  columnName - hyperbolic angle
  
  Returns:
  
  hyperbolic sine of the given value, as if computed by java.lang.Math.sinh
  
  Since:
  
  1.4.0
- tan
  
  public static Column tan(Column e)
  
  Parameters:
  
  e - angle in radians
  
  Returns:
  
  tangent of the given value, as if computed by java.lang.Math.tan
  
  Since:
  
  1.4.0
- tan
  
  public static Column tan(String columnName)
  
  Parameters:
  
  columnName - angle in radians
  
  Returns:
  
  tangent of the given value, as if computed by java.lang.Math.tan
  
  Since:
  
  1.4.0
- tanh
  
  public static Column tanh(Column e)
  
  Parameters:
  
  e - hyperbolic angle
  
  Returns:
  
  hyperbolic tangent of the given value, as if computed by java.lang.Math.tanh
  
  Since:
  
  1.4.0
- tanh
  
  public static Column tanh(String columnName)
  
  Parameters:
  
  columnName - hyperbolic angle
  
  Returns:
  
  hyperbolic tangent of the given value, as if computed by java.lang.Math.tanh
  
  Since:
  
  1.4.0
- toDegrees
  
  public static Column toDegrees(Column e)
  
  Deprecated.
  Use degrees. Since 2.1.0.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- toDegrees
  
  public static Column toDegrees(String columnName)
  
  Deprecated.
  Use degrees. Since 2.1.0.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- degrees
  
  public static Column degrees(Column e)
  
  Converts an angle measured in radians to an approximately equivalent angle measured in degrees.
  
  Parameters:
  
  e - angle in radians
  
  Returns:
  
  angle in degrees, as if computed by java.lang.Math.toDegrees
  
  Since:
  
  2.1.0
- degrees
  
  public static Column degrees(String columnName)
  
  Converts an angle measured in radians to an approximately equivalent angle measured in degrees.
  
  Parameters:
  
  columnName - angle in radians
  
  Returns:
  
  angle in degrees, as if computed by java.lang.Math.toDegrees
  
  Since:
  
  2.1.0
- toRadians
  
  public static Column toRadians(Column e)
  
  Deprecated.
  Use radians. Since 2.1.0.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- toRadians
  
  public static Column toRadians(String columnName)
  
  Deprecated.
  Use radians. Since 2.1.0.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- radians
  
  public static Column radians(Column e)
  
  Converts an angle measured in degrees to an approximately equivalent angle measured in radians.
  
  Parameters:
  
  e - angle in degrees
  
  Returns:
  
  angle in radians, as if computed by java.lang.Math.toRadians
  
  Since:
  
  2.1.0
- radians
  
  public static Column radians(String columnName)
  
  Converts an angle measured in degrees to an approximately equivalent angle measured in radians.
  
  Parameters:
  
  columnName - angle in degrees
  
  Returns:
  
  angle in radians, as if computed by java.lang.Math.toRadians
  
  Since:
  
  2.1.0
- width_bucket
  
  public static Column width_bucket(Column v, Column min, Column max, Column numBucket)
  
  Returns the bucket number into which the value of this expression would fall after being evaluated. Note that input arguments must follow conditions listed below; otherwise, the method will return null.
  
  Parameters:
  
  v - value to compute a bucket number in the histogram
  
  min - minimum value of the histogram
  
  max - maximum value of the histogram
  
  numBucket - the number of buckets
  
  Returns:
  
  the bucket number into which the value would fall after being evaluated
  
  Since:
  
  3.5.0
- current_catalog
  
  public static Column current_catalog()
  
  Returns the current catalog.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- current_database
  
  public static Column current_database()
  
  Returns the current database.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- current_schema
  
  public static Column current_schema()
  
  Returns the current schema.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- current_user
  
  public static Column current_user()
  
  Returns the user name of current execution context.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- md5
  
  public static Column md5(Column e)
  
  Calculates the MD5 digest of a binary column and returns the value as a 32 character hex string.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- sha1
  
  public static Column sha1(Column e)
  
  Calculates the SHA-1 digest of a binary column and returns the value as a 40 character hex string.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- sha2
  
  public static Column sha2(Column e, int numBits)
  
  Calculates the SHA-2 family of hash functions of a binary column and returns the value as a hex string.
  
  Parameters:
  
  e - column to compute SHA-2 on.
  
  numBits - one of 224, 256, 384, or 512.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- crc32
  
  public static Column crc32(Column e)
  
  Calculates the cyclic redundancy check value (CRC32) of a binary column and returns the value as a bigint.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- hash
  
  public static Column hash(scala.collection.immutable.Seq<Column> cols)
  
  Calculates the hash code of given columns, and returns the result as an int column.
  
  Parameters:
  
  cols - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.0.0
- xxhash64
  
  public static Column xxhash64(scala.collection.immutable.Seq<Column> cols)
  
  Calculates the hash code of given columns using the 64-bit variant of the xxHash algorithm, and returns the result as a long column. The hash computation uses an initial seed of 42.
  
  Parameters:
  
  cols - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.0.0
- assert_true
  
  public static Column assert_true(Column c)
  
  Returns null if the condition is true, and throws an exception otherwise.
  
  Parameters:
  
  c - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.1.0
- assert_true
  
  public static Column assert_true(Column c, Column e)
  
  Returns null if the condition is true; throws an exception with the error message otherwise.
  
  Parameters:
  
  c - (undocumented)
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.1.0
- raise_error
  
  public static Column raise_error(Column c)
  
  Throws an exception with the provided error message.
  
  Parameters:
  
  c - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.1.0
- hll_sketch_estimate
  
  public static Column hll_sketch_estimate(Column c)
  
  Returns the estimated number of unique values given the binary representation of a Datasketches HllSketch.
  
  Parameters:
  
  c - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- hll_sketch_estimate
  
  public static Column hll_sketch_estimate(String columnName)
  
  Returns the estimated number of unique values given the binary representation of a Datasketches HllSketch.
  
  Parameters:
  
  columnName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- hll_union
  
  public static Column hll_union(Column c1, Column c2)
  
  Merges two binary representations of Datasketches HllSketch objects, using a Datasketches Union object. Throws an exception if sketches have different lgConfigK values.
  
  Parameters:
  
  c1 - (undocumented)
  
  c2 - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- hll_union
  
  public static Column hll_union(String columnName1, String columnName2)
  
  Merges two binary representations of Datasketches HllSketch objects, using a Datasketches Union object. Throws an exception if sketches have different lgConfigK values.
  
  Parameters:
  
  columnName1 - (undocumented)
  
  columnName2 - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- hll_union
  
  public static Column hll_union(Column c1, Column c2, boolean allowDifferentLgConfigK)
  
  Merges two binary representations of Datasketches HllSketch objects, using a Datasketches Union object. Throws an exception if sketches have different lgConfigK values and allowDifferentLgConfigK is set to false.
  
  Parameters:
  
  c1 - (undocumented)
  
  c2 - (undocumented)
  
  allowDifferentLgConfigK - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- hll_union
  
  public static Column hll_union(String columnName1, String columnName2, boolean allowDifferentLgConfigK)
  
  Merges two binary representations of Datasketches HllSketch objects, using a Datasketches Union object. Throws an exception if sketches have different lgConfigK values and allowDifferentLgConfigK is set to false.
  
  Parameters:
  
  columnName1 - (undocumented)
  
  columnName2 - (undocumented)
  
  allowDifferentLgConfigK - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- user
  
  public static Column user()
  
  Returns the user name of current execution context.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- session_user
  
  public static Column session_user()
  
  Returns the user name of current execution context.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.0.0
- uuid
  
  public static Column uuid()
  
  Returns an universally unique identifier (UUID) string. The value is returned as a canonical UUID 36-character string.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- aes_encrypt
  
  public static Column aes_encrypt(Column input, Column key, Column mode, Column padding, Column iv, Column aad)
  
  Returns an encrypted value of input using AES in given mode with the specified padding. Key lengths of 16, 24 and 32 bits are supported. Supported combinations of (mode, padding) are ('ECB', 'PKCS'), ('GCM', 'NONE') and ('CBC', 'PKCS'). Optional initialization vectors (IVs) are only supported for CBC and GCM modes. These must be 16 bytes for CBC and 12 bytes for GCM. If not provided, a random vector will be generated and prepended to the output. Optional additional authenticated data (AAD) is only supported for GCM. If provided for encryption, the identical AAD value must be provided for decryption. The default mode is GCM.
  
  Parameters:
  
  input - The binary value to encrypt.
  
  key - The passphrase to use to encrypt the data.
  
  mode - Specifies which block cipher mode should be used to encrypt messages. Valid modes: ECB, GCM, CBC.
  
  padding - Specifies how to pad messages whose length is not a multiple of the block size. Valid values: PKCS, NONE, DEFAULT. The DEFAULT padding means PKCS for ECB, NONE for GCM and PKCS for CBC.
  
  iv - Optional initialization vector. Only supported for CBC and GCM modes. Valid values: None or "". 16-byte array for CBC mode. 12-byte array for GCM mode.
  
  aad - Optional additional authenticated data. Only supported for GCM mode. This can be any free-form input and must be provided for both encryption and decryption.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- aes_encrypt
  
  public static Column aes_encrypt(Column input, Column key, Column mode, Column padding, Column iv)
  
  Returns an encrypted value of input.
  Parameters:
  
  input - (undocumented)
  
  key - (undocumented)
  
  mode - (undocumented)
  
  padding - (undocumented)
  
  iv - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
  
  See Also:
  
  org.apache.spark.sql.functions.aes_encrypt(Column, Column, Column, Column, Column, Column)
- aes_encrypt
  
  public static Column aes_encrypt(Column input, Column key, Column mode, Column padding)
  
  Returns an encrypted value of input.
  Parameters:
  
  input - (undocumented)
  
  key - (undocumented)
  
  mode - (undocumented)
  
  padding - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
  
  See Also:
  
  org.apache.spark.sql.functions.aes_encrypt(Column, Column, Column, Column, Column, Column)
- aes_encrypt
  
  public static Column aes_encrypt(Column input, Column key, Column mode)
  
  Returns an encrypted value of input.
  Parameters:
  
  input - (undocumented)
  
  key - (undocumented)
  
  mode - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
  
  See Also:
  
  org.apache.spark.sql.functions.aes_encrypt(Column, Column, Column, Column, Column, Column)
- aes_encrypt
  
  public static Column aes_encrypt(Column input, Column key)
  
  Returns an encrypted value of input.
  Parameters:
  
  input - (undocumented)
  
  key - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
  
  See Also:
  
  org.apache.spark.sql.functions.aes_encrypt(Column, Column, Column, Column, Column, Column)
- aes_decrypt
  
  public static Column aes_decrypt(Column input, Column key, Column mode, Column padding, Column aad)
  
  Returns a decrypted value of input using AES in mode with padding. Key lengths of 16, 24 and 32 bits are supported. Supported combinations of (mode, padding) are ('ECB', 'PKCS'), ('GCM', 'NONE') and ('CBC', 'PKCS'). Optional additional authenticated data (AAD) is only supported for GCM. If provided for encryption, the identical AAD value must be provided for decryption. The default mode is GCM.
  
  Parameters:
  
  input - The binary value to decrypt.
  
  key - The passphrase to use to decrypt the data.
  
  mode - Specifies which block cipher mode should be used to decrypt messages. Valid modes: ECB, GCM, CBC.
  
  padding - Specifies how to pad messages whose length is not a multiple of the block size. Valid values: PKCS, NONE, DEFAULT. The DEFAULT padding means PKCS for ECB, NONE for GCM and PKCS for CBC.
  
  aad - Optional additional authenticated data. Only supported for GCM mode. This can be any free-form input and must be provided for both encryption and decryption.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- aes_decrypt
  
  public static Column aes_decrypt(Column input, Column key, Column mode, Column padding)
  
  Returns a decrypted value of input.
  Parameters:
  
  input - (undocumented)
  
  key - (undocumented)
  
  mode - (undocumented)
  
  padding - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
  
  See Also:
  
  org.apache.spark.sql.functions.aes_decrypt(Column, Column, Column, Column, Column)
- aes_decrypt
  
  public static Column aes_decrypt(Column input, Column key, Column mode)
  
  Returns a decrypted value of input.
  Parameters:
  
  input - (undocumented)
  
  key - (undocumented)
  
  mode - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
  
  See Also:
  
  org.apache.spark.sql.functions.aes_decrypt(Column, Column, Column, Column, Column)
- aes_decrypt
  
  public static Column aes_decrypt(Column input, Column key)
  
  Returns a decrypted value of input.
  Parameters:
  
  input - (undocumented)
  
  key - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
  
  See Also:
  
  org.apache.spark.sql.functions.aes_decrypt(Column, Column, Column, Column, Column)
- try_aes_decrypt
  
  public static Column try_aes_decrypt(Column input, Column key, Column mode, Column padding, Column aad)
  
  This is a special version of aes_decrypt that performs the same operation, but returns a NULL value instead of raising an error if the decryption cannot be performed.
  
  Parameters:
  
  input - The binary value to decrypt.
  
  key - The passphrase to use to decrypt the data.
  
  mode - Specifies which block cipher mode should be used to decrypt messages. Valid modes: ECB, GCM, CBC.
  
  padding - Specifies how to pad messages whose length is not a multiple of the block size. Valid values: PKCS, NONE, DEFAULT. The DEFAULT padding means PKCS for ECB, NONE for GCM and PKCS for CBC.
  
  aad - Optional additional authenticated data. Only supported for GCM mode. This can be any free-form input and must be provided for both encryption and decryption.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- try_aes_decrypt
  
  public static Column try_aes_decrypt(Column input, Column key, Column mode, Column padding)
  
  Returns a decrypted value of input.
  Parameters:
  
  input - (undocumented)
  
  key - (undocumented)
  
  mode - (undocumented)
  
  padding - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
  
  See Also:
  
  org.apache.spark.sql.functions.try_aes_decrypt(Column, Column, Column, Column, Column)
- try_aes_decrypt
  
  public static Column try_aes_decrypt(Column input, Column key, Column mode)
  
  Returns a decrypted value of input.
  Parameters:
  
  input - (undocumented)
  
  key - (undocumented)
  
  mode - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
  
  See Also:
  
  org.apache.spark.sql.functions.try_aes_decrypt(Column, Column, Column, Column, Column)
- try_aes_decrypt
  
  public static Column try_aes_decrypt(Column input, Column key)
  
  Returns a decrypted value of input.
  Parameters:
  
  input - (undocumented)
  
  key - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
  
  See Also:
  
  org.apache.spark.sql.functions.try_aes_decrypt(Column, Column, Column, Column, Column)
- sha
  
  public static Column sha(Column col)
  
  Returns a sha1 hash value as a hex string of the col.
  
  Parameters:
  
  col - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- input_file_block_length
  
  public static Column input_file_block_length()
  
  Returns the length of the block being read, or -1 if not available.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- input_file_block_start
  
  public static Column input_file_block_start()
  
  Returns the start offset of the block being read, or -1 if not available.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- reflect
  
  public static Column reflect(scala.collection.immutable.Seq<Column> cols)
  
  Calls a method with reflection.
  
  Parameters:
  
  cols - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- java_method
  
  public static Column java_method(scala.collection.immutable.Seq<Column> cols)
  
  Calls a method with reflection.
  
  Parameters:
  
  cols - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- try_reflect
  
  public static Column try_reflect(scala.collection.immutable.Seq<Column> cols)
  
  This is a special version of reflect that performs the same operation, but returns a NULL value instead of raising an error if the invoke method thrown exception.
  
  Parameters:
  
  cols - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.0.0
- version
  
  public static Column version()
  
  Returns the Spark version. The string contains 2 fields, the first being a release version and the second being a git revision.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- typeof
  
  public static Column typeof(Column col)
  
  Return DDL-formatted type string for the data type of the input.
  
  Parameters:
  
  col - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- stack
  
  public static Column stack(scala.collection.immutable.Seq<Column> cols)
  
  Separates col1, ..., colk into n rows. Uses column names col0, col1, etc. by default unless specified otherwise.
  
  Parameters:
  
  cols - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- random
  
  public static Column random(Column seed)
  
  Returns a random value with independent and identically distributed (i.i.d.) uniformly distributed values in [0, 1).
  
  Parameters:
  
  seed - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- random
  
  public static Column random()
  
  Returns a random value with independent and identically distributed (i.i.d.) uniformly distributed values in [0, 1).
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- bitmap_bit_position
  
  public static Column bitmap_bit_position(Column col)
  
  Returns the bucket number for the given input column.
  
  Parameters:
  
  col - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- bitmap_bucket_number
  
  public static Column bitmap_bucket_number(Column col)
  
  Returns the bit position for the given input column.
  
  Parameters:
  
  col - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- bitmap_construct_agg
  
  public static Column bitmap_construct_agg(Column col)
  
  Returns a bitmap with the positions of the bits set from all the values from the input column. The input column will most likely be bitmap_bit_position().
  
  Parameters:
  
  col - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- bitmap_count
  
  public static Column bitmap_count(Column col)
  
  Returns the number of set bits in the input bitmap.
  
  Parameters:
  
  col - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- bitmap_or_agg
  
  public static Column bitmap_or_agg(Column col)
  
  Returns a bitmap that is the bitwise OR of all of the bitmaps from the input column. The input column should be bitmaps created from bitmap_construct_agg().
  
  Parameters:
  
  col - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- ascii
  
  public static Column ascii(Column e)
  
  Computes the numeric value of the first character of the string column, and returns the result as an int column.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- base64
  
  public static Column base64(Column e)
  
  Computes the BASE64 encoding of a binary column and returns it as a string column. This is the reverse of unbase64.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- bit_length
  
  public static Column bit_length(Column e)
  
  Calculates the bit length for the specified string column.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.3.0
- concat_ws
  
  public static Column concat_ws(String sep, scala.collection.immutable.Seq<Column> exprs)
  
  Concatenates multiple input string columns together into a single string column, using the given separator.
  
  Parameters:
  
  sep - (undocumented)
  
  exprs - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
  
  Note:
  
  Input strings which are null are skipped.
- decode
  
  public static Column decode(Column value, String charset)
  
  Computes the first argument into a string from a binary using the provided character set (one of 'US-ASCII', 'ISO-8859-1', 'UTF-8', 'UTF-16BE', 'UTF-16LE', 'UTF-16'). If either argument is null, the result will also be null.
  
  Parameters:
  
  value - (undocumented)
  
  charset - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- encode
  
  public static Column encode(Column value, String charset)
  
  Computes the first argument into a binary from a string using the provided character set (one of 'US-ASCII', 'ISO-8859-1', 'UTF-8', 'UTF-16BE', 'UTF-16LE', 'UTF-16'). If either argument is null, the result will also be null.
  
  Parameters:
  
  value - (undocumented)
  
  charset - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- format_number
  
  public static Column format_number(Column x, int d)
  
  Formats numeric column x to a format like '#,###,###.##', rounded to d decimal places with HALF_EVEN round mode, and returns the result as a string column.
  If d is 0, the result has no decimal point or fractional part. If d is less than 0, the result will be null.
  
  Parameters:
  
  x - (undocumented)
  
  d - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- format_string
  
  public static Column format_string(String format, scala.collection.immutable.Seq<Column> arguments)
  
  Formats the arguments in printf-style and returns the result as a string column.
  
  Parameters:
  
  format - (undocumented)
  
  arguments - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- initcap
  
  public static Column initcap(Column e)
  
  Returns a new string column by converting the first letter of each word to uppercase. Words are delimited by whitespace.
  For example, "hello world" will become "Hello World".
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- instr
  
  public static Column instr(Column str, String substring)
  
  Locate the position of the first occurrence of substr column in the given string. Returns null if either of the arguments are null.
  
  Parameters:
  
  str - (undocumented)
  
  substring - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
  
  Note:
  
  The position is not zero based, but 1 based index. Returns 0 if substr could not be found in str.
- length
  
  public static Column length(Column e)
  
  Computes the character length of a given string or number of bytes of a binary string. The length of character strings include the trailing spaces. The length of binary strings includes binary zeros.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- len
  
  public static Column len(Column e)
  
  Computes the character length of a given string or number of bytes of a binary string. The length of character strings include the trailing spaces. The length of binary strings includes binary zeros.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- lower
  
  public static Column lower(Column e)
  
  Converts a string column to lower case.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- levenshtein
  
  public static Column levenshtein(Column l, Column r, int threshold)
  
  Computes the Levenshtein distance of the two given string columns if it's less than or equal to a given threshold.
  
  Parameters:
  
  l - (undocumented)
  
  r - (undocumented)
  
  threshold - (undocumented)
  
  Returns:
  
  result distance, or -1
  
  Since:
  
  3.5.0
- levenshtein
  
  public static Column levenshtein(Column l, Column r)
  
  Computes the Levenshtein distance of the two given string columns.
  
  Parameters:
  
  l - (undocumented)
  
  r - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- locate
  
  public static Column locate(String substr, Column str)
  
  Locate the position of the first occurrence of substr.
  
  Parameters:
  
  substr - (undocumented)
  
  str - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
  
  Note:
  
  The position is not zero based, but 1 based index. Returns 0 if substr could not be found in str.
- locate
  
  public static Column locate(String substr, Column str, int pos)
  
  Locate the position of the first occurrence of substr in a string column, after position pos.
  
  Parameters:
  
  substr - (undocumented)
  
  str - (undocumented)
  
  pos - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
  
  Note:
  
  The position is not zero based, but 1 based index. returns 0 if substr could not be found in str.
- lpad
  
  public static Column lpad(Column str, int len, String pad)
  
  Left-pad the string column with pad to a length of len. If the string column is longer than len, the return value is shortened to len characters.
  
  Parameters:
  
  str - (undocumented)
  
  len - (undocumented)
  
  pad - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- lpad
  
  public static Column lpad(Column str, int len, byte[] pad)
  
  Left-pad the binary column with pad to a byte length of len. If the binary column is longer than len, the return value is shortened to len bytes.
  
  Parameters:
  
  str - (undocumented)
  
  len - (undocumented)
  
  pad - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.3.0
- ltrim
  
  public static Column ltrim(Column e)
  
  Trim the spaces from left end for the specified string value.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- ltrim
  
  public static Column ltrim(Column e, String trimString)
  
  Trim the specified character string from left end for the specified string column.
  
  Parameters:
  
  e - (undocumented)
  
  trimString - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.3.0
- octet_length
  
  public static Column octet_length(Column e)
  
  Calculates the byte length for the specified string column.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.3.0
- collate
  
  public static Column collate(Column e, String collation)
  
  Marks a given column with specified collation.
  
  Parameters:
  
  e - (undocumented)
  
  collation - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.0.0
- collation
  
  public static Column collation(Column e)
  
  Returns the collation name of a given column.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.0.0
- rlike
  
  public static Column rlike(Column str, Column regexp)
  
  Returns true if str matches regexp, or false otherwise.
  
  Parameters:
  
  str - (undocumented)
  
  regexp - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- regexp
  
  public static Column regexp(Column str, Column regexp)
  
  Returns true if str matches regexp, or false otherwise.
  
  Parameters:
  
  str - (undocumented)
  
  regexp - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- regexp_like
  
  public static Column regexp_like(Column str, Column regexp)
  
  Returns true if str matches regexp, or false otherwise.
  
  Parameters:
  
  str - (undocumented)
  
  regexp - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- regexp_count
  
  public static Column regexp_count(Column str, Column regexp)
  
  Returns a count of the number of times that the regular expression pattern regexp is matched in the string str.
  
  Parameters:
  
  str - (undocumented)
  
  regexp - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- regexp_extract
  
  public static Column regexp_extract(Column e, String exp, int groupIdx)
  
  Extract a specific group matched by a Java regex, from the specified string column. If the regex did not match, or the specified group did not match, an empty string is returned. if the specified group index exceeds the group count of regex, an IllegalArgumentException will be thrown.
  
  Parameters:
  
  e - (undocumented)
  
  exp - (undocumented)
  
  groupIdx - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- regexp_extract_all
  
  public static Column regexp_extract_all(Column str, Column regexp)
  
  Extract all strings in the str that match the regexp expression and corresponding to the first regex group index.
  
  Parameters:
  
  str - (undocumented)
  
  regexp - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- regexp_extract_all
  
  public static Column regexp_extract_all(Column str, Column regexp, Column idx)
  
  Extract all strings in the str that match the regexp expression and corresponding to the regex group index.
  
  Parameters:
  
  str - (undocumented)
  
  regexp - (undocumented)
  
  idx - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- regexp_replace
  
  public static Column regexp_replace(Column e, String pattern, String replacement)
  
  Replace all substrings of the specified string value that match regexp with rep.
  
  Parameters:
  
  e - (undocumented)
  
  pattern - (undocumented)
  
  replacement - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- regexp_replace
  
  public static Column regexp_replace(Column e, Column pattern, Column replacement)
  
  Replace all substrings of the specified string value that match regexp with rep.
  
  Parameters:
  
  e - (undocumented)
  
  pattern - (undocumented)
  
  replacement - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.1.0
- regexp_substr
  
  public static Column regexp_substr(Column str, Column regexp)
  
  Returns the substring that matches the regular expression regexp within the string str. If the regular expression is not found, the result is null.
  
  Parameters:
  
  str - (undocumented)
  
  regexp - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- regexp_instr
  
  public static Column regexp_instr(Column str, Column regexp)
  
  Searches a string for a regular expression and returns an integer that indicates the beginning position of the matched substring. Positions are 1-based, not 0-based. If no match is found, returns 0.
  
  Parameters:
  
  str - (undocumented)
  
  regexp - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- regexp_instr
  
  public static Column regexp_instr(Column str, Column regexp, Column idx)
  
  Searches a string for a regular expression and returns an integer that indicates the beginning position of the matched substring. Positions are 1-based, not 0-based. If no match is found, returns 0.
  
  Parameters:
  
  str - (undocumented)
  
  regexp - (undocumented)
  
  idx - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- unbase64
  
  public static Column unbase64(Column e)
  
  Decodes a BASE64 encoded string column and returns it as a binary column. This is the reverse of base64.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- rpad
  
  public static Column rpad(Column str, int len, String pad)
  
  Right-pad the string column with pad to a length of len. If the string column is longer than len, the return value is shortened to len characters.
  
  Parameters:
  
  str - (undocumented)
  
  len - (undocumented)
  
  pad - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- rpad
  
  public static Column rpad(Column str, int len, byte[] pad)
  
  Right-pad the binary column with pad to a byte length of len. If the binary column is longer than len, the return value is shortened to len bytes.
  
  Parameters:
  
  str - (undocumented)
  
  len - (undocumented)
  
  pad - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.3.0
- repeat
  
  public static Column repeat(Column str, int n)
  
  Repeats a string column n times, and returns it as a new string column.
  
  Parameters:
  
  str - (undocumented)
  
  n - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- repeat
  
  public static Column repeat(Column str, Column n)
  
  Repeats a string column n times, and returns it as a new string column.
  
  Parameters:
  
  str - (undocumented)
  
  n - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.0.0
- rtrim
  
  public static Column rtrim(Column e)
  
  Trim the spaces from right end for the specified string value.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- rtrim
  
  public static Column rtrim(Column e, String trimString)
  
  Trim the specified character string from right end for the specified string column.
  
  Parameters:
  
  e - (undocumented)
  
  trimString - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.3.0
- soundex
  
  public static Column soundex(Column e)
  
  Returns the soundex code for the specified expression.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- split
  
  public static Column split(Column str, String pattern)
  
  Splits str around matches of the given pattern.
  
  Parameters:
  
  str - a string expression to split
  
  pattern - a string representing a regular expression. The regex string should be a Java regular expression.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- split
  
  public static Column split(Column str, Column pattern)
  
  Splits str around matches of the given pattern.
  
  Parameters:
  
  str - a string expression to split
  
  pattern - a column of string representing a regular expression. The regex string should be a Java regular expression.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.0.0
- split
  
  public static Column split(Column str, String pattern, int limit)
  
  Splits str around matches of the given pattern.
  Parameters:
  
  str - a string expression to split
  
  pattern - a string representing a regular expression. The regex string should be a Java regular expression.
  
  limit - an integer expression which controls the number of times the regex is applied.
  
  limit greater than 0: The resulting array's length will not be more than limit, and the resulting array's last entry will contain all input beyond the last matched regex.
  
  limit less than or equal to 0: regex will be applied as many times as possible, and the resulting array can be of any size.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.0.0
- split
  
  public static Column split(Column str, Column pattern, Column limit)
  
  Splits str around matches of the given pattern.
  Parameters:
  
  str - a string expression to split
  
  pattern - a column of string representing a regular expression. The regex string should be a Java regular expression.
  
  limit - a column of integer expression which controls the number of times the regex is applied.
  
  limit greater than 0: The resulting array's length will not be more than limit, and the resulting array's last entry will contain all input beyond the last matched regex.
  
  limit less than or equal to 0: regex will be applied as many times as possible, and the resulting array can be of any size.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.0.0
- substring
  
  public static Column substring(Column str, int pos, int len)
  
  Substring starts at pos and is of length len when str is String type or returns the slice of byte array that starts at pos in byte and is of length len when str is Binary type
  
  Parameters:
  
  str - (undocumented)
  
  pos - (undocumented)
  
  len - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
  
  Note:
  
  The position is not zero based, but 1 based index.
- substring_index
  
  public static Column substring_index(Column str, String delim, int count)
  
  Returns the substring from string str before count occurrences of the delimiter delim. If count is positive, everything the left of the final delimiter (counting from left) is returned. If count is negative, every to the right of the final delimiter (counting from the right) is returned. substring_index performs a case-sensitive match when searching for delim.
  
  Parameters:
  
  str - (undocumented)
  
  delim - (undocumented)
  
  count - (undocumented)
  
  Returns:
  
  (undocumented)
- overlay
  
  public static Column overlay(Column src, Column replace, Column pos, Column len)
  
  Overlay the specified portion of src with replace, starting from byte position pos of src and proceeding for len bytes.
  
  Parameters:
  
  src - (undocumented)
  
  replace - (undocumented)
  
  pos - (undocumented)
  
  len - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.0.0
- overlay
  
  public static Column overlay(Column src, Column replace, Column pos)
  
  Overlay the specified portion of src with replace, starting from byte position pos of src.
  
  Parameters:
  
  src - (undocumented)
  
  replace - (undocumented)
  
  pos - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.0.0
- sentences
  
  public static Column sentences(Column string, Column language, Column country)
  
  Splits a string into arrays of sentences, where each sentence is an array of words.
  
  Parameters:
  
  string - (undocumented)
  
  language - (undocumented)
  
  country - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.2.0
- sentences
  
  public static Column sentences(Column string)
  
  Splits a string into arrays of sentences, where each sentence is an array of words. The default locale is used.
  
  Parameters:
  
  string - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.2.0
- translate
  
  public static Column translate(Column src, String matchingString, String replaceString)
  
  Translate any character in the src by a character in replaceString. The characters in replaceString correspond to the characters in matchingString. The translate will happen when any character in the string matches the character in the matchingString.
  
  Parameters:
  
  src - (undocumented)
  
  matchingString - (undocumented)
  
  replaceString - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- trim
  
  public static Column trim(Column e)
  
  Trim the spaces from both ends for the specified string column.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- trim
  
  public static Column trim(Column e, String trimString)
  
  Trim the specified character from both ends for the specified string column.
  
  Parameters:
  
  e - (undocumented)
  
  trimString - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.3.0
- upper
  
  public static Column upper(Column e)
  
  Converts a string column to upper case.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- to_binary
  
  public static Column to_binary(Column e, Column f)
  
  Converts the input e to a binary value based on the supplied format. The format can be a case-insensitive string literal of "hex", "utf-8", "utf8", or "base64". By default, the binary format for conversion is "hex" if format is omitted. The function returns NULL if at least one of the input parameters is NULL.
  
  Parameters:
  
  e - (undocumented)
  
  f - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- to_binary
  
  public static Column to_binary(Column e)
  
  Converts the input e to a binary value based on the default format "hex". The function returns NULL if at least one of the input parameters is NULL.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- to_char
  
  public static Column to_char(Column e, Column format)
  
  Convert e to a string based on the format. Throws an exception if the conversion fails. The format can consist of the following characters, case insensitive: '0' or '9': Specifies an expected digit between 0 and 9. A sequence of 0 or 9 in the format string matches a sequence of digits in the input value, generating a result string of the same length as the corresponding sequence in the format string. The result string is left-padded with zeros if the 0/9 sequence comprises more digits than the matching part of the decimal value, starts with 0, and is before the decimal point. Otherwise, it is padded with spaces. '.' or 'D': Specifies the position of the decimal point (optional, only allowed once). ',' or 'G': Specifies the position of the grouping (thousands) separator (,). There must be a 0 or 9 to the left and right of each grouping separator. '$': Specifies the location of the $ currency sign. This character may only be specified once. 'S' or 'MI': Specifies the position of a '-' or '+' sign (optional, only allowed once at the beginning or end of the format string). Note that 'S' prints '+' for positive values but 'MI' prints a space. 'PR': Only allowed at the end of the format string; specifies that the result string will be wrapped by angle brackets if the input value is negative.
  If e is a datetime, format shall be a valid datetime pattern, see Datetime Patterns. If e is a binary, it is converted to a string in one of the formats: 'base64': a base 64 string. 'hex': a string in the hexadecimal format. 'utf-8': the input binary is decoded to UTF-8 string.
  
  Parameters:
  
  e - (undocumented)
  
  format - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- to_varchar
  
  public static Column to_varchar(Column e, Column format)
  
  Convert e to a string based on the format. Throws an exception if the conversion fails. The format can consist of the following characters, case insensitive: '0' or '9': Specifies an expected digit between 0 and 9. A sequence of 0 or 9 in the format string matches a sequence of digits in the input value, generating a result string of the same length as the corresponding sequence in the format string. The result string is left-padded with zeros if the 0/9 sequence comprises more digits than the matching part of the decimal value, starts with 0, and is before the decimal point. Otherwise, it is padded with spaces. '.' or 'D': Specifies the position of the decimal point (optional, only allowed once). ',' or 'G': Specifies the position of the grouping (thousands) separator (,). There must be a 0 or 9 to the left and right of each grouping separator. '$': Specifies the location of the $ currency sign. This character may only be specified once. 'S' or 'MI': Specifies the position of a '-' or '+' sign (optional, only allowed once at the beginning or end of the format string). Note that 'S' prints '+' for positive values but 'MI' prints a space. 'PR': Only allowed at the end of the format string; specifies that the result string will be wrapped by angle brackets if the input value is negative.
  If e is a datetime, format shall be a valid datetime pattern, see Datetime Patterns. If e is a binary, it is converted to a string in one of the formats: 'base64': a base 64 string. 'hex': a string in the hexadecimal format. 'utf-8': the input binary is decoded to UTF-8 string.
  
  Parameters:
  
  e - (undocumented)
  
  format - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- to_number
  
  public static Column to_number(Column e, Column format)
  
  Convert string 'e' to a number based on the string format 'format'. Throws an exception if the conversion fails. The format can consist of the following characters, case insensitive: '0' or '9': Specifies an expected digit between 0 and 9. A sequence of 0 or 9 in the format string matches a sequence of digits in the input string. If the 0/9 sequence starts with 0 and is before the decimal point, it can only match a digit sequence of the same size. Otherwise, if the sequence starts with 9 or is after the decimal point, it can match a digit sequence that has the same or smaller size. '.' or 'D': Specifies the position of the decimal point (optional, only allowed once). ',' or 'G': Specifies the position of the grouping (thousands) separator (,). There must be a 0 or 9 to the left and right of each grouping separator. 'expr' must match the grouping separator relevant for the size of the number. '$': Specifies the location of the $ currency sign. This character may only be specified once. 'S' or 'MI': Specifies the position of a '-' or '+' sign (optional, only allowed once at the beginning or end of the format string). Note that 'S' allows '-' but 'MI' does not. 'PR': Only allowed at the end of the format string; specifies that 'expr' indicates a negative number with wrapping angled brackets.
  
  Parameters:
  
  e - (undocumented)
  
  format - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- replace
  
  public static Column replace(Column src, Column search, Column replace)
  
  Replaces all occurrences of search with replace.
  
  Parameters:
  
  src - A column of string to be replaced
  
  search - A column of string, If search is not found in str, str is returned unchanged.
  
  replace - A column of string, If replace is not specified or is an empty string, nothing replaces the string that is removed from str.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- replace
  
  public static Column replace(Column src, Column search)
  
  Replaces all occurrences of search with replace.
  
  Parameters:
  
  src - A column of string to be replaced
  
  search - A column of string, If search is not found in src, src is returned unchanged.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- split_part
  
  public static Column split_part(Column str, Column delimiter, Column partNum)
  
  Splits str by delimiter and return requested part of the split (1-based). If any input is null, returns null. if partNum is out of range of split parts, returns empty string. If partNum is 0, throws an error. If partNum is negative, the parts are counted backward from the end of the string. If the delimiter is an empty string, the str is not split.
  
  Parameters:
  
  str - (undocumented)
  
  delimiter - (undocumented)
  
  partNum - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- substr
  
  public static Column substr(Column str, Column pos, Column len)
  
  Returns the substring of str that starts at pos and is of length len, or the slice of byte array that starts at pos and is of length len.
  
  Parameters:
  
  str - (undocumented)
  
  pos - (undocumented)
  
  len - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- substr
  
  public static Column substr(Column str, Column pos)
  
  Returns the substring of str that starts at pos, or the slice of byte array that starts at pos.
  
  Parameters:
  
  str - (undocumented)
  
  pos - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- parse_url
  
  public static Column parse_url(Column url, Column partToExtract, Column key)
  
  Extracts a part from a URL.
  
  Parameters:
  
  url - (undocumented)
  
  partToExtract - (undocumented)
  
  key - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- parse_url
  
  public static Column parse_url(Column url, Column partToExtract)
  
  Extracts a part from a URL.
  
  Parameters:
  
  url - (undocumented)
  
  partToExtract - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- printf
  
  public static Column printf(Column format, scala.collection.immutable.Seq<Column> arguments)
  
  Formats the arguments in printf-style and returns the result as a string column.
  
  Parameters:
  
  format - (undocumented)
  
  arguments - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- url_decode
  
  public static Column url_decode(Column str)
  
  Decodes a str in 'application/x-www-form-urlencoded' format using a specific encoding scheme.
  
  Parameters:
  
  str - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- url_encode
  
  public static Column url_encode(Column str)
  
  Translates a string into 'application/x-www-form-urlencoded' format using a specific encoding scheme.
  
  Parameters:
  
  str - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- position
  
  public static Column position(Column substr, Column str, Column start)
  
  Returns the position of the first occurrence of substr in str after position start. The given start and return value are 1-based.
  
  Parameters:
  
  substr - (undocumented)
  
  str - (undocumented)
  
  start - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- position
  
  public static Column position(Column substr, Column str)
  
  Returns the position of the first occurrence of substr in str after position 1. The return value are 1-based.
  
  Parameters:
  
  substr - (undocumented)
  
  str - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- endswith
  
  public static Column endswith(Column str, Column suffix)
  
  Returns a boolean. The value is True if str ends with suffix. Returns NULL if either input expression is NULL. Otherwise, returns False. Both str or suffix must be of STRING or BINARY type.
  
  Parameters:
  
  str - (undocumented)
  
  suffix - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- startswith
  
  public static Column startswith(Column str, Column prefix)
  
  Returns a boolean. The value is True if str starts with prefix. Returns NULL if either input expression is NULL. Otherwise, returns False. Both str or prefix must be of STRING or BINARY type.
  
  Parameters:
  
  str - (undocumented)
  
  prefix - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- btrim
  
  public static Column btrim(Column str)
  
  Removes the leading and trailing space characters from str.
  
  Parameters:
  
  str - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- btrim
  
  public static Column btrim(Column str, Column trim)
  
  Remove the leading and trailing trim characters from str.
  
  Parameters:
  
  str - (undocumented)
  
  trim - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- try_to_binary
  
  public static Column try_to_binary(Column e, Column f)
  
  This is a special version of to_binary that performs the same operation, but returns a NULL value instead of raising an error if the conversion cannot be performed.
  
  Parameters:
  
  e - (undocumented)
  
  f - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- try_to_binary
  
  public static Column try_to_binary(Column e)
  
  This is a special version of to_binary that performs the same operation, but returns a NULL value instead of raising an error if the conversion cannot be performed.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- try_to_number
  
  public static Column try_to_number(Column e, Column format)
  
  Convert string e to a number based on the string format format. Returns NULL if the string e does not match the expected format. The format follows the same semantics as the to_number function.
  
  Parameters:
  
  e - (undocumented)
  
  format - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- char_length
  
  public static Column char_length(Column str)
  
  Returns the character length of string data or number of bytes of binary data. The length of string data includes the trailing spaces. The length of binary data includes binary zeros.
  
  Parameters:
  
  str - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- character_length
  
  public static Column character_length(Column str)
  
  Returns the character length of string data or number of bytes of binary data. The length of string data includes the trailing spaces. The length of binary data includes binary zeros.
  
  Parameters:
  
  str - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- chr
  
  public static Column chr(Column n)
  
  Returns the ASCII character having the binary equivalent to n. If n is larger than 256 the result is equivalent to chr(n % 256)
  
  Parameters:
  
  n - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- contains
  
  public static Column contains(Column left, Column right)
  
  Returns a boolean. The value is True if right is found inside left. Returns NULL if either input expression is NULL. Otherwise, returns False. Both left or right must be of STRING or BINARY type.
  
  Parameters:
  
  left - (undocumented)
  
  right - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- elt
  
  public static Column elt(scala.collection.immutable.Seq<Column> inputs)
  
  Returns the n-th input, e.g., returns input2 when n is 2. The function returns NULL if the index exceeds the length of the array and spark.sql.ansi.enabled is set to false. If spark.sql.ansi.enabled is set to true, it throws ArrayIndexOutOfBoundsException for invalid indices.
  
  Parameters:
  
  inputs - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- find_in_set
  
  public static Column find_in_set(Column str, Column strArray)
  
  Returns the index (1-based) of the given string (str) in the comma-delimited list (strArray). Returns 0, if the string was not found or if the given string (str) contains a comma.
  
  Parameters:
  
  str - (undocumented)
  
  strArray - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- like
  
  public static Column like(Column str, Column pattern, Column escapeChar)
  
  Returns true if str matches pattern with escapeChar, null if any arguments are null, false otherwise.
  
  Parameters:
  
  str - (undocumented)
  
  pattern - (undocumented)
  
  escapeChar - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- like
  
  public static Column like(Column str, Column pattern)
  
  Returns true if str matches pattern with escapeChar('\'), null if any arguments are null, false otherwise.
  
  Parameters:
  
  str - (undocumented)
  
  pattern - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- ilike
  
  public static Column ilike(Column str, Column pattern, Column escapeChar)
  
  Returns true if str matches pattern with escapeChar case-insensitively, null if any arguments are null, false otherwise.
  
  Parameters:
  
  str - (undocumented)
  
  pattern - (undocumented)
  
  escapeChar - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- ilike
  
  public static Column ilike(Column str, Column pattern)
  
  Returns true if str matches pattern with escapeChar('\') case-insensitively, null if any arguments are null, false otherwise.
  
  Parameters:
  
  str - (undocumented)
  
  pattern - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- lcase
  
  public static Column lcase(Column str)
  
  Returns str with all characters changed to lowercase.
  
  Parameters:
  
  str - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- ucase
  
  public static Column ucase(Column str)
  
  Returns str with all characters changed to uppercase.
  
  Parameters:
  
  str - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- left
  
  public static Column left(Column str, Column len)
  
  Returns the leftmost len(len can be string type) characters from the string str, if len is less or equal than 0 the result is an empty string.
  
  Parameters:
  
  str - (undocumented)
  
  len - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- right
  
  public static Column right(Column str, Column len)
  
  Returns the rightmost len(len can be string type) characters from the string str, if len is less or equal than 0 the result is an empty string.
  
  Parameters:
  
  str - (undocumented)
  
  len - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- add_months
  
  public static Column add_months(Column startDate, int numMonths)
  
  Returns the date that is numMonths after startDate.
  
  Parameters:
  
  startDate - A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS
  
  numMonths - The number of months to add to startDate, can be negative to subtract months
  
  Returns:
  
  A date, or null if startDate was a string that could not be cast to a date
  
  Since:
  
  1.5.0
- add_months
  
  public static Column add_months(Column startDate, Column numMonths)
  
  Returns the date that is numMonths after startDate.
  
  Parameters:
  
  startDate - A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS
  
  numMonths - A column of the number of months to add to startDate, can be negative to subtract months
  
  Returns:
  
  A date, or null if startDate was a string that could not be cast to a date
  
  Since:
  
  3.0.0
- curdate
  
  public static Column curdate()
  
  Returns the current date at the start of query evaluation as a date column. All calls of current_date within the same query return the same value.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- current_date
  
  public static Column current_date()
  
  Returns the current date at the start of query evaluation as a date column. All calls of current_date within the same query return the same value.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- current_timezone
  
  public static Column current_timezone()
  
  Returns the current session local timezone.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- current_timestamp
  
  public static Column current_timestamp()
  
  Returns the current timestamp at the start of query evaluation as a timestamp column. All calls of current_timestamp within the same query return the same value.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- now
  
  public static Column now()
  
  Returns the current timestamp at the start of query evaluation.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- localtimestamp
  
  public static Column localtimestamp()
  
  Returns the current timestamp without time zone at the start of query evaluation as a timestamp without time zone column. All calls of localtimestamp within the same query return the same value.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.3.0
- date_format
  
  public static Column date_format(Column dateExpr, String format)
  
  Converts a date/timestamp/string to a value of string in the format specified by the date format given by the second argument.
  See Datetime Patterns for valid date and time format patterns
  
  Parameters:
  
  dateExpr - A date, timestamp or string. If a string, the data must be in a format that can be cast to a timestamp, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS
  
  format - A pattern dd.MM.yyyy would return a string like 18.03.1993
  
  Returns:
  
  A string, or null if dateExpr was a string that could not be cast to a timestamp
  
  Throws:
  
  IllegalArgumentException - if the format pattern is invalid
  
  Since:
  
  1.5.0
  
  Note:
  
  Use specialized functions like year(org.apache.spark.sql.Column) whenever possible as they benefit from a specialized implementation.
- date_add
  
  public static Column date_add(Column start, int days)
  
  Returns the date that is days days after start
  
  Parameters:
  
  start - A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS
  
  days - The number of days to add to start, can be negative to subtract days
  
  Returns:
  
  A date, or null if start was a string that could not be cast to a date
  
  Since:
  
  1.5.0
- date_add
  
  public static Column date_add(Column start, Column days)
  
  Returns the date that is days days after start
  
  Parameters:
  
  start - A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS
  
  days - A column of the number of days to add to start, can be negative to subtract days
  
  Returns:
  
  A date, or null if start was a string that could not be cast to a date
  
  Since:
  
  3.0.0
- dateadd
  
  public static Column dateadd(Column start, Column days)
  
  Returns the date that is days days after start
  
  Parameters:
  
  start - A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS
  
  days - A column of the number of days to add to start, can be negative to subtract days
  
  Returns:
  
  A date, or null if start was a string that could not be cast to a date
  
  Since:
  
  3.5.0
- date_sub
  
  public static Column date_sub(Column start, int days)
  
  Returns the date that is days days before start
  
  Parameters:
  
  start - A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS
  
  days - The number of days to subtract from start, can be negative to add days
  
  Returns:
  
  A date, or null if start was a string that could not be cast to a date
  
  Since:
  
  1.5.0
- date_sub
  
  public static Column date_sub(Column start, Column days)
  
  Returns the date that is days days before start
  
  Parameters:
  
  start - A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS
  
  days - A column of the number of days to subtract from start, can be negative to add days
  
  Returns:
  
  A date, or null if start was a string that could not be cast to a date
  
  Since:
  
  3.0.0
- datediff
  
  public static Column datediff(Column end, Column start)
  Returns the number of days from start to end.
  Only considers the date part of the input. For example:
  dateddiff("2018-01-10 00:00:00", "2018-01-09 23:59:59") // returns 1
  Parameters:
  
  end - A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS
  
  start - A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS
  
  Returns:
  
  An integer, or null if either end or start were strings that could not be cast to a date. Negative if end is before start
  
  Since:
  
  1.5.0
- date_diff
  
  public static Column date_diff(Column end, Column start)
  Returns the number of days from start to end.
  Only considers the date part of the input. For example:
  dateddiff("2018-01-10 00:00:00", "2018-01-09 23:59:59") // returns 1
  Parameters:
  
  end - A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS
  
  start - A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS
  
  Returns:
  
  An integer, or null if either end or start were strings that could not be cast to a date. Negative if end is before start
  
  Since:
  
  3.5.0
- date_from_unix_date
  
  public static Column date_from_unix_date(Column days)
  
  Create date from the number of days since 1970-01-01.
  
  Parameters:
  
  days - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- year
  
  public static Column year(Column e)
  
  Extracts the year as an integer from a given date/timestamp/string.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  An integer, or null if the input was a string that could not be cast to a date
  
  Since:
  
  1.5.0
- quarter
  
  public static Column quarter(Column e)
  
  Extracts the quarter as an integer from a given date/timestamp/string.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  An integer, or null if the input was a string that could not be cast to a date
  
  Since:
  
  1.5.0
- month
  
  public static Column month(Column e)
  
  Extracts the month as an integer from a given date/timestamp/string.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  An integer, or null if the input was a string that could not be cast to a date
  
  Since:
  
  1.5.0
- dayofweek
  
  public static Column dayofweek(Column e)
  
  Extracts the day of the week as an integer from a given date/timestamp/string. Ranges from 1 for a Sunday through to 7 for a Saturday
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  An integer, or null if the input was a string that could not be cast to a date
  
  Since:
  
  2.3.0
- dayofmonth
  
  public static Column dayofmonth(Column e)
  
  Extracts the day of the month as an integer from a given date/timestamp/string.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  An integer, or null if the input was a string that could not be cast to a date
  
  Since:
  
  1.5.0
- day
  
  public static Column day(Column e)
  
  Extracts the day of the month as an integer from a given date/timestamp/string.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  An integer, or null if the input was a string that could not be cast to a date
  
  Since:
  
  3.5.0
- dayofyear
  
  public static Column dayofyear(Column e)
  
  Extracts the day of the year as an integer from a given date/timestamp/string.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  An integer, or null if the input was a string that could not be cast to a date
  
  Since:
  
  1.5.0
- hour
  
  public static Column hour(Column e)
  
  Extracts the hours as an integer from a given date/timestamp/string.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  An integer, or null if the input was a string that could not be cast to a date
  
  Since:
  
  1.5.0
- extract
  
  public static Column extract(Column field, Column source)
  
  Extracts a part of the date/timestamp or interval source.
  
  Parameters:
  
  field - selects which part of the source should be extracted.
  
  source - a date/timestamp or interval column from where field should be extracted.
  
  Returns:
  
  a part of the date/timestamp or interval source
  
  Since:
  
  3.5.0
- date_part
  
  public static Column date_part(Column field, Column source)
  
  Extracts a part of the date/timestamp or interval source.
  
  Parameters:
  
  field - selects which part of the source should be extracted, and supported string values are as same as the fields of the equivalent function extract.
  
  source - a date/timestamp or interval column from where field should be extracted.
  
  Returns:
  
  a part of the date/timestamp or interval source
  
  Since:
  
  3.5.0
- datepart
  
  public static Column datepart(Column field, Column source)
  
  Extracts a part of the date/timestamp or interval source.
  
  Parameters:
  
  field - selects which part of the source should be extracted, and supported string values are as same as the fields of the equivalent function EXTRACT.
  
  source - a date/timestamp or interval column from where field should be extracted.
  
  Returns:
  
  a part of the date/timestamp or interval source
  
  Since:
  
  3.5.0
- last_day
  
  public static Column last_day(Column e)
  
  Returns the last day of the month which the given date belongs to. For example, input "2015-07-27" returns "2015-07-31" since July 31 is the last day of the month in July 2015.
  
  Parameters:
  
  e - A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS
  
  Returns:
  
  A date, or null if the input was a string that could not be cast to a date
  
  Since:
  
  1.5.0
- minute
  
  public static Column minute(Column e)
  
  Extracts the minutes as an integer from a given date/timestamp/string.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  An integer, or null if the input was a string that could not be cast to a date
  
  Since:
  
  1.5.0
- weekday
  
  public static Column weekday(Column e)
  
  Returns the day of the week for date/timestamp (0 = Monday, 1 = Tuesday, ..., 6 = Sunday).
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- make_date
  
  public static Column make_date(Column year, Column month, Column day)
  
  Parameters:
  
  year - (undocumented)
  
  month - (undocumented)
  
  day - (undocumented)
  
  Returns:
  
  A date created from year, month and day fields.
  
  Since:
  
  3.3.0
- months_between
  
  public static Column months_between(Column end, Column start)
  Returns number of months between dates start and end.
  A whole number is returned if both inputs have the same day of month or both are the last day of their respective months. Otherwise, the difference is calculated assuming 31 days per month.
  For example:
  months_between("2017-11-14", "2017-07-14") // returns 4.0 months_between("2017-01-01", "2017-01-10") // returns 0.29032258 months_between("2017-06-01", "2017-06-16 12:00:00") // returns -0.5
  Parameters:
  
  end - A date, timestamp or string. If a string, the data must be in a format that can be cast to a timestamp, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS
  
  start - A date, timestamp or string. If a string, the data must be in a format that can cast to a timestamp, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS
  
  Returns:
  
  A double, or null if either end or start were strings that could not be cast to a timestamp. Negative if end is before start
  
  Since:
  
  1.5.0
- months_between
  
  public static Column months_between(Column end, Column start, boolean roundOff)
  
  Returns number of months between dates end and start. If roundOff is set to true, the result is rounded off to 8 digits; it is not rounded otherwise.
  
  Parameters:
  
  end - (undocumented)
  
  start - (undocumented)
  
  roundOff - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4.0
- next_day
  
  public static Column next_day(Column date, String dayOfWeek)
  
  Returns the first date which is later than the value of the date column that is on the specified day of the week.
  For example, next_day('2015-07-27', "Sunday") returns 2015-08-02 because that is the first Sunday after 2015-07-27.
  
  Parameters:
  
  date - A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS
  
  dayOfWeek - Case insensitive, and accepts: "Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"
  
  Returns:
  
  A date, or null if date was a string that could not be cast to a date or if dayOfWeek was an invalid value
  
  Since:
  
  1.5.0
- next_day
  
  public static Column next_day(Column date, Column dayOfWeek)
  
  Returns the first date which is later than the value of the date column that is on the specified day of the week.
  For example, next_day('2015-07-27', "Sunday") returns 2015-08-02 because that is the first Sunday after 2015-07-27.
  
  Parameters:
  
  date - A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS
  
  dayOfWeek - A column of the day of week. Case insensitive, and accepts: "Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"
  
  Returns:
  
  A date, or null if date was a string that could not be cast to a date or if dayOfWeek was an invalid value
  
  Since:
  
  3.2.0
- second
  
  public static Column second(Column e)
  
  Extracts the seconds as an integer from a given date/timestamp/string.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  An integer, or null if the input was a string that could not be cast to a timestamp
  
  Since:
  
  1.5.0
- weekofyear
  
  public static Column weekofyear(Column e)
  
  Extracts the week number as an integer from a given date/timestamp/string.
  A week is considered to start on a Monday and week 1 is the first week with more than 3 days, as defined by ISO 8601
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  An integer, or null if the input was a string that could not be cast to a date
  
  Since:
  
  1.5.0
- from_unixtime
  
  public static Column from_unixtime(Column ut)
  
  Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string representing the timestamp of that moment in the current system time zone in the yyyy-MM-dd HH:mm:ss format.
  
  Parameters:
  
  ut - A number of a type that is castable to a long, such as string or integer. Can be negative for timestamps before the unix epoch
  
  Returns:
  
  A string, or null if the input was a string that could not be cast to a long
  
  Since:
  
  1.5.0
- from_unixtime
  
  public static Column from_unixtime(Column ut, String f)
  
  Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string representing the timestamp of that moment in the current system time zone in the given format.
  See Datetime Patterns for valid date and time format patterns
  
  Parameters:
  
  ut - A number of a type that is castable to a long, such as string or integer. Can be negative for timestamps before the unix epoch
  
  f - A date time pattern that the input will be formatted to
  
  Returns:
  
  A string, or null if ut was a string that could not be cast to a long or f was an invalid date time pattern
  
  Since:
  
  1.5.0
- unix_timestamp
  
  public static Column unix_timestamp()
  
  Returns the current Unix timestamp (in seconds) as a long.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
  
  Note:
  
  All calls of unix_timestamp within the same query return the same value (i.e. the current timestamp is calculated at the start of query evaluation).
- unix_timestamp
  
  public static Column unix_timestamp(Column s)
  
  Converts time string in format yyyy-MM-dd HH:mm:ss to Unix timestamp (in seconds), using the default timezone and the default locale.
  
  Parameters:
  
  s - A date, timestamp or string. If a string, the data must be in the yyyy-MM-dd HH:mm:ss format
  
  Returns:
  
  A long, or null if the input was a string not of the correct format
  
  Since:
  
  1.5.0
- unix_timestamp
  
  public static Column unix_timestamp(Column s, String p)
  
  Converts time string with given pattern to Unix timestamp (in seconds).
  See Datetime Patterns for valid date and time format patterns
  
  Parameters:
  
  s - A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS
  
  p - A date time pattern detailing the format of s when s is a string
  
  Returns:
  
  A long, or null if s was a string that could not be cast to a date or p was an invalid format
  
  Since:
  
  1.5.0
- to_timestamp
  
  public static Column to_timestamp(Column s)
  
  Converts to a timestamp by casting rules to TimestampType.
  
  Parameters:
  
  s - A date, timestamp or string. If a string, the data must be in a format that can be cast to a timestamp, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS
  
  Returns:
  
  A timestamp, or null if the input was a string that could not be cast to a timestamp
  
  Since:
  
  2.2.0
- to_timestamp
  
  public static Column to_timestamp(Column s, String fmt)
  
  Converts time string with the given pattern to timestamp.
  See Datetime Patterns for valid date and time format patterns
  
  Parameters:
  
  s - A date, timestamp or string. If a string, the data must be in a format that can be cast to a timestamp, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS
  
  fmt - A date time pattern detailing the format of s when s is a string
  
  Returns:
  
  A timestamp, or null if s was a string that could not be cast to a timestamp or fmt was an invalid format
  
  Since:
  
  2.2.0
- try_to_timestamp
  
  public static Column try_to_timestamp(Column s, Column format)
  
  Parses the s with the format to a timestamp. The function always returns null on an invalid input with/without ANSI SQL mode enabled. The result data type is consistent with the value of configuration spark.sql.timestampType.
  
  Parameters:
  
  s - (undocumented)
  
  format - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- try_to_timestamp
  
  public static Column try_to_timestamp(Column s)
  
  Parses the s to a timestamp. The function always returns null on an invalid input with/without ANSI SQL mode enabled. It follows casting rules to a timestamp. The result data type is consistent with the value of configuration spark.sql.timestampType.
  
  Parameters:
  
  s - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- to_date
  
  public static Column to_date(Column e)
  
  Converts the column into DateType by casting rules to DateType.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- to_date
  
  public static Column to_date(Column e, String fmt)
  
  Converts the column into a DateType with a specified format
  See Datetime Patterns for valid date and time format patterns
  
  Parameters:
  
  e - A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS
  
  fmt - A date time pattern detailing the format of e when eis a string
  
  Returns:
  
  A date, or null if e was a string that could not be cast to a date or fmt was an invalid format
  
  Since:
  
  2.2.0
- unix_date
  
  public static Column unix_date(Column e)
  
  Returns the number of days since 1970-01-01.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- unix_micros
  
  public static Column unix_micros(Column e)
  
  Returns the number of microseconds since 1970-01-01 00:00:00 UTC.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- unix_millis
  
  public static Column unix_millis(Column e)
  
  Returns the number of milliseconds since 1970-01-01 00:00:00 UTC. Truncates higher levels of precision.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- unix_seconds
  
  public static Column unix_seconds(Column e)
  
  Returns the number of seconds since 1970-01-01 00:00:00 UTC. Truncates higher levels of precision.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- trunc
  
  public static Column trunc(Column date, String format)
  
  Returns date truncated to the unit specified by the format.
  For example, trunc("2018-11-19 12:01:19", "year") returns 2018-01-01
  
  Parameters:
  
  date - A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS
  
  format - : 'year', 'yyyy', 'yy' to truncate by year, or 'month', 'mon', 'mm' to truncate by month Other options are: 'week', 'quarter'
  
  Returns:
  
  A date, or null if date was a string that could not be cast to a date or format was an invalid value
  
  Since:
  
  1.5.0
- date_trunc
  
  public static Column date_trunc(String format, Column timestamp)
  
  Returns timestamp truncated to the unit specified by the format.
  For example, date_trunc("year", "2018-11-19 12:01:19") returns 2018-01-01 00:00:00
  
  Parameters:
  
  format - : 'year', 'yyyy', 'yy' to truncate by year, 'month', 'mon', 'mm' to truncate by month, 'day', 'dd' to truncate by day, Other options are: 'microsecond', 'millisecond', 'second', 'minute', 'hour', 'week', 'quarter'
  
  timestamp - A date, timestamp or string. If a string, the data must be in a format that can be cast to a timestamp, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS
  
  Returns:
  
  A timestamp, or null if timestamp was a string that could not be cast to a timestamp or format was an invalid value
  
  Since:
  
  2.3.0
- from_utc_timestamp
  
  public static Column from_utc_timestamp(Column ts, String tz)
  
  Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in UTC, and renders that time as a timestamp in the given time zone. For example, 'GMT+1' would yield '2017-07-14 03:40:00.0'.
  
  Parameters:
  
  ts - A date, timestamp or string. If a string, the data must be in a format that can be cast to a timestamp, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS
  
  tz - A string detailing the time zone ID that the input should be adjusted to. It should be in the format of either region-based zone IDs or zone offsets. Region IDs must have the form 'area/city', such as 'America/Los_Angeles'. Zone offsets must be in the format '(+|-)HH:mm', for example '-08:00' or '+01:00'. Also 'UTC' and 'Z' are supported as aliases of '+00:00'. Other short names are not recommended to use because they can be ambiguous.
  
  Returns:
  
  A timestamp, or null if ts was a string that could not be cast to a timestamp or tz was an invalid value
  
  Since:
  
  1.5.0
- from_utc_timestamp
  
  public static Column from_utc_timestamp(Column ts, Column tz)
  
  Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in UTC, and renders that time as a timestamp in the given time zone. For example, 'GMT+1' would yield '2017-07-14 03:40:00.0'.
  
  Parameters:
  
  ts - (undocumented)
  
  tz - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4.0
- to_utc_timestamp
  
  public static Column to_utc_timestamp(Column ts, String tz)
  
  Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in the given time zone, and renders that time as a timestamp in UTC. For example, 'GMT+1' would yield '2017-07-14 01:40:00.0'.
  
  Parameters:
  
  ts - A date, timestamp or string. If a string, the data must be in a format that can be cast to a timestamp, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS
  
  tz - A string detailing the time zone ID that the input should be adjusted to. It should be in the format of either region-based zone IDs or zone offsets. Region IDs must have the form 'area/city', such as 'America/Los_Angeles'. Zone offsets must be in the format '(+|-)HH:mm', for example '-08:00' or '+01:00'. Also 'UTC' and 'Z' are supported as aliases of '+00:00'. Other short names are not recommended to use because they can be ambiguous.
  
  Returns:
  
  A timestamp, or null if ts was a string that could not be cast to a timestamp or tz was an invalid value
  
  Since:
  
  1.5.0
- to_utc_timestamp
  
  public static Column to_utc_timestamp(Column ts, Column tz)
  
  Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in the given time zone, and renders that time as a timestamp in UTC. For example, 'GMT+1' would yield '2017-07-14 01:40:00.0'.
  
  Parameters:
  
  ts - (undocumented)
  
  tz - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4.0
- window
  
  public static Column window(Column timeColumn, String windowDuration, String slideDuration, String startTime)
  Bucketize rows into one or more time windows given a timestamp specifying column. Window starts are inclusive but the window ends are exclusive, e.g. 12:05 will be in the window [12:05,12:10) but not in [12:00,12:05). Windows can support microsecond precision. Windows in the order of months are not supported. The following example takes the average stock price for a one minute window every 10 seconds starting 5 seconds after the hour:
  
  val df = ... // schema => timestamp: TimestampType, stockId: StringType, price: DoubleType df.groupBy(window($"timestamp", "1 minute", "10 seconds", "5 seconds"), $"stockId") .agg(mean("price"))
  
  The windows will look like:
  
  09:00:05-09:01:05 09:00:15-09:01:15 09:00:25-09:01:25 ...
  
  For a streaming query, you may use the function current_timestamp to generate windows on processing time.
  Parameters:
  
  timeColumn - The column or the expression to use as the timestamp for windowing by time. The time column must be of TimestampType or TimestampNTZType.
  
  windowDuration - A string specifying the width of the window, e.g. 10 minutes, 1 second. Check org.apache.spark.unsafe.types.CalendarInterval for valid duration identifiers. Note that the duration is a fixed length of time, and does not vary over time according to a calendar. For example, 1 day always means 86,400,000 milliseconds, not a calendar day.
  
  slideDuration - A string specifying the sliding interval of the window, e.g. 1 minute. A new window will be generated every slideDuration. Must be less than or equal to the windowDuration. Check org.apache.spark.unsafe.types.CalendarInterval for valid duration identifiers. This duration is likewise absolute, and does not vary according to a calendar.
  
  startTime - The offset with respect to 1970-01-01 00:00:00 UTC with which to start window intervals. For example, in order to have hourly tumbling windows that start 15 minutes past the hour, e.g. 12:15-13:15, 13:15-14:15... provide startTime as 15 minutes.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.0.0
- window
  
  public static Column window(Column timeColumn, String windowDuration, String slideDuration)
  Bucketize rows into one or more time windows given a timestamp specifying column. Window starts are inclusive but the window ends are exclusive, e.g. 12:05 will be in the window [12:05,12:10) but not in [12:00,12:05). Windows can support microsecond precision. Windows in the order of months are not supported. The windows start beginning at 1970-01-01 00:00:00 UTC. The following example takes the average stock price for a one minute window every 10 seconds:
  
  val df = ... // schema => timestamp: TimestampType, stockId: StringType, price: DoubleType df.groupBy(window($"timestamp", "1 minute", "10 seconds"), $"stockId") .agg(mean("price"))
  
  The windows will look like:
  
  09:00:00-09:01:00 09:00:10-09:01:10 09:00:20-09:01:20 ...
  
  For a streaming query, you may use the function current_timestamp to generate windows on processing time.
  Parameters:
  
  timeColumn - The column or the expression to use as the timestamp for windowing by time. The time column must be of TimestampType or TimestampNTZType.
  
  windowDuration - A string specifying the width of the window, e.g. 10 minutes, 1 second. Check org.apache.spark.unsafe.types.CalendarInterval for valid duration identifiers. Note that the duration is a fixed length of time, and does not vary over time according to a calendar. For example, 1 day always means 86,400,000 milliseconds, not a calendar day.
  
  slideDuration - A string specifying the sliding interval of the window, e.g. 1 minute. A new window will be generated every slideDuration. Must be less than or equal to the windowDuration. Check org.apache.spark.unsafe.types.CalendarInterval for valid duration identifiers. This duration is likewise absolute, and does not vary according to a calendar.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.0.0
- window
  
  public static Column window(Column timeColumn, String windowDuration)
  Generates tumbling time windows given a timestamp specifying column. Window starts are inclusive but the window ends are exclusive, e.g. 12:05 will be in the window [12:05,12:10) but not in [12:00,12:05). Windows can support microsecond precision. Windows in the order of months are not supported. The windows start beginning at 1970-01-01 00:00:00 UTC. The following example takes the average stock price for a one minute tumbling window:
  
  val df = ... // schema => timestamp: TimestampType, stockId: StringType, price: DoubleType df.groupBy(window($"timestamp", "1 minute"), $"stockId") .agg(mean("price"))
  
  The windows will look like:
  
  09:00:00-09:01:00 09:01:00-09:02:00 09:02:00-09:03:00 ...
  
  For a streaming query, you may use the function current_timestamp to generate windows on processing time.
  Parameters:
  
  timeColumn - The column or the expression to use as the timestamp for windowing by time. The time column must be of TimestampType or TimestampNTZType.
  
  windowDuration - A string specifying the width of the window, e.g. 10 minutes, 1 second. Check org.apache.spark.unsafe.types.CalendarInterval for valid duration identifiers.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.0.0
- window_time
  
  public static Column window_time(Column windowColumn)
  
  Extracts the event time from the window column.
  The window column is of StructType { start: Timestamp, end: Timestamp } where start is inclusive and end is exclusive. Since event time can support microsecond precision, window_time(window) = window.end - 1 microsecond.
  
  Parameters:
  
  windowColumn - The window column (typically produced by window aggregation) of type StructType { start: Timestamp, end: Timestamp }
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.4.0
- session_window
  
  public static Column session_window(Column timeColumn, String gapDuration)
  
  Generates session window given a timestamp specifying column.
  Session window is one of dynamic windows, which means the length of window is varying according to the given inputs. The length of session window is defined as "the timestamp of latest input of the session + gap duration", so when the new inputs are bound to the current session window, the end time of session window can be expanded according to the new inputs.
  Windows can support microsecond precision. gapDuration in the order of months are not supported.
  For a streaming query, you may use the function current_timestamp to generate windows on processing time.
  
  Parameters:
  
  timeColumn - The column or the expression to use as the timestamp for windowing by time. The time column must be of TimestampType or TimestampNTZType.
  
  gapDuration - A string specifying the timeout of the session, e.g. 10 minutes, 1 second. Check org.apache.spark.unsafe.types.CalendarInterval for valid duration identifiers.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.2.0
- session_window
  
  public static Column session_window(Column timeColumn, Column gapDuration)
  
  Generates session window given a timestamp specifying column.
  Session window is one of dynamic windows, which means the length of window is varying according to the given inputs. For static gap duration, the length of session window is defined as "the timestamp of latest input of the session + gap duration", so when the new inputs are bound to the current session window, the end time of session window can be expanded according to the new inputs.
  Besides a static gap duration value, users can also provide an expression to specify gap duration dynamically based on the input row. With dynamic gap duration, the closing of a session window does not depend on the latest input anymore. A session window's range is the union of all events' ranges which are determined by event start time and evaluated gap duration during the query execution. Note that the rows with negative or zero gap duration will be filtered out from the aggregation.
  Windows can support microsecond precision. gapDuration in the order of months are not supported.
  For a streaming query, you may use the function current_timestamp to generate windows on processing time.
  
  Parameters:
  
  timeColumn - The column or the expression to use as the timestamp for windowing by time. The time column must be of TimestampType or TimestampNTZType.
  
  gapDuration - A column specifying the timeout of the session. It could be static value, e.g. 10 minutes, 1 second, or an expression/UDF that specifies gap duration dynamically based on the input row.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.2.0
- timestamp_seconds
  
  public static Column timestamp_seconds(Column e)
  
  Converts the number of seconds from the Unix epoch (1970-01-01T00:00:00Z) to a timestamp.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.1.0
- timestamp_millis
  
  public static Column timestamp_millis(Column e)
  
  Creates timestamp from the number of milliseconds since UTC epoch.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- timestamp_micros
  
  public static Column timestamp_micros(Column e)
  
  Creates timestamp from the number of microseconds since UTC epoch.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- timestamp_diff
  
  public static Column timestamp_diff(String unit, Column start, Column end)
  
  Gets the difference between the timestamps in the specified units by truncating the fraction part.
  
  Parameters:
  
  unit - (undocumented)
  
  start - (undocumented)
  
  end - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.0.0
- timestamp_add
  
  public static Column timestamp_add(String unit, Column quantity, Column ts)
  
  Adds the specified number of units to the given timestamp.
  
  Parameters:
  
  unit - (undocumented)
  
  quantity - (undocumented)
  
  ts - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.0.0
- to_timestamp_ltz
  
  public static Column to_timestamp_ltz(Column timestamp, Column format)
  
  Parses the timestamp expression with the format expression to a timestamp without time zone. Returns null with invalid input.
  
  Parameters:
  
  timestamp - (undocumented)
  
  format - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- to_timestamp_ltz
  
  public static Column to_timestamp_ltz(Column timestamp)
  
  Parses the timestamp expression with the default format to a timestamp without time zone. The default format follows casting rules to a timestamp. Returns null with invalid input.
  
  Parameters:
  
  timestamp - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- to_timestamp_ntz
  
  public static Column to_timestamp_ntz(Column timestamp, Column format)
  
  Parses the timestamp_str expression with the format expression to a timestamp without time zone. Returns null with invalid input.
  
  Parameters:
  
  timestamp - (undocumented)
  
  format - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- to_timestamp_ntz
  
  public static Column to_timestamp_ntz(Column timestamp)
  
  Parses the timestamp expression with the default format to a timestamp without time zone. The default format follows casting rules to a timestamp. Returns null with invalid input.
  
  Parameters:
  
  timestamp - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- to_unix_timestamp
  
  public static Column to_unix_timestamp(Column timeExp, Column format)
  
  Returns the UNIX timestamp of the given time.
  
  Parameters:
  
  timeExp - (undocumented)
  
  format - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- to_unix_timestamp
  
  public static Column to_unix_timestamp(Column timeExp)
  
  Returns the UNIX timestamp of the given time.
  
  Parameters:
  
  timeExp - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- monthname
  
  public static Column monthname(Column timeExp)
  
  Extracts the three-letter abbreviated month name from a given date/timestamp/string.
  
  Parameters:
  
  timeExp - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.0.0
- dayname
  
  public static Column dayname(Column timeExp)
  
  Extracts the three-letter abbreviated day name from a given date/timestamp/string.
  
  Parameters:
  
  timeExp - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.0.0
- array_contains
  
  public static Column array_contains(Column column, Object value)
  
  Returns null if the array is null, true if the array contains value, and false otherwise.
  
  Parameters:
  
  column - (undocumented)
  
  value - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- array_append
  
  public static Column array_append(Column column, Object element)
  
  Returns an ARRAY containing all elements from the source ARRAY as well as the new element. The new element/column is located at end of the ARRAY.
  
  Parameters:
  
  column - (undocumented)
  
  element - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.4.0
- arrays_overlap
  
  public static Column arrays_overlap(Column a1, Column a2)
  
  Returns true if a1 and a2 have at least one non-null element in common. If not and both the arrays are non-empty and any of them contains a null, it returns null. It returns false otherwise.
  
  Parameters:
  
  a1 - (undocumented)
  
  a2 - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4.0
- slice
  
  public static Column slice(Column x, int start, int length)
  
  Returns an array containing all the elements in x from index start (or starting from the end if start is negative) with the specified length.
  
  Parameters:
  
  x - the array column to be sliced
  
  start - the starting index
  
  length - the length of the slice
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4.0
- slice
  
  public static Column slice(Column x, Column start, Column length)
  
  Returns an array containing all the elements in x from index start (or starting from the end if start is negative) with the specified length.
  
  Parameters:
  
  x - the array column to be sliced
  
  start - the starting index
  
  length - the length of the slice
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.1.0
- array_join
  
  public static Column array_join(Column column, String delimiter, String nullReplacement)
  
  Concatenates the elements of column using the delimiter. Null values are replaced with nullReplacement.
  
  Parameters:
  
  column - (undocumented)
  
  delimiter - (undocumented)
  
  nullReplacement - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4.0
- array_join
  
  public static Column array_join(Column column, String delimiter)
  
  Concatenates the elements of column using the delimiter.
  
  Parameters:
  
  column - (undocumented)
  
  delimiter - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4.0
- concat
  
  public static Column concat(scala.collection.immutable.Seq<Column> exprs)
  
  Concatenates multiple input columns together into a single column. The function works with strings, binary and compatible array columns.
  
  Parameters:
  
  exprs - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
  
  Note:
  
  Returns null if any of the input columns are null.
- array_position
  
  public static Column array_position(Column column, Object value)
  
  Locates the position of the first occurrence of the value in the given array as long. Returns null if either of the arguments are null.
  
  Parameters:
  
  column - (undocumented)
  
  value - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4.0
  
  Note:
  
  The position is not zero based, but 1 based index. Returns 0 if value could not be found in array.
- element_at
  
  public static Column element_at(Column column, Object value)
  
  Returns element of array at given index in value if column is array. Returns value for the given key in value if column is map.
  
  Parameters:
  
  column - (undocumented)
  
  value - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4.0
- try_element_at
  
  public static Column try_element_at(Column column, Column value)
  
  (array, index) - Returns element of array at given (1-based) index. If Index is 0, Spark will throw an error. If index < 0, accesses elements from the last to the first. The function always returns NULL if the index exceeds the length of the array.
  (map, key) - Returns value for given key. The function always returns NULL if the key is not contained in the map.
  
  Parameters:
  
  column - (undocumented)
  
  value - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- get
  
  public static Column get(Column column, Column index)
  
  Returns element of array at given (0-based) index. If the index points outside of the array boundaries, then this function returns NULL.
  
  Parameters:
  
  column - (undocumented)
  
  index - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.4.0
- array_sort
  
  public static Column array_sort(Column e)
  
  Sorts the input array in ascending order. The elements of the input array must be orderable. NaN is greater than any non-NaN elements for double/float type. Null elements will be placed at the end of the returned array.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4.0
- array_sort
  
  public static Column array_sort(Column e, scala.Function2<Column,Column,Column> comparator)
  
  Sorts the input array based on the given comparator function. The comparator will take two arguments representing two elements of the array. It returns a negative integer, 0, or a positive integer as the first element is less than, equal to, or greater than the second element. If the comparator function returns null, the function will fail and raise an error.
  
  Parameters:
  
  e - (undocumented)
  
  comparator - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.4.0
- array_remove
  
  public static Column array_remove(Column column, Object element)
  
  Remove all elements that equal to element from the given array.
  
  Parameters:
  
  column - (undocumented)
  
  element - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4.0
- array_compact
  
  public static Column array_compact(Column column)
  
  Remove all null elements from the given array.
  
  Parameters:
  
  column - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.4.0
- array_prepend
  
  public static Column array_prepend(Column column, Object element)
  
  Returns an array containing value as well as all elements from array. The new element is positioned at the beginning of the array.
  
  Parameters:
  
  column - (undocumented)
  
  element - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- array_distinct
  
  public static Column array_distinct(Column e)
  
  Removes duplicate values from the array.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4.0
- array_intersect
  
  public static Column array_intersect(Column col1, Column col2)
  
  Returns an array of the elements in the intersection of the given two arrays, without duplicates.
  
  Parameters:
  
  col1 - (undocumented)
  
  col2 - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4.0
- array_insert
  
  public static Column array_insert(Column arr, Column pos, Column value)
  
  Adds an item into a given array at a specified position
  
  Parameters:
  
  arr - (undocumented)
  
  pos - (undocumented)
  
  value - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.4.0
- array_union
  
  public static Column array_union(Column col1, Column col2)
  
  Returns an array of the elements in the union of the given two arrays, without duplicates.
  
  Parameters:
  
  col1 - (undocumented)
  
  col2 - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4.0
- array_except
  
  public static Column array_except(Column col1, Column col2)
  
  Returns an array of the elements in the first array but not in the second array, without duplicates. The order of elements in the result is not determined
  
  Parameters:
  
  col1 - (undocumented)
  
  col2 - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4.0
- transform
  
  public static Column transform(Column column, scala.Function1<Column,Column> f)
  Returns an array of elements after applying a transformation to each element in the input array.
  df.select(transform(col("i"), x => x + 1))
  Parameters:
  
  column - the input array column
  
  f - col => transformed_col, the lambda function to transform the input column
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.0.0
- transform
  
  public static Column transform(Column column, scala.Function2<Column,Column,Column> f)
  Returns an array of elements after applying a transformation to each element in the input array.
  df.select(transform(col("i"), (x, i) => x + i))
  Parameters:
  
  column - the input array column
  
  f - (col, index) => transformed_col, the lambda function to transform the input column given the index. Indices start at 0.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.0.0
- exists
  
  public static Column exists(Column column, scala.Function1<Column,Column> f)
  Returns whether a predicate holds for one or more elements in the array.
  df.select(exists(col("i"), _ % 2 === 0))
  Parameters:
  
  column - the input array column
  
  f - col => predicate, the Boolean predicate to check the input column
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.0.0
- forall
  
  public static Column forall(Column column, scala.Function1<Column,Column> f)
  Returns whether a predicate holds for every element in the array.
  df.select(forall(col("i"), x => x % 2 === 0))
  Parameters:
  
  column - the input array column
  
  f - col => predicate, the Boolean predicate to check the input column
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.0.0
- filter
  
  public static Column filter(Column column, scala.Function1<Column,Column> f)
  Returns an array of elements for which a predicate holds in a given array.
  df.select(filter(col("s"), x => x % 2 === 0))
  Parameters:
  
  column - the input array column
  
  f - col => predicate, the Boolean predicate to filter the input column
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.0.0
- filter
  
  public static Column filter(Column column, scala.Function2<Column,Column,Column> f)
  Returns an array of elements for which a predicate holds in a given array.
  df.select(filter(col("s"), (x, i) => i % 2 === 0))
  Parameters:
  
  column - the input array column
  
  f - (col, index) => predicate, the Boolean predicate to filter the input column given the index. Indices start at 0.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.0.0
- aggregate
  
  public static Column aggregate(Column expr, Column initialValue, scala.Function2<Column,Column,Column> merge, scala.Function1<Column,Column> finish)
  Applies a binary operator to an initial state and all elements in the array, and reduces this to a single state. The final state is converted into the final result by applying a finish function.
  df.select(aggregate(col("i"), lit(0), (acc, x) => acc + x, _ * 10))
  Parameters:
  
  expr - the input array column
  
  initialValue - the initial value
  
  merge - (combined_value, input_value) => combined_value, the merge function to merge an input value to the combined_value
  
  finish - combined_value => final_value, the lambda function to convert the combined value of all inputs to final result
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.0.0
- aggregate
  
  public static Column aggregate(Column expr, Column initialValue, scala.Function2<Column,Column,Column> merge)
  Applies a binary operator to an initial state and all elements in the array, and reduces this to a single state.
  df.select(aggregate(col("i"), lit(0), (acc, x) => acc + x))
  Parameters:
  
  expr - the input array column
  
  initialValue - the initial value
  
  merge - (combined_value, input_value) => combined_value, the merge function to merge an input value to the combined_value
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.0.0
- reduce
  
  public static Column reduce(Column expr, Column initialValue, scala.Function2<Column,Column,Column> merge, scala.Function1<Column,Column> finish)
  Applies a binary operator to an initial state and all elements in the array, and reduces this to a single state. The final state is converted into the final result by applying a finish function.
  df.select(aggregate(col("i"), lit(0), (acc, x) => acc + x, _ * 10))
  Parameters:
  
  expr - the input array column
  
  initialValue - the initial value
  
  merge - (combined_value, input_value) => combined_value, the merge function to merge an input value to the combined_value
  
  finish - combined_value => final_value, the lambda function to convert the combined value of all inputs to final result
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- reduce
  
  public static Column reduce(Column expr, Column initialValue, scala.Function2<Column,Column,Column> merge)
  Applies a binary operator to an initial state and all elements in the array, and reduces this to a single state.
  df.select(aggregate(col("i"), lit(0), (acc, x) => acc + x))
  Parameters:
  
  expr - the input array column
  
  initialValue - the initial value
  
  merge - (combined_value, input_value) => combined_value, the merge function to merge an input value to the combined_value
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- zip_with
  
  public static Column zip_with(Column left, Column right, scala.Function2<Column,Column,Column> f)
  Merge two given arrays, element-wise, into a single array using a function. If one array is shorter, nulls are appended at the end to match the length of the longer array, before applying the function.
  df.select(zip_with(df1("val1"), df1("val2"), (x, y) => x + y))
  Parameters:
  
  left - the left input array column
  
  right - the right input array column
  
  f - (lCol, rCol) => col, the lambda function to merge two input columns into one column
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.0.0
- transform_keys
  
  public static Column transform_keys(Column expr, scala.Function2<Column,Column,Column> f)
  Applies a function to every key-value pair in a map and returns a map with the results of those applications as the new keys for the pairs.
  df.select(transform_keys(col("i"), (k, v) => k + v))
  Parameters:
  
  expr - the input map column
  
  f - (key, value) => new_key, the lambda function to transform the key of input map column
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.0.0
- transform_values
  
  public static Column transform_values(Column expr, scala.Function2<Column,Column,Column> f)
  Applies a function to every key-value pair in a map and returns a map with the results of those applications as the new values for the pairs.
  df.select(transform_values(col("i"), (k, v) => k + v))
  Parameters:
  
  expr - the input map column
  
  f - (key, value) => new_value, the lambda function to transform the value of input map column
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.0.0
- map_filter
  
  public static Column map_filter(Column expr, scala.Function2<Column,Column,Column> f)
  Returns a map whose key-value pairs satisfy a predicate.
  df.select(map_filter(col("m"), (k, v) => k * 10 === v))
  Parameters:
  
  expr - the input map column
  
  f - (key, value) => predicate, the Boolean predicate to filter the input map column
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.0.0
- map_zip_with
  
  public static Column map_zip_with(Column left, Column right, scala.Function3<Column,Column,Column,Column> f)
  Merge two given maps, key-wise into a single map using a function.
  df.select(map_zip_with(df("m1"), df("m2"), (k, v1, v2) => k === v1 + v2))
  Parameters:
  
  left - the left input map column
  
  right - the right input map column
  
  f - (key, value1, value2) => new_value, the lambda function to merge the map values
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.0.0
- explode
  
  public static Column explode(Column e)
  
  Creates a new row for each element in the given array or map column. Uses the default column name col for elements in the array and key and value for elements in the map unless specified otherwise.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- explode_outer
  
  public static Column explode_outer(Column e)
  
  Creates a new row for each element in the given array or map column. Uses the default column name col for elements in the array and key and value for elements in the map unless specified otherwise. Unlike explode, if the array/map is null or empty then null is produced.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.2.0
- posexplode
  
  public static Column posexplode(Column e)
  
  Creates a new row for each element with position in the given array or map column. Uses the default column name pos for position, and col for elements in the array and key and value for elements in the map unless specified otherwise.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.1.0
- posexplode_outer
  
  public static Column posexplode_outer(Column e)
  
  Creates a new row for each element with position in the given array or map column. Uses the default column name pos for position, and col for elements in the array and key and value for elements in the map unless specified otherwise. Unlike posexplode, if the array/map is null or empty then the row (null, null) is produced.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.2.0
- inline
  
  public static Column inline(Column e)
  
  Creates a new row for each element in the given array of structs.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.4.0
- inline_outer
  
  public static Column inline_outer(Column e)
  
  Creates a new row for each element in the given array of structs. Unlike inline, if the array is null or empty then null is produced for each nested column.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.4.0
- get_json_object
  
  public static Column get_json_object(Column e, String path)
  
  Extracts json object from a json string based on json path specified, and returns json string of the extracted json object. It will return null if the input json string is invalid.
  
  Parameters:
  
  e - (undocumented)
  
  path - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
- json_tuple
  
  public static Column json_tuple(Column json, scala.collection.immutable.Seq<String> fields)
  
  Creates a new row for a json column according to the given field names.
  
  Parameters:
  
  json - (undocumented)
  
  fields - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
- from_json
  
  public static Column from_json(Column e, StructType schema, scala.collection.immutable.Map<String,String> options)
  
  (Scala-specific) Parses a column containing a JSON string into a StructType with the specified schema. Returns null, in the case of an unparseable string.
  
  Parameters:
  
  e - a string column containing JSON data.
  
  schema - the schema to use when parsing the json string
  
  options - options to control how the json is parsed. Accepts the same options as the json data source. See Data Source Option in the version you use.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.1.0
- from_json
  
  public static Column from_json(Column e, DataType schema, scala.collection.immutable.Map<String,String> options)
  
  (Scala-specific) Parses a column containing a JSON string into a MapType with StringType as keys type, StructType or ArrayType with the specified schema. Returns null, in the case of an unparseable string.
  
  Parameters:
  
  e - a string column containing JSON data.
  
  schema - the schema to use when parsing the json string
  
  options - options to control how the json is parsed. accepts the same options and the json data source. See Data Source Option in the version you use.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.2.0
- from_json
  
  public static Column from_json(Column e, StructType schema, Map<String,String> options)
  
  (Java-specific) Parses a column containing a JSON string into a StructType with the specified schema. Returns null, in the case of an unparseable string.
  
  Parameters:
  
  e - a string column containing JSON data.
  
  schema - the schema to use when parsing the json string
  
  options - options to control how the json is parsed. accepts the same options and the json data source. See Data Source Option in the version you use.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.1.0
- from_json
  
  public static Column from_json(Column e, DataType schema, Map<String,String> options)
  
  (Java-specific) Parses a column containing a JSON string into a MapType with StringType as keys type, StructType or ArrayType with the specified schema. Returns null, in the case of an unparseable string.
  
  Parameters:
  
  e - a string column containing JSON data.
  
  schema - the schema to use when parsing the json string
  
  options - options to control how the json is parsed. accepts the same options and the json data source. See Data Source Option in the version you use.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.2.0
- from_json
  
  public static Column from_json(Column e, StructType schema)
  
  Parses a column containing a JSON string into a StructType with the specified schema. Returns null, in the case of an unparseable string.
  
  Parameters:
  
  e - a string column containing JSON data.
  
  schema - the schema to use when parsing the json string
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.1.0
- from_json
  
  public static Column from_json(Column e, DataType schema)
  
  Parses a column containing a JSON string into a MapType with StringType as keys type, StructType or ArrayType with the specified schema. Returns null, in the case of an unparseable string.
  
  Parameters:
  
  e - a string column containing JSON data.
  
  schema - the schema to use when parsing the json string
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.2.0
- from_json
  
  public static Column from_json(Column e, String schema, Map<String,String> options)
  
  (Java-specific) Parses a column containing a JSON string into a MapType with StringType as keys type, StructType or ArrayType with the specified schema. Returns null, in the case of an unparseable string.
  
  Parameters:
  
  e - a string column containing JSON data.
  
  schema - the schema as a DDL-formatted string.
  
  options - options to control how the json is parsed. accepts the same options and the json data source. See Data Source Option in the version you use.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.1.0
- from_json
  
  public static Column from_json(Column e, String schema, scala.collection.immutable.Map<String,String> options)
  
  (Scala-specific) Parses a column containing a JSON string into a MapType with StringType as keys type, StructType or ArrayType with the specified schema. Returns null, in the case of an unparseable string.
  
  Parameters:
  
  e - a string column containing JSON data.
  
  schema - the schema as a DDL-formatted string.
  
  options - options to control how the json is parsed. accepts the same options and the json data source. See Data Source Option in the version you use.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.3.0
- from_json
  
  public static Column from_json(Column e, Column schema)
  
  (Scala-specific) Parses a column containing a JSON string into a MapType with StringType as keys type, StructType or ArrayType of StructTypes with the specified schema. Returns null, in the case of an unparseable string.
  
  Parameters:
  
  e - a string column containing JSON data.
  
  schema - the schema to use when parsing the json string
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4.0
- from_json
  
  public static Column from_json(Column e, Column schema, Map<String,String> options)
  
  (Java-specific) Parses a column containing a JSON string into a MapType with StringType as keys type, StructType or ArrayType of StructTypes with the specified schema. Returns null, in the case of an unparseable string.
  
  Parameters:
  
  e - a string column containing JSON data.
  
  schema - the schema to use when parsing the json string
  
  options - options to control how the json is parsed. accepts the same options and the json data source. See Data Source Option in the version you use.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4.0
- try_parse_json
  
  public static Column try_parse_json(Column json)
  
  Parses a JSON string and constructs a Variant value. Returns null if the input string is not a valid JSON value.
  
  Parameters:
  
  json - a string column that contains JSON data.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.0.0
- parse_json
  
  public static Column parse_json(Column json)
  
  Parses a JSON string and constructs a Variant value.
  
  Parameters:
  
  json - a string column that contains JSON data.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.0.0
- is_variant_null
  
  public static Column is_variant_null(Column v)
  
  Check if a variant value is a variant null. Returns true if and only if the input is a variant null and false otherwise (including in the case of SQL NULL).
  
  Parameters:
  
  v - a variant column.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.0.0
- variant_get
  
  public static Column variant_get(Column v, String path, String targetType)
  
  Extracts a sub-variant from v according to path, and then cast the sub-variant to targetType. Returns null if the path does not exist. Throws an exception if the cast fails.
  
  Parameters:
  
  v - a variant column.
  
  path - the extraction path. A valid path should start with $ and is followed by zero or more segments like [123], .name, ['name'], or ["name"].
  
  targetType - the target data type to cast into, in a DDL-formatted string.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.0.0
- try_variant_get
  
  public static Column try_variant_get(Column v, String path, String targetType)
  
  Extracts a sub-variant from v according to path, and then cast the sub-variant to targetType. Returns null if the path does not exist or the cast fails..
  
  Parameters:
  
  v - a variant column.
  
  path - the extraction path. A valid path should start with $ and is followed by zero or more segments like [123], .name, ['name'], or ["name"].
  
  targetType - the target data type to cast into, in a DDL-formatted string.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.0.0
- schema_of_variant
  
  public static Column schema_of_variant(Column v)
  
  Returns schema in the SQL format of a variant.
  
  Parameters:
  
  v - a variant column.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.0.0
- schema_of_variant_agg
  
  public static Column schema_of_variant_agg(Column v)
  
  Returns the merged schema in the SQL format of a variant column.
  
  Parameters:
  
  v - a variant column.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.0.0
- schema_of_json
  
  public static Column schema_of_json(String json)
  
  Parses a JSON string and infers its schema in DDL format.
  
  Parameters:
  
  json - a JSON string.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4.0
- schema_of_json
  
  public static Column schema_of_json(Column json)
  
  Parses a JSON string and infers its schema in DDL format.
  
  Parameters:
  
  json - a foldable string column containing a JSON string.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4.0
- schema_of_json
  
  public static Column schema_of_json(Column json, Map<String,String> options)
  
  Parses a JSON string and infers its schema in DDL format using options.
  
  Parameters:
  
  json - a foldable string column containing JSON data.
  
  options - options to control how the json is parsed. accepts the same options and the json data source. See Data Source Option in the version you use.
  
  Returns:
  
  a column with string literal containing schema in DDL format.
  
  Since:
  
  3.0.0
- json_array_length
  
  public static Column json_array_length(Column e)
  
  Returns the number of elements in the outermost JSON array. NULL is returned in case of any other valid JSON string, NULL or an invalid JSON.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- json_object_keys
  
  public static Column json_object_keys(Column e)
  
  Returns all the keys of the outermost JSON object as an array. If a valid JSON object is given, all the keys of the outermost object will be returned as an array. If it is any other valid JSON string, an invalid JSON string or an empty string, the function returns null.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- to_json
  
  public static Column to_json(Column e, scala.collection.immutable.Map<String,String> options)
  
  (Scala-specific) Converts a column containing a StructType, ArrayType or a MapType into a JSON string with the specified schema. Throws an exception, in the case of an unsupported type.
  
  Parameters:
  
  e - a column containing a struct, an array or a map.
  
  options - options to control how the struct column is converted into a json string. accepts the same options and the json data source. See Data Source Option in the version you use. Additionally the function supports the pretty option which enables pretty JSON generation.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.1.0
- to_json
  
  public static Column to_json(Column e, Map<String,String> options)
  
  (Java-specific) Converts a column containing a StructType, ArrayType or a MapType into a JSON string with the specified schema. Throws an exception, in the case of an unsupported type.
  
  Parameters:
  
  e - a column containing a struct, an array or a map.
  
  options - options to control how the struct column is converted into a json string. accepts the same options and the json data source. See Data Source Option in the version you use. Additionally the function supports the pretty option which enables pretty JSON generation.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.1.0
- to_json
  
  public static Column to_json(Column e)
  
  Converts a column containing a StructType, ArrayType or a MapType into a JSON string with the specified schema. Throws an exception, in the case of an unsupported type.
  
  Parameters:
  
  e - a column containing a struct, an array or a map.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.1.0
- mask
  
  public static Column mask(Column input)
  
  Masks the given string value. The function replaces characters with 'X' or 'x', and numbers with 'n'. This can be useful for creating copies of tables with sensitive information removed.
  
  Parameters:
  
  input - string value to mask. Supported types: STRING, VARCHAR, CHAR
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- mask
  
  public static Column mask(Column input, Column upperChar)
  
  Masks the given string value. The function replaces upper-case characters with specific character, lower-case characters with 'x', and numbers with 'n'. This can be useful for creating copies of tables with sensitive information removed.
  
  Parameters:
  
  input - string value to mask. Supported types: STRING, VARCHAR, CHAR
  
  upperChar - character to replace upper-case characters with. Specify NULL to retain original character.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- mask
  
  public static Column mask(Column input, Column upperChar, Column lowerChar)
  
  Masks the given string value. The function replaces upper-case and lower-case characters with the characters specified respectively, and numbers with 'n'. This can be useful for creating copies of tables with sensitive information removed.
  
  Parameters:
  
  input - string value to mask. Supported types: STRING, VARCHAR, CHAR
  
  upperChar - character to replace upper-case characters with. Specify NULL to retain original character.
  
  lowerChar - character to replace lower-case characters with. Specify NULL to retain original character.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- mask
  
  public static Column mask(Column input, Column upperChar, Column lowerChar, Column digitChar)
  
  Masks the given string value. The function replaces upper-case, lower-case characters and numbers with the characters specified respectively. This can be useful for creating copies of tables with sensitive information removed.
  
  Parameters:
  
  input - string value to mask. Supported types: STRING, VARCHAR, CHAR
  
  upperChar - character to replace upper-case characters with. Specify NULL to retain original character.
  
  lowerChar - character to replace lower-case characters with. Specify NULL to retain original character.
  
  digitChar - character to replace digit characters with. Specify NULL to retain original character.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- mask
  
  public static Column mask(Column input, Column upperChar, Column lowerChar, Column digitChar, Column otherChar)
  
  Masks the given string value. This can be useful for creating copies of tables with sensitive information removed.
  
  Parameters:
  
  input - string value to mask. Supported types: STRING, VARCHAR, CHAR
  
  upperChar - character to replace upper-case characters with. Specify NULL to retain original character.
  
  lowerChar - character to replace lower-case characters with. Specify NULL to retain original character.
  
  digitChar - character to replace digit characters with. Specify NULL to retain original character.
  
  otherChar - character to replace all other characters with. Specify NULL to retain original character.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- size
  
  public static Column size(Column e)
  
  Returns length of array or map.
  This function returns -1 for null input only if spark.sql.ansi.enabled is false and spark.sql.legacy.sizeOfNull is true. Otherwise, it returns null for null input. With the default settings, the function returns null for null input.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- cardinality
  
  public static Column cardinality(Column e)
  
  Returns length of array or map. This is an alias of size function.
  This function returns -1 for null input only if spark.sql.ansi.enabled is false and spark.sql.legacy.sizeOfNull is true. Otherwise, it returns null for null input. With the default settings, the function returns null for null input.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- sort_array
  
  public static Column sort_array(Column e)
  
  Sorts the input array for the given column in ascending order, according to the natural ordering of the array elements. Null elements will be placed at the beginning of the returned array.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- sort_array
  
  public static Column sort_array(Column e, boolean asc)
  
  Sorts the input array for the given column in ascending or descending order, according to the natural ordering of the array elements. NaN is greater than any non-NaN elements for double/float type. Null elements will be placed at the beginning of the returned array in ascending order or at the end of the returned array in descending order.
  
  Parameters:
  
  e - (undocumented)
  
  asc - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- array_min
  
  public static Column array_min(Column e)
  
  Returns the minimum value in the array. NaN is greater than any non-NaN elements for double/float type. NULL elements are skipped.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4.0
- array_max
  
  public static Column array_max(Column e)
  
  Returns the maximum value in the array. NaN is greater than any non-NaN elements for double/float type. NULL elements are skipped.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4.0
- array_size
  
  public static Column array_size(Column e)
  
  Returns the total number of elements in the array. The function returns null for null input.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- array_agg
  
  public static Column array_agg(Column e)
  
  Aggregate function: returns a list of objects with duplicates.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
  
  Note:
  
  The function is non-deterministic because the order of collected results depends on the order of the rows which may be non-deterministic after a shuffle.
- shuffle
  
  public static Column shuffle(Column e)
  
  Returns a random permutation of the given array.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4.0
  
  Note:
  
  The function is non-deterministic.
- reverse
  
  public static Column reverse(Column e)
  
  Returns a reversed string or an array with reverse order of elements.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- flatten
  
  public static Column flatten(Column e)
  
  Creates a single array from an array of arrays. If a structure of nested arrays is deeper than two levels, only one level of nesting is removed.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4.0
- sequence
  
  public static Column sequence(Column start, Column stop, Column step)
  
  Generate a sequence of integers from start to stop, incrementing by step.
  
  Parameters:
  
  start - (undocumented)
  
  stop - (undocumented)
  
  step - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4.0
- sequence
  
  public static Column sequence(Column start, Column stop)
  
  Generate a sequence of integers from start to stop, incrementing by 1 if start is less than or equal to stop, otherwise -1.
  
  Parameters:
  
  start - (undocumented)
  
  stop - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4.0
- array_repeat
  
  public static Column array_repeat(Column left, Column right)
  
  Creates an array containing the left argument repeated the number of times given by the right argument.
  
  Parameters:
  
  left - (undocumented)
  
  right - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4.0
- array_repeat
  
  public static Column array_repeat(Column e, int count)
  
  Creates an array containing the left argument repeated the number of times given by the right argument.
  
  Parameters:
  
  e - (undocumented)
  
  count - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4.0
- map_contains_key
  
  public static Column map_contains_key(Column column, Object key)
  
  Returns true if the map contains the key.
  
  Parameters:
  
  column - (undocumented)
  
  key - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.3.0
- map_keys
  
  public static Column map_keys(Column e)
  
  Returns an unordered array containing the keys of the map.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.3.0
- map_values
  
  public static Column map_values(Column e)
  
  Returns an unordered array containing the values of the map.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.3.0
- map_entries
  
  public static Column map_entries(Column e)
  
  Returns an unordered array of all entries in the given map.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.0.0
- map_from_entries
  
  public static Column map_from_entries(Column e)
  
  Returns a map created from the given array of entries.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4.0
- arrays_zip
  
  public static Column arrays_zip(scala.collection.immutable.Seq<Column> e)
  
  Returns a merged array of structs in which the N-th struct contains all N-th values of input arrays.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4.0
- map_concat
  
  public static Column map_concat(scala.collection.immutable.Seq<Column> cols)
  
  Returns the union of all the given maps.
  
  Parameters:
  
  cols - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4.0
- from_csv
  
  public static Column from_csv(Column e, StructType schema, scala.collection.immutable.Map<String,String> options)
  
  Parses a column containing a CSV string into a StructType with the specified schema. Returns null, in the case of an unparseable string.
  
  Parameters:
  
  e - a string column containing CSV data.
  
  schema - the schema to use when parsing the CSV string
  
  options - options to control how the CSV is parsed. accepts the same options and the CSV data source. See Data Source Option in the version you use.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.0.0
- from_csv
  
  public static Column from_csv(Column e, Column schema, Map<String,String> options)
  
  (Java-specific) Parses a column containing a CSV string into a StructType with the specified schema. Returns null, in the case of an unparseable string.
  
  Parameters:
  
  e - a string column containing CSV data.
  
  schema - the schema to use when parsing the CSV string
  
  options - options to control how the CSV is parsed. accepts the same options and the CSV data source. See Data Source Option in the version you use.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.0.0
- schema_of_csv
  
  public static Column schema_of_csv(String csv)
  
  Parses a CSV string and infers its schema in DDL format.
  
  Parameters:
  
  csv - a CSV string.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.0.0
- schema_of_csv
  
  public static Column schema_of_csv(Column csv)
  
  Parses a CSV string and infers its schema in DDL format.
  
  Parameters:
  
  csv - a foldable string column containing a CSV string.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.0.0
- schema_of_csv
  
  public static Column schema_of_csv(Column csv, Map<String,String> options)
  
  Parses a CSV string and infers its schema in DDL format using options.
  
  Parameters:
  
  csv - a foldable string column containing a CSV string.
  
  options - options to control how the CSV is parsed. accepts the same options and the CSV data source. See Data Source Option in the version you use.
  
  Returns:
  
  a column with string literal containing schema in DDL format.
  
  Since:
  
  3.0.0
- to_csv
  
  public static Column to_csv(Column e, Map<String,String> options)
  
  (Java-specific) Converts a column containing a StructType into a CSV string with the specified schema. Throws an exception, in the case of an unsupported type.
  
  Parameters:
  
  e - a column containing a struct.
  
  options - options to control how the struct column is converted into a CSV string. It accepts the same options and the CSV data source. See Data Source Option in the version you use.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.0.0
- to_csv
  
  public static Column to_csv(Column e)
  
  Converts a column containing a StructType into a CSV string with the specified schema. Throws an exception, in the case of an unsupported type.
  
  Parameters:
  
  e - a column containing a struct.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.0.0
- from_xml
  
  public static Column from_xml(Column e, StructType schema, Map<String,String> options)
  
  Parses a column containing a XML string into the data type corresponding to the specified schema. Returns null, in the case of an unparseable string.
  
  Parameters:
  
  e - a string column containing XML data.
  
  schema - the schema to use when parsing the XML string
  
  options - options to control how the XML is parsed. accepts the same options and the XML data source. See Data Source Option in the version you use.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.0.0
- from_xml
  
  public static Column from_xml(Column e, String schema, Map<String,String> options)
  
  (Java-specific) Parses a column containing a XML string into a StructType with the specified schema. Returns null, in the case of an unparseable string.
  
  Parameters:
  
  e - a string column containing XML data.
  
  schema - the schema as a DDL-formatted string.
  
  options - options to control how the XML is parsed. accepts the same options and the xml data source. See Data Source Option in the version you use.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.0.0
- from_xml
  
  public static Column from_xml(Column e, Column schema)
  
  (Java-specific) Parses a column containing a XML string into a StructType with the specified schema. Returns null, in the case of an unparseable string.
  
  Parameters:
  
  e - a string column containing XML data.
  
  schema - the schema to use when parsing the XML string
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.0.0
- from_xml
  
  public static Column from_xml(Column e, Column schema, Map<String,String> options)
  
  (Java-specific) Parses a column containing a XML string into a StructType with the specified schema. Returns null, in the case of an unparseable string.
  
  Parameters:
  
  e - a string column containing XML data.
  
  schema - the schema to use when parsing the XML string
  
  options - options to control how the XML is parsed. accepts the same options and the XML data source. See Data Source Option in the version you use.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.0.0
- from_xml
  
  public static Column from_xml(Column e, StructType schema)
  
  Parses a column containing a XML string into the data type corresponding to the specified schema. Returns null, in the case of an unparseable string.
  
  Parameters:
  
  e - a string column containing XML data.
  
  schema - the schema to use when parsing the XML string
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.0.0
- schema_of_xml
  
  public static Column schema_of_xml(String xml)
  
  Parses a XML string and infers its schema in DDL format.
  
  Parameters:
  
  xml - a XML string.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.0.0
- schema_of_xml
  
  public static Column schema_of_xml(Column xml)
  
  Parses a XML string and infers its schema in DDL format.
  
  Parameters:
  
  xml - a foldable string column containing a XML string.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.0.0
- schema_of_xml
  
  public static Column schema_of_xml(Column xml, Map<String,String> options)
  
  Parses a XML string and infers its schema in DDL format using options.
  
  Parameters:
  
  xml - a foldable string column containing XML data.
  
  options - options to control how the xml is parsed. accepts the same options and the XML data source. See Data Source Option in the version you use.
  
  Returns:
  
  a column with string literal containing schema in DDL format.
  
  Since:
  
  4.0.0
- to_xml
  
  public static Column to_xml(Column e, Map<String,String> options)
  
  (Java-specific) Converts a column containing a StructType into a XML string with the specified schema. Throws an exception, in the case of an unsupported type.
  
  Parameters:
  
  e - a column containing a struct.
  
  options - options to control how the struct column is converted into a XML string. It accepts the same options as the XML data source. See Data Source Option in the version you use.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.0.0
- to_xml
  
  public static Column to_xml(Column e)
  
  Converts a column containing a StructType into a XML string with the specified schema. Throws an exception, in the case of an unsupported type.
  
  Parameters:
  
  e - a column containing a struct.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.0.0
- years
  
  public static Column years(Column e)
  
  (Java-specific) A transform for timestamps and dates to partition data into years.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.0.0
- months
  
  public static Column months(Column e)
  
  (Java-specific) A transform for timestamps and dates to partition data into months.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.0.0
- days
  
  public static Column days(Column e)
  
  (Java-specific) A transform for timestamps and dates to partition data into days.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.0.0
- xpath
  
  public static Column xpath(Column xml, Column path)
  
  Returns a string array of values within the nodes of xml that match the XPath expression.
  
  Parameters:
  
  xml - (undocumented)
  
  path - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- xpath_boolean
  
  public static Column xpath_boolean(Column xml, Column path)
  
  Returns true if the XPath expression evaluates to true, or if a matching node is found.
  
  Parameters:
  
  xml - (undocumented)
  
  path - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- xpath_double
  
  public static Column xpath_double(Column xml, Column path)
  
  Returns a double value, the value zero if no match is found, or NaN if a match is found but the value is non-numeric.
  
  Parameters:
  
  xml - (undocumented)
  
  path - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- xpath_number
  
  public static Column xpath_number(Column xml, Column path)
  
  Returns a double value, the value zero if no match is found, or NaN if a match is found but the value is non-numeric.
  
  Parameters:
  
  xml - (undocumented)
  
  path - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- xpath_float
  
  public static Column xpath_float(Column xml, Column path)
  
  Returns a float value, the value zero if no match is found, or NaN if a match is found but the value is non-numeric.
  
  Parameters:
  
  xml - (undocumented)
  
  path - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- xpath_int
  
  public static Column xpath_int(Column xml, Column path)
  
  Returns an integer value, or the value zero if no match is found, or a match is found but the value is non-numeric.
  
  Parameters:
  
  xml - (undocumented)
  
  path - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- xpath_long
  
  public static Column xpath_long(Column xml, Column path)
  
  Returns a long integer value, or the value zero if no match is found, or a match is found but the value is non-numeric.
  
  Parameters:
  
  xml - (undocumented)
  
  path - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- xpath_short
  
  public static Column xpath_short(Column xml, Column path)
  
  Returns a short integer value, or the value zero if no match is found, or a match is found but the value is non-numeric.
  
  Parameters:
  
  xml - (undocumented)
  
  path - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- xpath_string
  
  public static Column xpath_string(Column xml, Column path)
  
  Returns the text contents of the first xml node that matches the XPath expression.
  
  Parameters:
  
  xml - (undocumented)
  
  path - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- hours
  
  public static Column hours(Column e)
  
  (Java-specific) A transform for timestamps to partition data into hours.
  
  Parameters:
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.0.0
- convert_timezone
  
  public static Column convert_timezone(Column sourceTz, Column targetTz, Column sourceTs)
  
  Converts the timestamp without time zone sourceTs from the sourceTz time zone to targetTz.
  
  Parameters:
  
  sourceTz - the time zone for the input timestamp. If it is missed, the current session time zone is used as the source time zone.
  
  targetTz - the time zone to which the input timestamp should be converted.
  
  sourceTs - a timestamp without time zone.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- convert_timezone
  
  public static Column convert_timezone(Column targetTz, Column sourceTs)
  
  Converts the timestamp without time zone sourceTs from the current time zone to targetTz.
  
  Parameters:
  
  targetTz - the time zone to which the input timestamp should be converted.
  
  sourceTs - a timestamp without time zone.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- make_dt_interval
  
  public static Column make_dt_interval(Column days, Column hours, Column mins, Column secs)
  
  Make DayTimeIntervalType duration from days, hours, mins and secs.
  
  Parameters:
  
  days - (undocumented)
  
  hours - (undocumented)
  
  mins - (undocumented)
  
  secs - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- make_dt_interval
  
  public static Column make_dt_interval(Column days, Column hours, Column mins)
  
  Make DayTimeIntervalType duration from days, hours and mins.
  
  Parameters:
  
  days - (undocumented)
  
  hours - (undocumented)
  
  mins - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- make_dt_interval
  
  public static Column make_dt_interval(Column days, Column hours)
  
  Make DayTimeIntervalType duration from days and hours.
  
  Parameters:
  
  days - (undocumented)
  
  hours - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- make_dt_interval
  
  public static Column make_dt_interval(Column days)
  
  Make DayTimeIntervalType duration from days.
  
  Parameters:
  
  days - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- make_dt_interval
  
  public static Column make_dt_interval()
  
  Make DayTimeIntervalType duration.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- make_interval
  
  public static Column make_interval(Column years, Column months, Column weeks, Column days, Column hours, Column mins, Column secs)
  
  Make interval from years, months, weeks, days, hours, mins and secs.
  
  Parameters:
  
  years - (undocumented)
  
  months - (undocumented)
  
  weeks - (undocumented)
  
  days - (undocumented)
  
  hours - (undocumented)
  
  mins - (undocumented)
  
  secs - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- make_interval
  
  public static Column make_interval(Column years, Column months, Column weeks, Column days, Column hours, Column mins)
  
  Make interval from years, months, weeks, days, hours and mins.
  
  Parameters:
  
  years - (undocumented)
  
  months - (undocumented)
  
  weeks - (undocumented)
  
  days - (undocumented)
  
  hours - (undocumented)
  
  mins - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- make_interval
  
  public static Column make_interval(Column years, Column months, Column weeks, Column days, Column hours)
  
  Make interval from years, months, weeks, days and hours.
  
  Parameters:
  
  years - (undocumented)
  
  months - (undocumented)
  
  weeks - (undocumented)
  
  days - (undocumented)
  
  hours - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- make_interval
  
  public static Column make_interval(Column years, Column months, Column weeks, Column days)
  
  Make interval from years, months, weeks and days.
  
  Parameters:
  
  years - (undocumented)
  
  months - (undocumented)
  
  weeks - (undocumented)
  
  days - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- make_interval
  
  public static Column make_interval(Column years, Column months, Column weeks)
  
  Make interval from years, months and weeks.
  
  Parameters:
  
  years - (undocumented)
  
  months - (undocumented)
  
  weeks - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- make_interval
  
  public static Column make_interval(Column years, Column months)
  
  Make interval from years and months.
  
  Parameters:
  
  years - (undocumented)
  
  months - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- make_interval
  
  public static Column make_interval(Column years)
  
  Make interval from years.
  
  Parameters:
  
  years - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- make_interval
  
  public static Column make_interval()
  
  Make interval.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- make_timestamp
  
  public static Column make_timestamp(Column years, Column months, Column days, Column hours, Column mins, Column secs, Column timezone)
  
  Create timestamp from years, months, days, hours, mins, secs and timezone fields. The result data type is consistent with the value of configuration spark.sql.timestampType. If the configuration spark.sql.ansi.enabled is false, the function returns NULL on invalid inputs. Otherwise, it will throw an error instead.
  
  Parameters:
  
  years - (undocumented)
  
  months - (undocumented)
  
  days - (undocumented)
  
  hours - (undocumented)
  
  mins - (undocumented)
  
  secs - (undocumented)
  
  timezone - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- make_timestamp
  
  public static Column make_timestamp(Column years, Column months, Column days, Column hours, Column mins, Column secs)
  
  Create timestamp from years, months, days, hours, mins and secs fields. The result data type is consistent with the value of configuration spark.sql.timestampType. If the configuration spark.sql.ansi.enabled is false, the function returns NULL on invalid inputs. Otherwise, it will throw an error instead.
  
  Parameters:
  
  years - (undocumented)
  
  months - (undocumented)
  
  days - (undocumented)
  
  hours - (undocumented)
  
  mins - (undocumented)
  
  secs - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- make_timestamp_ltz
  
  public static Column make_timestamp_ltz(Column years, Column months, Column days, Column hours, Column mins, Column secs, Column timezone)
  
  Create the current timestamp with local time zone from years, months, days, hours, mins, secs and timezone fields. If the configuration spark.sql.ansi.enabled is false, the function returns NULL on invalid inputs. Otherwise, it will throw an error instead.
  
  Parameters:
  
  years - (undocumented)
  
  months - (undocumented)
  
  days - (undocumented)
  
  hours - (undocumented)
  
  mins - (undocumented)
  
  secs - (undocumented)
  
  timezone - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- make_timestamp_ltz
  
  public static Column make_timestamp_ltz(Column years, Column months, Column days, Column hours, Column mins, Column secs)
  
  Create the current timestamp with local time zone from years, months, days, hours, mins and secs fields. If the configuration spark.sql.ansi.enabled is false, the function returns NULL on invalid inputs. Otherwise, it will throw an error instead.
  
  Parameters:
  
  years - (undocumented)
  
  months - (undocumented)
  
  days - (undocumented)
  
  hours - (undocumented)
  
  mins - (undocumented)
  
  secs - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- make_timestamp_ntz
  
  public static Column make_timestamp_ntz(Column years, Column months, Column days, Column hours, Column mins, Column secs)
  
  Create local date-time from years, months, days, hours, mins, secs fields. If the configuration spark.sql.ansi.enabled is false, the function returns NULL on invalid inputs. Otherwise, it will throw an error instead.
  
  Parameters:
  
  years - (undocumented)
  
  months - (undocumented)
  
  days - (undocumented)
  
  hours - (undocumented)
  
  mins - (undocumented)
  
  secs - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- make_ym_interval
  
  public static Column make_ym_interval(Column years, Column months)
  
  Make year-month interval from years, months.
  
  Parameters:
  
  years - (undocumented)
  
  months - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- make_ym_interval
  
  public static Column make_ym_interval(Column years)
  
  Make year-month interval from years.
  
  Parameters:
  
  years - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- make_ym_interval
  
  public static Column make_ym_interval()
  
  Make year-month interval.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- bucket
  
  public static Column bucket(Column numBuckets, Column e)
  
  (Java-specific) A transform for any type that partitions by a hash of the input column.
  
  Parameters:
  
  numBuckets - (undocumented)
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.0.0
- bucket
  
  public static Column bucket(int numBuckets, Column e)
  
  (Java-specific) A transform for any type that partitions by a hash of the input column.
  
  Parameters:
  
  numBuckets - (undocumented)
  
  e - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.0.0
- ifnull
  
  public static Column ifnull(Column col1, Column col2)
  
  Returns col2 if col1 is null, or col1 otherwise.
  
  Parameters:
  
  col1 - (undocumented)
  
  col2 - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- isnotnull
  
  public static Column isnotnull(Column col)
  
  Returns true if col is not null, or false otherwise.
  
  Parameters:
  
  col - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- equal_null
  
  public static Column equal_null(Column col1, Column col2)
  
  Returns same result as the EQUAL(=) operator for non-null operands, but returns true if both are null, false if one of the them is null.
  
  Parameters:
  
  col1 - (undocumented)
  
  col2 - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- nullif
  
  public static Column nullif(Column col1, Column col2)
  
  Returns null if col1 equals to col2, or col1 otherwise.
  
  Parameters:
  
  col1 - (undocumented)
  
  col2 - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- nvl
  
  public static Column nvl(Column col1, Column col2)
  
  Returns col2 if col1 is null, or col1 otherwise.
  
  Parameters:
  
  col1 - (undocumented)
  
  col2 - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- nvl2
  
  public static Column nvl2(Column col1, Column col2, Column col3)
  
  Returns col2 if col1 is not null, or col3 otherwise.
  
  Parameters:
  
  col1 - (undocumented)
  
  col2 - (undocumented)
  
  col3 - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- udaf
  
  public static <IN, BUF, OUT> UserDefinedFunction udaf(Aggregator<IN,BUF,OUT> agg, scala.reflect.api.TypeTags.TypeTag<IN> evidence$3)
  Obtains a UserDefinedFunction that wraps the given Aggregator so that it may be used with untyped Data Frames.
  val agg = // Aggregator[IN, BUF, OUT] // declare a UDF based on agg val aggUDF = udaf(agg) val aggData = df.agg(aggUDF($"colname")) // register agg as a named function spark.udf.register("myAggName", udaf(agg))
  Parameters:
  
  agg - the typed Aggregator
  
  evidence$3 - (undocumented)
  
  Returns:
  
  a UserDefinedFunction that can be used as an aggregating expression.
  
  Note:
  
  The input encoder is inferred from the input type IN.
- udaf
  
  public static <IN, BUF, OUT> UserDefinedFunction udaf(Aggregator<IN,BUF,OUT> agg, Encoder<IN> inputEncoder)
  Obtains a UserDefinedFunction that wraps the given Aggregator so that it may be used with untyped Data Frames.
  Aggregator<IN, BUF, OUT> agg = // custom Aggregator Encoder<IN> enc = // input encoder // declare a UDF based on agg UserDefinedFunction aggUDF = udaf(agg, enc) DataFrame aggData = df.agg(aggUDF($"colname")) // register agg as a named function spark.udf.register("myAggName", udaf(agg, enc))
  Parameters:
  
  agg - the typed Aggregator
  
  inputEncoder - a specific input encoder to use
  
  Returns:
  
  a UserDefinedFunction that can be used as an aggregating expression
  
  Note:
  
  This overloading takes an explicit input encoder, to support UDAF declarations in Java.
- udf
  
  public static <RT> UserDefinedFunction udf(scala.Function0<RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$4)
  
  Defines a Scala closure of 0 arguments as user-defined function (UDF). The data types are automatically inferred based on the Scala closure's signature. By default the returned UDF is deterministic. To change it to nondeterministic, call the API UserDefinedFunction.asNondeterministic().
  
  Parameters:
  
  f - (undocumented)
  
  evidence$4 - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- udf
  
  public static <RT, A1> UserDefinedFunction udf(scala.Function1<A1,RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$5, scala.reflect.api.TypeTags.TypeTag<A1> evidence$6)
  
  Defines a Scala closure of 1 arguments as user-defined function (UDF). The data types are automatically inferred based on the Scala closure's signature. By default the returned UDF is deterministic. To change it to nondeterministic, call the API UserDefinedFunction.asNondeterministic().
  
  Parameters:
  
  f - (undocumented)
  
  evidence$5 - (undocumented)
  
  evidence$6 - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- udf
  
  public static <RT, A1, A2> UserDefinedFunction udf(scala.Function2<A1,A2,RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$7, scala.reflect.api.TypeTags.TypeTag<A1> evidence$8, scala.reflect.api.TypeTags.TypeTag<A2> evidence$9)
  
  Defines a Scala closure of 2 arguments as user-defined function (UDF). The data types are automatically inferred based on the Scala closure's signature. By default the returned UDF is deterministic. To change it to nondeterministic, call the API UserDefinedFunction.asNondeterministic().
  
  Parameters:
  
  f - (undocumented)
  
  evidence$7 - (undocumented)
  
  evidence$8 - (undocumented)
  
  evidence$9 - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- udf
  
  public static <RT, A1, A2, A3> UserDefinedFunction udf(scala.Function3<A1,A2,A3,RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$10, scala.reflect.api.TypeTags.TypeTag<A1> evidence$11, scala.reflect.api.TypeTags.TypeTag<A2> evidence$12, scala.reflect.api.TypeTags.TypeTag<A3> evidence$13)
  
  Defines a Scala closure of 3 arguments as user-defined function (UDF). The data types are automatically inferred based on the Scala closure's signature. By default the returned UDF is deterministic. To change it to nondeterministic, call the API UserDefinedFunction.asNondeterministic().
  
  Parameters:
  
  f - (undocumented)
  
  evidence$10 - (undocumented)
  
  evidence$11 - (undocumented)
  
  evidence$12 - (undocumented)
  
  evidence$13 - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- udf
  
  public static <RT, A1, A2, A3, A4> UserDefinedFunction udf(scala.Function4<A1,A2,A3,A4,RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$14, scala.reflect.api.TypeTags.TypeTag<A1> evidence$15, scala.reflect.api.TypeTags.TypeTag<A2> evidence$16, scala.reflect.api.TypeTags.TypeTag<A3> evidence$17, scala.reflect.api.TypeTags.TypeTag<A4> evidence$18)
  
  Defines a Scala closure of 4 arguments as user-defined function (UDF). The data types are automatically inferred based on the Scala closure's signature. By default the returned UDF is deterministic. To change it to nondeterministic, call the API UserDefinedFunction.asNondeterministic().
  
  Parameters:
  
  f - (undocumented)
  
  evidence$14 - (undocumented)
  
  evidence$15 - (undocumented)
  
  evidence$16 - (undocumented)
  
  evidence$17 - (undocumented)
  
  evidence$18 - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- udf
  
  public static <RT, A1, A2, A3, A4, A5> UserDefinedFunction udf(scala.Function5<A1,A2,A3,A4,A5,RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$19, scala.reflect.api.TypeTags.TypeTag<A1> evidence$20, scala.reflect.api.TypeTags.TypeTag<A2> evidence$21, scala.reflect.api.TypeTags.TypeTag<A3> evidence$22, scala.reflect.api.TypeTags.TypeTag<A4> evidence$23, scala.reflect.api.TypeTags.TypeTag<A5> evidence$24)
  
  Defines a Scala closure of 5 arguments as user-defined function (UDF). The data types are automatically inferred based on the Scala closure's signature. By default the returned UDF is deterministic. To change it to nondeterministic, call the API UserDefinedFunction.asNondeterministic().
  
  Parameters:
  
  f - (undocumented)
  
  evidence$19 - (undocumented)
  
  evidence$20 - (undocumented)
  
  evidence$21 - (undocumented)
  
  evidence$22 - (undocumented)
  
  evidence$23 - (undocumented)
  
  evidence$24 - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- udf
  
  public static <RT, A1, A2, A3, A4, A5, A6> UserDefinedFunction udf(scala.Function6<A1,A2,A3,A4,A5,A6,RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$25, scala.reflect.api.TypeTags.TypeTag<A1> evidence$26, scala.reflect.api.TypeTags.TypeTag<A2> evidence$27, scala.reflect.api.TypeTags.TypeTag<A3> evidence$28, scala.reflect.api.TypeTags.TypeTag<A4> evidence$29, scala.reflect.api.TypeTags.TypeTag<A5> evidence$30, scala.reflect.api.TypeTags.TypeTag<A6> evidence$31)
  
  Defines a Scala closure of 6 arguments as user-defined function (UDF). The data types are automatically inferred based on the Scala closure's signature. By default the returned UDF is deterministic. To change it to nondeterministic, call the API UserDefinedFunction.asNondeterministic().
  
  Parameters:
  
  f - (undocumented)
  
  evidence$25 - (undocumented)
  
  evidence$26 - (undocumented)
  
  evidence$27 - (undocumented)
  
  evidence$28 - (undocumented)
  
  evidence$29 - (undocumented)
  
  evidence$30 - (undocumented)
  
  evidence$31 - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- udf
  
  public static <RT, A1, A2, A3, A4, A5, A6, A7> UserDefinedFunction udf(scala.Function7<A1,A2,A3,A4,A5,A6,A7,RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$32, scala.reflect.api.TypeTags.TypeTag<A1> evidence$33, scala.reflect.api.TypeTags.TypeTag<A2> evidence$34, scala.reflect.api.TypeTags.TypeTag<A3> evidence$35, scala.reflect.api.TypeTags.TypeTag<A4> evidence$36, scala.reflect.api.TypeTags.TypeTag<A5> evidence$37, scala.reflect.api.TypeTags.TypeTag<A6> evidence$38, scala.reflect.api.TypeTags.TypeTag<A7> evidence$39)
  
  Defines a Scala closure of 7 arguments as user-defined function (UDF). The data types are automatically inferred based on the Scala closure's signature. By default the returned UDF is deterministic. To change it to nondeterministic, call the API UserDefinedFunction.asNondeterministic().
  
  Parameters:
  
  f - (undocumented)
  
  evidence$32 - (undocumented)
  
  evidence$33 - (undocumented)
  
  evidence$34 - (undocumented)
  
  evidence$35 - (undocumented)
  
  evidence$36 - (undocumented)
  
  evidence$37 - (undocumented)
  
  evidence$38 - (undocumented)
  
  evidence$39 - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- udf
  
  public static <RT, A1, A2, A3, A4, A5, A6, A7, A8> UserDefinedFunction udf(scala.Function8<A1,A2,A3,A4,A5,A6,A7,A8,RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$40, scala.reflect.api.TypeTags.TypeTag<A1> evidence$41, scala.reflect.api.TypeTags.TypeTag<A2> evidence$42, scala.reflect.api.TypeTags.TypeTag<A3> evidence$43, scala.reflect.api.TypeTags.TypeTag<A4> evidence$44, scala.reflect.api.TypeTags.TypeTag<A5> evidence$45, scala.reflect.api.TypeTags.TypeTag<A6> evidence$46, scala.reflect.api.TypeTags.TypeTag<A7> evidence$47, scala.reflect.api.TypeTags.TypeTag<A8> evidence$48)
  
  Defines a Scala closure of 8 arguments as user-defined function (UDF). The data types are automatically inferred based on the Scala closure's signature. By default the returned UDF is deterministic. To change it to nondeterministic, call the API UserDefinedFunction.asNondeterministic().
  
  Parameters:
  
  f - (undocumented)
  
  evidence$40 - (undocumented)
  
  evidence$41 - (undocumented)
  
  evidence$42 - (undocumented)
  
  evidence$43 - (undocumented)
  
  evidence$44 - (undocumented)
  
  evidence$45 - (undocumented)
  
  evidence$46 - (undocumented)
  
  evidence$47 - (undocumented)
  
  evidence$48 - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- udf
  
  public static <RT, A1, A2, A3, A4, A5, A6, A7, A8, A9> UserDefinedFunction udf(scala.Function9<A1,A2,A3,A4,A5,A6,A7,A8,A9,RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$49, scala.reflect.api.TypeTags.TypeTag<A1> evidence$50, scala.reflect.api.TypeTags.TypeTag<A2> evidence$51, scala.reflect.api.TypeTags.TypeTag<A3> evidence$52, scala.reflect.api.TypeTags.TypeTag<A4> evidence$53, scala.reflect.api.TypeTags.TypeTag<A5> evidence$54, scala.reflect.api.TypeTags.TypeTag<A6> evidence$55, scala.reflect.api.TypeTags.TypeTag<A7> evidence$56, scala.reflect.api.TypeTags.TypeTag<A8> evidence$57, scala.reflect.api.TypeTags.TypeTag<A9> evidence$58)
  
  Defines a Scala closure of 9 arguments as user-defined function (UDF). The data types are automatically inferred based on the Scala closure's signature. By default the returned UDF is deterministic. To change it to nondeterministic, call the API UserDefinedFunction.asNondeterministic().
  
  Parameters:
  
  f - (undocumented)
  
  evidence$49 - (undocumented)
  
  evidence$50 - (undocumented)
  
  evidence$51 - (undocumented)
  
  evidence$52 - (undocumented)
  
  evidence$53 - (undocumented)
  
  evidence$54 - (undocumented)
  
  evidence$55 - (undocumented)
  
  evidence$56 - (undocumented)
  
  evidence$57 - (undocumented)
  
  evidence$58 - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- udf
  
  public static <RT, A1, A2, A3, A4, A5, A6, A7, A8, A9, A10> UserDefinedFunction udf(scala.Function10<A1,A2,A3,A4,A5,A6,A7,A8,A9,A10,RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$59, scala.reflect.api.TypeTags.TypeTag<A1> evidence$60, scala.reflect.api.TypeTags.TypeTag<A2> evidence$61, scala.reflect.api.TypeTags.TypeTag<A3> evidence$62, scala.reflect.api.TypeTags.TypeTag<A4> evidence$63, scala.reflect.api.TypeTags.TypeTag<A5> evidence$64, scala.reflect.api.TypeTags.TypeTag<A6> evidence$65, scala.reflect.api.TypeTags.TypeTag<A7> evidence$66, scala.reflect.api.TypeTags.TypeTag<A8> evidence$67, scala.reflect.api.TypeTags.TypeTag<A9> evidence$68, scala.reflect.api.TypeTags.TypeTag<A10> evidence$69)
  
  Defines a Scala closure of 10 arguments as user-defined function (UDF). The data types are automatically inferred based on the Scala closure's signature. By default the returned UDF is deterministic. To change it to nondeterministic, call the API UserDefinedFunction.asNondeterministic().
  
  Parameters:
  
  f - (undocumented)
  
  evidence$59 - (undocumented)
  
  evidence$60 - (undocumented)
  
  evidence$61 - (undocumented)
  
  evidence$62 - (undocumented)
  
  evidence$63 - (undocumented)
  
  evidence$64 - (undocumented)
  
  evidence$65 - (undocumented)
  
  evidence$66 - (undocumented)
  
  evidence$67 - (undocumented)
  
  evidence$68 - (undocumented)
  
  evidence$69 - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- udf
  
  public static UserDefinedFunction udf(UDF0<?> f, DataType returnType)
  
  Defines a Java UDF0 instance as user-defined function (UDF). The caller must specify the output data type, and there is no automatic input type coercion. By default the returned UDF is deterministic. To change it to nondeterministic, call the API UserDefinedFunction.asNondeterministic().
  
  Parameters:
  
  f - (undocumented)
  
  returnType - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.3.0
- udf
  
  public static UserDefinedFunction udf(UDF1<?,?> f, DataType returnType)
  
  Defines a Java UDF1 instance as user-defined function (UDF). The caller must specify the output data type, and there is no automatic input type coercion. By default the returned UDF is deterministic. To change it to nondeterministic, call the API UserDefinedFunction.asNondeterministic().
  
  Parameters:
  
  f - (undocumented)
  
  returnType - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.3.0
- udf
  
  public static UserDefinedFunction udf(UDF2<?,?,?> f, DataType returnType)
  
  Defines a Java UDF2 instance as user-defined function (UDF). The caller must specify the output data type, and there is no automatic input type coercion. By default the returned UDF is deterministic. To change it to nondeterministic, call the API UserDefinedFunction.asNondeterministic().
  
  Parameters:
  
  f - (undocumented)
  
  returnType - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.3.0
- udf
  
  public static UserDefinedFunction udf(UDF3<?,?,?,?> f, DataType returnType)
  
  Defines a Java UDF3 instance as user-defined function (UDF). The caller must specify the output data type, and there is no automatic input type coercion. By default the returned UDF is deterministic. To change it to nondeterministic, call the API UserDefinedFunction.asNondeterministic().
  
  Parameters:
  
  f - (undocumented)
  
  returnType - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.3.0
- udf
  
  public static UserDefinedFunction udf(UDF4<?,?,?,?,?> f, DataType returnType)
  
  Defines a Java UDF4 instance as user-defined function (UDF). The caller must specify the output data type, and there is no automatic input type coercion. By default the returned UDF is deterministic. To change it to nondeterministic, call the API UserDefinedFunction.asNondeterministic().
  
  Parameters:
  
  f - (undocumented)
  
  returnType - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.3.0
- udf
  
  public static UserDefinedFunction udf(UDF5<?,?,?,?,?,?> f, DataType returnType)
  
  Defines a Java UDF5 instance as user-defined function (UDF). The caller must specify the output data type, and there is no automatic input type coercion. By default the returned UDF is deterministic. To change it to nondeterministic, call the API UserDefinedFunction.asNondeterministic().
  
  Parameters:
  
  f - (undocumented)
  
  returnType - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.3.0
- udf
  
  public static UserDefinedFunction udf(UDF6<?,?,?,?,?,?,?> f, DataType returnType)
  
  Defines a Java UDF6 instance as user-defined function (UDF). The caller must specify the output data type, and there is no automatic input type coercion. By default the returned UDF is deterministic. To change it to nondeterministic, call the API UserDefinedFunction.asNondeterministic().
  
  Parameters:
  
  f - (undocumented)
  
  returnType - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.3.0
- udf
  
  public static UserDefinedFunction udf(UDF7<?,?,?,?,?,?,?,?> f, DataType returnType)
  
  Defines a Java UDF7 instance as user-defined function (UDF). The caller must specify the output data type, and there is no automatic input type coercion. By default the returned UDF is deterministic. To change it to nondeterministic, call the API UserDefinedFunction.asNondeterministic().
  
  Parameters:
  
  f - (undocumented)
  
  returnType - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.3.0
- udf
  
  public static UserDefinedFunction udf(UDF8<?,?,?,?,?,?,?,?,?> f, DataType returnType)
  
  Defines a Java UDF8 instance as user-defined function (UDF). The caller must specify the output data type, and there is no automatic input type coercion. By default the returned UDF is deterministic. To change it to nondeterministic, call the API UserDefinedFunction.asNondeterministic().
  
  Parameters:
  
  f - (undocumented)
  
  returnType - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.3.0
- udf
  
  public static UserDefinedFunction udf(UDF9<?,?,?,?,?,?,?,?,?,?> f, DataType returnType)
  
  Defines a Java UDF9 instance as user-defined function (UDF). The caller must specify the output data type, and there is no automatic input type coercion. By default the returned UDF is deterministic. To change it to nondeterministic, call the API UserDefinedFunction.asNondeterministic().
  
  Parameters:
  
  f - (undocumented)
  
  returnType - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.3.0
- udf
  
  public static UserDefinedFunction udf(UDF10<?,?,?,?,?,?,?,?,?,?,?> f, DataType returnType)
  
  Defines a Java UDF10 instance as user-defined function (UDF). The caller must specify the output data type, and there is no automatic input type coercion. By default the returned UDF is deterministic. To change it to nondeterministic, call the API UserDefinedFunction.asNondeterministic().
  
  Parameters:
  
  f - (undocumented)
  
  returnType - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.3.0
- udf
  
  public static UserDefinedFunction udf(Object f, DataType dataType)
  
  Deprecated.
  Scala `udf` method with return type parameter is deprecated. Please use Scala `udf` method without return type parameter. Since 3.0.0.
  
  Defines a deterministic user-defined function (UDF) using a Scala closure. For this variant, the caller must specify the output data type, and there is no automatic input type coercion. By default the returned UDF is deterministic. To change it to nondeterministic, call the API UserDefinedFunction.asNondeterministic().
  Note that, although the Scala closure can have primitive-type function argument, it doesn't work well with null values. Because the Scala closure is passed in as Any type, there is no type information for the function arguments. Without the type information, Spark may blindly pass null to the Scala closure with primitive-type argument, and the closure will see the default value of the Java type for the null argument, e.g. udf((x: Int) => x, IntegerType), the result is 0 for null input.
  
  Parameters:
  
  f - A closure in Scala
  
  dataType - The output data type of the UDF
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.0.0
- callUDF
  
  public static Column callUDF(String udfName, scala.collection.immutable.Seq<Column> cols)
  
  Deprecated.
  Use call_udf.
  
  Call an user-defined function.
  
  Parameters:
  
  udfName - (undocumented)
  
  cols - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- call_udf
  
  public static Column call_udf(String udfName, scala.collection.immutable.Seq<Column> cols)
  Call an user-defined function. Example:
  import org.apache.spark.sql._ val df = Seq(("id1", 1), ("id2", 4), ("id3", 5)).toDF("id", "value") val spark = df.sparkSession spark.udf.register("simpleUDF", (v: Int) => v * v) df.select($"id", call_udf("simpleUDF", $"value"))
  Parameters:
  
  udfName - (undocumented)
  
  cols - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.2.0
- call_function
  
  public static Column call_function(String funcName, scala.collection.immutable.Seq<Column> cols)
  
  Call a SQL function.
  
  Parameters:
  
  funcName - function name that follows the SQL identifier syntax (can be quoted, can be qualified)
  
  cols - the expression parameters of function
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.5.0
- unwrap_udt
  
  public static Column unwrap_udt(Column column)
  
  Unwrap UDT data type column into its underlying type.
  
  Parameters:
  
  column - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.4.0

Class functions

Nested Class Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Details

functions

Method Details

countDistinct

countDistinct

count_distinct

array

array

map

coalesce

struct

struct

greatest

greatest

least

least

hash

xxhash64

concat_ws

format_string

elt

concat

json_tuple

arrays_zip

map_concat

callUDF

call_udf

call_function

col

column

lit

typedLit

typedlit

asc

asc_nulls_first

asc_nulls_last

desc

desc_nulls_first

desc_nulls_last

approxCountDistinct

approxCountDistinct

approxCountDistinct

approxCountDistinct

approx_count_distinct

approx_count_distinct

approx_count_distinct

approx_count_distinct

avg

avg

collect_list

collect_list

collect_set

collect_set

count_min_sketch

corr

corr

count

count

countDistinct

countDistinct

count_distinct

covar_pop

covar_pop

covar_samp

covar_samp

first

first

first

first

first_value

first_value

grouping

grouping

grouping_id

grouping_id