pyspark.sql.functions.transform#
- pyspark.sql.functions.transform(col, f)[source]#
- Returns an array of elements after applying a transformation to each element in the input array. - New in version 3.1.0. - Changed in version 3.4.0: Supports Spark Connect. - Parameters
- colColumnor str
- name of column or expression 
- ffunction
- a function that is applied to each element of the input array. Can take one of the following forms: - Unary - (x: Column) -> Column: ...
- Binary (x: Column, i: Column) -> Column..., where the second argument is
- a 0-based index of the element. 
 
- Binary 
 - and can use methods of - Column, functions defined in- pyspark.sql.functionsand Scala- UserDefinedFunctions. Python- UserDefinedFunctionsare not supported (SPARK-27052).
 
- col
- Returns
- Column
- a new array of transformed elements. 
 
 - Examples - >>> df = spark.createDataFrame([(1, [1, 2, 3, 4])], ("key", "values")) >>> df.select(transform("values", lambda x: x * 2).alias("doubled")).show() +------------+ | doubled| +------------+ |[2, 4, 6, 8]| +------------+ - >>> def alternate(x, i): ... return when(i % 2 == 0, x).otherwise(-x) ... >>> df.select(transform("values", alternate).alias("alternated")).show() +--------------+ | alternated| +--------------+ |[1, -2, 3, -4]| +--------------+