| agg {SparkR} | R Documentation |
Aggregates on the entire SparkDataFrame without groups. The resulting SparkDataFrame will also contain the grouping columns.
Compute aggregates by specifying a list of columns
agg(x, ...) summarize(x, ...) ## S4 method for signature 'GroupedData' agg(x, ...) ## S4 method for signature 'GroupedData' summarize(x, ...) ## S4 method for signature 'SparkDataFrame' agg(x, ...) ## S4 method for signature 'SparkDataFrame' summarize(x, ...)
x |
a SparkDataFrame or GroupedData. |
... |
further arguments to be passed to or from other methods. |
df2 <- agg(df, <column> = <aggFunction>) df2 <- agg(df, newColName = aggFunction(column))
A SparkDataFrame.
agg since 1.4.0
summarize since 1.4.0
agg since 1.4.0
summarize since 1.4.0
Other agg_funcs: avg,
countDistinct, count,
first, kurtosis,
last, max,
mean, min, sd,
skewness, stddev_pop,
stddev_samp, sumDistinct,
sum, var_pop,
var_samp, var
Other SparkDataFrame functions: SparkDataFrame-class,
arrange, as.data.frame,
attach, cache,
coalesce, collect,
colnames, coltypes,
createOrReplaceTempView,
crossJoin, dapplyCollect,
dapply, describe,
dim, distinct,
dropDuplicates, dropna,
drop, dtypes,
except, explain,
filter, first,
gapplyCollect, gapply,
getNumPartitions, group_by,
head, histogram,
insertInto, intersect,
isLocal, join,
limit, merge,
mutate, ncol,
nrow, persist,
printSchema, randomSplit,
rbind, registerTempTable,
rename, repartition,
sample, saveAsTable,
schema, selectExpr,
select, showDF,
show, storageLevel,
str, subset,
take, union,
unpersist, withColumn,
with, write.df,
write.jdbc, write.json,
write.orc, write.parquet,
write.text
## Not run:
##D df2 <- agg(df, age = "sum") # new column name will be created as 'SUM(age#0)'
##D df3 <- agg(df, ageSum = sum(df$age)) # Creates a new column named ageSum
##D df4 <- summarize(df, ageSum = max(df$age))
## End(Not run)