org.apache.spark.rdd (Spark 2.0.1 JavaDoc)

All Classes

Interface Summary
Interface Description

JdbcRDD.ConnectionFactory

PartitionCoalescer
::DeveloperApi:: A PartitionCoalescer defines how to coalesce the partitions of a given RDD.

Class Summary
Class	Description
AsyncRDDActions<T>	A set of asynchronous RDD actions available through an implicit conversion.
CheckpointState	Enumeration to manage state transitions of an RDD through checkpointing [ Initialized --> checkpointing in progress --> checkpointed ].
CoGroupedRDD<K>	:: DeveloperApi :: A RDD that cogroups its parents.
DefaultPartitionCoalescer	Coalesce the partitions of a parent RDD (`prev`) into fewer partitions, so that each partition of this RDD computes one or more of the parent ones.
DoubleRDDFunctions	Extra functions available on RDDs of Doubles through an implicit conversion.
HadoopRDD<K,V>	:: DeveloperApi :: An RDD that provides core functionality for reading data stored in Hadoop (e.g., files in HDFS, sources in HBase, or S3), using the older MapReduce API (`org.apache.hadoop.mapred`).
HadoopRDD.HadoopMapPartitionsWithSplitRDD$
InputFileNameHolder	This holds file names of the current Spark task.
JdbcRDD<T>	An RDD that executes a SQL query on a JDBC connection and reads results.
NewHadoopRDD<K,V>	:: DeveloperApi :: An RDD that provides core functionality for reading data stored in Hadoop (e.g., files in HDFS, sources in HBase, or S3), using the new MapReduce API (`org.apache.hadoop.mapreduce`).
NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD$
OrderedRDDFunctions<K,V,P extends scala.Product2<K,V>>	Extra functions available on RDDs of (key, value) pairs where the key is sortable through an implicit conversion.
PairRDDFunctions<K,V>	Extra functions available on RDDs of (key, value) pairs through an implicit conversion.
PartitionGroup	::DeveloperApi:: A group of `Partition`s param: prefLoc preferred location for the partition group
PartitionPruningRDD<T>	:: DeveloperApi :: A RDD used to prune RDD partitions/partitions so we can avoid launching tasks on all partitions.
RDD<T>	A Resilient Distributed Dataset (RDD), the basic abstraction in Spark.
SequenceFileRDDFunctions<K,V>	Extra functions available on RDDs of (key, value) pairs to create a Hadoop SequenceFile, through an implicit conversion.
ShuffledRDD<K,V,C>	:: DeveloperApi :: The resulting RDD from a shuffle (e.g.
UnionRDD<T>

Package org.apache.spark.rdd Description

Provides implementation's of various RDDs.

All Classes