Create an input stream that directly pulls messages from Kafka Brokers without using any receiver.
Create an input stream that directly pulls messages from Kafka Brokers without using any receiver. This stream can guarantee that each message from Kafka is included in transformations exactly once (see points below).
Points to note:
JavaStreamingContext object
Class of the keys in the Kafka records
Class of the values in the Kafka records
Class of the key decoder
Class type of the value decoder
Kafka configuration parameters. Requires "metadata.broker.list" or "bootstrap.servers" to be set with Kafka broker(s) (NOT zookeeper servers), specified in host1:port1,host2:port2 form. If not starting from a checkpoint, "auto.offset.reset" may be set to "largest" or "smallest" to determine where the stream starts (defaults to "largest")
Names of the topics to consume
Create an input stream that directly pulls messages from Kafka Brokers without using any receiver.
Create an input stream that directly pulls messages from Kafka Brokers without using any receiver. This stream can guarantee that each message from Kafka is included in transformations exactly once (see points below).
Points to note:
JavaStreamingContext object
Class of the keys in the Kafka records
Class of the values in the Kafka records
Class of the key decoder
Class of the value decoder
Class of the records in DStream
Kafka configuration parameters. Requires "metadata.broker.list" or "bootstrap.servers" to be set with Kafka broker(s) (NOT zookeeper servers), specified in host1:port1,host2:port2 form.
Per-topic/partition Kafka offsets defining the (inclusive) starting point of the stream
Function for translating each message and metadata into the desired type
Create an input stream that directly pulls messages from Kafka Brokers without using any receiver.
Create an input stream that directly pulls messages from Kafka Brokers without using any receiver. This stream can guarantee that each message from Kafka is included in transformations exactly once (see points below).
Points to note:
StreamingContext object
Kafka configuration parameters. Requires "metadata.broker.list" or "bootstrap.servers" to be set with Kafka broker(s) (NOT zookeeper servers), specified in host1:port1,host2:port2 form. If not starting from a checkpoint, "auto.offset.reset" may be set to "largest" or "smallest" to determine where the stream starts (defaults to "largest")
Names of the topics to consume
Create an input stream that directly pulls messages from Kafka Brokers without using any receiver.
Create an input stream that directly pulls messages from Kafka Brokers without using any receiver. This stream can guarantee that each message from Kafka is included in transformations exactly once (see points below).
Points to note:
StreamingContext object
Kafka configuration parameters. Requires "metadata.broker.list" or "bootstrap.servers" to be set with Kafka broker(s) (NOT zookeeper servers) specified in host1:port1,host2:port2 form.
Per-topic/partition Kafka offsets defining the (inclusive) starting point of the stream
Function for translating each message and metadata into the desired type
Create a RDD from Kafka using offset ranges for each topic and partition.
Create a RDD from Kafka using offset ranges for each topic and partition. This allows you specify the Kafka leader to connect to (to optimize fetching) and access the message as well as the metadata.
JavaSparkContext object
Kafka configuration parameters. Requires "metadata.broker.list" or "bootstrap.servers" to be set with Kafka broker(s) (NOT zookeeper servers) specified in host1:port1,host2:port2 form.
Each OffsetRange in the batch corresponds to a range of offsets for a given Kafka topic/partition
Kafka brokers for each TopicAndPartition in offsetRanges. May be an empty map, in which case leaders will be looked up on the driver.
Function for translating each message and metadata into the desired type
Create a RDD from Kafka using offset ranges for each topic and partition.
Create a RDD from Kafka using offset ranges for each topic and partition.
JavaSparkContext object
Kafka configuration parameters. Requires "metadata.broker.list" or "bootstrap.servers" to be set with Kafka broker(s) (NOT zookeeper servers) specified in host1:port1,host2:port2 form.
Each OffsetRange in the batch corresponds to a range of offsets for a given Kafka topic/partition
Create a RDD from Kafka using offset ranges for each topic and partition.
Create a RDD from Kafka using offset ranges for each topic and partition. This allows you specify the Kafka leader to connect to (to optimize fetching) and access the message as well as the metadata.
SparkContext object
Kafka configuration parameters. Requires "metadata.broker.list" or "bootstrap.servers" to be set with Kafka broker(s) (NOT zookeeper servers) specified in host1:port1,host2:port2 form.
Each OffsetRange in the batch corresponds to a range of offsets for a given Kafka topic/partition
Kafka brokers for each TopicAndPartition in offsetRanges. May be an empty map, in which case leaders will be looked up on the driver.
Function for translating each message and metadata into the desired type
Create a RDD from Kafka using offset ranges for each topic and partition.
Create a RDD from Kafka using offset ranges for each topic and partition.
SparkContext object
Kafka configuration parameters. Requires "metadata.broker.list" or "bootstrap.servers" to be set with Kafka broker(s) (NOT zookeeper servers) specified in host1:port1,host2:port2 form.
Each OffsetRange in the batch corresponds to a range of offsets for a given Kafka topic/partition
Create an input stream that pulls messages from Kafka Brokers.
Create an input stream that pulls messages from Kafka Brokers.
JavaStreamingContext object
Key type of DStream
value type of Dstream
Type of kafka key decoder
Type of kafka value decoder
Map of kafka configuration parameters, see http://kafka.apache.org/08/configuration.html
Map of (topic_name -> numPartitions) to consume. Each partition is consumed in its own thread
RDD storage level.
Create an input stream that pulls messages from Kafka Brokers.
Create an input stream that pulls messages from Kafka Brokers.
JavaStreamingContext object
Zookeeper quorum (hostname:port,hostname:port,..).
The group id for this consumer.
Map of (topic_name -> numPartitions) to consume. Each partition is consumed in its own thread.
RDD storage level.
Create an input stream that pulls messages from Kafka Brokers.
Create an input stream that pulls messages from Kafka Brokers. Storage level of the data will be the default StorageLevel.MEMORY_AND_DISK_SER_2.
JavaStreamingContext object
Zookeeper quorum (hostname:port,hostname:port,..)
The group id for this consumer
Map of (topic_name -> numPartitions) to consume. Each partition is consumed in its own thread
Create an input stream that pulls messages from Kafka Brokers.
Create an input stream that pulls messages from Kafka Brokers.
StreamingContext object
Map of kafka configuration parameters, see http://kafka.apache.org/08/configuration.html
Map of (topic_name -> numPartitions) to consume. Each partition is consumed in its own thread.
Storage level to use for storing the received objects
Create an input stream that pulls messages from Kafka Brokers.
Create an input stream that pulls messages from Kafka Brokers.
StreamingContext object
Zookeeper quorum (hostname:port,hostname:port,..)
The group id for this consumer
Map of (topic_name -> numPartitions) to consume. Each partition is consumed in its own thread
Storage level to use for storing the received objects (default: StorageLevel.MEMORY_AND_DISK_SER_2)