StreamingContext (Spark 1.4.1 JavaDoc)

Overview

Package

Class

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.spark.streaming
Class StreamingContext

Object
  org.apache.spark.streaming.StreamingContext

All Implemented Interfaces:: Logging

public class StreamingContext
extends Object
implements Logging
extends Object
implements Logging

Main entry point for Spark Streaming functionality. It provides methods used to create DStreams from various input sources. It can be either created by providing a Spark master URL and an appName, or from a org.apache.spark.SparkConf configuration (see core Spark documentation), or from an existing org.apache.spark.SparkContext. The associated SparkContext can be accessed using context.sparkContext. After creating and transforming DStreams, the streaming computation can be started and stopped using context.start() and context.stop(), respectively. context.awaitTermination() allows the current thread to wait for the termination of the context by stop() or by an exception.

Constructor Summary
`StreamingContext(SparkConf conf, Duration batchDuration)` Create a StreamingContext by providing the configuration necessary for a new SparkContext.
`StreamingContext(SparkContext sparkContext, Duration batchDuration)` Create a StreamingContext using an existing SparkContext.
`StreamingContext(String path)` Recreate a StreamingContext from a checkpoint file.
`StreamingContext(String path, org.apache.hadoop.conf.Configuration hadoopConf)` Recreate a StreamingContext from a checkpoint file.
`StreamingContext(String path, SparkContext sparkContext)` Recreate a StreamingContext from a checkpoint file using an existing SparkContext.
`StreamingContext(String master, String appName, Duration batchDuration, String sparkHome, scala.collection.Seq<String> jars, scala.collection.Map<String,String> environment)` Create a StreamingContext by providing the details necessary for creating a new SparkContext.

Method Summary





<T> ReceiverInputDStream<T>

actorStream(akka.actor.Props props,
            String name,
            StorageLevel storageLevel,
            akka.actor.SupervisorStrategy supervisorStrategy,
            scala.reflect.ClassTag<T> evidence$3)

Create an input stream with any arbitrary user implemented actor receiver.

void addStreamingListener(StreamingListener streamingListener)
Add a StreamingListener object for receiving system events related to streaming.

void awaitTermination()
Wait for the execution to stop.

void awaitTermination(long timeout)
Deprecated. As of 1.3.0, replaced by awaitTerminationOrTimeout(Long).

boolean awaitTerminationOrTimeout(long timeout)
Wait for the execution to stop.

DStream<byte[]> binaryRecordsStream(String directory, int recordLength)
:: Experimental ::

void checkpoint(String directory)
Set the context to periodically checkpoint the DStream operations for driver fault-tolerance.

String checkpointDir()

Duration checkpointDuration()

SparkConf conf()

SparkEnv env()





<K,V,F extends org.apache.hadoop.mapreduce.InputFormat<K,V>> 


InputDStream<scala.Tuple2<K,V>>

fileStream(String directory,
           scala.reflect.ClassTag<K> evidence$6,
           scala.reflect.ClassTag<V> evidence$7,
           scala.reflect.ClassTag<F> evidence$8)

Create a input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format.





<K,V,F extends org.apache.hadoop.mapreduce.InputFormat<K,V>> 


InputDStream<scala.Tuple2<K,V>>

fileStream(String directory,
           scala.Function1<org.apache.hadoop.fs.Path,Object> filter,
           boolean newFilesOnly,
           scala.reflect.ClassTag<K> evidence$9,
           scala.reflect.ClassTag<V> evidence$10,
           scala.reflect.ClassTag<F> evidence$11)

Create a input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format.





<K,V,F extends org.apache.hadoop.mapreduce.InputFormat<K,V>> 


InputDStream<scala.Tuple2<K,V>>

fileStream(String directory,
           scala.Function1<org.apache.hadoop.fs.Path,Object> filter,
           boolean newFilesOnly,
           org.apache.hadoop.conf.Configuration conf,
           scala.reflect.ClassTag<K> evidence$12,
           scala.reflect.ClassTag<V> evidence$13,
           scala.reflect.ClassTag<F> evidence$14)

Create a input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format.

static scala.Option<StreamingContext> getActive()
:: Experimental ::

static StreamingContext getActiveOrCreate(scala.Function0<StreamingContext> creatingFunc)
:: Experimental ::

static StreamingContext getActiveOrCreate(String checkpointPath, scala.Function0<StreamingContext> creatingFunc, org.apache.hadoop.conf.Configuration hadoopConf, boolean createOnError)
:: Experimental ::

static StreamingContext getOrCreate(String checkpointPath, scala.Function0<StreamingContext> creatingFunc, org.apache.hadoop.conf.Configuration hadoopConf, boolean createOnError)
Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.

StreamingContextState getState()
:: DeveloperApi ::

org.apache.spark.streaming.DStreamGraph graph()

boolean isCheckpointPresent()

static scala.Option<String> jarOfClass(Class<?> cls)
Find the JAR from which a given class was loaded, to make it easy for users to pass their JARs to StreamingContext.





<T> ReceiverInputDStream<T>

networkStream(Receiver<T> receiver,
              scala.reflect.ClassTag<T> evidence$1)

Deprecated. As of 1.0.0", replaced by receiverStream.

org.apache.spark.streaming.ui.StreamingJobProgressListener progressListener()





<T> InputDStream<T>

queueStream(scala.collection.mutable.Queue<RDD<T>> queue,
            boolean oneAtATime,
            scala.reflect.ClassTag<T> evidence$15)

Create an input stream from a queue of RDDs.





<T> InputDStream<T>

queueStream(scala.collection.mutable.Queue<RDD<T>> queue,
            boolean oneAtATime,
            RDD<T> defaultRDD,
            scala.reflect.ClassTag<T> evidence$16)

Create an input stream from a queue of RDDs.





<T> ReceiverInputDStream<T>

rawSocketStream(String hostname,
                int port,
                StorageLevel storageLevel,
                scala.reflect.ClassTag<T> evidence$5)

Create a input stream from network source hostname:port, where data is received as serialized blocks (serialized using the Spark's serializer) that can be directly pushed into the block manager without deserializing them.





<T> ReceiverInputDStream<T>

receiverStream(Receiver<T> receiver,
               scala.reflect.ClassTag<T> evidence$2)

Create an input stream with any arbitrary user implemented receiver.

void remember(Duration duration)
Set each DStreams in this context to remember RDDs it generated in the last given duration.

SparkContext sc()

org.apache.spark.streaming.scheduler.JobScheduler scheduler()





<T> ReceiverInputDStream<T>

socketStream(String hostname,
             int port,
             scala.Function1<java.io.InputStream,scala.collection.Iterator<T>> converter,
             StorageLevel storageLevel,
             scala.reflect.ClassTag<T> evidence$4)

Create a input stream from TCP source hostname:port.

ReceiverInputDStream<String> socketTextStream(String hostname, int port, StorageLevel storageLevel)
Create a input stream from TCP source hostname:port.

SparkContext sparkContext()
Return the associated Spark context

void start()
Start the execution of the streams.

void stop(boolean stopSparkContext)
Stop the execution of the streams immediately (does not wait for all received data to be processed).

void stop(boolean stopSparkContext, boolean stopGracefully)
Stop the execution of the streams, with option of ensuring all received data has been processed.

DStream<String> textFileStream(String directory)
Create a input stream that monitors a Hadoop-compatible filesystem for new files and reads them as text files (using key as LongWritable, value as Text and input format as TextInputFormat).

static



<K,V> PairDStreamFunctions<K,V>

toPairDStreamFunctions(DStream<scala.Tuple2<K,V>> stream,
                       scala.reflect.ClassTag<K> kt,
                       scala.reflect.ClassTag<V> vt,
                       scala.math.Ordering<K> ord)

Deprecated. As of 1.3.0, replaced by implicit functions in the DStream companion object. This is kept here only for backward compatibility.





<T> DStream<T>

transform(scala.collection.Seq<DStream<?>> dstreams,
          scala.Function2<scala.collection.Seq<RDD<?>>,Time,RDD<T>> transformFunc,
          scala.reflect.ClassTag<T> evidence$18)

Create a new DStream in which each RDD is generated by applying a function on RDDs of the DStreams.

scala.Option<org.apache.spark.streaming.ui.StreamingTab> uiTab()





<T> DStream<T>

union(scala.collection.Seq<DStream<T>> streams,
      scala.reflect.ClassTag<T> evidence$17)

Create a unified DStream from multiple DStreams of the same type and same slide duration.

org.apache.spark.streaming.ContextWaiter waiter()

Methods inherited from class Object
`equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Methods inherited from interface org.apache.spark.Logging
`initializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning`

Constructor Detail

StreamingContext

public StreamingContext(SparkContext sparkContext,
                        Duration batchDuration)

Create a StreamingContext using an existing SparkContext.

Parameters:: sparkContext - existing SparkContext; batchDuration - the time interval at which streaming data will be divided into batches

StreamingContext

public StreamingContext(SparkConf conf,
                        Duration batchDuration)

Create a StreamingContext by providing the configuration necessary for a new SparkContext.

Parameters:: conf - a org.apache.spark.SparkConf object specifying Spark parameters; batchDuration - the time interval at which streaming data will be divided into batches

StreamingContext

public StreamingContext(String master,
                        String appName,
                        Duration batchDuration,
                        String sparkHome,
                        scala.collection.Seq<String> jars,
                        scala.collection.Map<String,String> environment)

Create a StreamingContext by providing the details necessary for creating a new SparkContext.

Parameters:: master - cluster URL to connect to (e.g. mesos://host:port, spark://host:port, local[4]).; appName - a name for your job, to display on the cluster web UI; batchDuration - the time interval at which streaming data will be divided into batches; sparkHome - (undocumented); jars - (undocumented); environment - (undocumented)

StreamingContext

public StreamingContext(String path,
                        org.apache.hadoop.conf.Configuration hadoopConf)

Recreate a StreamingContext from a checkpoint file.

Parameters:: path - Path to the directory that was specified as the checkpoint directory; hadoopConf - Optional, configuration object if necessary for reading from HDFS compatible filesystems

StreamingContext

public StreamingContext(String path)

Recreate a StreamingContext from a checkpoint file.

Parameters:: path - Path to the directory that was specified as the checkpoint directory

StreamingContext

public StreamingContext(String path,
                        SparkContext sparkContext)

Recreate a StreamingContext from a checkpoint file using an existing SparkContext.

Parameters:: path - Path to the directory that was specified as the checkpoint directory; sparkContext - Existing SparkContext

Method Detail

getActive

public static scala.Option<StreamingContext> getActive()

:: Experimental ::

Get the currently active context, if there is one. Active means started but not stopped.

Returns:: (undocumented)

toPairDStreamFunctions

public static <K,V> PairDStreamFunctions<K,V> toPairDStreamFunctions(DStream<scala.Tuple2<K,V>> stream,
                                                                     scala.reflect.ClassTag<K> kt,
                                                                     scala.reflect.ClassTag<V> vt,
                                                                     scala.math.Ordering<K> ord)

Deprecated. As of 1.3.0, replaced by implicit functions in the DStream companion object. This is kept here only for backward compatibility.

Parameters:: stream - (undocumented); kt - (undocumented); vt - (undocumented); ord - (undocumented)
Returns:: (undocumented)

getActiveOrCreate

public static StreamingContext getActiveOrCreate(scala.Function0<StreamingContext> creatingFunc)

:: Experimental ::

Either return the "active" StreamingContext (that is, started but not stopped), or create a new StreamingContext that is

Parameters:: creatingFunc - Function to create a new StreamingContext
Returns:: (undocumented)

getActiveOrCreate

public static StreamingContext getActiveOrCreate(String checkpointPath,
                                                 scala.Function0<StreamingContext> creatingFunc,
                                                 org.apache.hadoop.conf.Configuration hadoopConf,
                                                 boolean createOnError)

:: Experimental ::

Either get the currently active StreamingContext (that is, started but not stopped), OR recreate a StreamingContext from checkpoint data in the given path. If checkpoint data does not exist in the provided, then create a new StreamingContext by calling the provided creatingFunc.

Parameters:: checkpointPath - Checkpoint directory used in an earlier StreamingContext program; creatingFunc - Function to create a new StreamingContext; hadoopConf - Optional Hadoop configuration if necessary for reading from the file system; createOnError - Optional, whether to create a new StreamingContext if there is an error in reading checkpoint data. By default, an exception will be thrown on error.
Returns:: (undocumented)

getOrCreate

public static StreamingContext getOrCreate(String checkpointPath,
                                           scala.Function0<StreamingContext> creatingFunc,
                                           org.apache.hadoop.conf.Configuration hadoopConf,
                                           boolean createOnError)

Either recreate a StreamingContext from checkpoint data or create a new StreamingContext. If checkpoint data exists in the provided checkpointPath, then StreamingContext will be recreated from the checkpoint data. If the data does not exist, then the StreamingContext will be created by called the provided creatingFunc.

Parameters:: checkpointPath - Checkpoint directory used in an earlier StreamingContext program; creatingFunc - Function to create a new StreamingContext; hadoopConf - Optional Hadoop configuration if necessary for reading from the file system; createOnError - Optional, whether to create a new StreamingContext if there is an error in reading checkpoint data. By default, an exception will be thrown on error.
Returns:: (undocumented)

jarOfClass

public static scala.Option<String> jarOfClass(Class<?> cls)

Find the JAR from which a given class was loaded, to make it easy for users to pass their JARs to StreamingContext.

Parameters:: cls - (undocumented)
Returns:: (undocumented)

isCheckpointPresent

public boolean isCheckpointPresent()

sc

public SparkContext sc()

conf

public SparkConf conf()

env

public SparkEnv env()

graph

public org.apache.spark.streaming.DStreamGraph graph()

checkpointDir

public String checkpointDir()

checkpointDuration

public Duration checkpointDuration()

scheduler

public org.apache.spark.streaming.scheduler.JobScheduler scheduler()

waiter

public org.apache.spark.streaming.ContextWaiter waiter()

progressListener

public org.apache.spark.streaming.ui.StreamingJobProgressListener progressListener()

uiTab

public scala.Option<org.apache.spark.streaming.ui.StreamingTab> uiTab()

sparkContext

public SparkContext sparkContext()

Return the associated Spark context

Returns:: (undocumented)

remember

public void remember(Duration duration)

Set each DStreams in this context to remember RDDs it generated in the last given duration. DStreams remember RDDs only for a limited duration of time and releases them for garbage collection. This method allows the developer to specify how long to remember the RDDs ( if the developer wishes to query old data outside the DStream computation).

Parameters:: duration - Minimum duration that each DStream should remember its RDDs

checkpoint

public void checkpoint(String directory)

Set the context to periodically checkpoint the DStream operations for driver fault-tolerance.

Parameters:: directory - HDFS-compatible directory where the checkpoint data will be reliably stored. Note that this must be a fault-tolerant file system like HDFS for

networkStream

public <T> ReceiverInputDStream<T> networkStream(Receiver<T> receiver,
                                                 scala.reflect.ClassTag<T> evidence$1)

Deprecated. As of 1.0.0", replaced by receiverStream.

Create an input stream with any arbitrary user implemented receiver. Find more details at: http://spark.apache.org/docs/latest/streaming-custom-receivers.html

Parameters:: receiver - Custom implementation of Receiver; evidence$1 - (undocumented)
Returns:: (undocumented)

receiverStream

public <T> ReceiverInputDStream<T> receiverStream(Receiver<T> receiver,
                                                  scala.reflect.ClassTag<T> evidence$2)

Create an input stream with any arbitrary user implemented receiver. Find more details at: http://spark.apache.org/docs/latest/streaming-custom-receivers.html

Parameters:: receiver - Custom implementation of Receiver; evidence$2 - (undocumented)
Returns:: (undocumented)

actorStream

public <T> ReceiverInputDStream<T> actorStream(akka.actor.Props props,
                                               String name,
                                               StorageLevel storageLevel,
                                               akka.actor.SupervisorStrategy supervisorStrategy,
                                               scala.reflect.ClassTag<T> evidence$3)

Create an input stream with any arbitrary user implemented actor receiver. Find more details at: http://spark.apache.org/docs/latest/streaming-custom-receivers.html

Parameters:: props - Props object defining creation of the actor; name - Name of the actor; storageLevel - RDD storage level (default: StorageLevel.MEMORY_AND_DISK_SER_2); supervisorStrategy - (undocumented); evidence$3 - (undocumented)
Returns:: (undocumented)

socketTextStream

public ReceiverInputDStream<String> socketTextStream(String hostname,
                                                     int port,
                                                     StorageLevel storageLevel)

Create a input stream from TCP source hostname:port. Data is received using a TCP socket and the receive bytes is interpreted as UTF8 encoded \n delimited lines.

Parameters:: hostname - Hostname to connect to for receiving data; port - Port to connect to for receiving data; storageLevel - Storage level to use for storing the received objects (default: StorageLevel.MEMORY_AND_DISK_SER_2)
Returns:: (undocumented)

socketStream

public <T> ReceiverInputDStream<T> socketStream(String hostname,
                                                int port,
                                                scala.Function1<java.io.InputStream,scala.collection.Iterator<T>> converter,
                                                StorageLevel storageLevel,
                                                scala.reflect.ClassTag<T> evidence$4)

Create a input stream from TCP source hostname:port. Data is received using a TCP socket and the receive bytes it interepreted as object using the given converter.

Parameters:: hostname - Hostname to connect to for receiving data; port - Port to connect to for receiving data; converter - Function to convert the byte stream to objects; storageLevel - Storage level to use for storing the received objects; evidence$4 - (undocumented)
Returns:: (undocumented)

rawSocketStream

public <T> ReceiverInputDStream<T> rawSocketStream(String hostname,
                                                   int port,
                                                   StorageLevel storageLevel,
                                                   scala.reflect.ClassTag<T> evidence$5)

Parameters:: hostname - Hostname to connect to for receiving data; port - Port to connect to for receiving data; storageLevel - Storage level to use for storing the received objects (default: StorageLevel.MEMORY_AND_DISK_SER_2); evidence$5 - (undocumented)
Returns:: (undocumented)

fileStream

public <K,V,F extends org.apache.hadoop.mapreduce.InputFormat<K,V>> InputDStream<scala.Tuple2<K,V>> fileStream(String directory,
                                                                                                               scala.reflect.ClassTag<K> evidence$6,
                                                                                                               scala.reflect.ClassTag<V> evidence$7,
                                                                                                               scala.reflect.ClassTag<F> evidence$8)

Create a input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format. Files must be written to the monitored directory by "moving" them from another location within the same file system. File names starting with . are ignored.

Parameters:: directory - HDFS directory to monitor for new file; evidence$6 - (undocumented); evidence$7 - (undocumented); evidence$8 - (undocumented)
Returns:: (undocumented)

fileStream

public <K,V,F extends org.apache.hadoop.mapreduce.InputFormat<K,V>> InputDStream<scala.Tuple2<K,V>> fileStream(String directory,
                                                                                                               scala.Function1<org.apache.hadoop.fs.Path,Object> filter,
                                                                                                               boolean newFilesOnly,
                                                                                                               scala.reflect.ClassTag<K> evidence$9,
                                                                                                               scala.reflect.ClassTag<V> evidence$10,
                                                                                                               scala.reflect.ClassTag<F> evidence$11)

Parameters:: directory - HDFS directory to monitor for new file; filter - Function to filter paths to process; newFilesOnly - Should process only new files and ignore existing files in the directory; evidence$9 - (undocumented); evidence$10 - (undocumented); evidence$11 - (undocumented)
Returns:: (undocumented)

fileStream

public <K,V,F extends org.apache.hadoop.mapreduce.InputFormat<K,V>> InputDStream<scala.Tuple2<K,V>> fileStream(String directory,
                                                                                                               scala.Function1<org.apache.hadoop.fs.Path,Object> filter,
                                                                                                               boolean newFilesOnly,
                                                                                                               org.apache.hadoop.conf.Configuration conf,
                                                                                                               scala.reflect.ClassTag<K> evidence$12,
                                                                                                               scala.reflect.ClassTag<V> evidence$13,
                                                                                                               scala.reflect.ClassTag<F> evidence$14)

Parameters:: directory - HDFS directory to monitor for new file; filter - Function to filter paths to process; newFilesOnly - Should process only new files and ignore existing files in the directory; conf - Hadoop configuration; evidence$12 - (undocumented); evidence$13 - (undocumented); evidence$14 - (undocumented)
Returns:: (undocumented)

textFileStream

public DStream<String> textFileStream(String directory)

Create a input stream that monitors a Hadoop-compatible filesystem for new files and reads them as text files (using key as LongWritable, value as Text and input format as TextInputFormat). Files must be written to the monitored directory by "moving" them from another location within the same file system. File names starting with . are ignored.

Parameters:: directory - HDFS directory to monitor for new file
Returns:: (undocumented)

binaryRecordsStream

public DStream<byte[]> binaryRecordsStream(String directory,
                                           int recordLength)

:: Experimental ::

Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them as flat binary files, assuming a fixed length per record, generating one byte array per record. Files must be written to the monitored directory by "moving" them from another location within the same file system. File names starting with . are ignored.

'''Note:''' We ensure that the byte array for each record in the resulting RDDs of the DStream has the provided record length.

Parameters:: directory - HDFS directory to monitor for new file; recordLength - length of each record in bytes
Returns:: (undocumented)

queueStream

public <T> InputDStream<T> queueStream(scala.collection.mutable.Queue<RDD<T>> queue,
                                       boolean oneAtATime,
                                       scala.reflect.ClassTag<T> evidence$15)

Create an input stream from a queue of RDDs. In each batch, it will process either one or all of the RDDs returned by the queue.

NOTE: Arbitrary RDDs can be added to queueStream, there is no way to recover data of those RDDs, so queueStream doesn't support checkpointing.

Parameters:: queue - Queue of RDDs; oneAtATime - Whether only one RDD should be consumed from the queue in every interval; evidence$15 - (undocumented)
Returns:: (undocumented)

queueStream

public <T> InputDStream<T> queueStream(scala.collection.mutable.Queue<RDD<T>> queue,
                                       boolean oneAtATime,
                                       RDD<T> defaultRDD,
                                       scala.reflect.ClassTag<T> evidence$16)

Create an input stream from a queue of RDDs. In each batch, it will process either one or all of the RDDs returned by the queue.

NOTE: Arbitrary RDDs can be added to queueStream, there is no way to recover data of those RDDs, so queueStream doesn't support checkpointing.

Parameters:: queue - Queue of RDDs; oneAtATime - Whether only one RDD should be consumed from the queue in every interval; defaultRDD - Default RDD is returned by the DStream when the queue is empty. Set as null if no RDD should be returned when empty; evidence$16 - (undocumented)
Returns:: (undocumented)

union

public <T> DStream<T> union(scala.collection.Seq<DStream<T>> streams,
                            scala.reflect.ClassTag<T> evidence$17)

Create a unified DStream from multiple DStreams of the same type and same slide duration.

Parameters:: streams - (undocumented); evidence$17 - (undocumented)
Returns:: (undocumented)

transform

public <T> DStream<T> transform(scala.collection.Seq<DStream<?>> dstreams,
                                scala.Function2<scala.collection.Seq<RDD<?>>,Time,RDD<T>> transformFunc,
                                scala.reflect.ClassTag<T> evidence$18)

Create a new DStream in which each RDD is generated by applying a function on RDDs of the DStreams.

Parameters:: dstreams - (undocumented); transformFunc - (undocumented); evidence$18 - (undocumented)
Returns:: (undocumented)

addStreamingListener

public void addStreamingListener(StreamingListener streamingListener)

Add a StreamingListener object for receiving system events related to streaming.

Parameters:: streamingListener - (undocumented)

getState

public StreamingContextState getState()

:: DeveloperApi ::

Return the current state of the context. The context can be in three possible states - - StreamingContextState.INTIALIZED - The context has been created, but not been started yet. Input DStreams, transformations and output operations can be created on the context. - StreamingContextState.ACTIVE - The context has been started, and been not stopped. Input DStreams, transformations and output operations cannot be created on the context. - StreamingContextState.STOPPED - The context has been stopped and cannot be used any more.

Returns:: (undocumented)

start

public void start()

Start the execution of the streams.

Throws:: IllegalStateException - if the StreamingContext is already stopped.

awaitTermination

public void awaitTermination()

Wait for the execution to stop. Any exceptions that occurs during the execution will be thrown in this thread.

awaitTermination

public void awaitTermination(long timeout)

Deprecated. As of 1.3.0, replaced by awaitTerminationOrTimeout(Long).

Wait for the execution to stop. Any exceptions that occurs during the execution will be thrown in this thread.

Parameters:: timeout - time to wait in milliseconds

awaitTerminationOrTimeout

public boolean awaitTerminationOrTimeout(long timeout)

Wait for the execution to stop. Any exceptions that occurs during the execution will be thrown in this thread.

Parameters:: timeout - time to wait in milliseconds
Returns:: true if it's stopped; or throw the reported error during the execution; or false if the waiting time elapsed before returning from the method.

stop

public void stop(boolean stopSparkContext)

Stop the execution of the streams immediately (does not wait for all received data to be processed). By default, if stopSparkContext is not specified, the underlying SparkContext will also be stopped. This implicit behavior can be configured using the SparkConf configuration spark.streaming.stopSparkContextByDefault.

Parameters:: stopSparkContext - If true, stops the associated SparkContext. The underlying SparkContext will be stopped regardless of whether this StreamingContext has been started.

stop

public void stop(boolean stopSparkContext,
                 boolean stopGracefully)

Stop the execution of the streams, with option of ensuring all received data has been processed.

Parameters:: stopSparkContext - if true, stops the associated SparkContext. The underlying SparkContext will be stopped regardless of whether this StreamingContext has been started.; stopGracefully - if true, stops gracefully by waiting for the processing of all received data to be completed

Overview

Package

Class

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.spark.streaming Class StreamingContext

StreamingContext

StreamingContext

StreamingContext

StreamingContext

StreamingContext

StreamingContext

getActive

toPairDStreamFunctions

getActiveOrCreate

getActiveOrCreate

getOrCreate

jarOfClass

isCheckpointPresent

sc

conf

env

graph

checkpointDir

checkpointDuration

scheduler

waiter

progressListener

uiTab

sparkContext

remember

checkpoint

networkStream

receiverStream

actorStream

socketTextStream

socketStream

rawSocketStream

fileStream

fileStream

fileStream

textFileStream

binaryRecordsStream

queueStream

queueStream

union

transform

addStreamingListener

getState

start

awaitTermination

awaitTermination

awaitTerminationOrTimeout

stop

stop

org.apache.spark.streaming
Class StreamingContext