Usage Examples

Using DC/OS Apache Spark

This section describes a basic and advanced example of how to use DC/OS Apache Spark.

Basic

  1. Perform a default installation by following the instructions in the Install and Customize section.

  2. Run a Spark job:

    dcos spark run --submit-args="--class org.apache.spark.examples.SparkPi https://downloads.mesosphere.com/spark/assets/spark-examples_2.11-2.4.0.jar 30"
    
  3. Run a Python Spark job:

    dcos spark run --submit-args="https://downloads.mesosphere.com/spark/examples/pi.py 30"
    
  4. Run an R Spark job:

    dcos spark run --submit-args="https://downloads.mesosphere.com/spark/examples/dataframe.R"
    
  5. View the status of your job using the Spark cluster dispatcher or use the Mesos UI to see job logs.

Advanced

Run a Spark Streaming job with Kafka.

Examples of Spark Streaming applications that connect to a secure Kafka cluster can be found at spark-build. As mentioned in the Kerberos section, Spark requires a JAAS file, the krb5.conf, and the keytab.

An example of a JAAS file is:

KafkaClient {
    com.sun.security.auth.module.Krb5LoginModule required
    useKeyTab=true
    storeKey=true
    keyTab="/mnt/mesos/sandbox/kafka-client.keytab"
    useTicketCache=false
    serviceName="kafka"
    principal="client@LOCAL";
};

The corresponding dcos spark command would be:

dcos spark run --submit-args="\
--conf spark.mesos.containerizer=mesos \  # required for secrets
--conf spark.mesos.uris=<URI_of_jaas.conf> \
--conf spark.mesos.driver.secret.names=spark/__dcos_base64___keytab \  # base64 encoding of binary secrets required in DC/OS 1.10 or lower
--conf spark.mesos.driver.secret.filenames=kafka-client.keytab \
--conf spark.mesos.executor.secret.names=spark/__dcos_base64___keytab \
--conf spark.mesos.executor.secret.filenames=kafka-client.keytab \
--conf spark.mesos.task.labels=DCOS_SPACE:/spark \
--conf spark.scheduler.minRegisteredResourcesRatio=1.0 \
--conf spark.executorEnv.KRB5_CONFIG_BASE64=W2xpYmRlZmF1bHRzXQpkZWZhdWx0X3JlYWxtID0gTE9DQUwKCltyZWFsbXNdCiAgTE9DQUwgPSB7CiAgICBrZGMgPSBrZGMubWFyYXRob24uYXV0b2lwLmRjb3MudGhpc2Rjb3MuZGlyZWN0b3J5OjI1MDAKICB9Cg== \
--conf spark.mesos.driverEnv.KRB5_CONFIG_BASE64=W2xpYmRlZmF1bHRzXQpkZWZhdWx0X3JlYWxtID0gTE9DQUwKCltyZWFsbXNdCiAgTE9DQUwgPSB7CiAgICBrZGMgPSBrZGMubWFyYXRob24uYXV0b2lwLmRjb3MudGhpc2Rjb3MuZGlyZWN0b3J5OjI1MDAKICB9Cg== \
--class MyAppClass <URL_of_jar> [application args]"

NOTE: There are additional walkthroughs available in the docs/walkthroughs/ directory of Mesosphere's spark-build.