How to install Spark for Jupyter

In order to configure GeoMesa Jupyter Kernels, we need to install Spark on the server running Jupyter.

Step-by-step guide

  1. Install Python 2.7 or 3.x for you Linux distribution.
  2. Download Spark.
  3. Unpack Spark and set up environment variables.

Note: Install Python before proceeding. Jupyter requires 2.7 or 3.x. Instructions for upgrading Python

Download and extract the binary files
wget http://d3kbcqa49mib13.cloudfront.net/spark-1.6.2-bin-hadoop2.6.tgz
tar xvf spark-1.6.2.tgz

You can select a different version here.

Configure environment variables, edit the paths as needed.

Setup environment variables
export SPARK_HADOOP_VERSION=2.6.0
export SPARK_YARN=true sbt/sbt assembly
export SPARK_HOME=~/spark-1.6.2-bin-hadoop2.6
PYTHONPATH=$SPARK_HOME/python:$SPARK_HOME/python/lib/py4j-0.9-src.zip:$PYTHONPATH

Add these to .bashrc as well.