How to Install Apache Spark on Ubuntu 16.04 / Debian 8

How to Install Apache Spark on Ubuntu 16.04 / Debian 8 / Linux mint 17. Apache Spark is a flexible and fast solution for large scale data processing. It is an open source distributed engine suitable for large scale data processing. Apache spark was founded by an Apache Software Foundation. It can be run on HBase, Hadoop, Cassandra, Hive, Apache Mesos, Amazon EC2 Cloud, HDFS, etc. It can run using its standalone cluster mode as well as on various cloud platforms.

Step-1 (Add the Java PPA)
# apt-add-repository ppa:webupd8team/java
Step-2 (Install the Java)

Update the apt-get repository

# apt-get update

Install the Java installer

# apt-get install oracle-java8-installer
Step-3 (Install the Scala)

Create the Scala directory

# mkdir /opt/scala

Download the Scala packages

# wget http://downloads.lightbend.com/scala/2.12.1/scala-2.12.1.deb

Install the Scala packages

# dpkg -i scala-2.12.1.deb
Step-4 (Install the Apache Spark)

Download the Apache Spark Tarball

# wget http://d3kbcqa49mib13.cloudfront.net/spark-2.0.2-bin-hadoop2.7.tgz

Extract the Apache Spark Tarball

# tar -xvf spark-2.0.2-bin-hadoop2.7.tgz

Copy the extracted packages to /opt/spark

# cp -rv spark-2.0.2-bin-hadoop2.7/* /opt/spark

Change the directory to access Spark Shell

# cd /opt/spark

Finally run the Spark Shell from Bash known as Scala Shell

# ./bin/spark-shell --master local[2]

Final Words

That’s all now you have installed the Apache Spark on your Debian based Linux distributions with ease. If you have any issues with above guides you need to use the comment section below.

Related Post

Develop New SysAdmin Skills with E-books (FREE Download)