Charmed Spark Tutorial

Charmed Spark Tutorial

The Charmed Spark solution delivers Spark utility client applications, that allow for simple and seamless usage of Apache Spark on Kubernetes.

First this tutorial provides instructions on how to get a simple Kubernetes distribution up and running. Then it describes how to run Spark Jobs, either as a separate process or interactively.

Through this tutorial you will learn a variety of operations, ranging from service account management, to actual computation. Finally we also provide a list of tips and tricks of common problems you may face when using Spark on Kubernetes.

This tutorial has been split into the following sections:

  1. Setting up your environment with MicroK8s
  2. Setup and manage Spark service accounts
  3. Submit a Spark Job using an enhanced version of spark-submit
  4. Perform a simple analysis using spark-shell using Scala, and pyspark using Python
  5. Review a list of common tips, tricks and issues you may face when using Spark on Kubernetes

While this tutorial intends to guide and teach you to deploy Apache Spark using Charmed Spark, it will be most beneficial if you are familiar with:

  • Basic terminal commands.
  • Apache Spark concepts and Kubernetes basic knowledge.

This tutorial can be run as in on the latest stable LTS version of Ubuntu 22.04.