Charmed Spark Documentation

Charmed Spark Documentation

Charmed Spark is a set of Canonical supported artifacts (including charms, ROCK OCI images and SNAPs) that makes operating Spark workloads on Kubernetes seamless, secure and production-ready.

The solution helps to simplify user interaction with Spark applications and the underlying Kubernetes cluster whilst retaining the traditional semantics and command line tooling that users already know. Operators benefit from straightforward, automated deployment of Spark components (e.g. Spark History Server) to the Kubernetes cluster, using Juju.

Deploying Spark applications to Kubernetes has several benefits over other cluster resource managers such as Apache YARN, as it greatly simplifies deployment, operation, authentication while allowing for flexibility and scaling. However, it requires knowledge on Kubernetes, networking and coordination between the different components of the Spark ecosystem in order to provide a scalable, secure and production-ready environment. As a consequence, this can significantly increase complexity for the end user and administrators, as a number of parameters need to be configured and prerequisites must be met for the application to deploy correctly or for using the Spark CLI interface (e.g. pyspark and spark-shell).

Charmed Spark helps to address these usability concerns and provides a consistent management interface for operations engineers and cluster administrators who need to manage enablers like Spark History Server.

Project and community

Charmed Spark is a distribution of Apache Spark. It’s an open-source project that welcomes community contributions, suggestions, fixes and constructive feedback.


Level Path Navlink
1 overview Overview
1 tutorial Tutorial
2 t-overview 1. Introduction
2 t-setup-environment 2. Set up the environment for the tutorial
2 t-spark-shell 2. Interacting with Spark using Interactive Shell
2 t-spark-submit 3. Submitting Jobs using Spark Submit
2 t-spark-streaming 4. Streaming workloads with Charmed Spark
2 t-spark-monitoring 5. Monitoring the Spark cluster
2 t-wrapping-up 6. Wrapping Up
1 how-to How To
2 h-setup-k8s Setup Environment
2 h-manage-service-accounts Manage Service Accounts
2 h-use-spark-client-from-python Use the Spark Client Python API
2 h-run-on-k8s-pod Run on K8s pods
2 h-deploy-spark-history Deploy Spark History Server
2 h-spark-streaming Run Spark Streaming Jobs
2 h-spark-monitoring Enable monitoring
2 h-history-server-authorization Enable authorization History Server
1 reference Reference
2 r-requirements Requirements
1 explanation Explanation
2 e-component-overview Component Overview
2 e-configuration Spark Client Hierarchical Configuration


Mapping table
Path Location
1 Like