Charmed Apache Spark K8s Documentation

Charmed Apache Spark K8s Documentation

Charmed Apache Spark solution is a set of Canonical supported artefacts (including charms, rocks and snaps) that make operating Apache Spark workloads on Kubernetes seamless, secure and production-ready. This solution includes the Charmed Apache Spark bundle as well as Client tools snap for Apache Spark and spark8t. For more information on the contents of the Charmed Apache Spark bundle and Charmed Apache Spark solution, see the Components explanation page.

Apache Spark is a free, open-source software project by the Apache Software Foundation. Users can find out more at the Apache Spark project page.

The solution helps to simplify user interaction with Apache Spark applications and the underlying Kubernetes cluster whilst retaining the traditional semantics and command line tooling that users already know. Operators benefit from straightforward, automated deployment of Apache Spark components (e.g. Spark History Server) to the Kubernetes cluster, using Juju.

Deploying Apache Spark applications to Kubernetes has several benefits over other cluster resource managers such as Apache YARN, as it greatly simplifies deployment, operation, authentication while allowing for flexibility and scaling. However, it requires knowledge on Kubernetes, networking and coordination between the different components of the Apache Spark ecosystem in order to provide a scalable, secure and production-ready environment. As a consequence, this can significantly increase complexity for the end user and administrators, as a number of parameters need to be configured and prerequisites must be met for the application to deploy correctly or for using the Spark CLI interface (e.g. pyspark and spark-shell).

Charmed Apache Spark helps to address these usability concerns and provides a consistent management interface for operations engineers and cluster administrators who need to manage enablers like Spark History Server.

Project and community

Charmed Apache Spark is a distribution of Apache Spark. It’s an open-source project that welcomes community contributions, suggestions, fixes and constructive feedback.

Navigation

Level Path Navlink
1 overview Overview
1 tutorial Tutorial
2 t-overview 1. Introduction
2 t-setup-environment 2. Set up the environment for the tutorial
2 t-spark-shell 2. Interacting with Spark using Interactive Shell
2 t-spark-submit 3. Submitting Jobs using Spark Submit
2 t-spark-streaming 4. Streaming workloads with Charmed Apache Spark
2 t-spark-monitoring 5. Monitoring the Spark cluster
2 t-wrapping-up 6. Wrapping Up
1 how-to How To
2 h-setup-k8s Setup the Environment
2 h-deploy Deploy Charmed Apache Spark
2 h-manage-service-accounts Manage Service Accounts using the snap
2 h-use-spark-client-from-python Manage Service Accounts using Python
2 h-use-integration-hub Manage Service Accounts using Integration Hub
2 h-spark-monitoring Enable and Configure Monitoring
2 h-expose-history-server Expose Spark History Server using Ingress
2 h-history-server-authorization Spark History Server authorization
2 h-run-on-k8s-pod Use K8s pods to run Charmed Apache Spark
2 h-spark-streaming Run Spark Streaming Jobs
2 h-spark-gpu Run Spark with GPU enabled
2 h-spark-cert Manage self-signed certificates
1 reference Reference
2 r-requirements Requirements
2 r-contacts Contacts
1 explanation Explanation
2 e-component-overview Component Overview
2 e-security Security
2 e-hardening Hardening Guide
2 e-configuration Charmed Apache Spark Hierarchical Configuration
2 e-monitoring Charmed Apache Spark Monitoring
2 e-trademarks Trademarks

Redirects

Mapping table
Path Location
1 Like