Get started with Managed Apache Spark on Azure

Get started with Managed Apache Spark on Azure

This guide describes how to get started with Canonical’s fully managed solution of Charmed Apache Spark running on top of Microsoft Azure. It provides supported operation management, deployment, and dedicated customer service.

Requirements

Before starting, ensure you have:

  • Admin rights on your Azure tenant.
  • The Microsoft.Compute and Microsoft.Capacity resource providers registered.

Access Azure Marketplace

Visit Azure Marketplace to find Canonical’s Managed Apache Spark offering.
You will find all the information about the application, available plans and pricing, reviews, and more.

Get started by clicking Create.

Canonical's Managed Apache Spark Create button

Configure initial Azure settings

Configure Apache Spark parameters:

  1. Start by selecting a subscription and a resource group name (create a new one if needed). This is where Apache Spark is set up: Azure subscription and resource group selection

  2. Select the region: Azure region

  3. Insert your contact information: Contact information

  4. Insert the name of your application and the resource group to contain the managed service. You can leave the latter field as is by default: App name

  5. Once completed, click Next.

Configure Apache Spark cluster resources

Let’s configure your newly deployed Apache Spark instance.

In case you see a warning about `Microsoft.Capacity` and `Microsoft.Compute` resource providers, refer to the guide linked in the warning for more information.

  1. Choose whether you want High Availability (HA).

Enabling HA is recommended for production environments, while disabling it in development and testing environments is acceptable for saving on costs.

  1. Choose the VM flavour for the management instance.
    The minimum deployment requires a machine size of at least 4 vCPUs and 16 GB of RAM. The use of vGPUs on a management instance is not recommended. For example, you can use 5 instances of the Standard D4s v3.

Management instance size

The system automatically checks if the required quota for the setup is allowed in your Azure tenant. If it’s not, the procedure generates a message containing all the information needed to increase your quota before continuing with the setup. Please, follow those instructions if needed.

  1. Select the worker nodes Pool Name which is the name given to the AKS NodePool used for the workers.
  2. Configure the Apache Spark workers size and range.
    A minimum viable worker node requires at least 4 vCPUs and 16 GB of RAM but using sizes with multiple vGPUs is recommended for the production environment:

Worker node size

  1. Once completed, click Next.

Configure storage

An Azure storage account is required for your Apache Spark environment.

  1. Select whether you want to use an existing storage account or create a new one.
    Select new or existing storage account

    a. If you decide to create a new storage account, please select a name for it.
    The name must be unique in Azure and avoid naming conflicts with other storage accounts. Additionally, the name must have between 3 and 24 characters in length and use numbers and lower-case letters only.
    We recommend using the suggested name unless there is a conflict. New storage account b. If you decide to use an existing storage account, please select the one you’d like to use and its secret key and container. Existing storage account

  2. Once the storage is configured, click Next.

Configure access

To configure access, you will have to choose the password for admin access to Kyuubi.

  1. Select a password for Kyuubi. The password must be at least 8 characters long and contain only numbers and letters. The username will be admin.

Kyuubi password setup

  1. Once you have set the password, click Next.

Review and submit

  1. Check and agree with the terms and conditions.
  2. Click Create to finish the configuration and start the setup.

The setup should take between 15 and 60 minutes. Once completed, you will be notified via email from canonical.com which will give you all the links and information to start using your setup. If you cannot find it, please check your Spam folder.

Get further help

You can contact Canonical Managed Services for customized deployments if you are a new user.

You can also visit the Support portal or contact out Support team if you are an existing user. You will be asked to provide your Ubuntu One account details, your subscription date, and the email address associated with the deployment.