This guide describes how to integrate Azure Blob Storage within your Charmed Kubeflow deployment.
Requirements
Assign permissions
Before starting to work with Azure Blob Storage, ensure that you have the necessary permissions to access the storage account. Follow this guide to self-assign the Storage Blob Data Contributor
role.
Create and connect to a new notebook
From the Kubeflow dashboard, navigate to Notebooks
, and click on New Notebook
. Select a JupyterLab
environment, and connect to the newly created notebook.
Install required packages
On the JupyterLab launcher, click on Terminal
to start a new terminal session. Next, install the packages required to connect to your Azure account and interact with it using the Python client library:
pip install azure-cli azure-storage-blob azure-identity
The installation may take a few minutes to complete.
Sign in to your Azure account
Sign in to Azure through the Azure CLI using the following command:
az login
Confirm that you have successfully logged in to your account:
az account show
Connect to your Azure account via the Python client
Add a new tab in your JupyterLab environment, and then create a new Python 3 notebook. Within a notebook cell, run the following code to connect to your Azure account:
import os, uuid
from azure.identity import AzureCliCredential
from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient
try:
print("Azure Blob Storage Python quickstart sample")
account_url = "https://<storageaccountname>.blob.core.windows.net"
default_credential = AzureCliCredential()
blob_service_client = BlobServiceClient(account_url, credential=default_credential)
except Exception as ex:
print('Exception:')
print(ex)
Replace the <storageaccountname>
token with the storage account name you want to interact with.
Create a new blob container
You can create a new blob container by creating a new text file in the data
directory and upload it as follows:
local_path = "./data"
os.mkdir(local_path)
local_file_name = str(uuid.uuid4()) + ".txt"
upload_file_path = os.path.join(local_path, local_file_name)
file = open(file=upload_file_path, mode='w')
file.write("Hello, World!")
file.close()
container_name = str(uuid.uuid4())
blob_client = blob_service_client.get_blob_client(container=container_name, blob=local_file_name)
print("\nUploading to Azure Storage as blob:\n\t" + local_file_name)
with open(file=upload_file_path, mode="rb") as data:
blob_client.upload_blob(data)
Establish the local file name defining the local_file_name
variable and the new container name defining the container_name
variable.
See Naming and Referencing Containers, Blobs, and Metadata for more information about naming containers.
List the blobs in a container
You can list all blobs in a specified container as follows:
print("\nListing blobs...")
blob_list = container_client.list_blobs()
for blob in blob_list:
print("\t" + blob.name)
Download blobs
You can download blobs and save them to your local file system. Use the following code to download the blob specified by its name:
download_file_path = os.path.join(local_path, str.replace(local_file_name ,'.txt', 'DOWNLOAD.txt'))
container_client = blob_service_client.get_container_client(container= container_name)
print("\nDownloading blob to \n\t" + download_file_path)
with open(file=download_file_path, mode="wb") as download_file:
download_file.write(container_client.download_blob(blob.name).readall())
Clean up resources
Clean up the resources created throughout this guide by running the following code:
print("\nPress the Enter key to begin clean up")
input()
print("Deleting blob container...")
container_client.delete_container()
print("Deleting the local source and downloaded files...")
os.remove(upload_file_path)
os.remove(download_file_path)
os.rmdir(local_path)
print("Done")
Alternatively, you can also use the Azure CLI along with this guide to do so.