Sunday, October 18, 2020

Manage Secrets in Azure Databricks Using Azure Key Vault

 To manage credentials Azure Databricks offers Secret Management. Secret Management allows users to share credentials in a secure mechanism. Currently Azure Databricks offers two types of Secret Scopes:

  • Azure Key Vault-backed: To reference secrets stored in an Azure Key Vault, you can create a secret scope backed by Azure Key Vault. Azure Key Vault-backed secrets are only supported for Azure Databricks Premium Plan.
  • Databricks-backed: A Databricks-backed scope is stored in (backed by) an Azure Databricks database. You create a Databricks-backed secret scope using the Databricks CLI (version 0.7.1 and above).

Creating Azure Key Vault

Open a Web Browser. I am using Chrome.

Enter the URL https://portal.azure.com and hit enter.

Web Browser

Sign in into your Azure Account.


Azure Portal  - Login Information

After successfully logging to Azure Portal, you should see the following screen.

Azure Portal - Home Page

Click on "All Services" on the top left corner.

Azure Portal - All Services

Search for "Azure Key Vault" in the "All Services" search text box.

Azure Portal - Search for Azure Key Vault service

Click on "Key vaults". It will open the blade for "Key vaults".


Azure Portal - Azure Key Vault Service View

Click on "Add". It will open a new blade for creating a key vault "Create key vault".

Azure Portal - Create Azure Key Vault

Enter all the information and click the "Create" button. Once the resource is created, refresh the screen and it will show the new "key vault" which we created.

Azure Portal - Azure Key Vault Service View with the newly created Azure Key Vault service

Click on the "key vault" name.

Azure Portal - Azure Key Vault Overview Page

Scroll down and click on the "Properties".


Azure Portal - Azure Key Vault Menu

Save the following information for the "key vault" created. We would be using these properties when we connect to the "key Vault" from "databricks"

  • DNS Name
  • Resource ID
Azure Portal - Azure Key Vault Properties (DNS Name and Resource ID)

Creating Secret in Azure Key Vault

Click on "Secrets" on the left-hand side.

Azure Portal - Azure Key Vault Menu

Click on "Generate/Import". We will be creating a secret for the "access key" for the "Azure Blob Storage".

Azure Portal - Azure Key Vault Generate/Import View

Enter the required information for creating the "secret".

Azure Portal - Azure Key Vault Secret creation view

After entering all the information click on the "Create" button.

Azure Portal - Azure Key Vault Generate/Import View

Note down the "Name" of the secret.

Creating Azure Key Vault Secret Scope in Databricks

Open a Web Browser. I am using Chrome.

Enter the URL https://portal.azure.com and hit enter.

Web Browser 

Sign in into your Azure Account.

Azure Portal  - Login Information

Open the Azure Databricks workspace created as part of the Azure Databricks Workspace mentioned in the Requirements section at the top of the article.

Azure Databricks - Workspace


Click on Launch Workspace to open Azure Databricks.

Azure Databricks - Home Page

Copy the "URL" from the browser window.

Azure Databricks - Home Page

Build the "URL" for creating the secret scope. https://<Databricks_url>#secrets/createScope.

Azure Databricks - Creating the Azure Key Vault backed  secret scope.

Enter all the required information:

  • Scope Name.
  • DNS Name (this is the "DNS name" which we saved when we created the "Azure Key Vault").
  • Resource ID (this is the "Resource ID" which we saved when we created the "Azure Key Vault").
Azure Databricks - Creating the Azure Key Vault backed  secret scope.

Click the "Create" button.

"Databricks" is now connected with "Azure Key Vault".

Using Azure Key Vault Secret Scope and Secret in Azure Databricks Notebook

Open a Web Browser. I am using Chrome.

Enter the URL https://portal.azure.com and hit enter.

Web Browser

Sign in into your Azure Account.

Azure Portal  - Login Information

Open the Azure Databricks workspace created as part of the "Azure Databricks Workspace" mentioned in the Requirements section at the top of the article.

Azure Databricks - Workspace

Click on "Launch Workspace" to open the "Azure Databricks".

Azure Databricks - Home page

In the left pane, click Workspace. From the Workspace drop-down, click Create, and then click Notebook.

In the Create Notebook dialog box, enter a name, select Python as the language.

Azure Databricks - Create a Python Notebook

Enter the following code in the Notebook

dbutils.secrets.get(scope = "azurekeyvault_secret_scope", key = "BlobStorageAccessKey") 

#azurekeyvault_secret_scope --> Azure Key Vault based scope which we created in Databricks 
#BlobStorageAccessKey --> Secret name which we created in Azure Key Vault 
command line

When you run the above command, it should show [REDACTED] which confirms that the secret was used from the Azure Key Vault secrets.

command line

In the same notebook we are going to add another command section and use Scala as the language.

%scala 
val blob_storage_account_access_key = dbutils.secrets.get(scope = "azurekeyvault_secret_scope", key = "BlobStorageAccessKey") 

//azurekeyvault_secret_scope --> Azure Key Vault based scope which we created in Databricks 
//BlobStorageAccessKey --> Secret name which we created in Azure Key Vault 
command line

When you run the above command, it should show [REDACTED] which confirms that the secret was used from the Azure Key Vault secrets.

command line

References

  1. https://docs.microsoft.com/en-us/azure/azure-databricks/what-is-azure-databricks
  2. https://azure.microsoft.com/en-us/pricing/details/key-vault/
  3. https://docs.microsoft.com/en-us/azure/azure-databricks/what-is-azure-databricks
  4. https://docs.azuredatabricks.net/user-guide/secrets/index.html#secrets-user-guide
  5. https://docs.azuredatabricks.net/user-guide/secrets/secret-scopes.html
  6. https://docs.azuredatabricks.net/user-guide/secrets/secret-scopes.html#create-an-azure-key-vault-backed-secret-scope

No comments:

Post a Comment

Get files of last hour in Azure Data Factory

  Case I have a Data Factory pipeline that should run each hour and collect all new files added to the data lake since the last run. What is...