Posts

Showing posts from September, 2020

Indexing in Azure Cosmos DB

Image
  Even though Cosmos DB automatically indexes every property by default, understanding how indexing works in Cosmos DB is vital for achieving efficient query performance. In  Azure Cosmos DB , every property in our items are indexed by default. This is fantastic for developers, as this means we don’t have to spend time managing indexing ourselves. However, there may be times where we do want to customize the indexing policy depending on the requirements of our workloads. The purpose of this article is to show you how indexing works in Azure Cosmos DB, what kinds of indexes there are in Cosmos DB and how we can employ different indexing strategies to optimize performance depending on what we’re trying to achieve. How indexing works in Cosmos DB Azure Cosmos DB persists our items within our containers as JSON documents. We can think of these documents as trees and each property in our item as a node within that tree. Say that I have a document for a customer, and that customer has multip

Use Azure Key Vault for Azure Databricks

Image
Use Case I need to use some passwords and keys in my Databricks notebook, but for security reasons I don't want to store them in the notebook. How do I prevent storing sensitive data in Databricks? Using Azure Key Vault for Azure Databricks Solution Let's say you want to connect to an Azure SQL Database with SQL Authentication or an Azure Blob Storage container with an Access key in Databricks. Instead of storing the password or key in the notebook in plain text, we will store it in an Azure Key Vault as a secret. With an extra line of code we will retrieve the secret and use its value for the connection. The example below will show all individual steps in detail including creating an Azure Key Vault, but assumes you already have an Azure Databricks notebook and a cluster to run its code. The steps to give Databricks access to the Key Vault slightly deviate from  Azure Data Factory  or  Azure Automation Runbook , because the access policy is set from within Databricks itself. 1