Steps to create Azure Databricks Cluster

Login to Azure Databricks workspace.
Select Data Science & Engineering from sidebar.
Select create -> New cluster. Add below details:
- Policy - Unrestricted
- Cluster name - protecto
- Cluster mode - Standard
- Databricks runtime version - latest with LTS
- Enable table access control and only allow Python and SQL commands
- Worker type - Node Size - 4 Core, 14 GB RAM (Standard_DS3_v2)
- Driver Type - same as worker
- In Advance option

Add the following in spark config:

spark.databricks.acl.dfAclsEnabled true

spark data brickss.repl.allowedLanguages python,sql

spark.databricks.delta.preview.enabled true

Note : python notebook and job creation steps will be shared during Protecto product installation. Please find the attached files (protecto_python_notebook).

5KB

protecto_python_notebook.py

Credentials needed to connect Databricks:

Service principal application id (client id)
Service principal directory id (tenant id)
Service principal application secret (client secret)
Server hostname
Port
Sql endpoint http path
Catalog name (eg: hive_metastore)

PreviousAdd Service principal (Azure AD Application) to Databricks NextSteps to create Databricks python notebook and schedule job

Last updated 1 year ago