Steps to create Databricks python notebook and schedule job

Step I: Create Databricks python notebook

  1. Login to Databricks as an Admin user.

  2. Provide the following details for notebook:

    • Name

    • Set default language to Python

  3. Inside the notebook, copy and paste the python code snippet shared with this document.

  4. Click Schedule on the top right corner.

Step II: Schedule job to run on notebook

Fill all the details like:

  1. Job name

  2. Schedule - Choose scheduled and configure it to run Every Day once at 1:00 with (UTC+00:00) UTC Timezone.

  3. Cluster - Choose the cluster created for Protecto in previous steps.

  4. Parameters - Add two parameters:

    • Key: host, value: <Databricks instance domain>

    • Key: token, value: <Admin Personal Access Token> (The PAT should have admin access). We are using this token for getting users and group mapping. The mapping can be extracted only with the Access token of an admin user.

  5. Alerts - Provide help@protecto.ai for both Success and Failure Alerts.

Last updated