Installing the Databricks CLI
Written by: Chris Sutcliffe
The Databricks command-line interface (CLI) provides an easy-to-use interface to the Databricks platform and is built on top of the Databricks REST API and can be used with the Workspace, DBFS, Jobs, Clusters, Libraries and Secrets API
To get you started, in this blog we'll walk you through all the steps invovled, right from the beginning.
Step 1: Installing Python
If already have Python installed, ensure you have version 2.7.9 or above installed and skip to Step 3.
To install python (in Windows 10), head over to the python website and download the latest version here. I chose the Windows x86-64 executable installer.
Next you'll need to add a PYTHON_HOME system variable pointing to the directory of your python installation. To do this, click Start and search for "System" and click "Edit the system environment variables.
On the System Properties window Advanced tab, click Environment Variables...
Click new and enter PYTHON_HOME as the variable name, and the path to your python installation as the variable value
Next you'll need to add the PYTHON_HOME to the Path environment variable. To do this, in the Environment Variables window, find the "Path" system variable, click Edit > New and enter: %PYTHON_HOME%\;%PYTHON_HOME%\Scripts\
Click ok to save your changes.
Step 2: Installing pip
If this is your first time installing Python, you'll next need to install pip. Pip is the standard package manager for Python and it allows you to install and manage additional packages that are not part of Pythons standard library.
To install pip, open your web browser and enter the URL https://bootstrap.pypa.io/get-pip.py, right click on the page and Save as.
Next, open your command prompt and navigate to the folder you downloaded the get-pip.py file, and exectute the following command:
python get-pip.py
Step 3: Installing and configuring the Databricks CLI
From the command prompt, execute the following command:
pip install databricks-cli
Step 4: Create a Databricks Access Token
Accessing Databricks via the Databricks CLI requires generating an access token. To do this, use the same method we explained in a previous blog Connecting Power BI to Databricks or follow the steps below:
From the Azure Databricks portal, click on the account icon
Then select User Settings
Click on Access Tokens
Click "Generate New Token" and add a comment and lifetime (leaving the lifetime blank will give it an indefinite lifetime).
This token can then be used when connecting with the Databricks CLI (make you copy this, you only get one chance). Its recommended you store this in a safe place such as the Azure Key Vault.
Step 5: Login to the Databricks Runtime
From your command prompt type the following command:
databricks configure --token
Databricks Host (should begin with https://): e.g. https://<your region>.azuredatabricks.net/?o=XXXXXXXXXXXXXX
Token: <paste your token here>
If all goes well, you should now be able to manage Databricks using a multitude of commands.
Summary
In this blog, we walked through all the steps to install and access the Databricks CLI to enable you to manage all aspects of your Databricks environment. In future blogs we'll go though more examples of using the CLI.
BizOne's consultants are experts in developing cloud based solutions on Microsoft Azure using Azure Databricks. For more information on how we can help you leverage cutting edge tool for your organization, contact us below and schedule a free demo!