Skip to content

Basic Setup with Databricks

This section describes how to install and configure Privacera encryption in Databricks to create policies for users and groups.


Before enabling encryption for Databricks, make sure you have enabled Databricks itself in Privacera Manager:

  • Databricks Spark Plugin (Python/SQL) on AWS, Azure, or GCP.
  • Custom properties for encryption detailed in Crypto.

Methods for Installing Encryption jar#

You can install the Privacera encryption jar file in the following ways:

After you install the jar file, you need to define some configuration properties and User-Defined Functions (UDFs) to call the Privacera encryption /protect and /unprotect API requests.

Install Encryption jar via Databricks CLI#

  1. Download the jar to a local machine.

    The variable PRIVACERA_BASE_DOWNLOAD_URL depends on the version of the Privacera software you want. See Configure and Install Core Services.

    wget ${PRIVACERA_BASE_DOWNLOAD_URL}/privacera-crypto-jar-with-dependencies.jar -O privacera-crypto-jar-with-dependencies.jar
  2. Upload the jar file to DBFS or an S3 location from where the Databricks cluster can access it.

  3. With the Databricks CLI, upload the jar into DBFS:  

    databricks fs ls
    databricks fs mkdirs dbfs:/privacera/crypto/jars
    databricks fs cp privacera-crypto-jar-with-dependencies.jar dbfs:/privacera/crypto/jars/privacera-crypto-jar-with-dependencies.jar

Install Encryption jar via Databricks UI#

  1. Go to the Databricks cluster details page: Clusters > cluster name > Libraries.

  2. Click Install > New.

  3. Drop or upload the jar file.


    Wait until the jar file is installed.

Create and Upload Encryption Configuration Files#

The steps here rely on the default location of the Privacera crypto properties file. However, you can change this location to a directory of your choice. Follow the steps here and then see Custom Path to Crypto Properties File in Databricks.

  1. Create the configuration file.

    mkdir -p privacera/crypto/configs
    cd privacera/crypto/configs
     # Edit the file to set the following variables. 
     # Mode of encryption/decryption: rpc or native
  2. Upload the configuration file to DBFS.

    databricks fs ls
    databricks fs mkdirs dbfs:/privacera/crypto/configs
    databricks fs cp dbfs:/privacera/crypto/configs/

Create Encryption UDFs#

Create Privacera encryption UDFs (User-Defined Functions) by running SQL queries in the Databricks cluster:

  • SQL query to create Privacera protect UDF:  
use privacera;
drop function if exists privacera.protect;
CREATE FUNCTION privacera.protect AS 'com.privacera.crypto.PrivaceraEncryptUDF';
  • SQL query to create privacera unprotect UDF.
create database if not exists privacera;
drop function if exists privacera.unprotect;
CREATE FUNCTION privacera.unprotect AS 'com.privacera.crypto.PrivaceraDecryptUDF'; 

Run Sample Queries To Verify#

Sample query to run encryption:

select privacera.protect(${colname},'${SCHEME_NAME}') from ${db_name}.${table_name} limit10;

Sample query to run encryption and decryption in a single query to verify the setup:

select privacera.unprotect(privacera.protect(${colname},'${SCHEME_NAME}'),'${SCHEME_NAME}') from ${db_name}.${table_name} limit10;