Skip to content

Basic Setup with Databricks

This section describes how to install and configure a UDF for Privacera Manager encryption in Databricks to create policies for users and groups.

For conceptual background, see PEG Architecture and Flow.

The overall approach is as follows:

  1. Install the Privacera Manager Encryption Jar in Databricks with the Databricks CLI or UI
  2. Upload Privacera Manager configuration files to Databricks
  3. Define UDFs in Databricks to call the Privacera Manager encryption protect and unprotect methods.


  1. In Databricks, make sure that the users who will use the UDFs have sufficient access to write the pertinent tables.
  2. In Privacera Manager, make sure to configure the Databricks datasource: Databricks Spark Plugin (Python/SQL) on AWS, Azure, or GCP.
  3. In Privacera Manager, make sure that Privacera Encryption has been enabled.
  4. In Privacera Manager, make sure that the users who will use the UDFs in Databricks have been given permission to access the encryption scheme policies that are part of the UDF syntax.
  5. In Privacera Manager, make sure that these same users have been given permission to access the encryption keys in the Ranger KMS.

Methods for Installing Encryption jar

You can install the Privacera encryption jar file in the following ways:

After you install the jar file, you need to define some configuration properties and User-Defined Functions (UDFs) to call the Privacera encryption /protect and /unprotect API endpoints.

Install Encryption jar via Databricks CLI

  1. Download the jar to a local machine.

    The variable PRIVACERA_BASE_DOWNLOAD_URL depends on the version of the Privacera software you want. See Configure and Install Core Services.

    wget $<PRIVACERA_BASE_DOWNLOAD_URL>/privacera-crypto-jar-with-dependencies.jar -O privacera-crypto-jar-with-dependencies.jar
  2. Upload the jar file to DBFS or an S3 location from where the Databricks cluster can access it.

  3. With the Databricks CLI, upload the jar into DBFS:  

    databricks fs ls
    databricks fs mkdirs dbfs:/privacera/crypto/jars
    databricks fs cp privacera-crypto-jar-with-dependencies.jar dbfs:/privacera/crypto/jars/privacera-crypto-jar-with-dependencies.jar

Install Encryption jar via Databricks UI

  1. Go to the Databricks cluster details page: Clusters > cluster name > Libraries.

  2. Click Install > New.

  3. Drop or upload the jar file.


    Wait until the jar file is installed.

Create and Upload Encryption Configuration Files

The steps here rely on the default location of the Privacera crypto properties file. However, you can change this location to a directory of your choice. Follow the steps here and then see Custom Path to Crypto Properties File in Databricks.

  1. Create the configuration file on your local machine. In the next step, upload the file to the Databricks cluster.

    mkdir -p privacera/crypto/configs
    cd privacera/crypto/configs
     # Edit the file to set the following variables. 
     # Mode of encryption/decryption: rpc or native
  2. Upload the configuration file to DBFS.

    databricks fs ls
    databricks fs mkdirs dbfs:/privacera/crypto/configs
    databricks fs cp dbfs:/privacera/crypto/configs/

Create Encryption UDFs

Create Privacera encryption UDFs (User-Defined Functions) by running SQL queries in the Databricks cluster:

  • SQL query to create Privacera protect UDF:  
create database if not exists privacera;
drop function if exists privacera.protect;
CREATE FUNCTION privacera.protect AS 'com.privacera.crypto.PrivaceraEncryptUDF';
  • SQL query to create privacera unprotect UDF.
create database if not exists privacera;
drop function if exists privacera.unprotect;
CREATE FUNCTION privacera.unprotect AS 'com.privacera.crypto.PrivaceraDecryptUDF'; 

Run Sample Queries To Verify

Sample query to run encryption:

select privacera.protect($<colname>,'$<SCHEME_NAME>') from $<db_name>.$<table_name> limit10;

Sample query to run encryption and decryption in a single query to verify the setup:

select privacera.unprotect(privacera.protect($<colname>,'$<SCHEME_NAME>'),'$<SCHEME_NAME>') from $<db_name>.$<table_name> limit10;