Skip to content

Databricks SQL Encryption

This procedure enables use of Privacera encryption services in a Databricks SQL notebook. The process is:

  • Create a secret shared by Privacera Encryption Gateway (PEG) and Databricks.
  • Create Resource Policies in Privacera for data access to Databricks SQL resources.
  • Create Privacera encryption and decryption User-Defined Functions (UDFs) in Databricks

About Encryption and Presentation Schemes#

For more information about Privacera encryption schemes, see the Privacera Encryption Guide.

Prerequisites#

Grant Permission in Encryption Scheme Policy#

To use Databricks SQL encryption, you must create a scheme policy for a user that will use the Databricks UDF. This scheme policy must grant the getSchemes permission. See Create Scheme Policies on PrivaceraCloud.

Configure Databricks#

Create Databricks Secrets#

With the Databricks CLI:

  1. Create a secret scope called "privaceracloud"

    databricks secrets create-scope --scope privaceracloud
    
  2. Add secrets to this scope:

    • peg_username, peg_password, and peg_secret are literals and should be entered exactly as shown.
    • The <username>, <password>, and <sharedsecret> values below are the same as what you entered in PrivaceraCloud when adding the PEG service. See Api Key.
    databricks secrets put --scope privaceracloud --key peg_username --string-value <username>
    databricks secrets put --scope privaceracloud --key peg_password --string-value <password> 
    databricks secrets put --scope privaceracloud --key peg_secret --string-value <sharedsecret>
    

Add Privacera Environment Variables to Databricks Cluster#

Add the following variables in your Databricks cluster Environment Variables configuration as shown:

PEG_SECRET={{secrets/privaceracloud/peg_secret}}
PEG_PASSWORD={{secrets/privaceracloud/peg_password}}
PEG_USERNAME={{secrets/privaceracloud/peg_username}}

Note that there can be existing environment variables. Do not remove these.

<img src="../assets/databricks_encryption_image5.png" />

Create Privacera protect and unprotect UDFs#

Log into Databricks, create a notebook, and set the language to SQL.

Run the following SQL commands in Databricks to create UDFs for Privacera encryption services, named protect and unprotect.

Note

com.privacera.crypto functions enable use of encryption schemes, but do not accept presentation schemes.

  • Create Privacera protect UDF:
create database if not exists privacera;
use privacera;
drop function if exists privacera.protect;
CREATE FUNCTION privacera.protect AS com.privacera.crypto.PrivaceraEncryptUDF'
  • Create Privacera unprotect UDF:
use privacera;
drop function if exists privacera.unprotect;
CREATE FUNCTION privacera.unprotect AS com.privacera.crypto.PrivaceraDecryptUDF'

Configure Privacera Resource Policies#

Databricks SQL resources are managed under Access Manager > Resource Policies > privacera_hive. Add resource policies allowing access to selected resources.

  1. Create a policy to give data access users, groups, or roles select privilege to target database resources. See the following sample:

    <img src="../assets/databricks_encryption_image6.png" />

  2. Create a policy to grant data access users, groups, or roles select privilege to the protect and unprotect User Defined Functions. See the following sample:

    <img src="../assets/databricks_encryption_image7.png" />

Using UDFs in SQL to Encrypt and Decrypt#

Syntax and examples for privacera.protect (encrypt) and privacera.unprotect (decrypt) UDFs:

privacera.protect#

Syntax:

select privacera.protect(${colname},'${SCHEME_NAME}') from ${db_name}.${table_name};

where:

  • ${colname} is the identifier of the column to encrypt.
  • ${SCHEME_NAME} is the name of the desired Privacera encryption scheme.
  • ${db_name}.{table_name} are the names of the database and table in that database.

Example

In this example, the email column of the bigdatabase.customer_data table is encrypted with the SYSTEM_EMAIL encryption scheme.

select privacera.protect(email, \'SYSTEM\_EMAIL\') from bigdatabase.customer\_data;

privacera.unprotect#

Syntax:

select privacera.unprotect(${colname},'${SCHEME_NAME}') from ${db_name}.${table_name};

where:

  • ${colname} is the identifier of the column to decrypt.
  • ${SCHEME_NAME} is the name of the desired Privacera encryption scheme, which must be the same encryption scheme used to originally encrypt..
  • ${db_name}.{table_name} are the names of the database and table in that database.

Example

In this example, the email column of the bigdatabase.customer_data table is decrypted with the SYSTEM_EMAIL encryption scheme.

select privacera.unprotect(email, SYSTEM_EMAIL') from bigdatabase.customer_data;

Last update: August 19, 2021