Hive UDFs
These are steps to enable encryption via a Hive User-Defined Function (UDF).
Add Privacera UDF in Hive
-
Login to Privacera Portal.
-
On the Privacera home page, expand Diagnostics and click Health Check.
-
Under the Diagnostics tab, click Encryption.
-
Click Create for UDF - Protect, UnProtect.
-
At the prompt, click Yes.
The UDF is created successfully.
Confirm Privacera UDF in Hive
-
SSH to the instance.
-
Do kinit for $
. -
Connect to beeline.
#Example beeline -u "jdbc:hive2://<hostname>:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2" -n $<user>
-
After connecting to beeline, enter the following two commands to describe the functions created by Privacera.
-
DESCRIBE FUNCTION EXTENDED privacera.protect;
-
DESCRIBE FUNCTION EXTENDED privacera.unprotect;
-
-
If the UDF has not been created successfully, you will see the error message “Function ‘privacera.protect’ does not exist”.
Test Privacera UDF in Hive
-
Login to Privacera Portal.
-
Click Encryption & Masking > Schemes .
-
Click Add.
-
Create a scheme by entering the following details, for example, as detailed in Create Custom Encryption Schemes:
- Scheme Name: SWIFT_ID
- Format type: FPE_ALPHA_NUMERIC
- Scope: All
- Algorithm: FPE
-
Click Save.
-
Check if the KMS key is generated for the scheme.
-
Login to Ranger at http://
:6080 with username as keyadmin. -
Click Encryption > Key Manager.
-
-
SSH to the instance.
-
Do kinit for $
. -
Connect to beeline.
#Example beeline -u "jdbc:hive2://<hostname>:2181/; serviceDiscoveryMode=zooKeeper; zooKeeperNamespace=hiveserver2" -n $<user>
-
After connecting to beeline enter the following commands.
select privacera.protect("TEXT" ,"SWIFT_ID"); select privacera.unprotect(privacera.protect("TEXT" ,"SWIFT_ID"), "SWIFT_ID");
-
Check the KMS audits in Ranger.
-
Login to Ranger http://
:6080 with username as keyadmin. -
Click Audit > Access.
-
If the UDF fails, the message "denied access" is recorded in Ranger KMS audits.
If access is denied, you need to give permission to $<user>.
-
If the UDF result is successful, the audits are shown as Allowed.
-
To check crypto UDF logs, run this command:
sudo su tail -f /var/log/hive/hiveserver2.log
-
Give Users Access to the UDFs
Users need access to the UDFs. See general steps in Set User Access in Ranger KMS.
Setup Privacera Encryption
When to Use Shaded Jar
You might need to use a shaded jar depending on certain conditions described below.
To check if the shaded jar is needed:
-
Go to the Hive libraries directory.
cd /opt/cloudera/parcels/CDH/lib/hive/lib/ ls | grep http
You need to install the shaded jar:
-
If the version of the httpclient is less than 4.5.6.
Or
-
If the UDF throws a NoClassDefFound exception for class
org/apache/http/config/Lookup
.
-
-
To install the shaded jar, see shaded jar section.
Create Privacera UDF in CDH/CDP
The following are the steps to install Privacera Encryption jar in CDH/CDP cluster.
-
Download jar to the cluster node.
sudo su - privacera export PRIVACERA_BASE_DOWNLOAD_URL=$<PRIVACERA_BASE_DOWNLOAD_URL> wget $<PRIVACERA_BASE_DOWNLOAD_URL>/privacera-crypto-jar-with-dependencies.jar -O privacera-crypto-jar-with-dependencies.jar
-
Upload jar into HDFS location from where Hive can access it. Following are the commands to upload jar into HDFS using Hadoop CLI.
kinit -kt /opt/privacera/keytab/privacera.headless.keytab privacera hadoop fs -ls hadoop fs -mkdir -p /privacera/crypto/jars hadoop fs -put privacera-crypto-jar-with-dependencies.jar /privacera/crypto/jars/privacera-crypto-jar-with-dependencies.jar
-
Create configuration files.
hadoop fs -mkdir -p /privacera/crypto/configs
vi crypto.properties
privacera.portal.base.url=http://$<PRIVACERA_SERVER>:6868 privacera.portal.username=$<USER_NAME> privacera.portal.password=$<PASSWORD> #Mode of encryption/decryption rpc/native privacera.crypto.mode=native
cp crypto.properties crypto_default.properties
-
Upload configuration to DBFS.
hadoop fs -put crypto.properties /privacera/crypto/configs/crypto.properties hadoop fs -put crypto_default.properties /privacera/crypto/configs/crypto_default.properties
-
Install Crypto jar into Cluster.
- SSH to the cluster node.
- Do kinit for $
(if Kerberos is enabled). - Connect to beeline.
-
If you have Kerberos-enabled cluster, use the following command to login to beeline:
Update the values for for the variables $
and $ below. HIVE_SERVER2_HOSTNAME=<<10.1.1.2>> REALM=<<EXAMPLE.PRIVACERA.US>> beeline -u "jdbc:hive2://$<HIVE_SERVER2_HOSTNAME>:10000/default;principal=hive/$<HIVE_SERVER2_HOSTNAME>@$<REALM>"
-
Add Privacera crypto UDF jar. You need to run multiple SQL queries in Databricks cluster to create privacera encryption functions.
-
SQL query to create the Privacera unprotect UDF.
add jar hdfs:///privacera/crypto/jars/privacera-crypto-jar-with-dependencies.jar;
-
-
Create the Privacera encryption UDF. You need to run multiple SQL queries in beeline to create Privacera encryption UDF.
-
SQL query to create Privacera unprotect udf.
create database if not exists privacera ; drop function if exists privacera.unprotect; create function privacera.unprotect AS 'com.privacera.crypto.PrivaceraDecryptUDF' using jar 'hdfs:///privacera/crypto/jars/privacera-crypto-jar-with-dependencies.jar';
-
SQL query to create Privacera protect UDF.
drop function if exists privacera.protect; create function privacera.protect AS 'com.privacera.crypto.PrivaceraEncryptUDF' using jar 'hdfs:///privacera/crypto/jars/privacera-crypto-jar-with-dependencies.jar';
-
Installing the Shaded Jar
The following are the steps to install the shaded Privacera Encryption jar in a CDH/CDP cluster.
-
Download jar in cluster node.
sudo su - privacera export PRIVACERA_BASE_DOWNLOAD_URL=$<PRIVACERA_BASE_DOWNLOAD_URL> wget $<PRIVACERA_BASE_DOWNLOAD_URL>/privacera-crypto-jar-with-dependencies-shaded.jar -O privacera-crypto-jar-with-dependencies-shaded.jar
-
Upload jar into HDFS location from where Hive can access it. Following are the commands to upload jar into HDFS using Hadoop CLI.
3. Create configuration files.kinit -kt /opt/privacera/keytab/privacera.headless.keytab privacera hadoop fs -ls / hadoop fs -mkdir -p /privacera/crypto/jars hadoop fs -put privacera-crypto-jar-with-dependencies-shaded.jar /privacera/crypto/jars/privacera-crypto-jar-with-dependencies-shaded.jar
hadoop fs -mkdir -p /privacera/crypto/configs
vi crypto.properties
privacera.portal.base.url=http://$<PRIVACERA_SERVER>:6868 privacera.portal.username=$<USER_NAME> privacera.portal.password=$<PASSWORD> #Mode of encryption/decryption rpc/native privacera.crypto.mode=native
cp crypto.properties crypto_default.properties
-
Upload configuration to DBFS.
hadoop fs -put crypto.properties /privacera/crypto/configs/crypto.properties hadoop fs -put crypto_default.properties /privacera/crypto/configs/crypto_default.properties
-
Install Crypto jar into Cluster.
- SSH to the cluster node.
- Do kinit for $
(if Kerberos is enabled). - Connect to beeline.
-
If you have Kerberos-enabled cluster, use the following command to login to beeline:
Update the values for $
and $ as per your environment HIVE_SERVER2_HOSTNAME=<<10.1.1.2>> REALM=<<EXAMPLE.PRIVACERA.US>> beeline -u "jdbc:hive2://$<HIVE_SERVER2_HOSTNAME>:10000/default;principal=hive/$<HIVE_SERVER2_HOSTNAME>@$<REALM>"
-
Add Privacera crypto UDF jar. You need to run multiple SQL queries in Databricks cluster to create privacera encryption functions.
-
SQL query to create the Privacera unprotect UDF.
add jar hdfs:///privacera/crypto/jars/privacera-crypto-jar-with-dependencies-shaded.jar;
-
-
Create the Privacera encryption UDF. (User defined function) You need to run multiple SQL queries in beeline to create Privacera encryption UDF.
-
SQL query to create Privacera unprotect udf.
create database if not exists privacera ; drop function if exists privacera.unprotect; CREATE FUNCTION privacera.unprotect AS 'com.privacera.crypto.PrivaceraDecryptUDF' using jar 'hdfs:///privacera/crypto/jars/privacera-crypto-jar-with-dependencies-shaded.jar';
-
SQL query to create Privacera protect UDF.
drop function if exists privacera.protect; CREATE FUNCTION privacera.protect AS 'com.privacera.crypto.PrivaceraEncryptUDF' using jar 'hdfs:///privacera/crypto/jars/privacera-crypto-jar-with-dependencies-shaded.jar';
-
Hive TEZ: Add Properties File in beeline
If you are using Hive TEZ, before executing the UDFs, you must add the cryptop.properties file in beeline wit hthe following command:
add file hdfs:///privacera/crypto/configs/crypto.properties;
Sample Queries to Verify Setup
-
Sample query to run encryption:
select privacera.protect("test_data","$<SCHEME_NAME>") limit 10;
-
Sample query to run encryption and decryption in one query to verify setup:
select privacera.unprotect(privacera.protect("test_data","$<SCHEME_NAME>"),"$<SCHEME_NAME>") limit 10;
-
For authorization and leveraging Hive Masking policies, you need to install the Hive plug-in in hiveserver2.
If you do not want to install the Hive plug-in, you can authorize use of the keys based on KMS.
-
Create a view on top of your raw table:
create view secure_view as select col1, privacera.protect(col2, ‘SSN’) as col2 from db.table;
select * from secure_view;
-
If the user is not present in Privacera, they can still access the protect function. It is recommended to use the Hive plug-in as they can control the access to the resource using Ranger Policies and it makes it easier to manage them with the simple UI.