Skip to content

Google BigQuery(GBQ)#

Google BigQuery provides fine-grained access control on BigQuery Datasets. This includes

  • Table-level Access Control
  • Column-level Access Control
  • Native/Dynamic secure view-based Row Filter
  • Masking With Dynamic secure views created using PolicySync

Privacera access control for BigQuery relies on the privacera_bigquery connector service.

Prerequisites#

Create PrivaceraPolicySyncRole IAM Role#

You need to give Privacera PolicySync basic access to GCP. To grant that access, create PrivaceraPolicySyncRole IAM role in your GCP project or GCP organization using the following commands on Google Cloud's shell (gcloud). The shell can be installed and accessed locally or through Google Console.

Run the following command to create the file containing the permissions required for the PrivaceraPolicySyncRole role:

ROLE_NAME="PrivaceraPolicySyncRole"

cat << EOF > ${ROLE_NAME}.yaml
title: "${ROLE_NAME}"
description: "${ROLE_NAME}"
stage: "ALPHA"
includedPermissions:
- resourcemanager.projects.get
- resourcemanager.projects.getIamPolicy
- resourcemanager.projects.setIamPolicy
- iam.roles.list
- iam.roles.get
- iam.roles.create
- iam.roles.update
- bigquery.jobs.create
- bigquery.datasets.get
- bigquery.datasets.create
- bigquery.datasets.update
- bigquery.datasets.delete
- bigquery.datasets.getIamPolicy
- bigquery.datasets.setIamPolicy
- bigquery.tables.list
- bigquery.tables.get
- bigquery.tables.getData
- bigquery.tables.create
- bigquery.tables.update
- bigquery.tables.delete
- bigquery.tables.getIamPolicy
- bigquery.tables.setIamPolicy
- bigquery.rowAccessPolicies.list
- bigquery.rowAccessPolicies.create
- bigquery.rowAccessPolicies.update
- bigquery.rowAccessPolicies.delete
- bigquery.rowAccessPolicies.getIamPolicy
- bigquery.rowAccessPolicies.setIamPolicy

EOF

GCP Project-level access#

Note

If you have multiple projects in your GCP organization and want them to be managed by a single BigQuery connector, then repeat the steps below for each project. Assign the role to the same service account which will be used across multiple projects.

  1. Run the following command. Replace <GCP_PROJECT_ID> with your GCP project ID.

    PROJECT_ID="<GCP_PROJECT_ID>"
    
  2. To create PrivaceraPolicySyncRole role in your GCP project, run the following command.

    gcloud iam roles create ${ROLE_NAME} --project=${PROJECT_ID} --file=${ROLE_NAME}.yaml
    

GCP Organization-level access#

  1. Run the following command. Replace <GCP_ORGANIZATION_ID> with your GCP organization ID.

    ORGANIZATION_ID="<GCP_ORGANIZATION_ID>"
    
  2. To create PrivaceraPolicySyncRole role in your GCP organization, run the following command.

    gcloud iam roles create ${ROLE_NAME} --organization=${ORGANIZATION_ID} --file=${ROLE_NAME}.yaml
    

Attach IAM Role to Service Account#

To attach the PrivaceraPolicySyncRole IAM role created above, do the following steps:

  1. Log in to your GCP console.

  2. Select IAM & admin > Service accounts and click + CREATE SERVICE ACCOUNT.

  3. Enter values in the fields and click CREATE.

  4. In Grant this service account access to project, select the role as PrivaceraPolicySyncRole.

  5. On the Services Account Page, find the newly created service account and copy the email address of the new service account for use in a later step.

    Note

    This email will be the Service Account Email for configuring PolicySync in Privacera Manager.

  6. If you are using a Google VM machine to configure GBQ for PolicySync, then you can attach the service account created above to your VM machine and skip below steps.

  7. On the Services Account Page, go to the Keys tab and click Add Key and select Create New Key.

  8. Select the JSON key type, and click CREATE. A JSON key file downloads to your system. Store the file at an accessible location. It will be used for configuring PolicySync in Privacera Manager.

Related Information

Refer the Google documentation for a detailed information on creating a service account - Click here.

Configure Logs for Auditing#

A sink is required to collect all the logs from GBQ. To create a sink, do the following steps:

  1. In the search bar, search for Logging, and then click Logs Router, and click Create Sink.

  2. Enter the sink name as PolicySyncBigQueryAuditSink, and then click Next.

  3. Enter the sink destination.

    1. In the Select sink service, select BigQuery.

    2. In Select BigQuery dataset, click Create new BigQuery dataset.

    3. Enter the Dataset ID as bigquery_audits and click Create Dataset.

    4. Click Next.

  4. Add the BigQuery logs in the sink:

    In the Build an inclusion filter, add the following line:

    resource.type="bigquery_resource"
    
  5. Click Create Sink.

Related Information

Refer the Google documentation for a detailed information on creating a sink - Click here.

CLI Configuration#

  1. SSH to the instance where Privacera is installed.

  2. Do the following, if you’re not using VM attached credentials:

    1. Access the JSON file of the service account you downloaded using the steps above.

    2. Copy the JSON to the config/custom-vars folder.

  3. Run the following commands.

    cd ~/privacera/privacera-manager/
    cp config/sample-vars/vars.policysync.bigquery.yml config/custom-vars/
    vi config/custom-vars/vars.policysync.bigquery.yml
    
  4. Set the properties for your specific installation. For property details and descriptions, see the Configuration Properties section that follows.

    Note

    Along with the above properties, you can add custom properties that are not included by default. For more information about these properties, see BigQuery Connector.

  5. Run the update.

    cd ~/privacera/privacera-manager 
    ./privacera-manager.sh update
    

Configuration Properties#

Property Mandatory Description Default Value Example
BIGQUERY_V2_ENABLE Property to enable/disable BigQuery. true
BIGQUERY_PROJECT_ID Yes This property is used to set the Google project Id for initial interaction with BigQuery APIs to explore other projects and datasets. privacera-demo-project
BIGQUERY_PROJECT_LOCATION Yes Set this property to the geographical region where the PolicySync's taxonomy will be created. us
BIGQUERY_JDBC_URL This property is used to set the JDBC URL for connecting to the bigquery server. jdbc:bigquery:example.com
BIGQUERY_USE_VM_CREDENTIALS Yes

This property is used if you want to connect toBigQueryusing the service account attached to your virtual machine.
It is not necessary to set BIGQUERY_OAUTH_PRIVATE_KEY_PATH when this property is used.

true
BIGQUERY_OAUTH_SERVICE_ACCOUNT_EMAIL Yes This property is used to specify the Email Address for the service account that will be used by PolicySync to connect to BigQuery.    
BIGQUERY_OAUTH_PRIVATE_KEY_PATH  

This property sets the path of the service account credential JSON file downloaded from the Google Service account keys section. This property value is needed when BIGQUERY_USE_VM_CREDENTIALS is set to false.

/workdir/policysync/cust_conf/policysync-gbq-service-account.json  
BIGQUERY_OAUTH_PRIVATE_KEY_FILE_NAME  

This property sets the name of the service account credential JSON file downloaded from the Google Service account keys section. Then BIGQUERY_OAUTH_PRIVATE_KEY_PATH will be auto-populated based on the filename. This property value is needed when BIGQUERY_USE_VM_CREDENTIALS is set to false.

   
BIGQUERY_AUDIT_ENABLE Yes This property enables access audit retrieval from BigQuery. false  
BIGQUERY_AUDIT_EXCLUDED_USERS Yes This property is used to specify a list of users whose access rights we want to ignore.
It takes list of comma-separated Email Addresses of the users.
 
BIGQUERY_MANAGE_DATASET_LIST Yes This property is used to set comma-separated dataset FQDN on which access control will be managed by PolicySync. If you want to manage all datasets then keep the value blank. It supports wildcards. The ignore dataset list has precedence over manage dataset list. testproject1.dataset1,testproject2.dataset2,sales_project*.sales*
BIGQUERY_MANAGE_PROJECT_LIST Yes This property is used to set comma-separated project names on which access control will be managed by PolicySync. If you want to manage all projects then keep the value blank. It supports wildcards. The ignore projects list has precedence over the manage projects list. testproject1,testproject2,sales_project*
BIGQUERY_GRANT_UPDATES Yes This property determines whether actual grant/revoke runs on bigquery. true

BIGQUERY_NATIVE_PUBLIC_GROUP_IDENTITY_NAME Yes Set this property to your preferred value, and policysync will use this native public group for access grants whenever a policy with a public group reference is created. All authenticated users: all GCP project authenticated users.
All users: all Google authenticated users.

Limitations#

Policy Permissions Limitation
View-based Row Filter Only the first satisfied condition is evaluated from row filter policy items.
Native Row Filter All the row filter policy items from the row filter policy will be evaluated as policysync translates each policy item as single row access policy on the BigQuery table and then the BigQuery handles evaluation part.
Lag on Access Control Enforcement PolicySync only process any changes in the policy in Ranger and submit it as a request to BigQuery. BigQuery takes its time to enforce the policies.

For more information, see the following:
User/Group/Role Policies BigQuery connector won’t create any user/group/roles in Google. Only user/group level policies will work.

Validation#

Let's test the access control using BigQuery by defining some test policies for a test user (emily).

  1. Prepare test data in GCP. Refer Google documentation for a detailed information on running queries in BigQuery - Click here.

    1. Log in to GCP console, navigate to BigQuery and then click Compose New Query.

    2. In the Query editor text area, copy the following query:

      -- Create Dataset
      CREATE SCHEMA customer_dataset;
      
      -- Create Table
      CREATE TABLE IF NOT EXISTS customer_dataset.customer_data (
      id INT64, 
      person_name string,
      domain string,
      ssn string,
      country string,
      us_phone string, 
      address string, 
      account_id string, 
      zipcode string);
      
      -- Insert Data into table
      insert into customer_dataset.customer_data values (1, 'Nancy','nancy@yahoo.com','201-99-5532','US','856-232-9702','939 Park Avenue','159635478','33317');
      insert into customer_dataset.customer_data values (2,'Gene','gene@google.us','202-99-5532','UK','954-583-0575','303 Johnston Blvd','236854569','95202');
      insert into customer_dataset.customer_data values (3,'Edward','edward@facebook.com','203-99-5532','US','209-626-9041','130 Hollister','365412985','60173');
      insert into customer_dataset.customer_data values (4,'Pearlene','pearlene@gmail.com','204-99-5532','US','708-471-6810','17 Warren Rd','452189732','90017');
      insert into customer_dataset.customer_data values (5,'James','james@cuvox.de','205-99-5532','US','661-338-6787','898 Newport Gray Rd','517836427','94041');
      insert into customer_dataset.customer_data values (6,'Pamela','pamela@cuvox.de','206-99-5532','UK','650-526-5259','861 Strick Rd','685231473','80214');
      insert into customer_dataset.customer_data values (7,'Donna','donna@fleckens.hu','207-99-5532','US','303-239-4282','1784 S Shore Dr','789563258','1730');
      insert into customer_dataset.customer_data values (8,'Amy','amy@gustr.com','208-99-5532','US','774-553-4736','9522 Apple Valley Dr','854126945','55102');
      insert into customer_dataset.customer_data values (9,'Adam','adam@teleworm.us','209-99-5532','UK','651-297-1448','745 Old Springville Rd','965412381','43201');
      insert into customer_dataset.customer_data values (10,'Lucille','lucille@armyspy.com','210-99-5532','US','740-320-1270','4223  Midway Road','785651236','89102');
      insert into customer_dataset.customer_data values (11,'Edard','edu@gustr.com','211-99-5532','UK','702-257-8796','3659  Dye Street','965121354','53207');
      insert into customer_dataset.customer_data values (12,'Nick','nick@jourrapide.com','212-99-5532','US','414-483-8638','2966  Nutters Barn Lane','563515264','72764');
      insert into customer_dataset.customer_data values (13,'Brian','brian@einrot.com','213-99-5532','US','479-872-9783','3300  Worthington Drive','654621233','91303');
      insert into customer_dataset.customer_data values (14,'Stella','stella@jourrapide.com','214-99-5532','US','818-596-6681','1893  Ingram Road','261613654','35816');
      insert into customer_dataset.customer_data values (15,'Leona','leona@dayrep.com','215-99-5532','UK','256-250-5413','4244  Burnside Court','986513211','75069');
      
      -- Verify table by running select query
      SELECT * FROM customer_dataset.customer_data;
      
    3. Click Run.

  2. Create test user in Privacera Portal as emily. For more information, click here.

  3. In GCP console, the user emily gets added after you create the user in step 2.

    1. Check the user emily can list the projects inside your organization.

    2. Check user emily does not have Full Admin or Full Reader access on BigQuery.

  4. Create a policy for emily to run queries and list dataset and tables.

  5. Check the access control on the test data in GCP.

    A) Table-level Access Control

    1. In Privacera Portal, create a policy Customer Data Full Access for accessing table.

    2. Log in to GCP console with credentials of the test user emily.

    3. Navigate to BigQuery.

    4. Run the following query.

      SELECT * FROM customer_dataset.customer_data;
      SELECT * FROM customer_dataset_secure.customer_data;
      

      User emily can access and view the data.

    5. In Privacera Portal, disable the policy.

    6. In GCP, run the following query.

      SELECT * FROM customer_dataset.customer_data;
      SELECT * FROM customer_dataset_secure.customer_data;
      

      User emily can not access and view the data.

    B) View-based Column-level Access Control

    In the view-based column-level access control, you have to create a column-level policy on the table. The columns which are not permitted from that policy will be shown as NULL in the secure view of the table.

    1. In Privacera Portal, do the following:

      Create a policy Customer Data Column Level Access granting access to a few columns.

    2. Log in to GCP console with credentials of the test user emily.

    3. Navigate to BigQuery.

    4. Run the following queries.

      User emily will not see person_name column in the secure view of customer_data table..

      SELECT * FROM customer_dataset_secure.customer_data;
      
    5. In Privacera Portal, disable the Customer Data Column Level Access policy.

    C) View-based Row-level Filter

    1. In Privacera Portal, enable Customer Data Full Access policy created above.

      emily can view customer_data for US and UK from table and in secure view.

    2. Log in to GCP console with credentials of the test user emily.

    3. Navigate to BigQuery.

    4. Run the following queries.

      Query to view data from customer data secure view. It will show data from countries US and UK.

      SELECT * FROM customer_dataset_secure.customer_data;
      

      User emily can access and view the data.

    5. In Privacera Portal, create a Customer Access by Country policy to access data only from UK.

    6. Run the following queries.

      Query to view data from customer data secure view. The row filter policy gets applied showing data from country UK.

      SELECT * FROM customer_dataset_secure.customer_data;
      

    D) View-based Masking

    1. Log in to GCP console with credentials of the test user emily.

    2. Navigate to BigQuery.

    3. Run the following query.

      SELECT * FROM customer_dataset_secure.customer_data;
      

      User emily can view the SSN values.

    4. In Privacera Portal, create a Mask SSN policy to mask SSN values for emily.

    5. Run the following query.

      SELECT * FROM customer_dataset_secure.customer_data;
      

      User emily cannot view the the SSN values, since it is masked using md5 encryption.

Create custom IAM roles#

By default, PolicySync creates all the IAM roles listed in the table below automatically to perform access control in GBQ. If you want to create the custom IAM roles manually, then disable the BIGQUERY_CREATE_CUSTOM_IAM_ROLES by setting its value to false.

In the GCP console, you need the map the roles with its appropriate permissions. The role and its permissions mapping are given in the table below. For creating a custom role in GCP and adding its corresponding permissions, see Creating a custom role.

Role Name GCP Permissions
PrivaceraGBQProjectListRole resourcemanager.projects.get
PrivaceraGBQJobListRole bigquery.jobs.list
PrivaceraGBQJobListAllRole bigquery.jobs.listAll
PrivaceraGBQJobCreateRole bigquery.jobs.create
PrivaceraGBQJobGetRole bigquery.jobs.get
PrivaceraGBQJobUpdateRole bigquery.jobs.update
PrivaceraGBQJobDeleteRole bigquery.jobs.delete
PrivaceraGBQDatasetCreateRole bigquery.datasets.create
PrivaceraGBQDatasetGetMetadataRole bigquery.datasets.get
PrivaceraGBQDatasetUpdateRole bigquery.datasets.update
PrivaceraGBQDatasetDeleteRole bigquery.datasets.delete
PrivaceraGBQTableListRole bigquery.tables.list
PrivaceraGBQTableCreateRole bigquery.tables.create
PrivaceraGBQTableGetMetadataRole bigquery.tables.get
PrivaceraGBQTableQueryRole bigquery.tables.getData
PrivaceraGBQTableExportRole bigquery.tables.export
PrivaceraGBQTableUpdateMetadataRole bigquery.tables.update
PrivaceraGBQTableUpdateRole bigquery.tables.updateData
PrivaceraGBQTableSetCategoryRole bigquery.tables.setCategory
PrivaceraGBQTableDeleteRole bigquery.tables.delete
PrivaceraGBQTransferUpdateRole bigquery.transfers.update
PrivaceraGBQTransferGetRole bigquery.transfers.get

Based on the GCP Resource hierarchy, you can create the roles above at the organization or project level. If you have multiple projects in your GCP organization, it is recommended to create all the roles at the organization level instead at the project level. Once the roles are defined at the organization level, all the projects under that organization will inherit all the roles.