Skip to content

Datasource

Datasource refers to data resources and configuration steps used to connect to directory services and identity providers.

Datasource supports configurations to connect PrivaceraCloud to and from:

  • Data resources: PolicySync and Data Server type ('data repositories'). Discovery Data Source targets.

  • Data access users: UserSync import sourced via the protocols LDAP, LDAP-SSL, and SCIM and and applications built on those protocols: Active Directory, Azure Active Directory, or Okta.

  • Portal users: Directory Service based on LDAP, such as Active Directory Server, or SAML-based Identity Providers.

Terminology#

Data Sources are organized into systems and applications.

A datasource system is a datasource application namespace and acts as an arbitrary grouping for datasource applications. Datasource systems have a name and a description. Common practice is to name systems for the type of cloud platform, or purpose. Any number of systems can be created.

A datasource application is a configuration for a data resource or authentication resource to be linked to your PrivaceraCloud account. When creating a datasource application you will provide target and type specific properties for the target resource such as location address (URL), and authentication credentials to that resource. Most properties are provided in a dialog for the specific type of resource. Additional custom properties can be added using a key/value pair syntax. Once defined, the set of properties can be exported to a JSON-formatted properties file.  This file can then be reimported at a later time or can be used as a template for other applications.

An authentication resource can be a connection to directory service for data access users, or for portal users.

A data resource is any one of the supported data repositories such as S3 buckets, Databricks,  EMR, and Hive. 

Data Source Connectors#

PrivaceraCloud supports three methods for connecting with data resources. These are: Plug-In, Data Access Server (Proxy), and Policy Sync. The appropriate method depends on the level of integration for external data access control made available by the data repository type.

Plug-In configurations are used for EMR/Spark and Databricks configuration. No Datasource configuration steps are required for Plugin data sources. To configure connections to EMR Spark, EMR Presto DB, EMR Hive, and Databricks, return to topic Connect Data Sources.

Data access server and PolicySync integration methods require additional configuration through this Settings: Datasource interface or using the Setup | Add Service interface.

User Interface#

The Setup and Add Service wizard creates a system named "Setup Wizard System" when connecting a data source. Otherwise, if the wizards were not used, this page is empty of systems and the first step is to create at least one datasource system. 

The second step is to create one or more datasource application, each of which represents a data or authentication connection.

Add System#

Click on (+) Add System, found on the upper right. The Add System dialogue has two fields:  Name, and Description. Give the System a unique name, and optionally a description. Click Add.

Add Application#

Applications are added to Systems. Click on the wrench icon in the System box (on the right).  Click on + Add Application. The Add Application dialog opens to the Choose tab. 

The following applications are available:

  • hive
  • mssql
  • postgres
  • snowflake
  • redshift
  • s3
  • gcs
  • adls
  • files
  • databricks_sql_analytics
  • databricks_SQL

PrivaceraCloud limits each account to one application of each type: LDAP/AD, DATA SERVER, USER SYNC, and SAML. Multiple applications of type POLICY SYNC can be created.

Common properties#

The first three fields of an application (Application Name, Application Description and Application code) are common for all PolicySync and Data Server applications. These are user assigned values.

  • Application Name: A meaningful and unique name.

  • Application Description (optional): A useful description of this data resource.

  • Application Code: A unique character string value used as an internal identifier.

Actions on an Application#

Once an application has been created it is listed by name in the Datasource system in which it was created with a status dot on the left.

Color-coding of status dot:

  • Red: As the connections starts and stays red if there are problems.
  • Green: Successful completion of the connection.

To the right, in the same row there are either two or three additional icons: Log/Warning (Triangle), Edit (pen), and Delete (trashcan).

If the Log/Warning indicator is displayed, there can have been startup or other problems connecting to this application/service. Click on it to open and view startup or error logs. If you contact Privacera Support, for assistance, they are likely to ask for a copy of the contents.

Click on the edit icon to re-open the application configuration. Click on the trashcan to delete this application.

POLICY SYNC: Data Resource#

MSSQL, Snowflake, Redshift, Postgres, Azure Synapse#

Use Policy Sync to add an application for the purpose of configuring a connection to an MSSQL, Snowflake, Redshift, Postgres, or Azure Synapse data repository. PrivaceraCloud supports one application of each type.

For a list of properties for Snowflake, see Description of Snowflake Properties.

  1. Select a datasource system and open + Add Application.

  2. Select POLICY SYNC.

  3. Enter Application Name, Description, and Code.

  4. Select a Service, one of SNOWFLAKE, REDSHIFT, SYNAPSE, MSSQL, or POSTGRES, using the drop down menu.

  5. Configuration fields specific to the selected service will be displayed. Follow the prompts and enter the connection information appropriate for this service connection.

  6. Custom Properties Additional properties can be added in consultation with Privacera Support using by entering key-value pairs in the Add Custom Properties edit box.

PEG (Privacera Encryption Gateway)#

The Privacera Encryption Gateway (PEG) is loaded, configured, and activated as a Datasource application.

PrivaceraCloud PEG supports two API REST methods: protect and unprotect. It uses Basic Auth (Base64 encoding) authenticated against a single configured service user.

  1. In Settings Select a or create a new datasource system and open the system wrench icon and select + Add Application.

  2. Select PEG to open the PEG configuration dialog. In the PEG configuration dialog:

    1. Leave Application Name and Application Code unchanged.
    2. Optionally enter a Description.
    3. Under Application Properties:
      1. Enter credentials (username and password) for a PEG service user. These are the Basic Authentication values for the PEG API requests.
      2. Enter a value for a shared secret. This value will be used as a shared secret when configuring embedded encryption using the Privacera Crypto Jar, for use in Databricks. See: Reference: Databricks Encryption for additional setup details, if using PEG with Databricks SQL and User-Defined Functions (UDFs).
  3. Click Save.

DATA SERVER: Data Resource#

Use Data Server to connect to AWS S3, Athena, S3 Databricks, or Azure ADLS  data sources. PrivaceraCloud supports one Data Server per account.

  • Object Level Access Control (OLAC) can be added to an existing AWS S3 configuration.

One AWS DATA SERVER can provide services for multiple actual AWS connection methods. AWS S3, and AWS EMR connection can be configured on the same AWS Data Server instance.

S3, Athena (AWS)#

AWS Account Access#

Connecting to an AWS hosted data source requires authentication or a Trust relation with those resources. You will provide this information as one step in the AWS Data resource connection. You will also need to specify your AWS Account Region.

Access to your AWS environment will be configured using one of two methods. A Use IAM Role toggle is provided to select the type of access.

  • AWS Access Key and Secret Key:

    1. Turn off Use IAM Role.

    2. The PrivaceraCloud configuration user interface provides prompts for each of these value directly.

      Note

      Access and Secret Key are stored encrypted. The Secret Access Key is never reflected back to the UI or is made visible.

  • AWS IAM Role:

    1. In the AWS Console, do the following:

      1. Create or use an existing IAM role in your environment. The role should be given access permissions by attaching an access policy in the AWS Console.

      2. Configure a Trust relationship with PrivaceraCloud. See AWS Access Using IAM Trust Relationship for specific instructions and requirements for configuring this IAM Role.

      Once that role is established you will provide its full ARN to PrivaceraCloud.

    2. In PrivaceraCloud, do the following:

      1. Turn on Use IAM Role.

      2. Enter the actual IAM Role using a full AWS ARN.

      3. (Optional) Add an external ID. For additional security, an external ID can be attached to your IAM role configured. This assures that your IAM role can be assumed by PrivaceraCloud only when the configured external ID is passed.

        Note

        The external ID is stored encrypted. It is never reflected back to the UI or is made visible.

      4. Enter AWS Region.

Steps#

  1. Select a datasource system and open + Add Application.

  2. Select DATA SERVER.

  3. Enter Application Name, Description, and Code.

  4. Select Cloud environment AWS

  5. Save this configuration. After this application is created and saved, re-open it for edit (click on pen icon), then click on Application Properties. Provide your AWS credentials as Access Key/Secret Key or provide an IAM Role, as specified in paragraph AWS Account Access above.

  6. Recommended: Install the AWS CLI.

    1. Open User Interface: Launch Pad and follow the steps to install and configure AWS CLI to your workstation so that it uses the PrivaceraCloud S3 Data Server proxy.
  7. Recommended: Validate connectivity by running AWS CLI for S3 or Athena queries such as: aws s3 ls
    or
    aws athena start-query-execution --query-string "SHOW DATABASES"

Object Level Access Control (AWS EMR Spark)#

These instructions enable Object Level Access Control (OLAC) on an existing connected AWS S3 resource. If AWS S3 is not already configured, first do so by following the instructions above in (Connect Data Resource - AWS: S3, Athena), then return here for additional configuration steps. Either Object Level Acess Control (OLAC) or Fine-Grained Access Control (FGAC) can be added to an existing AWS S3 configuration, but not both.

Two subcomponents are installed:

  • Privacera Credential Token Service (P-CTS) is installed to the targeted AWS EMR master node. P-CTS is a secure service running on an EMR master node which provides encrypted access tokens to the requesting user. Tokens are encrypted using a shared secret key with the Privacera Cloud Signing Server.

  • Privacera Signing Agent (P-SA) installed to targeted AWS EMR worker nodes. P-SA redirects Spark S3 requests to the Privacera Cloud Signing Server with a P-CTS access token in the request. P-SA then provides the appropriate signed response to Spark for accessing the S3 data if:
    (a) The incoming request has a valid P-CTS token;
    and (b) The requesting user has permissions on the S3 resource as defined in the “privacera_s3“ service in Access Manager: Resource Policies.

These steps will:

  1. Create an AWS Kerberos-based Security Configuration.

  2. Establish a shared secret between PrivaceraCloud and the AWS EMR Kerberos based Security Configuration.

  3. Create a new AWS Cluster configured to use that Security Configuration. That Cluster will link back to the Privacera Signing Agent (P-SA) and Privacera Credential Token Service (P-CTS).

Prerequisites#

  1. Obtain or determine a character string to serve as a "shared key" between PrivaceraCloud and the AWS EMR Cluster. We'll refer to this as <SHARED_KEY> in the configuration steps below.

  2. Obtain your account unique call-in <emr-script-download-url> to allow the EMR cluster to obtain additional scripts and setup from PrivaceraCloud. Steps:

    1. Open Settings: Api Key.
    2. Use an existing Active Api Key or create a new one. Set Expiry = Never Expires.
    3. Open the Api Key Info box (click the (i) in the key row).
    4. Copy and store as <emr-script-download-url> using the Copy Url link found under AWS EMR Setup Script.

Steps#

  1. In PrivaceraCloud console, Setting: Datasource, locate the existing AWS Data Server datasource application, and open it for edit. (Click on the edit (pen) icon.)

  2. Click on tab Application Properties. Under Add New Properties, in the Add Custom Properties edit box, add the shared.secret property:

    dataserver.shared.secret=<SHARED_KEY>
    

    Save this configuration.

  3. From your AWS EMR web console:

    1. Create an EMR Security Configuration for Kerberos Authentiation.

      1. Open your AWS EMR web console.
      2. Click on Security Configurations, then Create.
      3. Provide a name for this Security Configuration such as PRIVACERA_KDC. We'll refer to this same Security Configuration later.
      4. Under Authentication, Enable Kerberos authentication and complete the fields as appropriate for your environment.
    2. Create a new EMR Cluster and assign to it the new Security Configuration.

      1. In the AWS EMR Console, create a new Cluster.

      2. Open "Advanced Options" (click Go to advanced options, next to Quick Options).

      3. In Step 1: Software and Steps:

        1. Under Software Configuration, select the appropriate EMR release and the desired associated applications.

        2. Under Edit Software Settings, select "Enter configuration", and add the text below.

          [ {
              "classification":"spark-defaults",
              "properties":{
                  "spark.driver.extraJavaOptions":"-javaagent:/usr/lib/spark/jars/privacera-signing-agent.jar",
                  "spark.executor.extraJavaOptions":"-javaagent:/usr/lib/spark/jars/privacera-signing-agent.jar",
                  }
          } ]
          

          The results will look similar to:

        3. Under Steps, select Step Type Custom Jar and click Add Step to open the Add Step dialog. Add code to download and install the Privacera Credential Token Service. Complete the fields as below substituting your <emr-sript-download-url> value in the wget command below. Click Add when all fields are complete.

          • Name: Install Privacera CTS

          • JAR location: command-runner.jar

          • Arguments:

            bash -c "wget <emr-script-download-url> ; chmod +x ./privacera_emr.sh ; sudo ./privacera_emr.sh priv-cts"
            
          • Action on failure: Continue

          The results will be similar to:

        4. Click Next to progress to Step 2: Hardware

      4. In Step 2: Hardware Select values Networking, Node, and Instance values as appropriate for your environment.

      5. In Step 3: General Cluster Settings. We'll add two scripts that will Install Privacera Signing Agent on master and worker nodes.

        1. Assign Cluster name, Logging, Debugging, and Termination protection as appropriate for your environment.

        2. Install the Master signing agent:
          Under Additional Options, expand Bootstrap Actions, select bootstrap action Run if and click Configure and add to open the Add Bootstrap Action dialog. In this dialog set the name to Privacera Signing Agent for Master, copy and paste into Optional Arguments the following script, using your own <emr-script-download-url>and click Add when done.

          instance.isMaster=true "wget <emr-script-download-url>; chmod +x ./privacera_emr.sh ; sudo ./privacera_emr.sh spark-fbac"
          

        3. The Worker signing agent is installed in the same way.Under Additional Options, expand Bootstrap Actions, select bootstrap action Run if and click Configure and add to open the Add Bootstrap Action dialog. In this dialog set the name to Privacera Signing Agent for Worker, copy and paste into Optional Arguments the following script, using your own <emr-script-download-url>and click Add when done.

          instance.isMaster=false "wget <emr-script-download-url>; chmod +x ./privacera_emr.sh ; sudo ./privacera_emr.sh spark-fbac"
          
      6. In Step 4: Security

        1. Complete Security Options as appropriate for your environment.

        2. Open Security Configuration, and select the configuration you created earlier, e.g. "PRIVACERA_KDC". Set Realm and enter a KDC admin password.

      7. Click Create cluster to complete.

Databricks (AWS) Clusters for Scala Notebooks#

AWS Databricks can be installed as the first AWS S3 application, or it can be layered, or added to an existing AWS S3 configuration. Multiple Databricks Clusters within the same hosting AWS account can be attached to PrivaceraCloud under a single Data Server connection. This connection method supports Scala, Python-based Databricks Cluster notebooks, and Databricks Delta.

For use with Delta, PrivaceraCloud access to AWS Databricks resources should be established using the IAM role method rather than using an Access and Secret Key method. See AWS Account Access in this topic for more information.

Prerequisites#

Establish an AWS S3 connection. Generate AWS Access/Secret Keys and identify an AWS S3 Region, or identify an existing AWS IAM Role that provides sufficient access to your Databricks Cluster host account.

For a Databricks Cluster connection you will also need:

  • <DATABRICKS_URL_LIST>:  Comma-separated list of the target Databricks cluster URLs,
    e.g.“https://dbc-yyyyyyyy-xxxx.cloud.databricks.com/".

Steps#

  1. To create a new AWS S3 Databricks connection:

    a. From Settings > Datasource, select a datasource system and click + Add Application.

    b. Select DATA SERVER.

    c. Enter an Application Name, Description, and Code.

    d. Select AWS as the Cloud environment.

    e. From the Add New Properties > Add Custom Properties edit box, add the following property (substituting your Databricks account information for any variables):

    dataserver.databricks.allowed.urls=<DATABRICKS_URL_LIST>
    

    f. Save your configuration.

    g. From the Datasource main dialog page, open your newly added Databricks application for editing (select the pen icon).

    h. Open Application Properties.

    Provide your AWS credentials as Access Key/Secret Key, or provide an IAM Role, as specified in paragraph AWS Account Access above.

  2. If you are updating an existing AWS Data Server configuration:

    a. From Settings > Datasource open an existing datasource application (select the pen icon).

    b. Open Application Properties.

    Add the following property (substituting your AWS and Databricks account information for any variables):

    dataserver.databricks.allowed.urls=<DATABRICKS_URL_LIST>
    

    c. Save your configuration.

  3. Download the Databricks Init Script:

    a. Log in to the PrivaceraCloud portal.

    b. Generate the new API and Init Script. For more information, refer to the topic API Key.

    c. On the Databricks Init Script section, click the DOWNLOAD SCRIPT button.

    By default, this script is named privacera_databricks.sh. Save it to a local filesystem or shared storage.

  4. Upload the Databricks Init Script to your Databricks Clusters.

    a. Log in to your Databricks Cluster using administrator privileges.

    b. On the left navigation, click the Data Icon .

    c. Click  Add Data from the upper right corner.

    d. From the Create New Table dialog box select Upload File, then select and open privacera_databricks.sh

    e. Copy the full storage path onto your clipboard.

  5. Add the Databricks Init Script to your target Databricks Clusters:

    a. From the Databricks navigation panel select Clusters 

    b. Choose a Cluster name from the list provided and click Edit to open the configuration dialog page.

    c. Open Advanced Options and select the Init Scripts tab.

    d. Enter the DBFS Init Script path name you copied earlier.

    e. Click Add.

    f. From Advanced Options, select the Spark tab. Add the following Spark configuration content to the Spark Config edit window.

    spark.databricks.isv.product privacera
    spark.databricks.repl.allowedLanguages sql,python,r,scala
    spark.driver.extraJavaOptions -javaagent:/databricks/jars/ranger-spark-plugin-faccess-2.0.0-SNAPSHOT.jar
    spark.hadoop.fs.s3.impl com.databricks.s3a.PrivaceraDatabricksS3AFileSystem
    spark.hadoop.fs.s3n.impl com.databricks.s3a.PrivaceraDatabricksS3AFileSystem
    spark.hadoop.fs.s3a.impl com.databricks.s3a.PrivaceraDatabricksS3AFileSystem
    spark.executor.extraJavaOptions -javaagent:/databricks/jars/ranger-spark-plugin-faccess-2.0.0-SNAPSHOT.jar
    spark.hadoop.signed.url.enable true
    

    g. Save and close. 

    h. Restart the Databricks Cluster.

  6. Create an S3 Service in PrivaceraCloud. Add a service to associate with the S3 Databricks datasource. For more information, refer to the topic Service Config.

Your S3 Databricks Cluster dataresource is now available for Access Manager Policy Management, under Access Manager > Resource Policies, Service "privacera_s3".

ADLS Gen 2 (Azure)#

Prerequisites

Obtain your Azure account ADLS Name and ADLS shared key.

Steps#

  1. Select a datasource system and open + Add Application.

  2. Select DATA SERVER.

  3. Enter Application Name, Description, and Code.

  4. Select Cloud environment AZURE.

  5. Under Add New Properties, in the Add Custom Properties edit window, add the following lines, substituting your Azure account ADLS Name and ADLS shared key:

    dataserver.azure.accountName=<AZURE_ADLS_ACCOUNT_NAME>
    dataserver.azure.sharedkey1=<ADLS_SHARED_KEY>
    

  6. Save this configuration.

  7. Recommended: Install and configure the Azure CLI to your client workstation to redirect ADLS requests to this PrivaceraCloud Azure ADLS Data Server proxy. See topic User Interface: Launch Pad

USER SYNC: Data Access Users#

Select USER SYNC to connect to external resource for the purpose of connecting, pulling, or serving as data access users.

The first set of steps are identical for all USER SYNC connections.

  1. Select a datasource system and open + Add Application.

  2. Select USER SYNC to open the Add User Sync Connector dialog.

  3. Select the connection protocol / service:

    • LDAP
    • SCIM (System for Cross Identity Management - Client)
    • SCIM-SERVER (System for Cross Identity Management - Server Endpoint)
    • AD
    • AAD
    • OKTA

LDAP, AD, or AAD Connection#

The configuration wizard will step through configuration pages. Complete the LDAP, AD, or AAD values for the specific Directory Service, filling in the BASIC and ADVANCED tabs as required. Click Next to step through the dialog.

Property Example
Service URL "ldap://dir.privacera.us:389"
Search Base "DC=ad,DC=privacera,DC=us"
Bind DN "CN=Bind User,OU=privacera,DC=ad,DC=privacera,DC=us"
Bind Password as needed
Authentication Simple

SCIM, OKTA, and SCIM Server#

Select one of the following:

  • OKTA: Pull data access users and groups from Okta. PrivaceraCloud will use Okta protocols in client-mode to connect to an Okta enabled SCIM-Server. It will synchronize with the targeted server to obtain data access users and groups.

  • SCIM: Pull data access users and groups from a generic SCIM 2.0 compliant server.

  • SCIM-SERVER: Configure to allow data access users and groups to be provided (pushed) to your PrivaceraCloud account from a SCIM 2.0 client, including push integration with an Okta Identity Provider. See SCIM Server User-Provisioning for detailed setup instructions.

The configuration wizard will advance you through the configuration pages. Complete and step through each of the pages. Complete all BASIC values, then review and update ADVANCED values as required. Click Save when complete.

LDAP/AD: Portal Users#

Select LDAP/AD to connect to an LDAP or Active Directory server for the purpose of defining or importing portal users.

LDAP and AD Connections#

  1. Select a datasource system and open + Add Application.

  2. Select LDAP / AD. .

  3. Enter Application Name, Description, and Code.

  4. Complete the remaining fields to connect to your LDAP or AD server.

    1. Select "LDAP SSL" to use SSL and to open a certificate upload dialog.

    2. LDAP Authentication Mechanism - supports simple and anonymous.

    3. LDAP BIND ANONYMOUSLY - must be false.

    SSL Certificate Upload:

  5. If your LDAP/AD requires additional properties, use Add Custom Properties to include them in the connection configuration.

Note

LDAP Connector search returns 1000 line items per page by default. For UserSync via LDAP with a large number of users / groups it is advised to enable paging in the ADVANCED section of Configure Connector.

  1. To enable paging for UserSync via LDAP:

    1. From Add UserSync Connector, click ADVANCED.

    2. Select Incremental Search.

    3. In the Add Custom Properties text box, set the following properties:

      usersync.connector.results.paged.enabled=true
      usersync.connector.results.paged size=<Results_Per_Page>
      
  2. Once all entries are complete, use Test Connection to validate the configuration. 

  3. When the connection test passes, click  Save

Extract User Name from Email Address#

In UserSync, you can set a property to extract the username portion of an email address value from the username attribute field. The username then becomes the value to the left of the @-sign of the email address.

In Datasource > UserSync > Configure Connector (Advanced Tab), set the following in Custom Properties:

usersync.connector.attribute.username.value.extractfromemail=true

SAML: Activate Single Sign On (SSO)#

PrivaceraCloud can be configured for SSO with an external Identity Provider. Connecting to an Identity Provider via SAML activates use of Single Sign On.

Prerequisites

Establish an Okta account and obtain key information before configuring Privacera SAML. See Okta Identity Provider Setup to obtain required SAML and metadata information. Once that information is available return to this section to complete the setup.

Steps#

  1. Select a datasource system and open + Add Application.

  2. Select SAML.

  3. Enter Application Name, Description, and Code.

  4. Enter the values in the remaining fields, refer to the following figure and the table:

    The following table shows the mapping of the fields in PrivaceraCloud with the fields of the SAML app in the Okta account:


    PrivaceraCloud Fields SAML App Fields in Okta Values Description
    Entity Id Audience URI (SP Entity ID) privacera_portal
    Identity Provider Url Embed Link URL Use Embed link from General > App Embed Link section in the Okta account.
    Identify Provider Metadata Identity Provider Metadata XML file Download the XML file from Sign On > Settings section in the Okta account, and then upload it in PrivaceraCloud.
    UserName Attribute UserID UserID Use only the field name from Okta i.e., UserID
    Firstname Attribute Firstname Firstname Use only the field name from Okta i.e., Firstname (Optional)
    LastName Attribute LastName LastName Use only the field name from Okta i.e., LastName (Optional)
    Email Attribute Email Email Use only the field name from Okta i.e., Email

  5. Click Save.

Once completed, a Single Sign-On (SSO) button will be displayed on the PrivaceraCloud Login page. Existing portal users can authenticate to PrivaceraCloud using their SSO credentials.

Note

SSO authenticated users without a match to a portal user will be authenticated, but not authorized to use any part of the PrivaceraCloud UI or API.

DISCOVERY#

For background, see PrivaceraCloud Discovery.

Data sources to be used for Privacera Discovery targets are connected using Settings > Datasource.

Add a Discovery Datasource System#

Discovery applications can be added to any datasource system, but you can want to group Discovery applications together in a single system.

  1. On the Datasource page, add a Datasource system by clicking on (+) Add System, found on the upper right.

    The Add System dialog box is displayed.

  2. Enter the System Name and Description.

  3. Click Save.

    Note

    Setup for each of the Discovery data source types is nearly identical except for DISCOVERY AWS S3.

Add a Discovery Data Source Type (Application)#

  1. Select a datasource system and add an application by clicking on the ellipsis icon found on the upper right corner of the Datasource system.

    The Add Application dialog box opens to the Choose tab.

  2. Select Discovery data source type from the Application List.

    The Add Application dialog box opens to the Configure tab.

  3. Based on the application chosen, provide the following details as applicable:

    • Application Name: A meaningful and unique name.
    • Application Description (optional): A useful description of this data resource.
    • Application Code: A unique character string value (used as an internal identifier).
    • Application Properties:

      For DISCOVERY DATABRICKS SPARK SQLL, DISCOVERY SNOWFLAKE SQL, DISCOVERY PRESTO SQL, DISCOVERY ORACLE SQL, DISCOVERY CASSANDRA SQL and DISCOVERY MYSQ -

      • Provide the connector jdbc.url, jdbc.username, and jdbc.password to allow access to the targeted cluster. These values can generally be obtained from the cluster itself.

      For DISCOVERY AWS S3 -

      To access the AWS S3, choose IAM Role or Access/Secret Key.

      • Disable Use IAM Role and provide the AWS Access Key, AWS Secret Key, and AWS Region (optional).

      • Enable Use IAM Role and enter IAM Role ARN and AWS Region (optional), current default value for AWS Region is “us-east-1".. For more information on how to use IAM Role to access AWS S3, click here.

  4. Click Test Connection to check if the connection is successful, and then click Save.

Go to Discovery: Data Source to add a resources using this connection as Discovery targets. See Discovery Scan Targets for quick start steps, and Privacera Discovery User Guide: Data Source Scanning: Enable a Data Source for Scan for more detailed instructions and options.

Note

Dataserver also supports logging the requested user’s name in AWS CloudWatch Logs. For more information see - Add UserInfo in S3 Requests sent via Dataserver


Last update: September 30, 2021