Skip to content

Enable Real-time Scanning on Azure ADLS

Prerequisites#

Ensure the following prerequisites are met. To configure them, see Account.

  • Select Enable Real-Time Scanning button.

  • Configure Event Hub for scanning.

  • Create Consumer Group for Pkafka.

  • Configure Checkpoint Storage for Pkafka.

Create a Storage Account and Event Subscription for Scanning#

  1. Log in to Azure Portal.

  2. Use an existing storage account or create a new one. Refer to Microsoft documentation on how to create a storage account.

    Use this storage account name in Storage Account Name when providing Application Properties details for the datasource.

  3. Get Storage Account Key:

    1. Navigate to the storage account.

    2. Under Security + networking, click Access keys.

    3. Click Show Keys for keys to be populated.

    4. Use appropriate key value in Storage Account Key when providing Application Properties details for the datasource.

  4. Use an existing container or create a new one. Refer to Microsoft documentation on how to create a container.

  5. Get URL Prefix:

    1. Navigate to the container and click Properties.

      Container property details are populated on the right.

    2. Use the URL prefix in the Application Properties details for the datasource.

  6. Create a event subscription. Refer to Microsoft documentation on how to Create an Event Grid subscription.

    1. Navigate to the storage account.

    2. On the left menu, select Events and click + Event Subscription.

      Create Event Subscription page is displayed.

    3. On the Create Event Subscription page within the Basic tab, provide the following values:

      1. Enter the Event Name and Event Schema.
      2. Topics Details are auto populated.
      3. Choose Event Type as Blob Created and Blob Deleted.
      4. Choose Endpoint type as Event Hubs.
      5. Select an Endpoint from Select Event Hub dialog.
        1. From the Event Hub Namespace dropdown, choose the Event Hub Namespace you created.
        2. From the Event Hub dropdown choose the Event Hub you created.
        3. Click Select Confirmation.
      6. Click Create.

    Note

    It is recommended to disable soft delete on blob storage account as ORC and Parquet file scanning is not supported when soft delete is enabled.

Configure Real-Time Scan for Discovery Azure ADLS#

  1. Log in to Privacera Portal.

  2. Add a data source system. See Add a Discovery Datasource System.

  3. Add an application to the datasource. To add an application to the datasource, see Add a Discovery Datasource Type.

    The Edit Application dialog box is displayed.

    1. Click Application Properties tab and enter the following storage account details to scan the datasource:

      1. Enter the Url Prefix associated with storage account.
      2. Enter the Storage Account Name.
      3. Provide the Storage Account Key which needs to be scanned.
    2. Ensure that Real-Time Enable button is enabled.

    3. Click Test Connection to check if the connection is successful, and then click Save.

  4. To add a resource to be scanned in real-time, navigate to Discovery > Data Source. See Discovery.

  5. To see the scan results, navigate to Data Inventory > Classifications.


Last update: August 13, 2021