- Platform Release 6.5
- Privacera Platform Installation
- About Privacera Manager (PM)
- Install overview
- Prerequisites
- Installation
- Default services configuration
- Component services configurations
- Access Management
- Data Server
- PolicySync
- Snowflake
- Redshift
- Redshift Spectrum
- PostgreSQL
- Microsoft SQL Server
- Databricks SQL
- RocksDB
- Google BigQuery
- Power BI
- UserSync
- Privacera Plugin
- Databricks
- Spark standalone
- Spark on EKS
- Trino Open Source
- Dremio
- AWS EMR
- AWS EMR with Native Apache Ranger
- GCP Dataproc
- Starburst Enterprise
- Privacera services (Data Assets)
- Audit Fluentd
- Grafana
- Access Request Manager (ARM)
- Ranger Tagsync
- Discovery
- Encryption & Masking
- Privacera Encryption Gateway (PEG) and Cryptography with Ranger KMS
- AWS S3 bucket encryption
- Ranger KMS
- AuthZ / AuthN
- Security
- Access Management
- Reference - Custom Properties
- Validation
- Additional Privacera Manager configurations
- CLI actions
- Debugging and logging
- Advanced service configuration
- Increase Privacera portal timeout for large requests
- Order of precedence in PolicySync filter
- Configure system properties
- PolicySync
- Databricks
- Table properties
- Upgrade Privacera Manager
- Troubleshooting
- Possible Errors and Solutions in Privacera Manager
-
- Unable to Connect to Docker
- Terminate Installation
- 6.5 Platform Installation fails with invalid apiVersion
- Ansible Kubernetes Module does not load
- Unable to connect to Kubernetes Cluster
- Common Errors/Warnings in YAML Config Files
- Delete old unused Privacera Docker images
- Unable to debug error for an Ansible task
- Unable to upgrade from 4.x to 5.x or 6.x due to Zookeeper snapshot issue
- Storage issue in Privacera UserSync & PolicySync
- Permission Denied Errors in PM Docker Installation
- Unable to initialize the Discovery Kubernetes pod
- Portal service
- Grafana service
- Audit server
- Audit Fluentd
- Privacera Plugin
-
- Possible Errors and Solutions in Privacera Manager
- How-to
- Appendix
- AWS topics
- AWS CLI
- AWS IAM
- Configure S3 for real-time scanning
- Install Docker and Docker compose (AWS-Linux-RHEL)
- AWS S3 MinIO quick setup
- Cross account IAM role for Databricks
- Integrate Privacera services in separate VPC
- Securely access S3 buckets ssing IAM roles
- Multiple AWS account support in Dataserver using Databricks
- Multiple AWS S3 IAM role support in Dataserver
- Azure topics
- GCP topics
- Kubernetes
- Microsoft SQL topics
- Snowflake configuration for PolicySync
- Create Azure resources
- Databricks
- Spark Plug-in
- Azure key vault
- Add custom properties
- Migrate Ranger KMS master key
- IAM policy for AWS controller
- Customize topic and table names
- Configure SSL for Privacera
- Configure Real-time scan across projects in GCP
- Upload custom SSL certificates
- Deployment size
- Service-level system properties
- PrestoSQL standalone installation
- AWS topics
- Privacera Platform User Guide
- Introduction to Privacera Platform
- Settings
- Data inventory
- Token generator
- System configuration
- Diagnostics
- Notifications
- How-to
- Privacera Discovery User Guide
- What is Discovery?
- Discovery Dashboard
- Scan Techniques
- Processing order of scan techniques
- Add and scan resources in a data source
- Start or cancel a scan
- Tags
- Dictionaries
- Patterns
- Scan status
- Data zone movement
- Models
- Disallowed Tags Policy
- Rules
- Types of rules
- Example rules and classifications
- Create a structured rule
- Create an unstructured rule
- Create a rule mapping
- Export rules and mappings
- Import rules and mappings
- Post-processing in real-time and offline scans
- Enable post-processing
- Example of post-processing rules on tags
- List of structured rules
- Supported scan file formats
- Data Source Scanning
- Data Inventory
- TagSync using Apache Ranger
- Compliance Workflow
- Data zones and workflow policies
- Workflow Policies
- Alerts Dashboard
- Data Zone Dashboard
- Data zone movement
- Example Workflow Usage
- Discovery health check
- Reports
- Built-in Reports
- Saved reports
- Offline reports
- Reports with the query builder
- How-to
- Privacera Encryption Guide
- Essential Privacera Encryption terminology
- Install Privacera Encryption
- Encryption Key Management
- Schemes
- Scheme Policies
- Encryption Schemes
- Presentation Schemes
- Masking schemes
- Encryption formats, algorithms, and scopes
- Deprecated encryption formats, algorithms, and scopes
- Encryption with PEG REST API
- PEG REST API on Privacera Platform
- PEG API Endpoint
- Encryption Endpoint Summary for Privacera Platform
- Authentication Methods on Privacera Platform
- Anatomy of the /protect API Endpoint on Privacera Platform
- About Constructing the datalist for protect
- About Deconstructing the datalist for unprotect
- Example of Data Transformation with /unprotect and Presentation Scheme
- Example PEG API endpoints
- /unprotect with masking scheme
- REST API Response Partial Success on Bulk Operations
- Audit Details for PEG REST API Accesses
- REST API Reference
- Make calls on behalf of another user
- Troubleshoot REST API Issues on Privacera Platform
- PEG REST API on Privacera Platform
- Encryption with Databricks, Hive, Streamsets, Trino
- Databricks UDFs for encryption and masking
- Hive UDFs
- Streamsets
- Trino UDFs
- Privacera Access Management User Guide
- Privacera Access Management
- How Polices are evaluated
- Resource policies
- Policies overview
- Creating Resource Based Policies
- Configure Policy with Attribute-Based Access Control
- Configuring Policy with Conditional Masking
- Tag Policies
- Entitlement
- Request Access
- Approve access requests
- Service Explorer
- User/Groups/Roles
- Permissions
- Reports
- Audit
- Security Zone
- Access Control using APIs
- AWS User Guide
- Overview of Privacera on AWS
- Set policies for AWS services
- Using Athena with data access server
- Using DynamoDB with data access server
- Databricks access manager policy
- Accessing Kinesis with data access server
- Accessing Firehose with Data Access Server
- EMR user guide
- AWS S3 bucket encryption
- S3 browser
- Getting started with Minio
- Plugins
- How to Get Support
- Coordinated Vulnerability Disclosure (CVD) Program of Privacera
- Shared Security Model
- Privacera documentation changelog
Discovery in AWS
Discovery
This topic allows you to set up the AWS configuration for installing Privacera Discovery in a Docker and Kubernetes (EKS) environment.
IAM policies
To use the Privacera Discovery service, ensure the following IAM policies are attached to the Privacera_PM_Role role to access the AWS services.
Policy to create AWS resourcesPolicy to create AWS resources is required only during installation or when Discovery is updated through Privacera Manager. This policy gives permissions to Privacera Manager to create AWS resources like DynamoDB, Kinesis, SQS, and S3 using terraform.
${AWS_REGION}: AWS region where the resources will get created.
{ "Version":"2012-10-17", "Statement":[ { "Sid":"CreateDynamodb", "Effect":"Allow", "Action":[ "dynamodb:CreateTable", "dynamodb:DescribeTable", "dynamodb:ListTables", "dynamodb:TagResource", "dynamodb:UntagResource", "dynamodb:UpdateTable", "dynamodb:UpdateTableReplicaAutoScaling", "dynamodb:UpdateTimeToLive", "dynamodb:DescribeTimeToLive", "dynamodb:ListTagsOfResource", "dynamodb:DescribeContinuousBackups" ], "Resource":"arn:aws:dynamodb:${AWS_REGION}:*:table/privacera*" }, { "Sid":"CreateKinesis", "Effect":"Allow", "Action":[ "kinesis:CreateStream", "kinesis:ListStreams", "kinesis:UpdateShardCount" ], "Resource":"arn:aws:kinesis:${AWS_REGION}:*:stream/privacera*" }, { "Sid":"CreateS3Bucket", "Effect":"Allow", "Action":[ "s3:CreateBucket", "s3:ListAllMyBuckets", "s3:GetBucketLocation" ], "Resource":[ "arn:aws:s3:::*" ] }, { "Sid":"CreateSQSMessages", "Effect":"Allow", "Action":[ "sqs:CreateQueue", "sqs:ListQueues" ], "Resource":[ "arn:aws:sqs:${AWS_REGION}:${ACCOUNNT_ID}:privacera*" ] } ] }
CLI configuration
SSH to the instance where Privacera is installed.
Configure your environment.
Configure Discovery for a Kubernetes environment. You need to set the Kubernetes cluster name. For more information, see Discovery (Kubernetes Mode)
For a Docker environment, you can skip this step.
Run the following commands.
cd ~/privacera/privacera-manager cp config/sample-vars/vars.discovery.aws.yml config/custom-vars/ vi config/custom-vars/vars.discovery.aws.yml
Edit the following properties. For property details and description, refer to the Configuration Properties below.
DISCOVERY_BUCKET_NAME: "<PLEASE_CHANGE>"
To configure a bucket, add the property as follows, where
bucket-1
is the name of the bucket:DISCOVERY_BUCKET_NAME: "bucket-1"
To configure a bucket containing a folder, add the property as follows:
DISCOVERY_BUCKET_NAME: "bucket-1/folder1"
Uncomment/Add the following variable to enable Autoscalability of Executor pods:
DISCOVERY_K8S_SPARK_DYNAMIC_ALLOCATION_ENABLED: "true"
(Optional) If you want to customize Discovery configuration further, you can add custom Discovery properties. For more information, refer to Discovery Custom Properties.
For example, by default, the username and password for the Discovery service is padmin/padmin. If you choose to change it, refer to Add Custom Properties.
Run the following commands.
cd ~/privacera/privacera-manager ./privacera-manager.sh update
Configuration properties
Property | Description | Example |
---|---|---|
DISCOVERY_BUCKET_NAME | Set the bucket name where Discovery will store its metadata files | container1 |
[Properties of Topic and Table names](../pm-ig/customize_topic_and_tables_names.md) | Topic and Table names are assigned by default in Privacera Discovery. To customize any topic or table name, refer to the link. |
Enable realtime scan
An AWS SQS queue is required, if you want to enable realtime scan on the S3 bucket.
After running the PM update command, an SQS queue will be created for you automatically with the name, privacera_bucket_sqs_{{DEPLOYMENT_ENV_NAME}}
, where {{DEPLOYMENT_ENV_NAME}}
is the environment name you set in the vars.privacera.yml
file. This queue name will appear in the list of queues of your AWS SQS account.
If you have an SQS queue which you want to use, add the DISCOVERY_BUCKET_SQS_NAME
property in the vars.discovery.aws.yml
file and assign your SQS queue name.
If you want to enable realtime scan on the bucket, click here.