Skip to main content

Privacera Platform

Discovery

:
Discovery

This topic provides the list of custom properties that can be configured for the Discovery service. It covers how you can configure the custom properties in Privacera Manager (PM) CLI.

PM CLI Configuration

To use a custom property from the properties table:

  1. Add the property to the following YML file in the custom-vars folder configured as per your environment.

    • vars.discovery.aws.yml

    • vars.discovery.azure.yml

    • vars.discovery.gcp.yml

  2. Run the following command:

    cd ~/privacera/privacera-manager
    ./privacera-manager.sh update
    
Properties Table

Property

Description

Values

Default Value

DISCOVERY_IMAGE_NAME

DISCOVERY_IMAGE_TAG

DISCOVERY_ENABLE

Set it true to enable Discovery.

true,false

USE_DATABRICKS_SPARK

Enable to use Databricks Spark instead of Apache Spark.

true,false

DISCOVERY_INSTALL

DISCOVERY_FS_PREFIX

For accessing the filesytem of the cloud storage service, do the following:

  • For AWS and GCP, set the filesystem prefix. s3a:// is the prefix for AWS, and gs:// for GCP.

  • For Azure, set the container name. A container name is associated with your Azure storage account and where the blobs are organized containing the data to be scanned.

  • s3a://

  • StorageContainerName

  • gs://

DISCOVERY_CLOUD_TYPE

Set the cloud type used for the Discovery setup.

  • AWS

  • AZURE

  • GCP

DISCOVERY_TRUSTSTORE_PASSWORD

AUTO_START_DATABRICKS_JOB

DISCOVERY_REALTIME_ENABLE

Set to true to enable real-time scan in Discovery.

true,false

false

DISCOVERY_MENU_ENABLE

Set to true to enable Discovery menu on Privacera Portal.

true,false

false

DISCOVERY_LOG_LEVEL

DISCOVERY_FOLDER_TAGGER_ENABLE

DISCOVERY_STORE_SAMPLE_VALUES

Whether any sample values should be stored for a column or field

true,false

false

DISCOVERY_MAX_SAMPLE_VALUES

Maximum sample values stored for a column or field.

DISCOVERY_ENCRYPT_SAMPLE_VALUES

Whether the samples should be stored encrypted.

true,false;

false

DISCOVERY_STREAM_SUFFIX

DISCOVERY_STREAM_TAGS

DISCOVERY_TABLE_SUFFIX

DISCOVERY_TABLE_TAGS

DISCOVERY_BUCKET_NAME

DISCOVERY_BUCKET_TAGS

DISCOVERY_CREATE_NOSQL_TABLES

DISCOVERY_GEN_TERRAFORM_NOSQL_TABLES

Set to true if you want to create Dynamodb tables using terraform.

Set to false to disable terraform and create the resource manually.

true

DISCOVERY_CREATE_STREAMS

DISCOVERY_GEN_TERRAFORM_STREAMS

Set to true if you want to create Kinesis streams using terraform.

Set to false to disable terraform and create the resource manually.

true

DISCOVERY_CREATE_BUCKET

DISCOVERY_GEN_TERRAFORM_BUCKET

Set to true if you want to create S3 bucket using terraform.

Set to false to disable terraform and create the resource manually.

true

DISCOVERY_GEN_TERRAFORM_AZURE_ACCOUNT

DISCOVERY_SPARK_DRIVER_MEMORY

DISCOVERY_SPARK_EXECUTOR_MEMORY

DISCOVERY_SPARK_DRIVER_CORES

DISCOVERY_SPARK_EXECUTOR_CORES

DISCOVERY_SPARK_EXECUTOR_INSTANCES

DISCOVERY_CREATE_DEFAULT_APP_IN_PORTAL

DISCOVERY_COSMOSDB_FILE_REPOSITORY_PATH

DISCOVERY_COSMOSDB_DOCUMENT_SIZE_LIMIT

DISCOVERY_COSMOSDB_OFFER_THROUGHPUT

DISCOVERY_AWS_CLOUD_ASSUME_ROLE

Property to enable/disable to grant Discovery access to AWS services to perform the scanning operation.

true

DISCOVERY_AWS_CLOUD_ASSUME_ROLE_ARN

DISCOVERY_BUCKET_SQS_NAME

Set this property if you want to set a custom name for a SQS queue.

privacera_bucket_sqs_{{DEPLOYMENT_ENV_NAME}}

DISCOVERY_SQS_TAGS

DISCOVERY_CREATE_SQS

DISCOVERY_GEN_TERRAFORM_SQS

Set to true if you want to create SQS resource using terraform.

Set to false to disable terraform and create the resource manually.

true

DATABRICKS_INIT_DBFS_FOLDER

DATABRICKS_DISCOVERY_CUST_CONF_ZIP_NAME

DATABRICKS_DISCOVERY_INIT_SCRIPT_PATH

DATABRICKS_DISCOVERY_SPARK_VERSION

The version of Spark used in a Databricks cluster.

  • 6.4.x-scala2.11 (Spark 2.4)

  • 7.3.x-scala2.12 (Spark 3.0)

  • 7.4.x-scala2.12 (Spark 3.0)

  • 7.5.x-scala2.12 (Spark 3.0)

  • 7.6.x-scala2.12 (Spark 3.0)

7.3.x-scala2.12

DISCOVERY_SPARK_TASK_SCHEDULER_ENABLE

DISCOVERY_RANGER_REST_ENABLED

DISCOVERY_K8S_IMAGE_NAME

DISCOVERY_K8S_IMAGE_TAG

DISCOVERY_K8S_IMAGE_PULL_POLICY

DISCOVERY_K8S_PVC_NAME

DISCOVERY_K8S_PVC_STORAGE_SIZE_MB

DISCOVERY_K8S_PVC_STORAGE_SIZE

DISCOVERY_K8S_STORAGE_PROVISIONER

DISCOVERY_K8S_SC_NAME

DISCOVERY_K8S_PV_ENCRYPTED

DISCOVERY_K8S_PV_KEY

DISCOVERY_K8S_LOADBALANCER_EXTERNAL

DISCOVERY_K8S_ANNOTATION_LOADBALANCER_ANNOTATION

DISCOVERY_K8S_SPARK_UI_PORT

DISCOVERY_K8S_SPARK_UI_PORT_EXTERNAL

Property to change the default port number for Discovery.

4040

DISCOVERY_K8S_SPARK_EVENT_LOG_ENABLED

DISCOVERY_K8S_SPARK_DRIVER_PORT

DISCOVERY_K8S_SPARK_BLOCKMANAGER_PORT

DISCOVERY_K8S_SPARK_PORT_MAX_RETRIES

DISCOVERY_K8S_SPARK_SERVICE_AC_NAME

DISCOVERY_K8S_SPARK_DRIVER_MEMORY

DISCOVERY_K8S_SPARK_EXECUTOR_MEMORY

DISCOVERY_K8S_SPARK_DRIVER_CORES

DISCOVERY_K8S_SPARK_EXECUTOR_CORES

DISCOVERY_K8S_SPARK_EXECUTOR_INSTANCES

DISCOVERY_K8S_SPARK_DRIVER_LIMIT_CORES

DISCOVERY_K8S_SPARK_EXECUTOR_LIMIT_CORES

DISCOVERY_K8S_SPARK_EXECUTOR_REQUEST_CORES

DISCOVERY_K8S_SPARK_MASTER

DISCOVERY_K8S_MEM_LIMITS

DISCOVERY_K8S_MEM_REQUESTS

DISCOVERY_K8S_CPU_LIMITS

DISCOVERY_K8S_CPU_REQUESTS

DISCOVERY_AZURE_APP_CLIENT_ID

DISCOVERY_AZURE_STORAGE_ACCOUNT_NAME

DISCOVERY_AZURE_URL_PREFIX

DISCOVERY_AZURE_AUDIT_TYPE

DISCOVERY_AZURE_LOCATION

CREATE_AZURE_RESOURCES

DISCOVERY_AZURE_RESOURCE_GROUP

DISCOVERY_AZURE_APPLICATION_ID

DISCOVERY_AZURE_TENANTID

DISCOVERY_AZURE_APP_CLIENT_SECRET_BASE64

DISCOVERY_AZURE_SUBSCRIPTION_ID

DISCOVERY_AZURE_COSMOS_DB_ACCOUNT

DISCOVERY_PORTAL_SERVICE_USERNAME

DISCOVERY_PORTAL_SERVICE_PASSWORD

DISCOVERY_CLOUD_MODE

DISCOVERY_AWS_ENDPOINT_ENABLE

DISCOVERY_KINESIS_ENDPOINT_URL

DISCOVERY_DYNAMODB_ENDPOINT_URL

DISCOVERY_SOLR_BASIC_AUTH_ENABLED

DISCOVERY_SOLR_BASIC_AUTH_USER

DISCOVERY_SOLR_BASIC_AUTH_PASSWORD

PRIVACERA_DISCOVERY_SECRETS_FILE

DISCOVERY_ENCRYPT_SECRETS

PRIVACERA_DISCOVERY_SECRETS_KEYSTORE_PASSWORD

DISCOVERY_ENCRYPT_PROPS_LIST

DISCOVERY_PORTAL_SERVICE_PASSWORD

PRIVACERA_DISCOVERY_DATASOURCE_PASSWORD

RANGER_TAGSYNC_PASSWORD

DISCOVERY_SOLR_BASIC_AUTH_PASSWORD

PRIVACERA_DISCOVERY_DATASOURCE_PASSWORD

DISCOVERY_FS_S3A_ACCCESS_KEY

DISCOVERY_FS_S3A_SECRET_KEY

DISCOVERY_CLUSTER_NAME

DISCOVERY_AGENT_MODE

DISCOVERY_LOGS_SOLR_ENABLE

DISCOVERY_RANGER_HOOK_ENABLED

DISCOVERY_SPARK_DOCKER_DRIVER_MEMORY

DISCOVERY_SPARK_DOCKER_EXECUTOR_MEMORY

DISCOVERY_SPARK_DOCKER_DRIVER_CORES

DISCOVERY_SPARK_DOCKER_EXECUTOR_CORES

DISCOVERY_SPARK_DOCKER_EXECUTOR_INSTANCES

DISCOVERY_DOCKER_SPARK_MASTER

DISCOVERY_OFFLINE_SCAN_DEBUG_ENABLED

DISCOVERY_SCAN_BACKUP_CLEANER_INTERVAL_HR

DISCOVERY_RTBF_POLICY_ENABLED

DISCOVERY_WORKFLOW_POLICY_ENABLED

DISCOVERY_WORKFLOW_EXPUNGE_POLICY_ENABLED

DISCOVERY_DEIDENTIFICATION_POLICY_ENABLED

DISCOVERY_CONTENT_SCANNING_ENABLED

DISCOVERY_SCAN_OFFICE_MIME_TYPES_AS_ARCHIVE_ENABLED

DISCOVERY_OFFLINE_SCAN_BACKUP_FOLDER

DISCOVERY_DICT_BASE_PATH

DISCOVERY_ML_BASE_PATH

DISCOVERY_ML_TAG_ACTION_MODEL_PATH

DISCOVERY_SCAN_REQUEST_FILES_DIR

PARTIAL_MATCH_ENABLE

DISCOVERY_COSMOSDB_URL

DISCOVERY_COSMOSDB_KEY

DISCOVERY_GEN_TERRAFORM_WITH_MSI_ROLE

DISCOVERY_AZURE_HNS_ENALBED

DISCOVERY_AZURE_ACCOUNT_REPLICATION_TYPE

DISCOVERY_AZURE_ACCOUNT_KIND

DISCOVERY_SAMPLE_VALUES_MAX_LENGTH

Maximum length of a sample that is stored for a column or field

DISCOVERY_S3_AUDITS_ENABLE

DISCOVERY_ADLS_AUDITS_ENABLE

DISCOVERY_GCS_AUDITS_ENABLE

DISCOVERY_GBQ_AUDITS_ENABLE

DISCOVERY_DEPLOYMENT_SUFFIX_ID

DISCOVERY_SERVICE_USER

DISCOVERY_VERSION_FILE_NAME

DISCOVERY_HEARTBEAT_UPDATE_INTERVAL_SEC

DISCOVERY_SCAN_BACKUP_CLEANER_THRESHOLD_HR

DISCOVERY_LOOKUP_COPY_TO_HDFS_INTERVAL_SEC

DISCOVERY_GENERATE_SRC_ALERT_INTERVAL_MIN

DISCOVERY_LOOKUP_COPY_TO_HDFS_FROM_AGENT

DISCOVERY_RETRY_ON_FAILURE_INTERVAL_SEC

DISCOVERY_SCAN_DELAY_RETRY_INTERVAL

DISCOVERY_SCAN_DELAY_RETRY_COUNT

DISCOVERY_HOST

DISCOVERY_KAFKA_HEARTBEAT_INTERVAL_MS

DISCOVERY_KAFKA_REQUEST_TIMEOUT_MS

DISCOVERY_KAFKA_SESSION_TIMEOUT_MS

DISCOVERY_KAFKA_CONNECTIONS_MAX_IDLE_MS

DISCOVERY_KAFKA_ENABLE_AUTO_COMMIT

DISCOVERY_KAFKA_AUTO_OFFSET_RESET

DISCOVERY_KERBEROS_ENABLE

DISCOVERY_SOLR_KERBEROS_ENABLE

DISCOVERY_HBASE_KERBEROS_ENABLE

DISCOVERY_KAFKA_KERBEROS_ENABLE

DISCOVERY_KERBEROS_RELOGIN_INTERVAL_SECS

DISCOVERY_PORTAL_KERBEROS_ENABLE

DISCOVERY_SCAN_WORKER_KAFKA_SEND_BUFFER_MEMORY

DISCOVERY_SCAN_WORKER_KAFKA_SEND_LINGERMS

DISCOVERY_SCAN_WORKER_KAFKA_SEND_BATCHSIZE

DISCOVERY_SCAN_WORKER_KAFKA_SEND_RETRIES

DISCOVERY_SOLR_COLLECTION

DISCOVERY_SOLR_LINEAGE_COLLECTION

DISCOVERY_SOLR_ALERT_COLLECTION

DISCOVERY_SOLR_RESOURCE_COLLECTION

DISCOVERY_SOLR_OFFLINE_SCAN_SUMMARY_COLLECTION

DISCOVERY_SOLR_RESOURCE_META_INFO_COLLECTION

DISCOVERY_SOLR_RESOURCE_AUDIT_COLLECTION

DISCOVERY_SOLR_SPARK_EVENT_COLLECTION

DISCOVERY_SOLR_OFFLINE_SCAN_CLEANUP_COLLECTION

DISCOVERY_UNSTRUCTURED_VALUE_CHECKING_ENABLED

DISCOVERY_NUM_TOKENS_FOR_UNSTRUCTURED_DATA_DETECTION

DISCOVERY_SCAN_INCLUDE_PART_FILES_MAX_INDEX

DISCOVERY_ACTIVE_SCAN_ENABLE

DISCOVERY_SPARK_JOB_SCHEDULER_SLEEP_TIME_MS

DISCOVERY_AMOUNT_ARRAYVALUES_EXTRACTED

DISCOVERY_RECOVERY_SPARK_DEFAULT_POOL_NAME

DISCOVERY_CONSUMER_RECORD_WAIT_TIMEOUT_MS

DISCOVERY_CONSUMER_RECORD_BATCH_SIZE

DISCOVERY_RECOVERY_RETRY_MAX

DISCOVERY_GENERAL_CONSUMER_QUEUE_SIZE

DISCOVERY_OFFLINE_CONSUMER_QUEUE_SIZE

DISCOVERY_CONSUMER_RECORD_DB_PATHS

DISCOVERY_CONSUMER_RECORD_HANDLER_THREAD_POOL_SIZE

Property to configure the thread pool size for handling the consumer records.

The property determines how many data source applications can be handled by the scheduler, so the property value should be more than the data source applications that are registered in an installation.

100

DISCOVERY_SEND_CHILD_TO_EXCLUDE_RESOURCE_INFO_ENABLE

DISCOVERY_DYNAMODB_WRITE_ITEM_MAX_SIZE

DISCOVERY_DYNAMODB_WRITE_BATCH_SIZE

DISCOVERY_DYNAMODB_READ_BATCH_SIZE

DISCOVERY_DYNAMODB_CHILD_COLUMN_LIMIT

DISCOVERY_AZURE_PAYLOAD_LIMIT

DISCOVERY_METASTORE_PAYLOAD_TABLE

DISCOVERY_METANAME_LEAF_ONLY

DISCOVERY_SEND_SPARK_JOB_EVENT

DISCOVERY_RESTART_ON_STUCK_JOBS

DISCOVERY_START_SCRIPT

DISCOVERY_DB_MAX_STATEMENTS

DISCOVERY_DB_MAX_POOL_SIZE

DISCOVERY_DB_ACQUIRE_INCREMENT

DISCOVERY_DB_MIN_POOL_SIZE

DISCOVERY_COSMOSDB_MAX_POOL_SIZE

DISCOVERY_COSMOSDB_RETRY_INTERVAL_SEC

DISCOVERY_COSMOSDB_MAX_RETRY

DISCOVERY_COSMOSDB_DATABASE_NAME

DISCOVERY_SAVE_ARCHIVE_FILES

DISCOVERY_RTBF_USE_ENCRYPTION

DISCOVERY_DATAZONE_MONITOR_OFF_PREMISE_SRC_ENABLE

DISCOVERY_DATAZONE_RESOURCE_REEVALUATE_ENABLED

DISCOVERY_SCAN_NEW_SCANNER_ENABLE

DISCOVERY_RIGHT_TO_PRIVACY_THREAD_POOL_SIZE

DISCOVERY_OFFLINE_SCAN_RETRY_COUNT

DISCOVERY_OFFLINE_SCAN_AUTO_RETRY_ENABLE

DISCOVERY_OFFLINE_FILE_AND_FOLDER_COUNTING_TASK_POLL_TIME_MS

DISCOVERY_OFFLINE_FILE_AND_FOLDER_COUNTING_TASK_TIMEOUT_MS

DISCOVERY_OFFLINE_SCAN_PARTITION_ENABLE

DISCOVERY_MAX_DICT_WORD_TO_SENTENCE_RATIO

DISCOVERY_APPLY_METANAME_DICT_TO_UNSTRUCT

DISCOVERY_MAX_BYTES_FOR_WORKFLOW

DISCOVERY_PRECORDS_PARQUET_VERSION

DISCOVERY_UNSTRUCT_TAGS_FILENAME

DISCOVERY_WORKFLOW_DUPLICATE_FILE_RETRY_MAX_ATTEMPTS

DISCOVERY_WORKFLOW_EXPUNGE_SPARKDF_SINGLE_FILE

DISCOVERY_WORKFLOW_EXPUNGE_SPARKDF_ENABLE

DISCOVERY_CLOUD_USE_ASSUMEROLE

DISCOVERY_GCP_CLOUD_OUTPUTWRITERS_ENABLE

DISCOVERY_DROOLS_POOL_SIZE

DISCOVERY_DROOLS_USE_POOL

DISCOVERY_INVALID_HEADER_CHARS_PAT

DISCOVERY_MAX_HEADER_LEN

DISCOVERY_STRUCT_VALUE_FULL_MATCH_ENABLED

DISCOVERY_CLASSIFIER_AUTO_CREATE_MANUAL_TAG

DISCOVERY_HBASE_BACKUP_TTL_MS

DISCOVERY_HBASE_BACKUP_TTL_ENABLE

DISCOVERY_HBASE_CLIENT_SCANNER_TIMEOUT_MS

DISCOVERY_EXCLUSION_CLEANER_SLEEP_MIN

DISCOVERY_EXCLUSION_CLEANER_BATCH_SIZE

DISCOVERY_EXCLUSION_CLEANER_ENABLE

DISCOVERY_FOLDER_TAGGER_BATCH_SIZE

DISCOVERY_FOLDER_TAGGER_BACKOFF_TIME_SEC

DISCOVERY_FOLDER_TAGGER_SLEEP_TIME_MS

DISCOVERY_CMD_SERVER_ENABLED

DISCOVERY_CMD_SERVER_PORT

DISCOVERY_RULE_ENGINE_ADJUST_SCORES

DISCOVERY_NOUN_LIST_FILE

DISCOVERY_SPARK_JOB_MAX_TIME_MS

DISCOVERY_ClASSIFY_RECORD_MAPPER_TASK_POLL_TIME_MS

DISCOVERY_ClASSIFY_RECORD_MAPPER_TASK_TIMEOUT_MS

DISCOVERY_ATLAS_HOOK_MAPPER_TASK_POLL_TIME_MS

DISCOVERY_ATLAS_HOOK_MAPPER_TASK_TIMEOUT_MS

DISCOVERY_NAV_TO_PRIVACERA_MAPPER_TASK_POLL_TIME_MS

DISCOVERY_NAV_TO_PRIVACERA_MAPPER_TASK_TIMEOUT_MS

DISCOVERY_SCAN_DELAY_MAPPER_TASK_POLL_TIME_MS

DISCOVERY_SCAN_DELAY_MAPPER_TASK_TIMEOUT_MS

DISCOVERY_ADLS_AUDITS_MAPPER_TASK_POLL_TIME_MS

DISCOVERY_ADLS_AUDITS_MAPPER_TASK_TIMEOUT_MS

DISCOVERY_S3_AUDITS_MAPPER_TASK_POLL_TIME_MS

DISCOVERY_S3_AUDITS_MAPPER_TASK_TIMEOUT_MS

DISCOVERY_DYNAMODB_AUDITS_MAPPER_TASK_POLL_TIME_MS

DISCOVERY_DYNAMODB_AUDITS_MAPPER_TASK_TIMEOUT_MS

DISCOVERY_HIVE_AUDITS_MAPPER_TASK_POLL_TIME_MS

DISCOVERY_HIVE_AUDITS_MAPPER_TASK_TIMEOUT_MS

DISCOVERY_CONTENT_CLASSIFIER_MAPPER_TASK_POLL_TIME_MS

DISCOVERY_CONTENT_ClASSIFIER_MAPPER_TASK_TIMEOUT_MS

DISCOVERY_CONTENT_SCAN_WORKER_TOPIC_PARTITION

DISCOVERY_CONTENT_SCAN_COLLECTOR_CYCLE_TIME_MS

DISCOVERY_DEFAULT_SPARK_PARTITION_PERCENT

DISCOVERY_USE_SPARK_PARTITION_CALC

DISCOVERY_HIVE_PROXY_USER_FEATURE

DISCOVERY_KERBEROS_LOGIN_RETRY_INTERVAL_MS

DISCOVERY_KERBEROS_LOGIN_NUM_RETRIES

DISCOVERY_LFS_USE_FILE_MONITOR

DISCOVERY_LFS_USE_FILE_WATCHER

DISCOVERY_OFFLINE_SCAN_CLEANUP_THREAD_POOL_SIZE

DISCOVERY_OFFLINE_SCAN_THREAD_POOL_SIZE

DISCOVERY_QUICK_SCAN_LIMIT

DISCOVERY_QUICK_SCAN_ENABLE

DISCOVERY_DO_HDFS_SCHEMA_MAPPING

DISCOVERY_ALLOW_FUZZY_MATCH_TAGS

DISCOVERY_EXEC_MIMETYPE_REMOVE_DEFAULTS

DISCOVERY_DEV_TEST_MODE

DISCOVERY_TRIGGER_FILE_PATH

DISCOVERY_POST_PROCESS_DROOLS_RULES_FILENAME

DISCOVERY_CLASSIFIER_RULES_UNSTRUCT_FILENAME

DISCOVERY_CLASSIFIER_RULES_FILENAME

DISCOVERY_CLASSIFIER_DROOLS_RULES_FILENAME

DISCOVERY_CHAT_SCAN_SKIP_INVALID_JSON_OUTPUT

DISCOVERY_UNSTRUCT_AS_SINGLE_LINE

DISCOVERY_POST_PROCESS_DATA_KEYSCORE_THRESHOLD

DISCOVERY_UNSTRUCTURED_DATA_KEYSCORE_THRESHOLD

DISCOVERY_STRUCTURED_DATA_KEYSCORE_THRESHOLD

DISCOVERY_USE_KEYSCORE_THRESHOLD

DISCOVERY__ML_PYTHON_FILE

DISCOVERY_ML_CONDA_ENV_PATH

DISCOVERY_ML_NLP_ENABLED

DISCOVERY_POST_PROCESS_RULE_ENGINE_ENABLED

DISCOVERY_RULE_ENGINE_DO_FALLBACK

DISCOVERY_RULE_DATABASE_ENABLED

DISCOVERY_RULE_ENGINE_ENABLED

DISCOVERY_RULE_ENGINE_DROOLS_ENABLED

DISCOVERY_RESOURCE_META_SCAN_MAPPER_CHECK_TASK_ACTIVE_INTERVAL_TIME_MS

DISCOVERY_RESOURCE_META_SCAN_MAPPER_TASK_POLL_TIME_MS

DISCOVERY_RESOURCE_META_SCAN_MAPPER_TASK_TIMEOUT_MS

DISCOVERY_SCHEMA_MAP_BASE_PATH

DISCOVERY_OFFLINE_SCAN_KAFKA_ENABLE

DISCOVERY_ML_ENABLE

DISCOVERY_SAS_SUFFIXES

DISCOVERY_ENABLE_SIMPLE_KAFKA_CONSUMER_FOR_AUDIT_PARSING

DISCOVERY_ENABLE_KAFKA_CONSUMER_FOR_MAPR_AUDIT_PARSING

DISCOVERY_ENABLE_KAFKA_CONSUMER_FOR_AUDIT_PARSING

DISCOVERY_ZIP_LOOKUP_KEY

DISCOVERY_GENERIC_ML_TYPE

DISCOVERY_CORE_NLP_ML_TYPE

DISCOVERY_PHONE_NUMBER_ML_TYPE

DISCOVERY_GEO_LAT_LONG_ML_TYPE

DISCOVERY_DOB_ML_TYPE

DISCOVERY_VIN_ML_TYPE

DISCOVERY_ITIN_ML_TYPE

DISCOVERY_EIN_ML_TYPE

DISCOVERY_SSN_ML_TYPE

DISCOVERY_IMEI_ML_TYPE

DISCOVERY_CC_ML_TYPE

DISCOVERY_ZIP_ML_TYPE

DISCOVERY_LFS_WATCHER_POLLTIME_MS

DISCOVERY_LFS_CREATE_MAX_TIME_MS

DISCOVERY_LFS_WATCHER_CACHE_SIZE

DISCOVERY_LFS_WATCHER_ENABLE

DISCOVERY_LFS_APP_TOPIC

DISCOVERY_LFS_APP

DISCOVERY_GOOGLE_BIGQUERY_PARSE_CTAS

DISCOVERY_DYNAMODB_ENABLE

DISCOVERY_FUZZY_SCORING_SENSE_CHECK_ENABLE

DISCOVERY_FUZZY_SCORING_MIN_CUTOFF_SCORE

DISCOVERY_ML_SRC_DETECT_MODEL_PATH

DISCOVERY_ML_MODEL_PATH

DISCOVERY_ML_CLASSIFY_TAG_ACTION_ENABLE

DISCOVERY_ML_CLASSIFY_SRC_CODE_ENABLE

DISCOVERY_ML_CLASSIFY_TAG_ENABLE

DISCOVERY_ML_STORE_SCAN_RESULTS

DISCOVERY_OUTPUTWRITERS_ENABLE

DISCOVERY_DATABRICKS_SPARK_ENABLE

DISCOVERY_KAFKA_PRODUCER_COMPRESSION_CODEC

DISCOVERY_SET_REMOTE_USER

DISCOVERY_STALE_DATA_RETRY_COUNT

DISCOVERY_AUDITS_TO_SOLR_ENABLED

DISCOVERY_ATLAS_HOOK_SIMPLE

DISCOVERY_ATLAS_HOOK_ENABLED

DISCOVERY_SPLUNK_ENABLE

DISCOVERY_SPLUNK_PORT

DISCOVERY_SPLUNK_ALERT_INDEX

DISCOVERY_SPLUNK_SCHEME

DISCOVERY_SPLUNK_HEC_SOURCE

DISCOVERY_ANOMALY_SCHEDULAR_ENABLE

DISCOVERY_MONITORING_SCHEDULAR_ENABLE

DISCOVERY_METRICS_JVM

DISCOVERY_METRICS_KAFKA_TOPIC

DISCOVERY_METRICS_KAFKA_INTERVAL_SEC

DISCOVERY_METRICS_ENABLE_KAFKA

DISCOVERY_METRICS_GRAPHITE_INTERVAL_SEC

DISCOVERY_METRICS_GRAPHITE_ENABLE

DISCOVERY_METRICS_CONSOLE_INTERVAL_SEC

DISCOVERY_METRICS_ENABLE_CONSOLE

DISCOVERY_METRICS_CSV_INTERVAL_SEC

DISCOVERY_METRICS_ENABLE_CSV

DISCOVERY_METRICS_CSVPATH

DISCOVERY_SOLR_LOGS_COLLECTION

DISCOVERY_SOLR_METRICS_COLLECTION

DISCOVERY_DB_CPDS_TEST_ONCHECKIN

DISCOVERY_DB_CPDS_TEST_ONCHECKOUT

DISCOVERY_DB_CPDS_IDLECONN_TEST_PERIOD_SEC

DISCOVERY_DB_CPDS_TESTQUERY

DISCOVERY_COMMON_EXCLUDE_RESOURCE_LIST

DISCOVERY_CSV_USE_HEADER

DISCOVERY_SCAN_MARK_LIMIT_BYTES

DISCOVERY_SCAN_MIN_CSV_FIELDS

DISCOVERY_SCAN_HIVE_MAX_COLS

Maximum number of columns in a database table or fields in a structured file to be scanned. This can be overriden by using `record.max.fields` property at data source level.

2000

DISCOVERY_SCAN_HIVE_MAX_ROWS

Maximum number of rows of a database table to be scanned.

500

DISCOVERY_SCAN_MAX_LINES

Maximum number of records of a structured file to be scanned.

500

DISCOVERY_CONTENT_MAX_CHARACTER

Maximum number of bytes in a column cell or field cell to be scanned.

1000

DISCOVERY_TIKA_MAX_BYTES

Maximum number of bytes of an unstructured file to be scanned.

102400

DISCOVERY_MAX_TAG_SNIPPET_SAMPLE_VALUES

Maximum number of samples to be captured for display in a tag.

3

DISCOVERY_QUICK_COUNT_THRESHOLD

DISCOVERY_KAFKA_CLASSIFIEDINFO_MAX_POLL_RECORDS

DISCOVERY_KAFKA_CLASSIFIEDINFO_SESSION_TIMEOUT_MS

DISCOVERY_KAFKA_CLASSIFIEDINFO_REQUEST_TIMEOUT_MS

DISCOVERY_META_SCANNING_ENABLE

DISCOVERY_OFFLINE_SCAN_SUMMARY_SOLR_ENABLE

DISCOVERY_METRICS_SOLR_ENABLE

DISCOVERY_NON_NULL_REPORT_OUTPUT_PATH

DISCOVERY_CLASSIFICATION_NON_NULL_COUNT_ENABLE

DISCOVERY_KAFKA_TOPIC_ENCRYPTION

DISCOVERY_KAFKA_TOPIC_DISCOVERY

DISCOVERY_KAFKA_DISCOVERY

DISCOVERY_KAFKA_DISCOVERY_REQUEST_TIMEOUT_MS

DISCOVERY_KAFKA_DISCOVERY_BOOSTRAP_SERVERS

DISCOVERY_KAFKA_DISCOVERY_USE_SSL

DISCOVERY_KAFKA_DISCOVERY_USE_KERBEROS

DISCOVERY_KAFKA_DISCOVERY_NAME

DISCOVERY_KAFKA_DISCOVERY_GROUP_ID

DISCOVERY_KAFKA_DISCOVERY_POLL_TIME_MS

DISCOVERY_KAFKA_DISCOVERY_ENABLE

DISCOVERY_IS_ATLAS_TAG_ENABLE

DISCOVERY_ATLAS_HOOK_VERSION

DISCOVERY_SCAN_RESOURCE_META_INFO_SOLR

DISCOVERY_IS_ATLAS_ENABLE

DISCOVERY_SPARK_STREAMING_RECEIVER_MAXRATE

DISCOVERY_SPARK_STREAMING_CHECKPOINT

DISCOVERY_SPARK_ENABLE_HIVE_SUPPORT

DISCOVERY_SPARK_LOCAL_MASTER

DISCOVERY_SPARK_APPLICATION_NAME

DISCOVERY_PORTAL_API_SCORE_THRESHOLD

DISCOVERY_PORTAL_API_APP_LIST

DISCOVERY_PORTAL_API_SYSTEM_LIST

DISCOVERY_KERBEROS_PRINCIPAL

DISCOVERY_KAFKA_ALERT_REPLICATION

DISCOVERY_KAFKA_GROUP_ID

DISCOVERY_GRAPHITE_HOST

DISCOVERY_KAFKA_CLASSFICATION_INFO_REPLICATION

DISCOVERY_MONITORING_HDFS_INPUT_PATH

DISCOVERY_KERBEROS_KEYTAB

DISCOVERY_SCAN_WORKER_KAFKA_GROUP_ID

DISCOVERY_SOLR_ALERTS_COLLECTION

DISCOVERY_SOLR_CLASSIFICATION_COLLECTION

DISCOVERY_GRAPHITE_PORT

DISCOVERY_HIVE_METASTORE_USEJDBC

DISCOVERY_INIT_CONTAINER_COMMAND_LIST

You can provide a list of commands to download custom jars to a specified location inside the Discovery container. For example:

DISCOVERY_INIT_CONTAINER_COMMAND_LIST:-wget https://privacera/public/custom-1.jar -O /opt/privacera/discovery/libs/custom-1.jar-wget https://privacera/public/custom-2.jar -O /opt/privacera/discovery/libs/custom-2.jar

DISCOVERY_SCAN_PARQUET_ORC_FROM_ARCHIVE_ENABLE

Property to enable/disable the scanning of ORC/Parquet files within a ZIP file.

true, false

false

DISCOVERY_SCAN_PARQUET_ORC_STREAM_FILE_SIZE_LIMIT

Property to set the file size limit in megabytes (MB) on the ORC/Parquet files being scanned from the archive location.

5242880

DISCOVERY_SCAN_PARQUET_TEMP_FILE_FROM_ARCHIVE_ENABLE

By default, Parquet files are stored in a temporary file within a zip file.

Set to true to scan the Parquet files from a temporary file.

Set to false to scan the Parquet files from a zip file stream.

true, false

true

DISCOVERY_SCAN_ORC_TEMP_FILE_FROM_ARCHIVE_ENABLE

By default, ORC files are stored in a temporary file within a zip file.

Set to true to scan the ORC files from a temporary file.

Set to false to scan the ORC files from a zip file stream.

true, false

false

DISCOVERY_GOOGLE_CLOUD_STORAGE_LINEAGE_LOOPBACK_TIME_MS

This property indicates time for GCS lineage loopback.

-

3000

DISCOVERY_GOOGLE_CLOUD_STORAGE_LINEAGE_CUTOFF_TIME_MS

This property indicates cut off time to wait for GCS log event for lineage.

-

300000

DISCOVERY_GOOGLE_CLOUD_STORAGE_LINEAGE_CUTOFF_TIME_CHECK_INTERVAL_MS

This property indicates fixed interval at which to check for delayed GCS lineage pending realtime file.

-

30000

DISCOVERY_CONTENT_SCAN_THREAD_POOL_SIZE

If you are scanning more than 2 datasource with different projects, then set this property as the number of projects you will be scanning in discovery.

-

2

DISCOVERY_CONNECTION_TEST_INTERVAL_SEC

The fixed interval in seconds at which all key Privacera internal components are checked. Status of the connection is sent to Portal. See Health Check

Allowable value is non-zero integer number of seconds. Recommended short duration and not to exceed 900 seconds (15 minutes).

60

DISCOVERY_TELEMETRY_UPDATE_TO_SOLR

Set to true to send telemetry to Apache Solr.

Set to false to not send telemetry to the Apache Solr.

The following telemetry is sent to Apache Solr:

  • Count of tags.

  • Count of resource scanned based on application and application type.

  • Scan amount based on application and application type.

  • Total compliance count and compliance count for individual policy.

true, false

true

DISCOVERY_RTBF_SUMMARY_ENABLED

Set this property to true to view the summary for RTP policy and Expunge policy on the UI for Auto Run jobs.

Set this property to false to not view the summary.

Although this property string contains "RTBF", the property relates to RTP.

true, false

false

DISCOVERY_K8S_SPARK_DYNAMIC_ALLOCATION_ENABLED

Whether to use dynamic resource allocation, which scales the number of executors registered with this application up and down based on the workload.

true, false

false

DISCOVERY_K8S_SPARK_DYNAMIC_ALLOCATION_SHUFFLE_TRACKING_ENABLED

Enables shuffle file tracking for executors, which allows dynamic allocation without the need for an external shuffle service. This option will try to keep alive executors that are storing shuffle data for active jobs.

true, false

true

DISCOVERY_K8S_SPARK_DYNAMIC_ALLOCATION_EXECUTOR_IDLE_TIMEOUT

If dynamic allocation is enabled and an executor has been idle for more than this duration, the executor will be removed.

-

60s

DISCOVERY_K8S_SPARK_DYNAMIC_ALLOCATION_CACHED_EXECUTOR_IDLE_TIMEOUT

If dynamic allocation is enabled and an executor which has cached data blocks has been idle for more than this duration, the executor will be removed.

-

120s

DISCOVERY_K8S_SPARK_DYNAMIC_ALLOCATION_MAX_EXECUTORS

Upper bound for the number of executors if dynamic allocation is enabled.

-

4

DISCOVERY_K8S_SPARK_MEMORY_OVERHEAD_FACTOR

This sets the Memory Overhead Factor that will allocate memory to non-JVM memory, which includes off-heap memory allocations, non-JVM tasks, and various systems processes.

-

0.1

DISCOVERY_HBASE_RETRY_ON_FAILURE_COUNT

Number of retries for Hbase connection.

-

2

DISCOVERY_HBASE_WAIT_BETWEEN_RETRY_MS

Wait time before retrying Hbase connection.

-

100 ms (milliseconds)

Memory Variables

DISCOVERY_DRIVER_HEAP_MIN_MEMORY_MB

Minimum Java Heap memory in MB used by Discovery Driver. For example, DISCOVERY_DRIVER_HEAP_MIN_MEMORY_MB: "1024"

DISCOVERY_DRIVER_HEAP_MIN_MEMORY

Minimum Java Heap memory used by Discovery Driver. Setting this value will override DISCOVERY_DRIVER_HEAP_MIN_MEMORY_MB. For example, DISCOVERY_DRIVER_HEAP_MIN_MEMORY: "1g"

DISCOVERY_DRIVER_HEAP_MAX_MEMORY_MB

Maximum Java Heap memory in MB used by Discovery Driver. For example, DISCOVERY_DRIVER_HEAP_MAX_MEMORY_MB: "1024"

DISCOVERY_DRIVER_HEAP_MAX_MEMORY

Maximum Java Heap memory used by Discovery Driver. Setting this value will override DISCOVERY_DRIVER_HEAP_MAX_MEMORY_MB. For example, DISCOVERY_DRIVER_HEAP_MAX_MEMORY: "1g"

DISCOVERY_DRIVER_K8S_MEM_REQUESTS_MB

Minimum amount of Kubernetes memory in MB to be requested by Discovery Driver. For example, DISCOVERY_DRIVER_K8S_MEM_REQUESTS_MB: "1024"

DISCOVERY_DRIVER_K8S_MEM_REQUESTS

Minimum amount of Kubernetes memory to be used by Discovery Driver. Setting this value will override DISCOVERY_DRIVER_K8S_MEM_REQUESTS_MB. For example, DISCOVERY_DRIVER_K8S_MEM_REQUESTS: "1G"

DISCOVERY_DRIVER_K8S_MEM_LIMITS_MB

Maximum amount of Kubernetes memory to be requested by Discovery Driver. The value set in in this field will be considered as megabytes. For example, DISCOVERY_DRIVER_K8S_MEM_LIMITS_MB: "1024"

DISCOVERY_DRIVER_K8S_MEM_LIMITS

Maximum amount of Kubernetes memory to be used by Discovery Driver. Setting this value will override DISCOVERY_DRIVER_K8S_MEM_LIMITS_MB. For example, DISCOVERY_DRIVER_K8S_MEM_LIMITS: "1G"

DISCOVERY_DRIVER_CPU_MIN

Minimum amount of Kubernetes CPU to be requested by Discovery Driver. For example, DISCOVERY_DRIVER_CPU_MIN: "0.5"

DISCOVERY_DRIVER_CPU_MAX

Maximum amount of Kubernetes CPU to be used by Discovery Driver. For example, DISCOVERY_DRIVER_CPU_MAX: "0.5"

DISCOVERY_EXECUTOR_HEAP_MIN_MEMORY_MB

Minimum Java Heap memory in MB used by Discovery Executor. For example, DISCOVERY_EXECUTOR_HEAP_MIN_MEMORY_MB: "1024"

DISCOVERY_EXECUTOR_HEAP_MIN_MEMORY

Minimum Java Heap memory used by Discovery Executor. Setting this value will override DISCOVERY_EXECUTOR_HEAP_MIN_MEMORY_MB. For example, DISCOVERY_EXECUTOR_HEAP_MIN_MEMORY: "1g"

DISCOVERY_EXECUTOR_HEAP_MAX_MEMORY_MB

Maximum Java Heap memory in MB used by Discovery Executor. For example, DISCOVERY_EXECUTOR_HEAP_MAX_MEMORY_MB: "1024"

DISCOVERY_EXECUTOR_HEAP_MAX_MEMORY

Maximum Java Heap memory used by Discovery Executor. Setting this value will override DISCOVERY_EXECUTOR_HEAP_MAX_MEMORY_MB. For example, DISCOVERY_EXECUTOR_HEAP_MAX_MEMORY: "1g"

DISCOVERY_EXECUTOR_K8S_MEM_REQUESTS_MB

Minimum amount of kubernetes memory in MB to be requested by Discovery Executor. For example, DISCOVERY_EXECUTOR_K8S_MEM_REQUESTS_MB: "1024"

DISCOVERY_EXECUTOR_K8S_MEM_REQUESTS

Minimum amount of kubernetes memory to be used by Discovery Executor. Setting this value will override DISCOVERY_EXECUTOR_K8S_MEM_REQUESTS_MB. For example, DISCOVERY_EXECUTOR_K8S_MEM_REQUESTS: "1G"

DISCOVERY_EXECUTOR_K8S_MEM_LIMITS_MB

Maximum amount of kubernetes memory in MB to be requested by Discovery Executor. For example, DISCOVERY_EXECUTOR_K8S_MEM_LIMITS_MB: "1024"

DISCOVERY_EXECUTOR_K8S_MEM_LIMITS

Maximum amount of kubernetes memory to be used by Discovery Executor. Setting this value will override DISCOVERY_EXECUTOR_K8S_MEM_LIMITS_MB. For example, DISCOVERY_EXECUTOR_K8S_MEM_LIMITS: "1G"

DISCOVERY_EXECUTOR_CPU_MIN

Minimum amount of kubernetes CPU to be requested by Discovery Executor. For example, DISCOVERY_EXECUTOR_CPU_MIN: "0.5"

DISCOVERY_EXECUTOR_CPU_MAX

Maximum amount of kubernetes CPU to be used by Discovery Executor. For example, DISCOVERY_EXECUTOR_CPU_MAX: "0.5"

DISCOVERY_CONSUMER_ENABLE

Set this property to true if you want to start a separate consumer pod, which will be used for writing DiscoveryPrivacera Discovery Classification and Scan Summary Data in Solr.

Set this property to false if you do not require a separate consumer pod.

Note

This property is enabled only for AWS.

true, false

false