Question 14
Domain 2You are building a real-time prediction engine that streams files which may contain Personally Identifiable Information (Pll) to Google Cloud. You want to use the Cloud Data Loss Prevention (DLP) API to scan the files. How should you ensure that the Pll is not accessible by unauthorized individuals?
Correct answer: D
Explanation
Cloud DLP is used to identify and classify sensitive data before it is exposed to broader access. Writing all incoming files to a "Quarantine" bucket keeps unscanned content isolated until DLP determines whether it belongs in "Sensitive" or "Non-sensitive" storage, reducing the chance that PII is accessible to unauthorized users.
Why each option is right or wrong
A. Stream all files to Google Cloud and then write the data to BigQuery. Periodically conduct a bulk scan of the table using the DLP API.
B. Stream all files to Google Cloud, and write batches of the data to BigQuery. While the data is being written to BigQuery, conduct a bulk scan of the data using the DLP API.
C. Create two buckets of data: Sensitive and Non-sensitive. Write all data to the Non-sensitive bucket. Periodically conduct a bulk scan of that bucket using the DLP API, and move the sensitive data to the Sensitive bucket.
D. Create three buckets of data: Quarantine, Sensitive, and Non-sensitive. Write all data to the Quarantine bucket.
Google Cloud DLP is used to inspect content before it is released into broader storage, so the control point here is to keep all incoming objects isolated until classification is complete. By writing every file first to a Quarantine bucket, you prevent unauthorized access during the scan window; only after DLP review should objects be moved into the Sensitive or Non-sensitive buckets, which is the standard segregation pattern for unclassified PII-bearing data.
E. Periodically conduct a bulk scan of that bucket using the DLP API, and move the data to either the Sensitive or Non-Sensitive bucket.