Question 31
Content Domain 1: Data EngineeringA machine learning team is deciding which inputs should be documented during early data discovery. Which set of data-source attributes best aligns with identifying data sources for an ML repository?
Correct answer: B
Explanation
Identifying data sources focuses on where data comes from and what it contains, including primary sources such as user data. — Source material: "Identify data sources, for example, content and location, primary sources such as user data."
Why each option is right or wrong
A. Model accuracy, training latency, deployment target, and rollback plan
Identifying data sources concerns origin and contents of data, not model performance or deployment planning.
B. Content, location, primary sources, and user data
The source material explicitly lists data-source identification examples as content and location, and primary sources such as user data. Because the question asks which attributes align with identifying data sources, this set matches the stated examples exactly.
C. Feature scaling, label encoding, class balance, and hyperparameters
Data-source identification occurs before preprocessing and model-tuning choices such as scaling or hyperparameters.
D. Encryption method, access password, firewall rule, and audit log
Security controls govern protection and access, not the source characteristics of the data itself.