Question 35
Domain 5: Monitor, retrain, and manage ML lifecycleYou have trained a model using a dataset containing data that was collected last year. As this year progresses, you will collect new data. You want to track any changing data trends that might affect the performance of the model. What should you do?
Correct answer: C
Explanation
Data drift monitoring compares a baseline dataset to newer data to detect changes in feature distributions over time. Using the training dataset as the baseline and the newly collected data as the target lets you track shifting trends that may affect model performance.
Why each option is right or wrong
A. Collect the new data in a new version of the existing training dataset, and profile both datasets.
B. Replace the training dataset with a new dataset that contains both the original training data and the new data.
C. Collect the new data in a separate dataset and create a Data Drift Monitor with the training dataset as a baseline and the new dataset as a target.
Azure Machine Learning’s Data Drift Monitor is designed to compare a fixed baseline dataset against a newer target dataset to detect changes in feature distributions over time. In this scenario, the model was trained on last year’s data, so that training set is the correct baseline, and the newly collected data must be registered separately as the target; the service then evaluates drift on the selected columns and can be scheduled to run periodically (for example, daily, weekly, or monthly) to catch trend changes as they emerge.