Question 3
Domain 1: Design and prepare a machine learning solutionYou create a multi-class image classification deep learning model that uses a set of labeled bird photographs collected by experts. All photographs are stored in an Azure blob container. You need to access the bird photograph files from Azure ML service workspace for training, minimizing data movement. What should you do?
Correct answer: C
Explanation
Registering the Azure blob storage as a datastore lets Azure Machine Learning access data in place from the workspace, so the training job can read the bird photographs without copying them elsewhere. A datastore is the workspace’s connection to storage, which minimizes data movement while still making the files available for training.
Why each option is right or wrong
A. Create and register a dataset by using TabularDataset class that references the Azure blob storage containing bird photographs.
B. Copy the bird photographs to the blob datastore that was created with your Azure Machine Learning service workspace.
C. Register the Azure blob storage containing the bird photographs as a datastore in Azure Machine Learning service.
Azure Machine Learning datastores are the workspace abstraction for connecting to external storage, and they are the supported way to reference Azure Blob Storage directly from the workspace without first copying the files into the workspace. In this scenario, registering the blob container as a datastore allows the training job to read the labeled bird images in place, which is the least-data-movement approach; by contrast, importing the files into another Azure ML asset would create an unnecessary copy.
D. Create an Azure Data Lake store and move the bird photographs to the store.
E. Create an Azure Cosmos DB database and attach the Azure Blob containing bird photographs storage to the database.