Question 19
Domain 3You work at an ecommerce startup. You need to create a customer churn prediction model. Your company’s recent sales records are stored in a BigQuery table. You want to understand how your initial model is making predictions. You also want to iterate on the model as quickly as possible while minimizing cost. How should you build your first model?
Correct answer: C
Explanation
BigQuery data can be used directly with Vertex AI by "associate the data with a Vertex AI dataset," which reduces data movement and cost. For a churn problem, an "AutoMLTabularTrainingJob" is appropriate because it trains a tabular classification model quickly and provides model insights for understanding predictions.
Why each option is right or wrong
A. Export the data to a Cloud Storage bucket. Load the data into a pandas DataFrame on Vertex AI Workbench and train a logistic regression model with scikit-learn.
B. Create a tf.data.Dataset by using the TensorFlow BigQueryClient. Implement a deep neural network in TensorFlow.
C. Prepare the data in BigQuery and associate the data with a Vertex AI dataset. Create an AutoMLTabularTrainingJob to train a classification model.
Vertex AI’s tabular AutoML is the right fit for a churn label because it trains a supervised classification model on structured rows and produces feature attribution/explanation outputs for understanding predictions. Using BigQuery as the source and associating it to a Vertex AI dataset avoids exporting the table, which minimizes data movement and speeds iteration; then an AutoMLTabularTrainingJob can train directly from that dataset rather than requiring custom model code.
D. Export the data to a Cloud Storage bucket. Create a tf.data.Dataset to read the data from Cloud Storage. Implement a deep neural network in TensorFlow.