Study Guide
Professional Machine Learning Engineer Study Guide
Use the official domain outline to connect Google Cloud ML services, Vertex AI workflows, MLOps patterns, model serving, pipeline orchestration, and responsible AI monitoring to scenario-based questions.
How the Exam Is Structured
The Google Cloud Professional Machine Learning Engineer exam validates the ability to build, evaluate, productionize, optimize, monitor, and orchestrate ML and generative AI solutions on Google Cloud. The ExamPal practice bank includes 801 premium questions and 40 free questions mapped across the official blueprint.
| Domain | Weight | Focus |
|---|---|---|
| Domain 1: Architecting low-code AI solutions | 13% | Developing ML models by using BigQuery ML; Building AI solutions by using ML APIs or foundation models |
| Domain 2: Collaborating within and across teams to manage data and models | 14% | Exploring and preprocessing organization-wide data (e.g., Cloud Storage, BigQuery, Spanner, Cloud SQL, Apache Spark, Apache Hadoop); Model prototyping using Jupyter notebooks |
| Domain 3: Scaling prototypes into ML models | 18% | Building models; Training models |
| Domain 4: Serving and scaling models | 20% | Serving models; Scaling online model serving |
| Domain 5: Automating and orchestrating ML pipelines | 22% | Developing end-to-end ML pipelines; Automating model retraining |
| Domain 6: Monitoring AI solutions | 13% | Identifying risks to AI solutions; Monitoring, testing, and troubleshooting AI solutions |
13% of exam
Domain 1: Architecting low-code AI solutions
This section covers building AI and ML solutions with low-code and managed Google Cloud services. It includes BigQuery ML, ML APIs and foundation models, and AutoML workflows for training and debugging models.
14% of exam
Domain 2: Collaborating within and across teams to manage data and models
This section covers organization-wide data exploration, preprocessing, notebook-based prototyping, and ML experimentation. It emphasizes Google Cloud data services, Vertex AI tooling, and evaluation of generative AI solutions.
18% of exam
Domain 3: Scaling prototypes into ML models
This section covers model design, training at scale, distributed training, hyperparameter tuning, troubleshooting, and hardware selection. It also includes fine-tuning foundation models and choosing compute and accelerator options.
20% of exam
Domain 4: Serving and scaling models
This section covers batch and online inference, model serving frameworks, model registry, A/B testing, and scaling online serving backends. It also includes hardware selection and production tuning for latency, memory, throughput, and performance.
22% of exam
Domain 5: Automating and orchestrating ML pipelines
This section covers end-to-end pipeline development, retraining automation, and metadata tracking and auditing. It includes validation, orchestration frameworks, hybrid and multicloud strategies, CI/CD deployment, and lineage/versioning.
13% of exam
Domain 6: Monitoring AI solutions
This section covers AI risk identification, responsible AI practices, explainability, continuous evaluation, skew and drift monitoring, and troubleshooting. It emphasizes secure AI systems and ongoing monitoring of model performance and errors.
Key Terms to Know
These terms are loaded from the shared terminology pack and appear across the question explanations.
- A/B testing
- A method for comparing different versions of a model to evaluate which performs better.
- Apache Hadoop
- A distributed data processing ecosystem mentioned as a data source and file type context for training and preprocessing.
- Apache Spark
- A distributed data processing framework used here for preprocessing, notebooks, and model development.
- AutoML
- A Google Cloud machine learning approach for training custom models using prepared data, including tabular workflows and forecasting models.
- BigQuery
- Google Cloud data warehouse used here for data exploration, preprocessing, and training data organization.
- BigQuery ML
- A Google Cloud capability for developing machine learning models directly in BigQuery, including model building, feature engineering or selection, and prediction generation.
- CI/CD
- Continuous integration and continuous delivery, used here for model deployment automation.
- CPU
- A central processing unit, listed here as a compute option for training and serving models.
- Cloud Build
- Google Cloud’s build service used here for pipeline components and CI/CD model deployment.
- Cloud Composer
- Google Cloud’s managed orchestration service used for ML pipeline orchestration.
- Cloud Run
- Google Cloud’s serverless compute service used here for pipeline components and compute needs.
- Cloud SQL
- Google Cloud managed relational database service mentioned as a data source and as a minimum coding skill area for interpreting code snippets.
- Cloud Storage
- Google Cloud object storage used here for organizing and training on data.
- Colab Enterprise
- A Google Cloud Jupyter backend option for model prototyping.
- Dataflow
- Google Cloud’s data processing service used here for preprocessing and as a component in ML pipelines.
- Dataproc
- A Google Cloud service mentioned as a Jupyter notebook backend and as a serving/inference environment.
- Document AI API
- An industry-specific API mentioned as an example of an ML API used to build AI solutions.
- Explainable AI
- A set of techniques and tools for explaining model predictions and monitoring model behavior.
Official Materials and Guidance
This page is built from Google Cloud official exam guide, the shared syllabus, topic tree, terminology pack, free pack, and premium pack.
- -Google PMLE exam guide HTML
- -Google PMLE exam guide text extract