Study Guide

AWS Certified Machine Learning - Specialty Study Guide

Use the official AWS domain outline to connect Data engineering, exploratory data analysis, modeling, machine learning implementation, and operations on AWS to scenario-based questions and explanations.

Download App Free Practice Exam Key Terms Glossary

How the Exam Is Structured

AWS Certified Machine Learning - Specialty (MLS-C01) validates Data engineering, exploratory data analysis, modeling, machine learning implementation, and operations on AWS. The ExamPal practice bank includes 457 premium questions and 40 free questions mapped across the official blueprint.

Domain	Weight	Focus
Content Domain 1: Data Engineering	20%	Task 1.1: Create data repositories for ML; Identify data sources
Content Domain 2: Exploratory Data Analysis	24%	Task 2.1: Sanitize and prepare data for modeling; Identify and handle missing data, corrupt data, and stop words
Content Domain 3: Modeling	36%	Task 3.1: Frame business problems as ML problems; Determine when to use and when not to use ML
Content Domain 4: Machine Learning Implementation and Operations	20%	Task 4.1: Build ML solutions for performance, availability, scalability, resiliency, and fault tolerance; Log and monitor AWS environments

20% of exam

Content Domain 1: Data Engineering

This domain covers the data engineering work needed to support machine learning solutions, including creating repositories, ingesting data, and transforming data for ML workloads. It emphasizes selecting appropriate storage, orchestration, and processing services for batch and streaming pipelines.

Task 1.1: Create data repositories for ML

Identify data sources

Determine storage mediums

Task 1.2: Identify and implement a data ingestion solution

Identify data job styles and job types

Orchestrate data ingestion pipelines

Task 1.3: Identify and implement a data transformation solution

24% of exam

Content Domain 2: Exploratory Data Analysis

This domain covers preparing data for machine learning, engineering useful features, and analyzing data to understand patterns before modeling. It includes data sanitation, feature extraction, visualization, descriptive statistics, and cluster analysis.

Task 2.1: Sanitize and prepare data for modeling

Identify and handle missing data, corrupt data, and stop words

Format, normalize, augment, and scale data

Task 2.2: Perform feature engineering

Identify and extract features from datasets, including from data sources such as text, speech, images, and public datasets

Analyze and evaluate feature engineering concepts

Task 2.3: Analyze and visualize data for ML

36% of exam

Content Domain 3: Modeling

Covers how to frame business problems as machine learning problems, choose appropriate models, train and tune models, and evaluate model performance. This domain emphasizes both classical ML and modern foundation/LLM approaches, along with the practical tradeoffs involved in model selection and evaluation.

Task 3.1: Frame business problems as ML problems

Determine when to use and when not to use ML

Know the difference between supervised and unsupervised learning

Task 3.2: Select the appropriate model(s) for a given ML problem

XGBoost, logistic regression, k-means, linear regression, decision trees, random forests, RNN, CNN, ensemble, transfer learning, and large language models (LLMs)

Express the intuition behind models

Task 3.3: Train ML models

20% of exam

Content Domain 4: Machine Learning Implementation and Operations

Covers building, securing, deploying, and operating machine learning solutions in AWS. This domain emphasizes operational qualities such as performance, availability, scalability, resiliency, and fault tolerance, along with service selection and security practices.

Task 4.1: Build ML solutions for performance, availability, scalability, resiliency, and fault tolerance

Log and monitor AWS environments

AWS CloudTrail and Amazon CloudWatch

Task 4.2: Recommend and implement the appropriate ML services and features for a given problem

ML on AWS (application services), for example:

Amazon Polly

Task 4.3: Apply basic AWS security practices to ML solutions

Key Terms to Know

These terms are loaded from the shared terminology pack and appear across the question explanations.

A/B testing: An experimental method that compares two versions of a system by exposing different groups to each version and measuring outcomes.
Amazon Transcribe: An AWS managed service that converts speech audio into text for search, analysis, and downstream processing.
ML model: The trained artifact produced by machine learning that is used to generate predictions or decisions from input data.
ML operations: The practices and workflows used to deploy, monitor, maintain, and update machine learning systems in production.
MLOps: A discipline that combines machine learning, software engineering, and operations to manage the ML lifecycle.
business process improvement: The practice of optimizing operational workflows, sometimes by applying machine learning when data-driven prediction is useful.
data ingestion pipeline: A workflow that collects and moves data from source systems into storage or processing systems.
data preparation: The process of cleaning, transforming, and validating data before modeling.
data sufficiency: The condition that enough relevant data is available to effectively train a machine learning model.
dataset issue: A problem in collected data, such as missing values, bias, noise, or inconsistency, that can reduce model quality.
defined intervals: Specified repeating time periods used to trigger scheduled jobs or workflows.
deployment: The process of making a trained machine learning model available for production use.
deterministic rule: A fixed and explicit rule that always produces the same output for the same input.
inference: The process of using a trained model to generate outputs from new input data.
job scheduling: The configuration of tasks to run automatically at defined times or recurring intervals.
labeled data: Data examples that include the correct output or target value for training supervised models.
machine learning: A method of building systems that learn patterns or relationships from data to make predictions or decisions.
manual intervention: Human action required to start, manage, or correct a process that could otherwise be automated.

Official Materials and Guidance

This page is built from AWS MLS-C01 official exam guide, the shared syllabus, topic tree, terminology pack, free pack, and premium pack.

-AWS Mls c01 Exam Guide

Download App Official source Start Free Practice Exam