Study Guide
AWS Certified Machine Learning - Specialty Study Guide
Use the official AWS domain outline to connect Data engineering, exploratory data analysis, modeling, machine learning implementation, and operations on AWS to scenario-based questions and explanations.
How the Exam Is Structured
AWS Certified Machine Learning - Specialty (MLS-C01) validates Data engineering, exploratory data analysis, modeling, machine learning implementation, and operations on AWS. The ExamPal practice bank includes 457 premium questions and 40 free questions mapped across the official blueprint.
| Domain | Weight | Focus |
|---|---|---|
| Content Domain 1: Data Engineering | 20% | Task 1.1: Create data repositories for ML; Identify data sources |
| Content Domain 2: Exploratory Data Analysis | 24% | Task 2.1: Sanitize and prepare data for modeling; Identify and handle missing data, corrupt data, and stop words |
| Content Domain 3: Modeling | 36% | Task 3.1: Frame business problems as ML problems; Determine when to use and when not to use ML |
| Content Domain 4: Machine Learning Implementation and Operations | 20% | Task 4.1: Build ML solutions for performance, availability, scalability, resiliency, and fault tolerance; Log and monitor AWS environments |
20% of exam
Content Domain 1: Data Engineering
This domain covers the data engineering work needed to support machine learning solutions, including creating repositories, ingesting data, and transforming data for ML workloads. It emphasizes selecting appropriate storage, orchestration, and processing services for batch and streaming pipelines.
24% of exam
Content Domain 2: Exploratory Data Analysis
This domain covers preparing data for machine learning, engineering useful features, and analyzing data to understand patterns before modeling. It includes data sanitation, feature extraction, visualization, descriptive statistics, and cluster analysis.
36% of exam
Content Domain 3: Modeling
Covers how to frame business problems as machine learning problems, choose appropriate models, train and tune models, and evaluate model performance. This domain emphasizes both classical ML and modern foundation/LLM approaches, along with the practical tradeoffs involved in model selection and evaluation.
20% of exam
Content Domain 4: Machine Learning Implementation and Operations
Covers building, securing, deploying, and operating machine learning solutions in AWS. This domain emphasizes operational qualities such as performance, availability, scalability, resiliency, and fault tolerance, along with service selection and security practices.
Key Terms to Know
These terms are loaded from the shared terminology pack and appear across the question explanations.
- A/B testing
- An experimental method that compares two versions of a system by exposing different groups to each version and measuring outcomes.
- Amazon Transcribe
- An AWS managed service that converts speech audio into text for search, analysis, and downstream processing.
- ML model
- The trained artifact produced by machine learning that is used to generate predictions or decisions from input data.
- ML operations
- The practices and workflows used to deploy, monitor, maintain, and update machine learning systems in production.
- MLOps
- A discipline that combines machine learning, software engineering, and operations to manage the ML lifecycle.
- business process improvement
- The practice of optimizing operational workflows, sometimes by applying machine learning when data-driven prediction is useful.
- data ingestion pipeline
- A workflow that collects and moves data from source systems into storage or processing systems.
- data preparation
- The process of cleaning, transforming, and validating data before modeling.
- data sufficiency
- The condition that enough relevant data is available to effectively train a machine learning model.
- dataset issue
- A problem in collected data, such as missing values, bias, noise, or inconsistency, that can reduce model quality.
- defined intervals
- Specified repeating time periods used to trigger scheduled jobs or workflows.
- deployment
- The process of making a trained machine learning model available for production use.
- deterministic rule
- A fixed and explicit rule that always produces the same output for the same input.
- inference
- The process of using a trained model to generate outputs from new input data.
- job scheduling
- The configuration of tasks to run automatically at defined times or recurring intervals.
- labeled data
- Data examples that include the correct output or target value for training supervised models.
- machine learning
- A method of building systems that learn patterns or relationships from data to make predictions or decisions.
- manual intervention
- Human action required to start, manage, or correct a process that could otherwise be automated.
Official Materials and Guidance
This page is built from AWS MLS-C01 official exam guide, the shared syllabus, topic tree, terminology pack, free pack, and premium pack.
- -AWS Mls c01 Exam Guide