AIF-C01 Exam Glossary - 947 Terms

Search the terminology pack for AWS Certified AI Practitioner. Use these definitions with the study guide and practice questions.

Download App Study Guide Free Practice Exam

#

200,000-token context windows: The default context window size supported by Claude Opus and Sonnet according to the text.

A

A2I: Abbreviation for Amazon Augmented AI.
accuracy: The proportion of all predictions that the model got right. The text gives the formula as correctly classified emails divided by total emails and notes that accuracy can be misleading when classes are imbalanced.
accuracy threshold: A minimum accuracy level that an interpretable model should verify when auditability is the primary requirement. The text says the interpretable model should be chosen and its accuracy threshold verified.
adapter-matrix training: The training of small LoRA adapter matrices on top of a frozen, quantized base model as part of QLoRA.
Adversarial fine-tuning data: Fine-tuning data supplied with malicious intent; the text gives it as the example for the Poisoning risk.
Agent evaluation: The evaluation of whether an AI agent completes assigned tasks accurately, efficiently, and at acceptable cost.
agent loop: The iterative process in which each iteration consumes tokens; agentic systems are more complex, more expensive to run, and harder to audit than single-shot calls.
AgentCore: An AWS service used to deploy GenAI applications and agentic workflows. The text says deployment through AgentCore can help complete the build in days or weeks, and that AgentCore Policy extends the responsibility model for agentic workflows.
AgentCore authorization policies: The authorization layer within Amazon Bedrock AgentCore that defines what actions a running agent is permitted to take, including which tools it can invoke, which knowledge bases it can query, which external endpoints it can call, and whether it can take irreversible actions such as deleting records or submitting forms.
AgentCore Browser: An AgentCore component that enables web navigation.
AgentCore Code Interpreter: An AgentCore component that provides a sandboxed Python execution environment, allowing an agent to run dynamically generated code such as refund calculation logic safely at runtime.
AgentCore Evaluations: An AgentCore component that runs automated testing.
AgentCore Gateway: An AgentCore component that manages MCP-based tool connections.
AgentCore Identity: An AgentCore component that manages OAuth 2.0 delegated authentication so an agent can authenticate to a company's CRM on behalf of each user without storing credentials in agent code.
AgentCore Memory: An AgentCore component that provides a persistent store for conversation history and user context across sessions, so returning users receive continuity without re-explaining their situation.
AgentCore Observability: An AgentCore component that handles distributed tracing.
AgentCore Policy: A feature that extends the responsibility model to agentic workflows by giving operators formal control over what tools agents are permitted to invoke. The text says it enforces human-readable policies that can be audited independently of the agent code.
AgentCore Runtime: An AgentCore component that serves as the execution environment for the agent itself.
AgentCore-deployed agents: Agents deployed with AgentCore for which Amazon Bedrock provides evaluation capabilities.
agentic AI: A newer AI term identified as central to business conversations and included among the basic AI and ML terms covered in the domain. The text does not further define it.
agentic AI assistant: An AI assistant that can perform actions on behalf of employees, including booking meeting rooms, sending calendar invites, and updating a project tracking system.
agentic AI deployment: An AI deployment in which an agent makes external calls; the text says the key review question is whether every external call is mediated through a managed identity rather than a hard-coded credential.
agentic patterns: Patterns related to agentic behavior in generative systems, included among the conceptual machinery needed to think clearly about generative systems. The text does not further define them.
agentic workflows: Multi-step workflows coordinated by Agents for Amazon Bedrock in which the model can call tools such as external APIs, Knowledge Bases, and AWS Lambda functions.
Agents for Amazon Bedrock: A Bedrock capability that coordinates multi-step agentic workflows by allowing the model to call external APIs, query Knowledge Bases, and execute AWS Lambda functions as tools within a single orchestrated session.
AI governance: The governance layer that the compliance services in the text support for AI workloads.
AI Governance Protocol: A repeating cycle that connects policies, frameworks, reviews, transparency standards, and training into an AI governance program. It is described as addressing distinct accountability gaps and helping the program withstand external scrutiny and remain effective as AI systems evolve.
AI literacy training: Annual training required for all employees that covers what AI is, how the organization's AI-use policy applies to their work, and what to do when they encounter an AI output they suspect is incorrect or harmful.
AI/ML lifecycle: The end-to-end lifecycle of AI and machine learning described in the text as moving data from collection through training, deployment, and monitoring, with AWS services participating at each stage.
AI/ML pipeline: The sequence of steps a team executes to go from raw data to a working model. Each stage receives an artifact from the previous step, transforms it, and passes the result forward.
AIF-C01: The exam guide version referenced in the note about MemoryDB being removed from the vector storage options in version 1.1.
Amazon A2I: Amazon Augmented AI; a service used for ongoing human review of model outputs in production. It routes specific inference outputs to human reviewers when the model’s confidence falls below a threshold or when the task type requires human oversight, and reviewer corrections can feed into a future fine-tuning cycle.
Amazon Athena: A service that can discover datasets registered in AWS Glue Data Catalog.
Amazon Augmented AI: A service used for ongoing human review of model outputs in production. It routes specific inference outputs to human reviewers when the model’s confidence falls below a threshold or when the task type requires human oversight, and reviewer corrections can feed into a future fine-tuning cycle.
Amazon Augmented AI (A2I): An AWS service used for human review of model outputs, addressing ongoing quality assurance and feedback collection.
Amazon Augmented AI (Amazon A2I): An AWS tool used to detect and monitor problems such as bias and related issues.
Amazon Aurora: An AWS vector-storage option included in Task 3.1.
Amazon Aurora PostgreSQL: The PostgreSQL-compatible Aurora database environment referenced for storing embeddings and combining relational and vector queries with pgvector.
Amazon Bedrock: An AWS service mapped to stages of the AI/ML development lifecycle in Task 1.3.
Amazon Bedrock AgentCore: AWS infrastructure for agentic AI that provides multi-step orchestration, memory, and tool use.
Amazon Bedrock AgentCore Identity: A new V1.1 addition identified as part of the AWS services and features used to secure AI systems.
Amazon Bedrock Agents: AWS infrastructure for agentic AI that provides multi-step orchestration, memory, and tool use.
Amazon Bedrock custom models: The AWS entry point listed for continuous pre-training.
Amazon Bedrock Endpoints: A managed FM inference API used for deployment of foundation models.
Amazon Bedrock Fine-Tuning: A service used for FM adaptation through fine-tuning, specifically supervised fine-tuning on private data.
Amazon Bedrock Flows: A Bedrock capability that provides a visual workflow builder for chaining prompts and sub-agents into structured pipelines without writing orchestration code.
Amazon Bedrock Guardrails: A Bedrock feature referenced as guardrails, indicating controls used within Amazon Bedrock.
Amazon Bedrock Guardrails content filters: The platform-level content filtering used as the primary mitigation for jailbreaking.
Amazon Bedrock Guardrails contextual grounding check: A built-in feature that evaluates RAG responses for grounding and relevance by computing a grounding score against retrieved context documents and a relevance score against the original query. Minimum thresholds are configured for each score, and responses below threshold are blocked or flagged; when configured to BLOCK, it acts as a real-time enforcement gate without a separate verification step.
Amazon Bedrock Guardrails sensitive-information filters: Guardrails filters that detect and redact PII from model responses before they reach the user.
Amazon Bedrock Knowledge Bases: A service used for FM adaptation through prompting and RAG, described as supporting vector-store-backed RAG pipelines.
Amazon Bedrock model distillation: The AWS managed workflow for distillation, where an organization designates a Bedrock model as the teacher, specifies the target prompt distribution, and produces a smaller fine-tuned student model.
Amazon Bedrock Model Evaluation: A Bedrock capability that runs automated and human-evaluated benchmark jobs scoring model responses on accuracy, robustness, toxicity, and task-specific metrics so teams can compare models before committing to one.
Amazon Bedrock Model Evaluations: An AWS tool used to document and surface explainability.
Amazon Bedrock model invocation logging: Logging that captures the input prompt, the model response, and metadata for every inference call, supporting reconstruction of what information was provided to users.
Amazon Bedrock Pricing: The pricing information for Amazon Bedrock.
Amazon Bedrock Prompt Management: A V1.1 material included in Task 3.2 on prompt engineering.
Amazon Bedrock's customization APIs: APIs through which fine-tuning can be performed in Amazon Bedrock, used in the text to close accuracy gaps for narrowly defined tasks.
Amazon CloudWatch: A metrics service whose metrics back auto-scaling policies for SageMaker inference endpoints.
Amazon CloudWatch Logs: A logging service that retains application-level logs from Lambda functions, ECS containers, and SageMaker inference endpoints, including runtime errors and output patterns that may reveal unintended data exposure.
Amazon CloudWatch Metrics: A service that captures operational signals used for monitoring policy compliance and anomalous behavior.
Amazon Comprehend: An AWS example service listed for text data.
Amazon Comprehend entity recognition: The AWS tool listed for PII detection and removal in data preparation.
Amazon EC2: An AWS service used for self-hosted APIs by hosting the model container on GPU instances for full flexibility.
Amazon EC2 instances: Compute instances scanned by Amazon Inspector for known software vulnerabilities and unintended network exposures.
Amazon ECR: The Amazon container registry whose container images are continuously scanned by Amazon Inspector for known vulnerabilities.
Amazon ECS: An AWS service whose tasks can have IAM roles attached; those roles inherit restrictions that limit access to Bedrock resources.
Amazon EKS: An AWS service used for self-hosted APIs by deploying the model as a Kubernetes workload for container orchestration at scale.
Amazon Elastic Container Registry (ECR): The registry where container images are stored and scanned by Amazon Inspector.
Amazon Forecast: A managed forecasting service that automatically selects among statistical and deep learning algorithms, computes quantile forecasts such as p50 and p90 demand levels, and writes results to Amazon S3 for downstream consumption.
Amazon Fraud Detector: The AWS managed service that packages the fraud-detection pattern, with pre-built models for online fraud, transaction fraud, and account takeover.
Amazon GuardDuty: A threat-detection service used for AI workloads to identify anomalous behavior patterns that may indicate compromise.
Amazon Inspector: An AWS service listed as supporting compliance in the governance side of the domain.
Amazon Kendra: A managed enterprise search service that underpins many knowledge base implementations, indexing document repositories and returning relevant passages in response to natural-language queries. It serves as the retrieval layer in knowledge base architectures.
Amazon Lex: A service that builds conversational interfaces that understand natural language intents and manage dialog state.
Amazon Macie: An AWS service identified as one of the services and features used to secure AI systems.
Amazon Mechanical Turk: A service used for high-volume annotation. The text says it can supply the human reviewer workforce for human-in-the-loop evaluation patterns.
Amazon Neptune: An AWS vector-storage option included in Task 3.1.
Amazon Neptune Analytics: An AWS service mentioned as supporting vector search.
Amazon Nova: An AWS model family or service referenced for model capabilities and model selection.
Amazon Nova Lite: A fast multimodal model family tier with a 300K-token context window, used for cost-sensitive production.
Amazon Nova Micro: A model designed for high-volume, low-cost text tasks where affordability is the primary constraint, and also described as targeting the lowest-latency tier in the Amazon Nova family.
Amazon Nova model capabilities: The capabilities of Amazon Nova models, referenced in the Amazon Nova User Guide.
Amazon Nova Premier: The most capable, most expensive, and highest-latency option in the Amazon Nova family. The text says it is appropriate for complex multi-step reasoning tasks rather than high-volume conversational support.
Amazon Nova Pro: A model in the Amazon Nova family that supports multimodal input, according to the text.
Amazon Nova User Guide: The user guide for Amazon Nova, cited for model capabilities and model selection.
Amazon OpenSearch Service: An AWS vector-storage option included in Task 3.1.
Amazon Personalize: An AWS service referenced in the context of measuring recommendation effectiveness.
Amazon Polly: A service that converts text to lifelike speech for voice-channel responses.
Amazon Q: An AWS service mapped to stages of the AI/ML development lifecycle in Task 1.3.
Amazon Q Business: An AWS service specifically designed for enterprise conversational assistants that answer questions grounded in an organization's own documents and data sources. It includes built-in connectors for SharePoint, Confluence, S3, and other enterprise systems, handles chunking, indexing, and retrieval automatically, and provides a managed web interface and API without requiring infrastructure management.
Amazon Q Developer: A code-completion and chat-overlay service that works inside existing IDEs. It is aimed at developers using VS Code and JetBrains, is in scope for AIF-C01 v1.1, and is described as superseded for full IDE workflows by Kiro.
Amazon Quick: An AWS service mapped to stages of the AI/ML development lifecycle in Task 1.3.
Amazon QuickSight: The BI-facing product that was rebranded in 2025 under the Amazon Quick name.
Amazon RDS: An AWS database service referenced for PostgreSQL, pgvector extension support, and installation of pgvector.
Amazon RDS for PostgreSQL: An AWS vector-storage option included in Task 3.1.
Amazon Redshift: The data warehouse that Amazon Quick connects to natively for natural-language question answering and chart generation.
Amazon Rekognition: An AWS example service listed for image data.
Amazon Resource Name: The unique identifier format used for a specific version of a prompt when deployed through Bedrock Prompt Management.
Amazon S3: The AWS storage service where data is stored for Amazon SageMaker AI Batch Transform batch inferencing jobs, and where SageMaker AI Async Inference writes outputs for retrieval.
Amazon S3 bucket: The destination to which CloudTrail delivers event records within minutes of an API call.
Amazon S3 buckets: A supported data source for Amazon Bedrock Knowledge Bases and described as the most common choice for document archives.
Amazon S3 integrity checking: The AWS mechanism used for checksum verification to detect data corruption.
Amazon S3 Lifecycle policies: Policies that automate the transition of objects through storage tiers from Standard to Standard-IA to Glacier to Glacier Deep Archive on a defined schedule and enforce deletion at end-of-retention.
Amazon S3 Object Lock: The AWS mechanism used for object lock and versioning to address training data tampering.
Amazon SageMaker: The AWS service referenced as the platform for Clarify. The text uses it in the name Amazon SageMaker Clarify, which addresses bias in classical machine learning models.
Amazon SageMaker AI: An AWS service mapped to stages of the AI/ML development lifecycle in Task 1.3.
Amazon SageMaker AI Developer Guide: The developer guide for Amazon SageMaker AI, cited as the source for Amazon SageMaker Clarify.
Amazon SageMaker Batch Transform: A batch inference service that reads a dataset from Amazon S3, passes each record through the model, and writes results back to S3. It is suitable for tasks like monthly risk scoring across an entire customer portfolio.
Amazon SageMaker Clarify: An AWS service that helps teams detect and measure bias in training data and trained models.
Amazon SageMaker Data Wrangler: An AWS mechanism used for data quality assessment to address data poisoning and model degradation.
Amazon SageMaker Experiments: A service that records the hyperparameters, metrics, and artifact versions associated with each training run, producing a searchable history for reproducibility and comparison.
Amazon SageMaker Ground Truth: An AWS service that helps teams create labeled datasets by combining automated labeling with human review, reducing the time and cost of annotation at scale.
Amazon SageMaker Ground Truth Plus: An AWS tool used for managed labeling workforce support, addressing annotator sourcing and management.
Amazon SageMaker JumpStart: An AWS service in the portfolio for building generative applications.
Amazon SageMaker Model Card: A structured model-documentation capability that records a model’s intended use, training data provenance, evaluation results across subgroups, known limitations, ethical considerations, and usage restrictions in a standardized, human-readable format.
Amazon SageMaker Model Cards: An AWS tool used to document and surface explainability.
Amazon SageMaker Model Monitor: An AWS service designed to detect data drift by computing a statistical baseline from the training dataset, continuously sampling inference inputs, and raising CloudWatch alarms when the production distribution diverges beyond a configured threshold.
Amazon SageMaker notebook instances: Notebook instances whose access can be controlled by a Config rule requiring IAM-based access to prevent direct internet exposure.
Amazon SageMaker Pipelines: The native MLOps orchestration service. It defines pipeline steps in code, stores each step's output as a versioned artifact, integrates with the SageMaker model registry, and supports trigger-based execution.
Amazon Titan: An example of an LLM that underpins modern generative AI applications.
Amazon Titan Embeddings: An embedding model exposed through Amazon Bedrock that accepts text and returns a floating-point vector.
Amazon Transcribe: An AWS service that surfaces NLP capabilities for speech-to-text conversion.
Amazon Translate: An AWS service that surfaces NLP capabilities for language translation.
Amazon VPC: The AWS virtual private cloud integration used for network isolation. The text says organizations can route Bedrock API calls over a VPC endpoint using AWS PrivateLink so inference traffic never traverses the public internet.
anchoring bias: A bias removed by side-by-side comparison because the reviewer does not know which model produced each output. The text says the comparison design helps eliminate this bias.
ANN: The acronym for approximate nearest-neighbor, an algorithm type used to search embeddings quickly in vector databases.
annual AI literacy training: Training required for all employees that covers what AI is, how the organization’s AI-use policy applies to their work, and what to do when they encounter an AI output they suspect is incorrect or harmful.
Anomaly detection: A technique that identifies data points that deviate significantly from the learned distribution of normal behavior; the text also notes that records not fitting any cluster tightly may be flagged as unusual.
anomaly scores: The unsupervised proxy for failure risk produced by anomaly detection.
Anthropic Claude: An example of an LLM that underpins modern generative AI applications.
Anthropic Claude Haiku: A fast multimodal model family tier with 200K-token context, used for high-volume customer interactions.
Anthropic Claude Opus: A flagship multimodal model family tier with 200K standard context and 1M context with beta header, used for complex reasoning and legal/medical tasks.
API: An interface through which Amazon Q Business exposes the assistant without requiring infrastructure management.
API audit logging: The recording of API activity, as done by AWS CloudTrail, to prove who did what and when.
API-level prompt separation: A mitigation for Hijacking / Injection that keeps system instructions separate from user content at the API level.
APIs: The interface through which AWS managed AI services provide pre-trained capabilities.
ARPU: Average revenue per user.
Artifact: A compliance-reporting service mentioned alongside cross-region inference in the final answer choices.
artifact versions: Versioned outputs associated with each training run that Amazon SageMaker Experiments records for reproducibility and comparison.
Artificial Intelligence Risk Management Framework: Generative AI: The title of NIST AI 600-1, a framework focused on managing risks related to generative AI.
attention weight: The relative attention given to information in the context; in lost-in-the-middle behavior, middle content receives proportionally less attention weight than content at the beginning or end.
attention weights: Values that Amazon SageMaker AI can surface as part of explainability tooling; the text says they are imperfect approximations rather than true causal explanations.
AUC: A metric that the AIF-C01 v1.1 exam guide replaced with precision and recall in the v1.1 list; the text does not define it further.
audit and compliance review: The review stage in the data origin documentation flow that follows the SageMaker Model Card and is intended to be followed by regulators or auditors from raw source to deployed model.
audit cycle: The recurring period before which organizations would otherwise manually extract screenshots and log excerpts, but which Audit Manager supports through continuous evidence collection.
audit evidence: Evidence that should be maintained in the governance repository alongside policy documents and training completion records.
audit logging: Logging used so reviewers can trace which training data influenced a response. In the text, it is presented as a way to support interpretability, not as a fix for hallucination.
Audit trail and logging requirements: The security and regulatory need to maintain a complete audit record of AI interactions using AWS CloudTrail, Bedrock model invocation logging, and Amazon CloudWatch.
audit trail of logic: A benefit of chain-of-thought prompting in which the reasoning steps are visible, making it useful when intermediate logic matters as much as the final answer.
audit trails: Records for AI interactions that are part of the security and privacy considerations specific to AI.
audit-logged: Tracked in an audit log. The text says override mechanisms should be prominent, low-friction, and audit-logged.
audit-readiness: An ongoing compliance state maintained through continuous monitoring, evidence collection, and the ability to produce audit evidence for AI workloads.
auditability: The ability for every individual decision to be completely auditable in environments where regulatory accountability requires it.
auditable: A property of source documents in RAG meaning the retrieved chunks are visible in the prompt, allowing a developer to inspect exactly which documents influenced the response.
auditable asset: An AI system that has been converted from a black box into something that can be reviewed through lineage, catalog, and model-card records.
auditable data origins: A practice in the AI workload posture that makes the origins of data traceable and reviewable.
auditable output: The output generated by the traditional ML model in the hybrid pattern so that it can be reviewed and traced.
auditable record: The record provided by the Model Card of what was known at the time of deployment.
auditor: The person who can review the full prompt history at any time.
augment the training dataset: To add additional examples to the training dataset. The text says this is the most direct fix because it specifically adds examples covering the underrepresented edge case types and addresses the distribution mismatch.
automatic evaluation: A Bedrock Model Evaluation job type that runs the selected model against a built-in or custom prompt dataset and scores responses using metrics such as accuracy, robustness, and toxicity without requiring human reviewers.
AWS: The cloud platform that provides services such as Amazon Rekognition, Amazon Comprehend, Amazon Translate, Amazon Transcribe, Amazon Bedrock, Amazon SageMaker AI, and Amazon SageMaker Clarify.
AWS AI Services: A chain of AWS services used in the conceptual voice channel flow to process voice input, transcribe the audio, interpret intent, and synthesize a spoken reply. The text specifically describes the handoff sequence as Transcribe to Lex to Comprehend to Polly.
AWS AI/ML service: An AWS service included in the text as an example of artificial intelligence.
AWS API calls: The AWS calls that Lambda functions can make when permitted by their IAM roles.
AWS Artifact: An AWS service listed as supporting compliance in the governance side of the domain.
AWS Audit Manager: An AWS service listed as supporting compliance in the governance side of the domain.
AWS Audit Manager with the NIST AI RMF framework: A combination used to collect compliance evidence and frame a governance program; the text says it does not detect drift and instead collects evidence about controls, not model input statistics.
AWS Backup: A centralized service for scheduling, monitoring, and enforcing backup and retention policies across S3, RDS, DynamoDB, EFS, and other services, with immutable recovery points that cannot be deleted before a configured retention period expires.
AWS Certification: An AWS certification program referenced as the source of the AI Practitioner (AIF-C01) Exam Guide v1.1, Domain 3.
AWS Certification Exam Guide AIF-C01 v1.1: An AWS certification exam guide identified as AIF-C01 version 1.1, referenced as the source for Task Statement 2.2.
AWS CloudTrail: An AWS service listed as supporting compliance in the governance side of the domain.
AWS Config: An AWS service listed as supporting compliance in the governance side of the domain.
AWS Config conformance packs: A configuration-checking mechanism that can check rules such as GDPR-related configuration, but without underlying S3 and Bedrock region controls it has nothing to check; the text says technical enforcement comes first.
AWS Config rule: A technical control that can be traced to a policy requirement, such as checking encryption status on relevant S3 buckets and SageMaker training volumes.
AWS Customer Carbon Footprint Tool: An AWS tool that helps organizations measure and track carbon emissions associated with their AWS usage. It breaks down emissions by service, region, and time period.
AWS Foundational Security Best Practices: A standard example to which AWS Config conformance packs can be aligned.
AWS Generative AI Security Scoping Matrix: A governance tool included among the governance protocols and also specifically named as part of Task 5.2.
AWS Glue: An AWS service used in the AI/ML pipeline for data storage and preparation, where it performs ETL and cataloging.
AWS Glue Data Catalog: A service that records dataset metadata at each stage, including table schemas, data types, last-modified timestamps, and column-level descriptions.
AWS Glue data quality rules: An AWS mechanism used for data quality assessment to address data poisoning and model degradation.
AWS Glue ETL: ETL jobs that can discover datasets registered in AWS Glue Data Catalog.
AWS IDE ecosystem: The ecosystem in which Kiro is the primary AI-assisted development tool after replacing Amazon Q Developer.
AWS Identity and Access Management: The AWS service used to control access to prompt resources and prompt ARNs through policies.
AWS Identity and Access Management (IAM): The AWS access-control service that determines which identities, roles, and services may call which models. The text says it can be applied with specific model ARNs and specific Bedrock actions such as bedrock:InvokeModel and bedrock:InvokeAgent.
AWS infrastructure: The infrastructure and services on AWS used for building generative AI applications, covered in Task 2.3.
AWS Key Management Service: The AWS service used with customer-managed keys to control encryption keys for data at rest in Amazon Bedrock managed features.
AWS Key Management Service (AWS KMS): The AWS service used to encrypt data at rest for Amazon Bedrock. The text says data sent to Amazon Bedrock is encrypted at rest using AWS KMS.
AWS Key Management Service (KMS): The AWS service that manages the cryptographic keys used for encryption at rest. The text notes that Bedrock Knowledge Bases and custom model fine-tuning jobs support customer-managed KMS keys, allowing the security team to control key rotation and revoke access at any time.
AWS KMS: The AWS service used to encrypt data at rest for Amazon Bedrock. The text says data sent to Amazon Bedrock is encrypted at rest using AWS KMS.
AWS Lake Formation: An AWS service that extends the Glue Data Catalog with fine-grained access control, allowing organizations to grant column-level, row-level, and cell-level permissions on datasets stored in Amazon S3.
AWS Lambda: A service whose functions can be executed as tools within Agents for Amazon Bedrock.
AWS Lambda functions: Functions scanned by Amazon Inspector for known software vulnerabilities and unintended network exposures.
AWS Management Console: An AWS interface through which Model Cards can be published.
AWS model distillation: A capability within Amazon Bedrock that transfers reasoning behavior from a large foundation model into a smaller, faster, cheaper model tuned to a specific task.
AWS Network Firewall: A network security service that, along with NAT Gateways, inspects and restricts outbound traffic from the AI workload to prevent unexpected external calls.
AWS portfolio: The set of AWS offerings across which the basic AI and ML terms are used. The text says the domain covers terms used across the AWS portfolio.
AWS PrivateLink: An AWS service identified as one of the services and features used to secure AI systems.
AWS regions: AWS deployment regions that differ in energy mix. The text says regions closer to renewable energy sources have lower carbon intensity per compute hour.
AWS Secrets Manager: The AWS service that Bedrock AgentCore Identity integrates with to rotate credentials automatically without requiring the agent definition to be updated.
AWS Security Hub: A centralized dashboard that receives findings from Inspector and is used by auditors alongside other compliance signals; it also aggregates findings from Config, Inspector, and Macie.
AWS shared responsibility model: The model that defines the boundary between AWS responsibilities and customer responsibilities for securing AI services such as Amazon Bedrock.
AWS SOC 2 Type II report: A compliance report on AWS's controls, produced by an independent auditor, stored in AWS Artifact, and downloadable by authorized AWS customers under NDA to present as evidence of platform-level security and availability standards.
AWS Trusted Advisor: An AWS service listed as supporting compliance in the governance side of the domain.
AWS Well-Architected Framework: An AWS framework cited in relation to the Machine Learning Lens and performance pillar.
AWS's own compliance certifications and agreements: Compliance certifications and agreements that AWS customers can download from AWS Artifact to demonstrate AWS's controls as the infrastructure provider.

B

Balanced datasets: A dataset characteristic meaning no class label or demographic group is so overrepresented that the model learns to predict that class as a shortcut rather than learning the underlying signal. The text gives an example of an unbalanced fraud dataset with 999 legitimate transactions for every 1 fraudulent transaction, where a model can reach 99.9% accuracy by predicting legitimate for everything while failing at the actual task.
base model compliance: The compliance of the base model that AWS manages in Amazon Bedrock.
Batch inference: Asynchronous processing of many requests together; in Amazon Bedrock it is used instead of real-time submission, with responses arriving minutes to hours after submission, and it typically provides a 50% discount versus on-demand pricing.
batch ML pipelines: Classical machine learning pipelines optimized for throughput rather than speed. The text says they process thousands of records but may take minutes per run.
Bedrock AgentCore: The Bedrock capability referenced as the platform that includes AgentCore Identity and AgentCore authorization policies for agent identity and permissions.
Bedrock AgentCore Identity: An agent credential management service that issues and rotates OAuth tokens and API keys for agent-initiated calls to external services.
Bedrock Agents: Agents for which Amazon Bedrock provides evaluation capabilities.
Bedrock API: The API whose calls can be routed through a VPC endpoint by AWS PrivateLink so they do not traverse the public internet.
Bedrock console: The interface through which Amazon Bedrock offers fine-tuning as part of its customization depth.
Bedrock custom model artifacts: Custom model artifacts for Bedrock whose encryption at rest can be enforced by an AWS Config rule.
Bedrock endpoints: Endpoints to which unusual API call patterns may be detected by GuardDuty in AI workloads.
Bedrock Flows: The workflow system that can reference a prompt ARN in flow node configuration to embed versioned prompts in automated pipelines.
Bedrock Guardrails: A component that sits between the application and the model, inspecting both the inbound prompt and the outbound response before either is passed through. It can approve or block prompts and apply output filtering.
Bedrock Model Evaluation: A service used in evaluation for performance and quality scoring.
Bedrock Model Evaluations: An evaluation tool used in the model lifecycle to validate output safety before deployment. The text places it after evaluations testing and before deployment in the explainability toolchain.
Bedrock model invocation logging: Logging for Amazon Bedrock that can be verified by an AWS Config rule to ensure every model request is captured for audit and analysis.
Bedrock-hosted foundation models: Foundation models hosted in Amazon Bedrock. The text says Amazon Bedrock Model Evaluation is the default path for evaluating them.
bedrock:*: A wildcard IAM permission granting all Bedrock actions on all resources. The text identifies this as overprivileged and recommends replacing it with a narrower permission.
bedrock:InvokeAgent: A specific Bedrock action mentioned as one of the permissions controlled by IAM for calling agents.
bedrock:InvokeModel: A specific Bedrock action mentioned as one of the permissions controlled by IAM for calling models.
benchmark datasets: Standardized collections of prompts and reference answers used to measure a model's performance across specific capability dimensions.
benchmark jobs: Jobs run by Amazon Bedrock Model Evaluation to score model responses.
benchmark saturation: A limitation of benchmarks in which models trained after a benchmark is published can inadvertently absorb the benchmark's answers through web-crawled training data, inflating scores beyond genuine capability improvements.
benchmark sets: Task-specific sets used for automated accuracy assessments before and after model or prompt changes.
BERT encoder: The component that must be run on both the candidate and the reference for BERTScore, adding compute cost and latency.
BERTScore: A technical evaluation method.
BI: Abbreviation used in the text for business intelligence in the phrase "BI + GenAI for business users."
BI dashboards: The dashboards that Amazon Quick returns visualizations for after translating and executing natural-language questions as SQL queries.
Bias: Systematic errors in a model's outputs that arise from flawed training data, flawed algorithm design, or flawed problem framing. It is a core concern in responsible AI governance.
bias assessments: Documentation that must be included in model cards for all production models under internal transparency.
Bias drift: A type of drift tracked by Amazon SageMaker Model Monitor that refers to changes in the bias metrics computed by SageMaker Clarify, indicating that the model is becoming more or less biased over time as the real-world distribution shifts.
bias metrics: Metrics computed by Amazon SageMaker Clarify from training data and model predictions to assess bias, such as the difference in positive-prediction rates between demographic groups.
bias monitoring: Monitoring of bias, which the compliance team wants to know about in AI project review.
Biased Output: A legal risk category of generative AI identified in the text, mitigated by fairness testing.
biased outputs: A legal and responsible-AI risk of working with generative AI mentioned in the text.
Bilingual Evaluation Understudy: The expansion of BLEU.
binary classification: A classification problem in which the category set has two members.
BLEU: A technical evaluation method.
BLEU scores: Word-level precision scores against a reference text; the text says they would not capture whether clinical information is correct.
bucket-level IAM policies: A broader access-control approach contrasted with column-level need-to-know access; the text says column-level control is substantially more precise than bucket-level IAM policies.
built-in evaluation jobs: A Bedrock-provided evaluation feature aligned with agent metrics.
built-in safety guardrails: The model's safety protections that jailbreaking attempts to bypass.

C

chain-of-thought: A standard prompt technique listed with zero-shot, single-shot, and few-shot.
Chain-of-thought prompting: A prompting technique that instructs the model to reason through a problem step by step before producing the final answer. It is especially useful for arithmetic, multi-step reasoning, and tasks where intermediate logic matters as much as the final answer. The text also notes that it can make errors visible by exposing where the reasoning went off course.
change-management process: A formal process for prompts in which a prompt author creates a new version, a reviewer evaluates it against the test dataset, a release manager promotes it to production, and an auditor reviews the full history.
Choosing the right Amazon Nova model: A guide topic about selecting the appropriate Amazon Nova model.
chunking: A retrieval preparation step that Amazon Q Business handles automatically when working with enterprise documents.
chunking strategies: Strategies that may be avoided when a model has a long enough context window to hold the needed content in one prompt.
chunking strategy: A strategy that splits a long document into pieces and processes them in chunks. It costs fewer tokens per chunk but requires additional orchestration logic and may produce less coherent responses.
CI/CD: Continuous integration and continuous delivery; in the text, SageMaker Pipelines provides ML-native CI/CD for automating the pipeline end to end.
citation accuracy: A generation-quality metric for RAG that asks whether the cited sources actually contain the information attributed to them.
claims-processing AI application: The AI application whose prompts are being maintained in Bedrock Prompt Management.
classical machine learning: A contrasting approach used to explain that generative systems differ in cost structure and failure modes from traditional machine learning.
classical machine learning models: Machine learning models that are contrasted with generative AI in the text. Amazon SageMaker Clarify addresses bias in these models rather than in generative AI.
Classical ML inference: Inference that is typically priced per-prediction or per-endpoint-hour.
classification: A task for which Cohere Command models are optimized.
classification tags: Tags such as "contains-PII" or "licensed-for-internal-use" that data owners attach to tables and columns so access policies can honor them automatically.
Claude 4.x generation: The generation of Anthropic Claude models named in the text, consisting of Haiku 4.x, Sonnet 4.x, and Opus 4.x.
Claude Haiku 4.x: A representative Anthropic Claude model described as having low latency, low cost, and being best for customer support.
Claude models: The models used most widely in Amazon Bedrock for text tasks, for which Anthropic's guidance recommends XML-style tags to delimit sections.
Claude Opus: A large-parameter commercial managed model mentioned as an example of a managed API option for foundation model use.
CloudTrail: An audit-layer service listed under Audit and used in the text to log API calls and detect misuse after the fact.
CloudTrail data events: CloudTrail records that capture reads and writes on S3 training buckets and model artifact buckets, providing access logs at the object level.
CloudTrail Lake: CloudTrail's managed analytics layer that allows SQL-based queries against event history without requiring an external log analytics system.
CloudWatch: An AWS service whose metrics back auto-scaling policies for SageMaker inference endpoints and whose alarms can automatically initiate retraining runs when raised by Model Monitor.
CloudWatch alarm: An alarm raised by Model Monitor that can automatically initiate a retraining run through SageMaker Pipelines.
CloudWatch alarms: Alerts raised by Amazon SageMaker Model Monitor when drift exceeds configurable thresholds.
CloudWatch Logs group: A destination that can receive the full prompt and response for every inference call when Bedrock model invocation logging is enabled.
CloudWatch Logs Insights: A tool used to build custom detection rules from Bedrock invocation logging data.
Clustering: An unsupervised technique that partitions the customer base into groups based on the similarity of purchase patterns, which can then be reviewed by an analyst and mapped to business segments.
CLV: Customer lifetime value.
CMK: The acronym used for customer-managed keys.
Cohere Command: A model family in Amazon Bedrock optimized for retrieval, classification, and enterprise search.
Cohere Embed: An embedding model exposed through Amazon Bedrock that accepts text and returns a floating-point vector.
Cohere's Command and Embed families: Commercial foundation model families accessible through AWS.
coherent prose: A type of output that favors foundation models because they can generate it, unlike traditional ML models.
Commercial foundation models: Foundation models offered by AI companies as a managed API service. The consuming organization pays per token processed rather than managing infrastructure, and the vendor continuously updates the model, though the organization has less visibility into training data and weights.
Common Vulnerabilities and Exposures (CVE): The database with which Amazon Inspector correlates findings to identify known software vulnerabilities.
Compliance: A factor covering regulatory and industry requirements. The text gives HIPAA-governed healthcare applications and financial applications with data-egress restrictions as examples of compliance constraints that narrow the eligible model list.
Compliance control: A comparison factor in which AWS manages base model compliance in Amazon Bedrock, while the organization controls the full stack in Amazon SageMaker AI.
compliance dashboard: A dashboard that ingests Trusted Advisor results alongside Config, Inspector, and Security Hub findings.
compliance lifecycle: The ongoing compliance process maintained through continuous monitoring, evidence collection, and audit-readiness rather than a single deployment-time checkpoint.
compliance officers: One of the reviewer/approver roles included in role-specific training.
compliance requirement: A mandatory condition in regulated industries; here, immutable prompt versions are described as a compliance requirement because outputs may be subject to audit.
compliance team: The team that wants to know about bias monitoring in AI project review.
compromised IAM role: An IAM role that has been compromised and may perform lateral movement or unusual API call patterns.
Computer vision: The analysis and interpretation of image and video data. The text says it is not the correct category for a system working with text, APIs, and knowledge bases rather than pixel data.
Computer vision (CV): The branch of AI that enables machines to interpret images and video. It can classify objects in a photo, detect defects on a manufacturing line, or count vehicles in a parking lot.
Config rule: A rule in AWS Config that can require S3 block-public-access settings and provide historical compliance data.
configuration recording: The process performed by AWS Config of tracking the state of AWS resources and recording new configurations when resources change.
configured threshold: The cutoff used by the Guardrails contextual grounding check; responses that fall below it are blocked.
contains-PII: An example of a classification tag used with AWS Lake Formation to mark data that contains PII.
context window: The amount of context a model can hold; its size is one of the elements used to describe model complexity.
Context window size: The maximum number of tokens the model can read in a single call, counting both the input and the output.
context windows: A model context concept used with conversation history in Amazon Bedrock.
continued pre-training: A training approach referenced alongside fine-tuning in Amazon Bedrock.
continuous pre-training: A process the text distinguishes from training, fine-tuning, and distillation.
Control Tower: An AWS governance service whose guardrails scope includes Amazon Bedrock. The text says organizations using Control Tower can apply service control policies to restrict which accounts may use which models.
conversational prompts: Prompts that foundation models accept and use to adjust their behavior based on instructions without retraining.
Converse: An Amazon Bedrock API used to send prompts; the system prompt field maps to role-and-constraints components and the user message carries task, context, format, and input.
Converse API: A cross-model compatibility layer through which Guardrails applies uniformly, including for models hosted outside Amazon Bedrock.
convolutional neural network: A neural network mentioned as an example of a model that can classify chest X-rays and is therefore ML, deep learning, and AI.
Convolutional neural networks: A typical machine learning approach for image data.
cost and latency profile: The performance profile of the distilled model, which is closer to a traditional ML model because it is smaller, faster, and cheaper.
cost per inference: The cost of a single model inference, which grows with context length under token-based pricing.
coverage gap: The gap in model performance for underrepresented users that inclusivity addresses by ensuring training and testing across the full population served.
CreateTrainingJob: A SageMaker API call recorded by CloudTrail.
creation metadata: Metadata retained with each prompt version, including creation timestamp, model association, and inference parameters.
CRFM: An acronym in the citation 'Stanford CRFM' for the cited work on foundation models.
CRM: An example of an enterprise SaaS application in Scope 2; the text says it can summarize account notes using built-in GenAI features.
cross-region inference: An Amazon Bedrock option that lets architects work within data-residency constraints while maintaining availability.
CSAT: Customer Satisfaction Score, a post-interaction survey instrument in which users rate their experience on a numeric scale.
custom classification: A text-analysis capability provided by Amazon Comprehend.
custom dashboards: AWS dashboard views used in the text to monitor operational leading indicators such as inference latency, error rates, and model invocation counts.
custom inference pipelines: Inference pipelines that can use Amazon Augmented AI as the human-review workflow layer. The text mentions them in the context of A2I.
custom model artifact: The resulting model artifact produced by fine-tuning; in Scope 4, the customer is responsible for its encryption, access control, and version management.
Custom model fine-tuning: The process of fine-tuning custom models in Amazon Bedrock.
custom prompt datasets: Organization-created prompt datasets used in evaluation so that models can be assessed on representative inputs rather than relying on generic datasets, helping close the gap between benchmark performance and production behavior.
customer records: Records the agent can look up when integrating internal tools through MCP.
customer-managed KMS key: A KMS key controlled by the customer, used in the text to encrypt training data.
customer-managed KMS keys: KMS keys that the customer controls. In the text, they are supported by Amazon Bedrock Knowledge Bases and custom model fine-tuning jobs, and they let the security team control key rotation and revoke access at any time.
CV: The branch of AI that enables machines to interpret images and video.
CX: A stakeholder group listed for user satisfaction metrics.

D

dashboards: CloudWatch dashboards, referenced together with metrics and alarms.
data classification levels: Levels that, under a data-use policy, may require additional approval gates before data enters a training pipeline.
data governance: A data preparation concern addressed by PII detection and removal.
data governance practices: Practices required by the EU AI Act for training datasets for high-risk AI systems, covering collection purpose, processing operations, and compliance with data-protection law.
data governance team: The team that asks which AWS mechanism best ensures that EU customer data used for model inference does not leave the EU.
data pipeline: The pipeline through which data enters the AI system; it is where quality checks, PETs, access controls, and integrity controls are applied.
Data poisoning: An attack that targets training data and affects model weights, not runtime inference quality in a system where the model itself is unchanged.
data preparation process: An iterative process for a fine-tuning project in which an initial dataset is curated, the model is trained, outputs are evaluated, errors are diagnosed, and training data is corrected or augmented before the next training run.
data quality assessment: A prerequisite for both security and model accuracy that checks completeness, accuracy, consistency, and freshness before data enters the AI pipeline.
data quality baselines: Statistical baselines computed from the training dataset and used by Amazon SageMaker Model Monitor to detect divergence in production inputs.
Data quality drift: A type of drift tracked by Amazon SageMaker Model Monitor that refers to changes in the statistical distribution of input features. The text uses the example of a loan-application model whose live traffic shifts from 30% to 60% college-degree holders, indicating the input distribution has changed and the training data may no longer be representative.
data residency compliance: A compliance requirement that can be given up when choosing to wait for local region support instead of making a cross-region request to an available region.
data-governance strategies: Strategies auditors expect, including data lifecycles, logging, residency, monitoring, observation, and retention.
data-preparation processes: Processes covered in Task 3.3 that change a model's weights to better match a specific task or domain.
dataset: An initial collection of data that is curated before model training in the iterative fine-tuning process.
dataset review: A mitigation for Poisoning that involves reviewing the dataset used for training or fine-tuning.
Dataset size: A fine-tuning consideration describing how many examples are used. The text says typical instruction-tuning for task adaptation requires hundreds to several thousand high-quality labeled examples, with a practical minimum usually in the range of two hundred to five hundred examples, and that improvements typically plateau after several thousand examples for a narrowly defined task.
de-provisions: The automatic removal or shutdown of compute in SageMaker AI Serverless Inference when it is not needed.
decision transparency: A human-centered design element used in explainable AI.
Deep learning: A subset of machine learning that uses layered neural networks. A linear regression model is ML but not deep learning, while a convolutional neural network is both ML and deep learning.
deep learning model: A model in the deep learning subset of machine learning; in the text, a large language model is identified as a deep learning model.
deep neural network: A model described as not transparent in the same way as a decision tree; with billions of parameters, no human can read the weight matrix and understand why a particular token sequence produced a particular output.
demand forecasting: A canonical business example for regression in the table.
demographic groups: Groups whose outcomes and accuracy are affected by bias and variance.
deterministic code: Code used instead of AI/ML when the outcome must be exact and reproducible; the text says it is appropriate for a fixed formula that applies identically and has not changed in five years.
deterministic rule engine: A rule-based system written by a developer that can hit a ceiling when business logic becomes too complex to maintain manually, making ML more attractive for scalability.
Deterministic-outcome requirements: A condition in which a business or regulatory process demands a specific, reproducible answer for a given input rather than a probabilistic estimate; the text gives tax calculations, regulatory eligibility checks, and contractual billing formulas as examples.
distillation: A cost-reduction technique that trains a compact student model to mimic the outputs of a larger teacher model on a target set of prompts. The teacher generates prompt-response pairs for training data, and the resulting student is faster and cheaper per call at inference.
distributed training jobs: Training jobs run on GPU clusters in SageMaker AI.
distribution of likely next tokens: The set of probable next-token choices from which the model samples at each step. The text identifies this sampling process as the source of nondeterminism.
Document classification: A use case in the table where the technical metric is precision on each category and the business metric is staff hours saved per week.
document embeddings: The representations held in the vector store for RAG pipelines.
document processing: A business application area for agentic AI in which an agent reads invoices, extracts line items, and enters them into an ERP.
document retrieval: The action of retrieving documents from Amazon Bedrock Knowledge Bases; the text says access controls can limit this action.
document-processing solution: An operational AI solution that processes documents and is evaluated by how quickly and cheaply it can handle each document compared with a manual baseline. In the text, it is treated as an internal workflow rather than a customer-facing sales funnel.
domain inaccuracy: A problem where the model lacks specialized knowledge in the training corpus. The text contrasts it with hallucination and says it is about being uninformed rather than inventing content.
domain-specific dataset: A smaller dataset used in fine-tuning that is specific to the business domain or task.
DynamoDB: An AWS service included among the services covered by AWS Backup for scheduling, monitoring, and enforcing backup and retention policies.

E

EC2: The AWS compute service referenced as the place to host open-source pre-trained models on self-managed GPU instances.
EC2 instances: The inference infrastructure on which Amazon Inspector is deployed for vulnerability scanning in the text.
ECS: An AWS service whose containers produce application-level logs retained by Amazon CloudWatch Logs.
EDA: An abbreviation for Exploratory data analysis.
EEOC: Abbreviation for Equal Employment Opportunity Commission.
efficient token consumption: A characteristic attributed to Mistral AI models in the text.
EFS: An AWS service included among the services covered by AWS Backup for scheduling, monitoring, and enforcing backup and retention policies.
embedding: A numerical vector that captures the semantic meaning of the user's question in the RAG process.
embedding model call: A component of cost per interaction in a RAG pipeline; the text says it is included in the total cost but does not further define it.
Embedding models: Models referenced for Amazon Bedrock Knowledge Bases that create embeddings.
embeddings: A foundational generative-AI concept included in the conceptual machinery of this domain. The text names embeddings as part of the vocabulary to understand, but does not further define them.
engineering and governance commitments: The commitments that make up responsible AI and determine whether an AI system produces outputs that are fair, accurate, and safe.
ERP: The system into which a document-processing agent enters extracted invoice line items.
ETL: A data-preparation role in the pipeline performed by AWS Glue; the text identifies it as part of data storage and prep, but does not expand the acronym.
EU: A jurisdiction mentioned as one of the places where regulators are putting concrete obligations on organizations that deploy AI.
EU AI Act: A law that classifies AI systems by risk level and assigns compliance obligations accordingly. Highest-risk systems, including those used in critical infrastructure, employment decisions, education, credit scoring, law enforcement, and biometric identification, must meet requirements for transparency, human oversight, data governance, and accuracy before deployment in the EU. Lower-risk systems face lighter obligations, and some AI applications are prohibited entirely.
evaluation: The stage where the adapted or trained model is tested on held-out data and scored against performance metrics. A model that passes evaluation proceeds to deployment; a model that fails returns to an earlier stage.
evaluation corpus: The collection of examples used for evaluation. The text says it should be large enough to smooth out phrasing variation across many examples.
evaluation job: The batch process run during prompt evaluation to score outputs against the test dataset.
Evaluation Layers: A layered evaluation structure shown in the figure, consisting of a Model Layer, App Layer, and Business Layer.
evaluation metrics: The metrics achieved on test sets and recorded in a model card.
evaluation pipeline: A process that can flag high-variance prompts for human review. The text recommends it as part of a mitigation strategy when consistency is mandatory.
explainability: A related but distinct topic from transparency; the text says it can be documented and surfaced by certain AWS tools and is part of human-centered design for explainable AI.
explainability and auditable decision logic: The regulatory factor that most strongly favors traditional ML in lending decisions because regulators require explanation of every decision, including which input features most influenced the outcome.
explainable: A model characteristic contrasted with an opaque model in the context of transparency and explainability.
explainable AI: AI for which human-centered design principles apply, as mentioned in the next chapter focus on transparency and explainability.
extended context windows: A model capability referring to the ability to handle larger context windows.
external transparency: The transparency requirement that includes disclosing to users that they are interacting with an AI system when that interaction may not be obvious, describing the data categories used to personalize the experience, and communicating how the system’s outputs are validated before being presented as factual.

F

F1 score: A technical metric used to evaluate model performance; in the text it is paired with fraud detection and compared against business metrics. A model can have a strong F1 score but still have negative ROI, which would indicate it should be redesigned or replaced.
Fairness: The property of a model that produces equitable outcomes across demographic groups defined by characteristics such as gender, race, or age. A model is considered fair when its bias toward any protected group is below an acceptable threshold.
Fairness testing: A mitigation strategy listed for the Biased Output risk category of generative AI.
FAISS: A paper and associated similarity-search approach referenced in the citation 'Billion-scale similarity search with GPUs.'
feedback and fine-tuning: A lifecycle stage in which production feedback, such as inaccurate responses about newer products, leads the team to run a new fine-tuning job on updated data or implement RAG to provide current information at inference time.
few-shot: A standard prompt technique listed with zero-shot, single-shot, and chain-of-thought.
few-shot examples: Examples included in the prompt that increase token count and therefore the bill under token-based pricing.
few-shot prompt: A prompt used in in-context learning that gives two to five examples.
Few-shot prompting: A prompting technique that includes two to eight examples covering representative variations of the task; it is described as the workhorse technique for business applications because it handles edge cases, enforces format consistency, and reduces the need for exhaustive constraint text, but it increases token cost and can approach the model's context-window limit for long documents.
fine-tuned model: A model produced by fine-tuning, whose training in Amazon Bedrock is charged based on compute time used during the fine-tuning job. Deploying it requires provisioned throughput.
fine-tuned one: The fine-tuned model being compared against a prompted base model when evaluating the quality gap.
fine-tuned variants: Separate model variants produced by fine-tuning for different domains when cross-domain performance is uneven.
fine-tunes: The process in Scope 4 of using proprietary data to adapt a foundation model, creating responsibility for the training dataset, the fine-tuning process, and the model artifact.
Fine-tuning compute: The compute aspect of training job execution, according to the table.
fine-tuning cycle: A later training cycle that incorporates reviewer corrections from human review.
fine-tuning dataset: The dataset used to fine-tune an existing foundation model. In Scope 4, the customer is responsible for this dataset.
fine-tuning in Amazon Bedrock: The process of customizing foundation models in Amazon Bedrock. The text references it as the current platform being contrasted with SageMaker JumpStart.
fine-tuning job: The training job used to create a fine-tuned model in Amazon Bedrock; compute time used during this job is charged.
fine-tuning project: A project whose data preparation process is iterative rather than linear, involving curation, training, evaluation, correction or augmentation, and repeated training runs.
fixed-size chunking: A chunking method supported by Bedrock Knowledge Bases that splits documents every N tokens.
floating-point vector: The numerical vector returned by an embedding model and stored in a vector database for similarity retrieval.
flow node configuration: The configuration in which a prompt ARN is referenced when using Bedrock Flows.
FM: An abbreviation used for foundation model in the text.
FM-powered procurement assistant: An application powered by a foundation model, used in the example where user return behavior and evaluation metrics are discussed.
FMs: An abbreviation for foundation models, which are large models trained on broad, general-purpose datasets at enormous scale.
FN: Abbreviation for False Negative in the confusion matrix.
Forecasting: A process that produces predictions of future values for a time-series variable, such as product demand, energy consumption, or call-center staffing requirements. It uses historical observations of the target variable plus optional related time series to improve accuracy.
foundation model: A model type contrasted with a custom-trained model in the text. The text raises the question of whether a foundation model is the same as a custom-trained model, but gives no further definition.
foundation model API: An API used in Scope 3 to call a foundation model, such as Amazon Bedrock, as part of an application built by a development team.
foundation model options: The set of possible foundation model sources being evaluated in the self-check scenario for a high-volume product description generation service.
foundation models: Models that are the subject of Domain 3 and are part of the foundational generative-AI vocabulary in this domain. The text does not provide a fuller definition beyond their role in generative AI.
Foundation models (FMs): Large models trained on broad, general-purpose datasets at enormous scale. They are not trained for one specific task and can be adapted to many downstream tasks through prompting, retrieval, or fine-tuning.
FP: Abbreviation for False Positive in the confusion matrix.
fraction of a cent per inference: The low per-inference cost target described as favorable to traditional ML in high-volume applications.
full fine-tuning: A training approach that would require approximately 140 GB of GPU memory in the example.
fully auditable: A requirement for traditional ML in regulated settings, meaning the decision path can be reviewed and reproduced.

G

GDPR: A regulatory framework referenced in relation to data residency and processing location requirements, and as a set of configuration rules that can be checked by AWS Config conformance packs.
GenAI: A class of AI that produces new content, such as text, images, audio, or code, in response to a prompt.
General Data Protection Regulation (GDPR): A regulation that restricts the transfer of EU residents' personal data to countries outside the European Economic Area unless specific safeguards are in place.
generative AI: A new V1.1 addition covered under basic AI and ML terminology in Task 1.1. The text does not further define it.
Generative AI (GenAI): A class of AI that produces new content, such as text, images, audio, or code, in response to a prompt. It is typically built on LLMs or similar large-scale generative models and produces variable-length, human-readable output.
generative AI applications: Applications in Amazon Bedrock for which Bedrock Model Evaluation can benchmark foundation models against custom criteria before production deployment.
generic benchmarks: General-purpose benchmark measures on which a model may score well even if it fails the actual business task it was built for.
geographic placement: The placement of compute and storage in specific regions, which the text identifies as the mechanism that controls data residency.
Glue Data Catalog: A catalog extended by AWS Lake Formation to support fine-grained access control over datasets, including column-level, row-level, and cell-level permissions.
governance: A business consideration that differs for agentic AI systems because they take real-world actions rather than producing static outputs.
governance and compliance: The translation of responsible AI principles into organizational accountability through governance structures that create policies, review cycles, audit trails, and regulatory evidence so an organization can explain, defend, and continuously improve its AI systems.
governance artifact: A non-technical artifact used for governance purposes; the text identifies the model card as one.
Governance frameworks: Structured vocabularies and control taxonomies that organizations use to build and communicate their AI governance programs.
governance processes: Processes that were bypassed when prompts were embedded in Lambda functions or environment variables, making them invisible and impossible to audit.
governance protocols: Practices that make data-governance strategies work in practice, including policies, review cadence, transparency standards, team training, and the AWS Generative AI Security Scoping Matrix.
governance repository: The repository where training completion records should be maintained, along with policy documents and audit evidence.
governance structures: Organizational structures that create policies, review cycles, audit trails, and regulatory evidence for AI systems, enabling explanation, defense, and continuous improvement.
GPT-3: The model associated with the cited paper 'Language Models are Few-Shot Learners.'
greedy deterministic mode: The deterministic mode reached when temperature is set to exactly 0, causing the model to always select the highest-probability next token.
Guardrails contextual grounding check: An Amazon Bedrock Guardrails feature that automatically scores a RAG response against the retrieved context documents and blocks responses below the configured threshold. It is presented as the most direct, lowest-effort way to enforce that the assistant does not return information not present in the official documents.
Guardrails with denied topics: A Bedrock Guardrails capability that lets operators define topic categories the model must not engage with, including investment product recommendations, and enforces this policy on every request regardless of how the user phrases the question.

H

Hallucination: The phenomenon where a generative model produces fluent, confident output that is not grounded in factual source material. The text distinguishes it from domain inaccuracy and notes that it can be mitigated by Retrieval Augmented Generation.
Hallucination rate: A generation metric for an AI agent that measures the rate of hallucinated outputs.
hallucination rates: The rate at which a model produces hallucinations. The text mentions higher hallucination rates on very recent events as an example of a failure mode that should be disclosed in a Model Card.
hallucination risk: The risk that an AI application will produce consequential output that is not sufficiently grounded or validated. The text states that any AI application producing consequential output should not be deployed unless it has at least one automated grounding or validation check in the response pipeline.
hallucinations: A failure mode of generative AI that must be handled with a fallback, according to the text.
hierarchical chunking: A chunking method supported by Bedrock Knowledge Bases that produces both a parent summary chunk and smaller child detail chunks so retrieval can operate at two levels of granularity.
high-dimensional vectors: The vector representations into which BERTScore encodes both texts before computing cosine similarity.
high-risk AI systems: AI systems whose training datasets must, under the EU AI Act, be subject to data governance practices covering collection purpose, processing operations, and compliance with data-protection law.
high-variance prompts: Prompts that produce especially divergent outputs across runs. The text says an evaluation pipeline can flag them for human review.
hijacking: A risk of poorly designed prompts.
Hijacking / Injection: A prompt security risk in which user input attempts to override the system prompt. The example is the instruction 'Ignore prior instructions' in user input, and the mitigation is API-level prompt separation and guardrails.
HIPAA: A regulatory requirement governing healthcare applications; the text says healthcare applications governed by HIPAA must use models deployed within a HIPAA-eligible service boundary.
HIPAA Business Associate Addenda: Agreements available for download in AWS Artifact.
HIPAA Business Associate Addenda (BAA): An AWS Artifact agreement available for covered healthcare entities.
HIPAA Eligible Services: AWS services that are eligible under HIPAA, referenced in AWS Compliance.
HIPAA-eligible service boundary: The deployment boundary within which models must be deployed for healthcare applications governed by HIPAA.
HNSW: An index algorithm supported by OpenSearch Service that delivers sub-millisecond retrieval at billions of vectors.
Hugging Face: A provider whose models can be browsed in SageMaker JumpStart.
Hugging Face Documentation: Documentation from Hugging Face cited as the source for text generation and sampling strategies.
human auditor: The expert who reviews a representative sample of predictions during a human audit.
Human audits: A detection practice in which expert judgment is applied to samples of model outputs. A human auditor reviews a representative sample of predictions and evaluates them for accuracy, fairness, and appropriateness; the text says human audits catch failure modes automated metrics may miss, but they are expensive and do not scale to 100% of outputs.
human evaluation: The older term updated in v1.1 to 'human-in-the-loop evaluation.' The text says the update emphasizes that humans are inserted at specific judgment points within a larger automated pipeline.
human evaluation job: A Bedrock Model Evaluations configuration that uses an internal team or AWS-managed workforce.
human-evaluation jobs: Evaluation jobs provided by Amazon Bedrock Model Evaluation. The text says they can be configured with an internal reviewer team or an AWS-managed workforce.
human-feedback processes: Processes used before fine-tuning to align a model's behavior with business expectations.
human-in-the-loop evaluation: The practice of incorporating qualified human reviewers into the evaluation process to assess model outputs against criteria automated metrics cannot capture, such as factual correctness on proprietary topics, tone appropriateness, or safety. The text notes that humans are inserted at specific judgment points within a larger automated pipeline rather than running the evaluation end to end.
human-in-the-loop evaluation workflows: Evaluation workflows in which outputs are routed to a defined reviewer pool, such as clinical nurses or physicians, for rubric-based scoring.

I

IAM: The AWS access-control service that determines which identities, roles, and services may call which models. The text says it can be applied with specific model ARNs and specific Bedrock actions such as bedrock:InvokeModel and bedrock:InvokeAgent.
IAM controls: Access-control measures based on AWS Identity and Access Management that are mentioned as part of the context for addressing prompt injection and governing prompt resources.
IAM policy change: A change affecting model access that is recorded by CloudTrail.
IAM principals: The principals whose ability to read training data is restricted by S3 bucket policies.
IAM role: A role attached to an application component such as a Lambda function, EC2 instance, or ECS task that carries the permissions used to call Bedrock and is subject to least-privilege restrictions.
IAM roles: AWS identity and access management roles identified as services and features used to secure AI systems.
IAM roles and policies: An access-management control in AWS security services for AI workloads that determines which principals can invoke models, agents, and knowledge bases.
IAM-based access: Access control for SageMaker notebook instances that uses IAM and prevents direct internet exposure of notebook environments.
IDE: Abbreviation used in the text for integrated development environment, as in existing IDEs and full IDE workflows.
Identity and Access Management (IAM): The primary mechanism for controlling who and what can interact with AI services. In the text, IAM is used to enforce least-privilege access, restrict access to specific Bedrock models, knowledge bases, or agents, and control access through roles attached to Lambda functions, EC2 instances, or ECS tasks.
image classification: A computer vision task that assigns a label to the whole image.
immutable audit record: The audit record created by AWS CloudTrail that captures every access, modification, or deletion event against S3 buckets.
immutable prompt versioning: A Bedrock Prompt Management feature in which each prompt change creates a new version and prior versions are retained permanently with their full content and metadata. It supports audit requirements by preserving a tamper-evident history of what prompt was used.
immutable prompt versions: Prompt versions that cannot be changed after creation; the text says they are a compliance requirement in regulated industries where model outputs may be subject to audit.
in-context learning: A technique where a foundation model is given examples of input-output pairs directly in the prompt (rather than via fine-tuning) to bias its responses toward a desired format or task. Includes zero-shot, one-shot, and few-shot variants.
inaccuracy in specialized domains: A limitation where a model may know general facts but fail on domain-specific questions because the relevant documents were never in its training corpus. The text gives examples such as hospital clinical protocols, insurance coding rules, and proprietary drug interactions.
inbound prompt: The prompt entering Bedrock Guardrails from the application; it is inspected before being passed through.
independent auditor: The party that produces the SOC 2 Type II report on AWS's controls.
inference API: A security-relevant surface in AI workloads that is part of the added requirements beyond traditional application stacks.
inference call: A single model inference used in the comparison table as the basis for low cost in a plain LLM call.
inference calls: Model execution calls; the text says ROUGE and BLEU require no additional inference calls, while BERTScore requires running the BERT encoder on both candidate and reference.
inference cost: The cost associated with serving model inference, which can be reduced by architectures such as mixture-of-experts.
inference data: New data that accumulates in production and is used by SageMaker Clarify to recompute bias metrics when integrated with SageMaker Model Monitor.
inference data buckets: Storage buckets holding data used for model inference, which must be restricted to EU regions in the scenario.
Inference flexibility: A comparison factor in which Amazon Bedrock provides a managed API with limited runtime config, while Amazon SageMaker AI supports custom inference code and batching strategies.
inference latency: An operational leading indicator that can be tracked with Amazon CloudWatch.
inference level: The level at which SageMaker Clarify addresses explainability.
inference outputs: Model outputs produced during production inference that may be routed to human reviewers by A2I when confidence is low or human oversight is required.
inference parameter: A parameter used at inference time; the text specifically discusses temperature, top-k, and top-p as inference parameters.
inference pipeline: The pipeline into which Amazon Augmented AI integrates human review for low-confidence predictions.
inference pricing: The pricing category from which custom models are distinct; the text contrasts custom model costs with inference pricing.
inference S3 bucket: A storage bucket used for inference data. The text refers to restricting training and inference data buckets to EU regions to support residency compliance.
inference throughput: A performance-related effect of model complexity that influences the infrastructure tier needed to serve the model.
inference time: The time at which retrieval-augmented generation fetches relevant documents to extend the model's knowledge.
inference-time feature derivation: A stage in the typical lifecycle of an AI training dataset.
initial dataset: The dataset curated at the start of the data preparation process.
input prompt: The prompt to which Amazon Bedrock Guardrails applies content policies.
Input token price: The cost per 1,000 input tokens, which varies by model and is directly proportional to prompt length.
Input tokens: The prompt tokens that are priced separately from output tokens when using a foundation model.
instruction tuning: A later step that follows continuous pre-training once a labeled dataset is available.
internal audit teams: Teams that need to verify that the classifier is applying consistent rules across demographic groups.
internal transparency: The minimum transparency requirement that includes publishing model cards for all production models, documenting known limitations, bias assessments, and intended scope, and making that documentation accessible to the security, legal, and compliance teams that oversee the system.
interpretability: The ability to explain or trace why a model produced a given output. The text notes that interpretability tooling can help trace outputs but does not stop hallucinations, and that interpretability issues are distinct from output variability.
invocation counts per model: An operational metric collected by Amazon CloudWatch from Amazon Bedrock and application code.
ISO 27001: One of the third-party audit report categories available on demand through AWS Artifact.
ISO/IEC 42001: The international standard for AI management systems published by the International Organization for Standardization. It defines requirements for establishing, implementing, maintaining, and continually improving an AI management system within an organization, covering objectives, roles, risk assessment, and performance evaluation. Certification provides third-party validation that an organization's AI governance program meets the standard's requirements.

J

jailbreak: A prompt-attack pattern that the Prompt-attack filter is designed to detect. The text uses it as an example of input intended to override the system prompt or bypass content rules.
jailbreaking: A risk of poorly designed prompts.
JSON-structured prompts: Prompts structured in JSON form, which work similarly to XML-style tags for models that process JSON natively.

K

K-Means Clustering Algorithm: An AWS SageMaker AI developer guide topic referenced as the source for unsupervised learning content; the text itself does not define the algorithm beyond identifying it as a clustering algorithm.
k-NN: The acronym/abbreviation for k-nearest-neighbor, an algorithm type used to search embeddings quickly in vector databases.
KMS: A data-security service in the security model, listed under the Data layer and used in the text for encrypting the system prompt and training data.
Knowledge Bases: A Bedrock-related capability used with Retrieval Augmented Generation.
Knowledge Bases for Amazon Bedrock: An Amazon Bedrock feature or capability related to knowledge bases, referenced in the context of retrieval augmented generation.

L

labeled dataset: A dataset created through human labeling at scale for training purposes.
labeled evaluation set: A task-specific set used to measure accuracy.
labeled question-answer training data: Training data consisting of question-answer pairs with labels. The company has none of this data, which is why the text favors a foundation model approach over supervised training.
labeled training data: Data with labels that can be used to reach acceptable accuracy for a single well-defined prediction task without general-purpose reasoning.
Lambda: A service whose functions produce application-level logs retained by Amazon CloudWatch Logs.
Lambda function: The function used in the AWS implementation of the AI agent to perform threshold calculation.
Lambda functions: A form used to define action groups that Amazon Bedrock agents connect to foundation models.
large language model: A newer AI term identified as central to business conversations and included among the basic AI and ML terms covered in the domain. The text does not further define it.
large language model (LLM): A deep learning model, specifically a neural network trained on a massive corpus of text, that can generate, summarize, translate, and reason about language at a level of fluency and flexibility not possible with earlier NLP techniques. Its scale is measured in billions of parameters, which gives broad capability but also makes it expensive to train from scratch.
Large language models: Models trained on text corpora containing hundreds of billions of words.
large language models (LLMs): Large-scale language models accessed through Amazon Bedrock for open-ended language work such as long-form generation, complex summarization, or multi-step reasoning. They are the better fit for these tasks than purpose-built NLP services.
latency: A factor that inferencing patterns are chosen to solve, together with throughput requirements; real-time inferencing returns predictions within milliseconds, while asynchronous inferencing may take seconds to minutes and batch inferencing minutes to hours.
Latency constraints: A limitation that matters when an application needs a response in single-digit milliseconds; the text says certain complex models may require more inference time than that, making a hard-coded decision path the only option that meets the SLA.
latency-sensitive production workloads: Production workloads for which reduced latency variance is beneficial.
length bias: The tendency of an LLM-as-a-judge to score longer answers higher even when accuracy is unchanged.
Lex: An AWS service in the voice channel flow that processes intent after transcription and before synthesized voice is returned.
Lexical metrics: Metrics that are fast but surface-level, such as ROUGE and BLEU.
lexical overlap: Word-level overlap between texts; the text says BERTScore is useful when lexical overlap would unfairly penalize valid paraphrases.
linear regression: An ML algorithm used for predicting continuous values.
linear regression model: A machine learning model that the text says is ML but not deep learning.
Llama: A Meta open-source pre-trained model family; the text notes it is in its 4th generation and includes Llama 4 Scout and Maverick variants with very long context windows.
Llama 4 Maverick: A representative Meta Llama model listed in the model family table.
Llama 4 Scout: A variant in Meta's Llama 4 generation that offers very long context windows.
LLM: A generative model used as the reasoning engine in an agentic AI system.
LLM-as-a-judge: A technical evaluation method.
LLMs: Large-scale language models accessed through Amazon Bedrock for open-ended language work such as long-form generation, complex summarization, or multi-step reasoning.
log-probs: The abbreviated term for log probabilities, used as a proxy for confidence when returned by some model APIs.
long-term storage: The storage destination used when information is offloaded from the working context as part of memory management.
Low inference cost: The cost characteristic of the small student model after distillation.
LSTM: A typical machine learning approach for time-series data.

M

Machine learning: A subset of AI that limits the definition to systems that learn from data. A rule-based fraud filter is AI but not ML, while a fraud model trained on transaction histories is both AI and ML.
Machine learning (ML): A subset of AI in which a system learns patterns from data rather than following rules written explicitly by a programmer. In the hierarchy, ML is narrower than AI and includes deep learning as a subset.
machine learning engineer: A user role that may be granted read access to non-PII columns while being prevented from accessing sensitive PII until a formal data-use agreement is in place.
Machine Learning Lens: A lens within the AWS Well-Architected Framework focused on machine learning.
Machine Learning Operations: The expanded form of MLOps, described as the discipline of applying software-engineering rigor to the ML lifecycle to make model delivery repeatable, scalable, and maintainable over time.
machine translation benchmarking: The evaluation context in which BLEU remains the standard metric.
Macie: An AWS security service whose findings are integrated into AWS Security Hub alongside GuardDuty and Amazon Inspector findings.
Macie scans: Scans of the RAG corpus used to detect sensitive documents before ingestion as a data leakage mitigation.
major AI research labs: The organizations for which pre-training is presented as a practical activity; the text says pre-training is not practical outside these labs.
managed labeling workflows: Labeling workflows managed by Amazon SageMaker Ground Truth. The text says this service can supply the human reviewer workforce for evaluation patterns.
managed vector store: The storage used by Knowledge Bases for Amazon Bedrock to store vector embeddings.
manual prompt engineering: An approach with Amazon Bedrock that would require the team to build all connector and retrieval logic themselves, making it substantially more work than using Q Business directly.
max_new_tokens: The output-length cap shown in the example inference parameter configuration, set to 512.
maximum tokens: An inference parameter stored with a prompt resource that limits the number of tokens.
MCP: An acronym for Model Context Protocol, the standardized protocol for connecting AI agents to external tools, data sources, and services.
MCP client: The AI agent side of the standardized interface defined by the Model Context Protocol; it connects to MCP servers to discover and call external tools or services.
MCP client interface: The client-side interface implemented by Strands Agents for working with the Model Context Protocol. The text says it allows agentic applications to connect to MCP-compliant tools and services.
MCP client support: Built-in support in Strands Agents for using MCP as a client.
MCP server endpoint: The endpoint exposed by a CRM system, ticketing system, or inventory API when it is made available through MCP so an agent can access it through the same protocol.
MCP servers: External tools or services that expose an MCP server endpoint so an AI agent can discover and call them through the Model Context Protocol.
medical records: An example source from which fine-tuning can extract structured fields.
medical-device classifications: An environment cited as one where regulatory accountability requires complete auditability of every individual decision.
Meta Llama: An example of an LLM that underpins modern generative AI applications.
Meta Llama 4 Maverick: An open multimodal model family tier with a 1M-token context window, used for customizable, long-context deployments.
metadata: Additional information stored with a prompt resource.
minimal labeled training data: A condition favoring foundation models when the organization has little labeled data but can express the task as a prompt.
Mistral: An open-source pre-trained model family from Mistral AI.
Mistral 7B: A representative Mistral AI model listed in the model family table.
Mistral AI: A model family in Amazon Bedrock, including Mixtral, described as strong on instruction-following and multilingual tasks with efficient token consumption.
Mistral Large 2: A balanced text-only model family tier with a 128K-token context window, used for European language tasks.
ML engineering capacity: The organization's ability to manage the training and retraining pipeline; sufficient capacity is one of the conditions favoring traditional ML.
ML engineering depth: The level of machine learning engineering capability an organization must have to train a custom model from scratch.
MMLU: Massive Multitask Language Understanding, a benchmark score mentioned as a starting point for evaluating model performance.
MMLU Benchmark: A benchmark referenced from Papers With Code.
model accuracy: A type of evidence the text says AWS Audit Manager can collect quarterly, though it does not detect drift.
model artifact: The model output or deliverable for which the customer is responsible in Scope 4 after fine-tuning a foundation model using proprietary data.
model artifacts: Stored model outputs that are kept in the SageMaker model registry with metadata to support traceability and technical-debt management.
model card: A provider disclosure that the text says should cover training data, environmental cost, and intended use. The responsible-model-selection criteria prefer a model card that exists and is current.
model cards: Documentation required for all production models under internal transparency; they must include known limitations, bias assessments, and intended scope, and be accessible to security, legal, and compliance teams.
Model complexity: A factor referring to parameter count, architecture depth, and the size of the context window a model can hold. The text says more complex models generally perform better on nuanced tasks but are slower and more expensive per token.
Model Context Protocol (MCP): A standardized client-server interface between an AI agent and external tools or services that lets the agent discover available tools, call them with structured arguments, and receive structured results without writing custom integration code for each system.
model distillation: A method to make a model better at a domain; it is one of the options to choose from alongside fine-tuning, in-context learning, and RAG.
Model evaluation: The evaluation of models in Amazon Bedrock, including model evaluations and automated model evaluation jobs.
model evaluation capabilities: Amazon Bedrock capabilities that let teams run automated accuracy assessments against task-specific benchmark sets before and after model or prompt changes.
model inference: The use of a model in the EU credit-scoring scenario, for which the text requires that EU customer data used for inference does not leave the EU.
model inference costs: A cost component included in ROI calculations. The text contrasts it with savings and revenue gains when evaluating a generative AI application.
model invocation: A model call or invocation referenced in the context of logging.
model invocation counts: An operational leading indicator that can be tracked with Amazon CloudWatch.
model invocation logging: Logging that captures each request and response. The text says it allows the team to audit variance across runs and identify which question types produce the most divergent outputs.
Model retraining: The response to monitoring signals. The text says a retraining strategy should specify the trigger, the data window used, and the deployment gate, and that SageMaker Pipelines can be triggered by a CloudWatch alarm raised by Model Monitor.
Model safety training: The attack target for Jailbreaking; it refers to the model’s safety training that attackers try to bypass.
Model training: The optimization process by which an algorithm finds the parameter values that minimize prediction error on the training dataset.
model training jobs: Training processes that consume datasets and are recorded in data lineage.
model-invocation permissions: Permissions held by a compromised IAM role that can be relevant to lateral movement in AI workloads.
MoE: An abbreviation for mixture-of-experts.
Multi-agent systems: Systems where multiple AI agents operate in coordination to handle complex tasks. Patterns include orchestrator/worker (a central coordinator distributes subtasks) and hierarchical (tree of coordinators). Each agent has its own tool access and reasoning loop.
multi-step workflows: Application architectures that combine FM calls, RAG retrievals, business logic, and human handoffs into a complete business process.
multiclass classification: A classification problem in which the category set has more than two members.
multiclass classification model: A model that can route questions to pre-written answers in each language. In the scenario, it is presented as an approach that can route questions but cannot generate personalized responses or maintain context across turns.

N

National Institute of Standards and Technology: The organization that publishes the NIST AI Risk Management Framework (AI RMF).
natural language intents: The intents understood by Amazon Lex conversational interfaces.
Natural language processing (NLP): The branch of AI that enables machines to read, understand, and generate human language. NLP tasks mentioned include sentiment analysis, named entity recognition, translation, and document summarization.
natural language understanding: A capability that foundation models excel at, according to the text.
NDA: A nondisclosure agreement under which authorized AWS customers can download the SOC 2 Type II report from AWS Artifact. The text says the report can be downloaded under NDA.
Negative prompting: A prompting approach that lists prohibited outputs or variants the model must not use. In the text, it can help by naming forbidden category labels, but it scales poorly across multiple categories and is more brittle than positive examples.
neural network: A computational model loosely inspired by the structure of biological neurons. Data passes through layers of interconnected nodes, each of which applies a mathematical transformation, and the network learns which transformations produce accurate outputs by adjusting its internal parameters during training.
neural networks: A type of model included in the text as part of artificial intelligence; deep learning uses layered neural networks.
next tokens: The tokens the model predicts one step at a time. The text says the model samples probabilistically from a distribution of likely next tokens at each step.
NIST AI 600-1: A NIST publication titled Artificial Intelligence Risk Management Framework: Generative AI.
NIST AI Risk Management Framework: A framework that, along with the EU AI Act, imposes traceability requirements on AI systems used in high-stakes decisions.
NIST AI Risk Management Framework (AI RMF): A voluntary framework from the National Institute of Standards and Technology that defines four functions for managing AI risk: Govern, Map, Measure, and Respond. Govern establishes accountability structures and policies; Map identifies and categorizes AI risks in context; Measure quantifies risk levels and tracks mitigation effectiveness; Respond defines actions to address identified risks.
NIST AI RMF: A framework used with AWS Audit Manager to collect compliance evidence and frame the governance program.
NIST RMF: An abbreviated framework reference included under the AI Governance Protocol’s frameworks component, referring to the NIST AI Risk Management Framework.
NIST SP 800-53: A compliance requirement example to which AWS Config conformance packs can be aligned.
NLP: The text-based modeling area whose models learn grammar, semantics, and factual associations from text data.
non-PII: Data or columns that do not contain personally identifiable information; in the text, non-PII columns may be granted for read access while PII columns remain restricted.
Nova Lite: A model in the Amazon Nova series described as part of the range from text-only, low-latency models up to a multimodal flagship.
Nova Micro: A model in the Amazon Nova series described as part of the range from text-only, low-latency models up to a multimodal flagship.
Nova Premier: A model in the Amazon Nova series described as part of the range from text-only, low-latency models up to a multimodal flagship.
Nova Pro: A model in the Amazon Nova series described as part of the range from text-only, low-latency models up to a multimodal flagship.
NPS: Net Promoter Score, a survey instrument that asks whether the user would recommend the application to a colleague.

O

OAuth tokens: Credentials issued and rotated by Bedrock AgentCore Identity for agent-initiated calls to external services.
on-demand inference: The standard Amazon Bedrock billing model in which charges accrue per input and output token with no commitment. It is best for variable or unpredictable workloads and has the highest per-token rate, with no wasted spend during idle periods.
on-demand pricing: A pricing model mentioned for Amazon Bedrock in the self-check scenario. The text says it results in per-token costs that compound at high volume.
on-demand report retrieval: The ability provided by AWS Artifact to download AWS compliance certifications and agreements when needed.
one million tokens: The context size enabled for Claude Opus and Sonnet when using the 1M-context beta header.
one-shot prompting: Another name for single-shot prompting; it includes exactly one example input-output pair before the actual task.
one-time training cost: The upfront cost of distillation that can be recovered by savings on inference in high-volume applications.
ongoing evaluation overhead: A cost component included in ROI calculations for running a generative AI application over time.
open-source evaluation framework: The type of framework Ragas is described as.
open-weight foundation model: A foundation model that a team can fine-tune on proprietary data at scale in SageMaker AI.
OpenSearch: An AWS managed service listed as supporting vector storage, with k-NN and large-scale capabilities.
OpenSearch Service: An AWS vector-store option considered for the RAG knowledge base. The text says it is not the only AWS service that supports vector search and that using it would require provisioning, securing, and operating a separate domain.
OpenSearch Service domain: A separate OpenSearch Service deployment that would need to be provisioned, secured, and operated if OpenSearch Service were chosen.
organizational knowledge bases: Knowledge sources that the Enterprise tier of Amazon Quick can search across through Amazon Q Business indexes.
output token limit: A limit that, if reduced, can reduce output token costs; the text notes that arbitrarily truncating output is likely to reduce quality.
Output token price: The cost per 1,000 output tokens, typically 3x to 5x the input price and directly proportional to response length.
output tokens: The response tokens generated by the model; they are priced separately from input tokens and are consistently more expensive.
overall accuracy: A population-wide performance metric that can mask disparities between demographic groups, as shown by the example of 92% overall accuracy hiding 98% accuracy for the majority group and 71% for the minority group.
OWASP LLM Top 10: A reference that lists prompt injection as the top risk for LLM applications and provides detailed mitigation patterns.
OWASP LLM01: The OWASP category referenced for prompt injection.
OWASP Top 10 for Large Language Model Applications: The industry reference framework cited by the AIF-C01 v1.1 exam guide for GenAI security, around which AWS structures most of its AI-specific risk discussion.

P

partition metadata: Information stored in AWS Glue Data Catalog for datasets in S3.
pay-per-token: A billing model for AWS GenAI services in which charges accumulate only when inference actually runs. It is contrasted with self-hosting, where infrastructure costs accrue around the clock even when no requests are being processed.
PCI DSS: One of the third-party audit report categories available on demand through AWS Artifact.
PCI DSS Attestation of Compliance: A compliance attestation available for download in AWS Artifact.
peak inference period: A period of high inference demand during which a SageMaker endpoint may fail to scale if it is approaching its instance-count limit.
PEFT: An abbreviation for Parameter-efficient fine-tuning, the approach that freezes most model weights and trains only a small set of additional parameters to reduce memory and compute burden.
performance versus interpretability trade-off: The tension in which models with the highest accuracy on complex tasks are often the least interpretable. The text contrasts complex models with interpretable models and frames this as a key exam topic.
personalized responses: Responses generated to draw on the company’s internal product documentation and tailored to the user’s question. The text says the solution must generate personalized responses.
personally identifiable information (PII): Sensitive data that includes information such as names, addresses, and identification numbers; the text contrasts PII columns with non-PII columns and notes that PII may require formal data-use agreements and suppression in model outputs.
PETs: Privacy-enhancing technologies; techniques that allow a dataset to be used for model training or analysis while protecting identity or sensitive attributes.
pgvector: A PostgreSQL extension that can be enabled with a single extension installation to support vector search. The text says it can be added to an existing RDS for PostgreSQL cluster, avoiding a new service.
pgvector extension: An extension for PostgreSQL mentioned as enabling vector search in Amazon Aurora and Amazon RDS for PostgreSQL.
PII: Personally identifiable information; in the table, it is something that must be detected and removed for privacy compliance and data governance.
PII detection and redaction: A PET practice in which PII is detected and then replaced with a placeholder before the data enters the training set.
PII detection and removal: A data preparation activity aimed at privacy compliance and data governance. The text lists Amazon Comprehend entity recognition as the AWS tool used for it.
plain LLM call: A single model call that is described as fast, cheap, and stateless. In the comparison table, it is single-step, uses no external tool access, has no state across steps, has millisecond-to-second latency, low cost, and is appropriate for classification, summarization, and generation.
poisoning: A risk of poorly designed prompts.
Policy in AgentCore: A new V1.1 addition identified as part of the AWS services and features used to secure AI systems.
policy-defined compliance rules: Rules against which AWS Config compares resource configurations to determine whether they are compliant.
positional bias: The tendency of an LLM-as-a-judge to favor whichever candidate appears first in the prompt.
post-hoc explainability layer: An added layer applied after model training to explain a model's predictions, such as by highlighting the image regions the model weighted most heavily.
post-training bias metrics: Bias metrics computed after training by Amazon SageMaker Clarify to measure whether the trained model treats groups differently even when given identical inputs.
PPO: An abbreviation for Proximal Policy Optimization.
practical guardrails: Controls in human-centered design that keep AI assistants useful without becoming inscrutable.
pre-computed embeddings: Embeddings that are stored in advance so they can be searched quickly in a vector database as part of RAG.
pre-trained BERT model: The BERT model used by BERTScore to compute semantic similarity between candidate and reference texts.
pre-training: The stage of the FM lifecycle in which a model learns general representations from a broad dataset using large amounts of compute, producing the base FM weights. The text says it is expensive enough that virtually no business outside hyperscalers does it.
pre-training bias metrics: Bias metrics computed at training time by Amazon SageMaker Clarify to identify whether the training data is skewed.
precision-oriented metric: The type of metric BLEU is described as being, in contrast to recall-oriented summarization metrics.
precision@k: A retrieval metric asking, of the k documents retrieved, what fraction were actually relevant.
preprocessing and validation: A stage in the typical lifecycle of an AI training dataset.
preprocessing steps: The transformations applied to training data before use, such as deduplication, PII removal, and format normalization.
pricing model selection: The decision process in which teams characterize their volume profile, work through pricing model options, and return to optimization levers when costs exceed targets.
privacy compliance: A data preparation concern addressed by PII detection and removal.
private SageMaker endpoint: An endpoint in the user's account infrastructure to which a model can be deployed through SageMaker JumpStart.
PrivateLink: A network-security component in the security model, listed under the Network layer and used in the text to limit network access to the model or storage path.
processing operations: One of the data governance practice areas required for training datasets for high-risk AI systems under the EU AI Act.
procurement assistant: An application used in a procurement context that users interact with to get assistance; the text treats it as the system whose usage and return behavior are being evaluated.
procurement policy database: The database the AI agent must query to determine the company's procurement policy before responding to supplier inquiries.
production LLM-as-a-judge pipelines: Operational evaluation pipelines that typically use a judge model different from and generally larger than the model being evaluated, rotate candidate order in side-by-side comparisons, and calibrate outputs against held-out human ratings.
production-ready prompt: A prompt that is suitable for production use. The text says experienced practitioners rarely produce one on the first attempt and that it is developed through iterative testing and revision.
prohibited-practices provisions: A category of EU AI Act provisions for which violations can incur fines of up to 35 million euros or 7% of global annual turnover, whichever is higher.
prompt: The input that a generative AI system responds to when producing new content.
prompt ARN: An Amazon Resource Name that uniquely identifies a specific version of a prompt. Applications reference it in API calls instead of including the full prompt text in code.
prompt ARNs: Amazon Resource Names that identify specific prompt versions. Applications reference them in API calls, and traffic routing can split live production traffic across them for A/B testing.
prompt author: The person who creates a new prompt version and defines placeholders in prompt variables.
Prompt caching: A cost-reduction technique that reuses the model's processed representation of a static prefix, such as a long system prompt or product catalogue, across multiple calls.
Prompt content: The content of a prompt; in the table it is the attack target for Exposure, meaning sensitive information placed in the prompt itself can be exposed.
prompt distribution: The distribution of prompts that an organization specifies in Amazon Bedrock model distillation as the target for the student model to learn from.
prompt engineers: People whose skills in building prompts for one use case transfer directly to others. The text uses them to illustrate operational savings from using a single adaptable model.
Prompt evaluation: A Bedrock Prompt Management capability that lets teams test a prompt version against a set of test cases and score the outputs before committing to production. Teams define a dataset of representative inputs and expected output criteria, run an evaluation job, and review a structured report.
prompt hijacking: A risk in which user-editable CRM record fields contain adversarial instructions that can influence the prompt. The text notes it requires adversarial intent and is not the most likely explanation for gradual quality degradation across many records.
Prompt Management: A Bedrock feature that controls the prompts teams use, but cannot prevent a user from asking prohibited questions.
prompt resource: A named object in Bedrock Prompt Management that stores the full prompt text, the associated model, inference parameters, and metadata. When the prompt text or parameters change, a new version is created and the previous version is retained.
Prompt security risks: A set of security risks associated with prompt engineering, including Exposure, Poisoning, Hijacking / Injection, and Jailbreaking, each tied to a different attack target and mitigation approach.
prompt templates: Templates stored, versioned, and shared by Amazon Bedrock Prompt Management for consistent production use.
Prompt variables: The parameterization mechanism in Bedrock Prompt Management. A prompt author defines placeholders in stored prompt text, and the application fills them at runtime with actual request values, separating stable prompt elements from variable user data.
prompt-and-response pairs: A dataset format used for instruction tuning, where each prompt is written as an explicit instruction and each response demonstrates the desired behavior.
Prompt-attack filter: A Guardrails detector for jailbreak and prompt-injection patterns in user input. It is distinct from the harm-category content filters and catches attempts to override the system prompt or bypass content rules.
prompt-chaining pipelines: Pipelines built by Amazon Bedrock Flows in the Bedrock console.
prompt-injection: A prompt-attack pattern that the Prompt-attack filter is designed to detect. The text uses it as an example of input intended to override the system prompt or bypass content rules.
prompt-injection defenses: Defensive controls designed to resist prompt injection by instructing the model to ignore instructions that follow a certain template. The text says these defenses become less effective once the template is known.
prompt-response pairs: Pairs generated by the teacher model in distillation and used as the student's training data.
prompted base model: A base model used with prompting rather than fine-tuning; the text contrasts its quality with that of a fine-tuned model.
prompts: Inputs that must be engineered for variability; poorly designed prompts are associated with exposure, poisoning, hijacking, and jailbreaking.
protected demographic group: A demographic group that may be systematically favored or penalized by bias in a model's predictions.
provisioned: Set up in advance; in serverless inferencing, a persistent endpoint does not need to be pre-provisioned.
provisioned cluster: An Aurora deployment type mentioned as distinct from the standard RDS infrastructure used by Amazon RDS for PostgreSQL.
provisioned throughput: An Amazon Bedrock option that lets architects work within data-residency constraints while maintaining availability, and that can reduce latency variance for latency-sensitive production workloads.

Q

quality standards: Standards that each stage of the data lifecycle should have.
quantile forecasts: Forecast outputs computed by Amazon Forecast, such as p50 and p90 demand levels.
quarterly model review process: A governance process that receives alerts from CloudWatch alarms so that detected drift triggers a documented, scheduled review rather than going unaddressed.

R

RAG: An abbreviation for retrieval-augmented generation.
RAG corpus: The collection of documents scanned by Macie to detect sensitive documents before ingestion.
RAG documents: Documents stored in S3 buckets that are used in retrieval-augmented generation and may be subject to anomalous access patterns.
RAG evaluation: The evaluation of RAG pipelines, split into two independent concerns: retrieval quality and generation quality.
RAG grounding: A technique for reducing hallucinations in which the model is instructed to answer only from the documents retrieved for the current query, with retrieved documents inserted into the prompt context and the system prompt instructing the model to cite sources and decline to answer if the documents do not contain the needed information.
RAG index: A storage location for semantic memory containing general world or domain knowledge. The text says semantic memory may be stored in model weights or a RAG index.
RAG ingestion pipelines: Pipelines that consume data for RAG applications and are recorded in data lineage.
RAG knowledge base: The knowledge base used for a RAG application; in the question it is expected to contain approximately 200,000 document chunks.
RAG pipelines: Retrieval Augmented Generation pipelines, where longer context windows may eliminate the need for complex chunking strategies.
RAG retrieval layer: The retrieval component in a RAG system; precision@k is described as a metric for this layer.
Ragas: An open-source evaluation framework that computes RAG metrics automatically by running a judge model over the retrieved documents and the generated answer.
RDS: The standard infrastructure on which Amazon RDS for PostgreSQL runs, contrasted in the text with Aurora serverless or provisioned clusters.
RDS for PostgreSQL cluster: An existing PostgreSQL cluster used for transactional data that can be extended with pgvector to avoid a new service.
RDS PostgreSQL: An AWS managed service listed as supporting vector storage, using pgvector and described as lightweight.
real-time inference endpoints: Endpoints to which models are deployed in SageMaker AI for real-time inference.
recall drop: A decline in recall, used in the self-check scenario to indicate that the model is missing genuine positive cases. The text links this decline to concept drift when customer behavior patterns have shifted.
recall-oriented: Describes ROUGE as emphasizing overlap coverage relative to the reference rather than precision.
recall@k: A retrieval metric asking, of all the relevant documents in the corpus, what fraction appeared in the top-k results.
red-team exercises: Exercises that reviewers and approvers are trained to conduct as part of role-specific training.
regional coverage: A practical constraint based on the fact that not every model is available in every AWS region. It may require cross-region inference or switching to an alternative model available in the desired region.
Regression: A supervised learning problem type that predicts a continuous numerical value, such as expected revenue from a customer in the next quarter.
regulatory audit: An audit that requires the team to demonstrate exactly which prompt was in use on a specific date three months ago and show that no unauthorized change was made to it.
regulatory compliance: The compliance area supported by AWS services in Task Statement 5.2.
regulatory or compliance requirements: A condition that strongly favors traditional ML when it demands a fully auditable, reproducible decision path.
reinforcement learning from human feedback: A training and fine-tuning topic included in Task 3.3.
reinforcement learning from human feedback (RLHF): A study format in which side-by-side comparison is the standard preference-collection method. The text identifies side-by-side comparison as the standard format for preference collection in RLHF studies.
Repeatable processes: Versioned, parameterized pipelines that replace ad hoc scripts and produce consistent outputs from consistent inputs. The text says SageMaker Pipelines defines steps in code, stores each step's output as a versioned artifact, and integrates with the model registry to gate deployments on evaluation thresholds.
representative-deficient dataset: A dataset that lacks sufficient representation of the target cases, especially edge cases. The text says a larger model trained on such a dataset will still perform poorly on edge cases.
Responsible AI: An AI system property that must be designed in from the start rather than bolted on after the model is built. It includes bias, fairness, inclusivity, robustness, safety, and veracity, and it affects how AI projects are scoped, how data is collected, and how outputs are monitored.
retraining pipeline: A pipeline that can receive corrected labels from human review and use them for retraining.
retrieval: A retrieval step that Amazon Q Business handles automatically when working with enterprise documents, enabling grounded answers from organizational data.
retrieval API: A component handled by Amazon Bedrock Knowledge Bases that provides retrieval functionality as part of the RAG implementation.
Retrieval Augmented Generation: A mitigation approach that grounds model responses in retrieved source material. In the text, it is recommended first for hallucination in a product catalog use case, and it constrains the model to generate responses from retrieved product records rather than parametric memory.
Retrieval Augmented Generation (RAG): A technique, along with Amazon Bedrock Knowledge Bases, used to solve the most common grounding problem.
retrieval logic: The logic needed to connect to data sources and retrieve relevant information. The text says a team using Amazon Bedrock with manual prompt engineering would need to build this themselves.
Retrieval metrics: Metrics used to evaluate retrieval performance in an application architecture; examples in the text include Precision@k and Recall@k for a RAG pipeline, and tool selection accuracy and step efficiency for an AI agent.
retrieval quality: In RAG evaluation, the measure of whether the vector store returned the right documents when given the user's query.
retrieval-augmented generation: The full pipeline managed by Knowledge Bases for Amazon Bedrock, involving document ingestion, chunking, vector embedding generation, storage in a managed vector store, and retrieval of relevant chunks at inference time.
retrieval-augmented generation (RAG): An adaptation technique that extends the model's knowledge by fetching relevant documents at inference time.
retrieval, grounding, and answer-generation components: The components the text says would need to be built from scratch if using SageMaker AI with a custom-trained model for the enterprise Q&A use case.
RFPs: Requests for proposals in which criteria such as cost, modality, latency, and multilingual support appear.
RLHF: A process that collects human judgments to improve model weights. In the text, it is contrasted with A2I, which routes inference outputs for human review rather than collecting judgments for weight updates.
ROI: The ratio of net financial benefit to total cost over a defined period. In the example, a fraud detection model that prevents $2 million in annual losses against a total annual cost of $400,000 has an ROI of 400 percent. It is used to justify AI investment to financial leadership and to determine whether a project receives continued funding after initial deployment.
role-specific training: Training tailored to particular roles. For AI builders it covers data governance requirements, model card documentation, Amazon Bedrock Guardrails, SageMaker Model Monitor, and the threat model for generative AI systems; for reviewers and approvers it covers evaluating model cards, interpreting drift monitoring outputs, and conducting model-card reviews and red-team exercises.
roleplay prompt: A prompt that uses roleplay as the attack technique in Jailbreaking, specifically to bypass refusals.
ROUGE: A technical evaluation method.
ROUGE-L: The most common ROUGE variant, which counts the longest common subsequence of words between the candidate and the reference.
runtime config: The limited runtime configuration available in Amazon Bedrock's managed API.
runtime inference quality: The quality of output during runtime inference, which the text says is not affected by data poisoning in a system where the model itself is unchanged.

S

S3: An enterprise data source that Amazon Q Business can connect to as part of its built-in connectors.
S3 block-public-access: A setting referenced in the AWS Config rule example that provides historical compliance data when used to require that S3 public access be blocked.
S3 block-public-access settings: Settings that, when required by an AWS Config rule, provide historical compliance data showing that S3 public access has been blocked.
S3 bucket policies: Policies that restrict which IAM principals can read training data.
S3 buckets: A company data source that Amazon Q Business can connect to.
S3 data events: CloudTrail data events enabled on an S3 bucket to produce a log of every read and write against training files.
S3 data source: The synchronized data source used with Amazon Bedrock Knowledge Bases in the text. It is the source from which the knowledge base is kept updated.
S3 Object Lock: A control that prevents modification or deletion of objects for a defined retention period, making it impossible for an attacker with write access to alter the historical training record.
S3 training-data buckets: Customer S3 buckets containing training data in the audit scenario; the text says auditors require evidence that they have not been publicly accessible at any point in the past year.
S3 Versioning: The S3 mechanism that preserves all previous states of an object so changes are detectable and reversible.
safety evaluation dimension: The Bedrock Model Evaluations dimension that checks for harmful, toxic, or inappropriate content before a model is placed in production.
safety fine-tuning: The model provider's safety training that is targeted by jailbreaking.
SageMaker: An AWS service with which A2I integrates; the text specifically says A2I integrates with SageMaker models.
SageMaker AI: An AWS service used for FM adaptation through fine-tuning on private data.
SageMaker AI Async Inference endpoints: AWS endpoints that accept large payloads, queue them, and write outputs to S3 for retrieval.
SageMaker AI Endpoints: A deployment service for custom ML that supports real-time custom model inference.
SageMaker AI Training: A managed service for classical model training that runs distributed training jobs.
SageMaker Clarify: A tool that can quantify the contribution of techniques such as regularization, cross-validation, and data augmentation by comparing bias metrics before and after their application, giving teams evidence that mitigation efforts produced measurable results.
SageMaker Feature Store: A centralized feature registry used in feature engineering, and also cited as a practice for managing technical debt so the same feature transformation is used consistently in training and inference.
SageMaker inference endpoints: Endpoints whose application-level logs are retained by Amazon CloudWatch Logs and may capture runtime errors and output patterns that reveal unintended data exposure.
SageMaker JumpStart: An AWS tool used for training job execution, specifically for fine-tuning compute and orchestration.
SageMaker Model Card: A structured, human-readable model documentation artifact that records a model's intended use cases, training data provenance, evaluation results across subgroups, known limitations, ethical considerations, and usage restrictions.
SageMaker Model Monitor: A service used for evaluation and monitoring. It performs performance and quality scoring, continuously compares live inference data against a baseline dataset, and raises alarms when drift exceeds a threshold.
SageMaker Model Registry: A SageMaker AI component where models are registered.
SageMaker models: Models integrated with A2I. The text states that A2I integrates with SageMaker models.
SageMaker Pipelines: ML-native CI/CD within Amazon SageMaker AI for automating the pipeline end to end.
SageMaker Python SDK: An AWS tool through which Model Cards can be published.
SageMaker Studio: The integrated development environment within Amazon SageMaker AI.
SageMaker training volumes: Training volumes for SageMaker whose encryption at rest can be enforced by an AWS Config rule.
SageMaker-hosted models: Models hosted on SageMaker that can use Amazon Augmented AI as the human-review workflow layer. The text mentions them in the context of A2I.
sampling process: The process inside most generative models in which the next token is selected probabilistically rather than deterministically, causing nondeterminism.
savings on inference: The cost savings that make distillation economically favorable in high-volume applications.
SDK: An acronym for software development kit. In the text, Strands Agents is described as an AWS open-source SDK for building agentic applications.
self-attention: A mechanism used by the transformer architecture to weigh the relevance of every token in a sequence against every other token when producing each output token. It helps maintain long-range dependencies in text.
self-enhancement bias: The bias that occurs when a model judges its own outputs and favors text that resembles its own style.
self-managed EC2 GPU instances: The hosting environment mentioned for open-source pre-trained models in the self-check scenario. The text says deploying on reserved or spot EC2 instances can reduce marginal cost significantly at 50 million monthly requests.
self-supervised: A learning approach required for unlabeled datasets to extract patterns, according to the text.
self-supervised learning: The learning approach used in the FM lifecycle's pre-training stage, where unlabeled text is used at scale to create general capabilities before later narrowing to specific tasks.
self-supervised techniques: Techniques required for unlabeled datasets to extract patterns.
semantic chunking: A chunking method supported by Bedrock Knowledge Bases that splits documents at natural topic boundaries identified by a secondary model.
semi-supervised: A learning approach that is incorrect in the scenario because no records are labeled.
semi-supervised learning: A learning paradigm that requires at least some labeled examples to guide the model. It is not appropriate when no records are labeled.
SFT: An abbreviation for supervised fine-tuning.
SHAP: A feature attribution method used by Amazon SageMaker Clarify to quantify how much each input feature contributed to a model's prediction.
shared on-demand inference pool: The shared inference pool through which custom models cannot be served, according to the text.
single-shot prompting: A simpler pattern that agentic AI can outperform for tasks requiring multiple tool lookups, mid-execution adaptation, or coordination across specialized sub-systems.
SLA: A response-time standard referenced in the text; the application may need to meet single-digit millisecond latency, and a hard-coded decision path may be the only option that satisfies it.
SOC 2: A third-party compliance report available for download in AWS Artifact.
software artifacts: Prompts are described as software artifacts that require version control, testing, review, and controlled deployment.
sparse or truncated records: CRM data quality changes that leave the model with less information than the prompt assumes.
SQL: A language that users can write through Amazon Quick.
Stability AI: A model family in Amazon Bedrock that handles image and multimodal generation tasks.
STEM: One of the subject areas included among the 57 academic subjects covered by MMLU.
storage: One of the infrastructure components whose geographic placement controls data residency, including the restriction of data buckets to EU regions.
structured evaluation plan: A planned approach to evaluating a foundation model application before deployment. The text compares shipping without one to releasing software without testing.
summarization evaluation: The evaluation of summaries using ROUGE, which is described as the standard metric for summarization because the task requires the key information from the source document to be present in the summary.
supervised fine-tuning: A fine-tuning approach mentioned in connection with adapting models for specific domains, where labeled examples are used to shift the model toward a target task.
supervised learning: A type of learning that must be crisply distinguished from unsupervised learning. The text does not provide further operational details.
supporting storage: Storage that, along with Amazon Bedrock endpoints, should be configured only in eu-* regions in the scenario.
surface-level lexical measure: A metric that evaluates text based on exact word-level overlap rather than meaning. The text contrasts this with semantic similarity measures and notes that ROUGE is such a measure.
sustained high throughput: The high-volume condition under which dedicated instances in Amazon SageMaker AI can be cheaper.
system prompt: A long static prefix that can be reused through prompt caching.
System prompt override: The attack target for Hijacking / Injection; it refers to overriding the system prompt through malicious user input.
system prompts: Longer system prompts increase token count and therefore the bill under token-based pricing.
systematic data pipeline: A structured data preparation approach that organizations should invest in from the start to avoid repeating training jobs more often and at higher total cost.
systematically biased labels: A type of training-data problem that, if present, is inherited by the model's outputs.

T

target set of prompts: The collection of prompts on which the teacher model generates responses for distillation training data.
team training: A governance protocol that helps make data-governance strategies work in practice.
Team training requirements: Requirements recognizing that governance programs fail when the people operating AI systems do not understand the relevant policies and risks. They include annual AI literacy training for all employees, role-specific training for AI builders, and role-specific training for reviewers and approvers, with completion records maintained in the governance repository.
technical accuracy targets: Accuracy goals that a model may meet while users still quietly stop engaging with it. The text contrasts these with actual user engagement and business success.
text classification: A built-in task type supported by Amazon SageMaker Ground Truth.
text-quality evaluation: The broader evaluation use case in which BERTScore is increasingly applied.
third-party evaluations: Evaluations that can be run when deploying an open-source model.
throughput: A factor that inferencing patterns are chosen to solve, together with latency requirements.
Throughput guarantee: A trade-off dimension comparing on-demand shared-pool inference with provisioned throughput; the lower-cost option gives up predictability under high concurrent load.
TII: A provider named in the text whose Falcon series models are available in SageMaker JumpStart.
TLS: The protocol used for encryption in transit to protect data moving between the application and the Bedrock API endpoint, between Bedrock and S3 buckets, and between the agent runtime and external tools.
TN: Abbreviation for True Negative in the confusion matrix.
token budget: The amount of tokens available per interaction; the text says this is a key question to ask when evaluating a generative-AI solution.
token cost: The cost that accumulates rapidly when using a foundation model at high volume; the text contrasts this with the predictable cost of a trained classifier.
token economics: The cost structure implied by token-based pricing, where understanding token consumption is directly relevant to budget planning for generative AI projects.
token level: The level at which BERTScore compares corresponding units of text after encoding them into vectors.
token-based compute charge: An ongoing compute cost model for foundation models in which each inference call is priced by the number of tokens in the input and output. Traditional ML models are described as having no such ongoing charge once trained.
token-based cost: A cost structure for foundation models in which cost depends on token usage; it is acceptable when application volume is low or tasks are sufficiently unique.
token-based pricing: A V1.1 addition to the foundational generative-AI vocabulary. It refers to pricing based on tokens, consistent with the text’s statement that generative systems are billed by tokens, not by predictions.
token-based pricing model: A pricing model in which you pay for the number of tokens consumed, both coming in and going out, rather than for the compute resource that ran the request.
token-level attributions: Explainability outputs that Amazon SageMaker AI can surface; the text notes they are imperfect approximations rather than true causal explanations.
token-limit errors: Errors produced when on-demand inference is throttled at high request rates; the text notes that these errors create visible failures in customer-facing applications.
tokenization: The first step in processing text with a language model, in which text is converted into token IDs by a tokenizer.
tokenizer: The component that converts raw text into token IDs during tokenization.
tokens: The unit used to bill generative systems; the text states that generative systems are billed by tokens, not by predictions, and that token budget per interaction is a key planning question.
tokens-per-minute: The throughput measure guaranteed by Provisioned Throughput when a team reserves a specified number of model units for a set period.
tokens-per-minute throughput level: The minimum throughput level guaranteed by provisioned throughput, measured in tokens per minute.
Tool selection accuracy: A metric for agents that measures whether the agent chose the correct tool at each step, especially when multiple APIs are available and the right choice is deterministic given the task description.
tool use: A capability provided by Amazon Bedrock Agents and Amazon Bedrock AgentCore for agentic AI.
TP: Abbreviation for True Positive in the confusion matrix.
traditional ML model: A model trained on labeled data for a specific, well-bounded task, with high interpretability, low latency, predictable cost, and no ongoing token-based compute charge once trained.
traditional ML models: Models used in production architectures for narrow prediction problems where explainability is mandatory.
training and hosting cost: The cost that must be justified when choosing fine-tuning, which is appropriate only when the performance improvement outweighs this cost.
training and model-artifact storage: A stage in the typical lifecycle of an AI training dataset.
training and retraining pipeline: The pipeline an organization must be able to manage if it has sufficient ML engineering capacity; this is cited as a condition favoring traditional ML.
training compute: The measurable compute effort used for model training, often quoted first by engineering teams.
training corpus: The collection of training data that should be screened and sanitized for PII before training begins.
training data: The data used to train a model; bias can arise from flawed training data, and fit is assessed relative to it.
training data bucket: The S3 bucket holding the training data that Macie scans for PII before ingestion.
training data cards: Documentation published by open-source models that describes the training corpus composition and supports review of data provenance risks.
training data coverage: The extent to which the training data includes the kinds of inputs the model must handle. The text identifies this as the real constraint because edge cases are missing from the training set.
Training data storage: A data preparation activity that provides secure, scalable input to training jobs. The text lists Amazon S3 as the AWS tool used for it.
training data tampering: The risk addressed by Amazon S3 Object Lock and S3 Versioning.
training dataset: The dataset to which an algorithm is applied to produce a model.
training dataset itself is imbalanced: A condition evaluated by pre-training bias metrics.
training distribution: The distribution represented by the training dataset, from which production input data may diverge in data drift.
training epochs: A training setting that could be increased, but the text presents it as not the most direct fix for underrepresented edge cases.
training examples: Examples used to fine-tune a model; they must accurately represent the distribution of inputs the deployed model will encounter.
training infrastructure: The infrastructure for which the customer is fully responsible in Scope 5 when training a model from scratch.
Training job execution: A data preparation activity described as fine-tuning compute and orchestration. The text lists Amazon Bedrock custom models and SageMaker JumpStart as the AWS tools used for it.
training jobs: Jobs that receive secure, scalable input from Amazon S3 and are part of the fine-tuning process.
training process: The process used to train or fine-tune the model. In Scope 4, the customer is responsible for the fine-tuning process itself.
training run: A repeated training iteration that occurs after data is corrected or augmented.
training S3 bucket: A storage bucket used for training data. The text mentions CloudTrail data events on the training S3 bucket in an incorrect answer choice and also refers to restricting training data buckets to EU regions.
training signal: The signal provided by the reward model during RLHF to guide reinforcement learning of the main model.
training-set accuracy: The accuracy measured on the training data; the text says it can be high in overfitting even though performance on new data is poor.
transformer: An architecture introduced in 2017 that uses self-attention to weigh the relevance of every token in a sequence against every other token when producing each output token.
Transformer models: A typical machine learning approach for text data.
Transformer-based large language models (LLMs): The dominant architecture for modern language tasks. They use the transformer architecture and self-attention to weigh the relevance of every token in a sequence against every other token when producing each output token.
transparency: A related but distinct topic from explainability; the text says it involves trade-offs with safety and is documented and surfaced by certain AWS tools.
transparency and explainability: A related but distinct topic in Domain 4 that focuses on distinguishing transparent and explainable models from opaque ones, discussing trade-offs between safety and transparency, and using AWS tools to document and surface explainability.
transparency standards: A governance protocol that helps make data-governance strategies work in practice.
transparent: A model characteristic contrasted with an opaque model in the context of transparency and explainability.

U

ULMFiT: An acronym for Universal Language Model Fine-tuning for Text Classification, referenced in the cited paper title.
unbalanced dataset: A dataset with extreme class imbalance; the text's fraud example has 999 legitimate transactions for every 1 fraudulent transaction, allowing a model to achieve 99.9% accuracy by predicting legitimate for everything while failing at the actual task.
unlabeled records: Records gathered in data collection that the model will learn from, even though they do not have labels.
unsupervised: A learning approach required for unlabeled datasets to extract patterns, according to the text.
unsupervised corpus: The unlabeled domain text used in continuous pre-training, described as the same type of corpus used in original pre-training but drawn from a specific domain.
unsupervised learning: A type of learning that must be crisply distinguished from supervised learning. The text does not provide further operational details.
US: A jurisdiction mentioned as one of the places where regulators are putting concrete obligations on organizations that deploy AI.

V

vector backend: A vector storage service used by Amazon Bedrock Knowledge Bases as the underlying backend for storing and searching embeddings.
vector database: A specialized data store optimized for nearest-neighbor lookups. The text says vectors produced by embedding models are stored in it for similarity retrieval.
vector embeddings: Representations generated from documents in the Knowledge Bases pipeline and stored in a managed vector store for later retrieval.
vector engine: The OpenSearch component described as being optimized for large-scale, high-throughput semantic search workloads.
vector search: Search using vectors, mentioned as a capability of Amazon Neptune Analytics and as part of vector-database-based retrieval.
vector search operation: A component of cost per interaction in a RAG pipeline; the text says it is included in the total cost but does not further define it.
vector store: A storage layer used by RAG to hold indexed report content so relevant sections can be retrieved when questions are asked. In the text, it is synchronized from the report repository and can be updated in minutes when new reports arrive.
vector stores: Storage systems that back knowledge bases used for RAG pipelines in Bedrock.
vector-storage options: AWS storage options used in Task 3.1: Amazon OpenSearch Service, Amazon Aurora, Amazon Neptune, and Amazon RDS for PostgreSQL.
vector-store integration: A component handled by Amazon Bedrock Knowledge Bases that connects the system to vector storage as part of the RAG implementation.
vectors: The vector representations stored in the knowledge base. The text states that at 200,000 vectors both services are technically capable.
versioned artifact: A stored prompt artifact that, when versioned, produces the same distribution of outputs every time the same input arrives.
VPC: The network boundary inside which AWS PrivateLink provides a private endpoint for connecting to Amazon Bedrock APIs.
VPC endpoint: The private network endpoint created for Amazon Bedrock using AWS PrivateLink so traffic does not traverse the public internet.
VPC flow logs: Network logs used in the text to detect unusual network patterns from the AI workload.
VPC isolation: Isolation using a VPC, mentioned as part of the security context for addressing prompt injection.
VRAM: GPU memory available on the instance; in the example, the available GPU instance has 24 GB of VRAM.

W

web-crawled training data: Training data gathered from the web that can inadvertently absorb benchmark answers and inflate benchmark scores.
Workflow evaluation: The evaluation of multi-step pipelines that combine FM calls, RAG retrievals, business logic, and human handoffs into a complete business process.

Z

zero-shot: A standard prompt technique listed with single-shot, few-shot, and chain-of-thought.
zero-shot inference: A form of inference described as viable when the organization has minimal labeled training data but can express the task as a prompt.
zero-shot prompt: A prompt used in in-context learning that gives no examples.
zero-shot prompting: A prompting approach that can categorize text without labeled training data, but the text says it is less suitable at very high volume because token cost accumulates rapidly and latency is higher than a trained classifier.

About These Definitions

These definitions are loaded from the shared release pack. Use them with the study guide and practice questions to connect vocabulary to exam scenarios.

Download App Read the full study guide Take the free practice exam