Question 32
Domain 3: Knowledge Integration, Data Handling, Cognition, Planning, and MemoryYou are tasked with building an ETL pipeline that integrates data from a customer relationship management (CRM) platform and a product inventory system into a unified knowledge base for an agentic AI system. Which approach best ensures reliable transformation and alignment of heterogeneous data sources?
Correct answer: D
Explanation
Schema mapping and normalization in the transformation phase align heterogeneous sources by converting different fields, formats, and meanings into a common structure before storage. ETL’s transformation step is where data is standardized so the knowledge store receives consistent, reliable records for downstream AI use.
Why each option is right or wrong
A. Load raw extracts from each source first and apply schema-aware transformations directly in the agent's querying layer at runtime.
B. Use direct database exports from each system without any reconciliation transformation, in order to reduce ingestion processing time.
C. Use hardcoded reconciliation rules embedded in the agent's prompt templates to bridge the differences between source schemas at query time.
D. Apply schema mapping and normalization during the transformation phase before loading into the knowledge store
The transformation stage of ETL is the point at which heterogeneous source fields are reconciled into a common model, so applying schema mapping and normalization before load is the only choice that prevents mismatched CRM and inventory records from being persisted as-is. In standard ETL practice, this is where differing data types, naming conventions, and value formats are standardized prior to storage, which is essential when the downstream knowledge base must support consistent retrieval and reasoning across sources.