Question 1

Domain 1: Data Preparation for Machine Learning (ML)

An ML engineer needs to use data with Amazon SageMaker Canvas to train an ML model. The data is stored in Amazon S3 and is complex in structure. The ML engineer must use a file format that minimizes processing time for the data. Which file format will meet these requirements?

A. CSV files compressed with Snappy B. JSON objects in JSONL format C. JSON files compressed with gzip D. Apache Parquet files

Question 1

Explanation

Why each option is right or wrong