Question 1
Section 1A team is preparing a dataset for a supervised machine learning project. Which dataset characteristic would make the data labeled rather than unlabeled?
Correct answer: B
Explanation
Labeled data includes input data paired with known target outputs or categories, while unlabeled data contains inputs without those assigned answers. Supervised learning relies on labeled examples to learn the mapping from inputs to outputs. — Source material: Identifying the differences between labeled and unlabeled data; Key Terms: labeled data, unlabeled data
Why each option is right or wrong
A. The dataset contains raw inputs with no assigned categories or outcomes.
Unlabeled data has inputs without associated target values or classes.
B. The dataset pairs each input example with a known category or expected output.
Labeled data is defined by having each input associated with a known output or category. In a supervised machine learning project, those attached targets are what distinguish labeled data from unlabeled data in this scenario.
C. The dataset includes a large number of records collected from different sources.
Dataset size or source variety does not determine whether data is labeled.
D. The dataset has been stored in a structured table with consistent columns.
Data structure does not determine whether target labels are present.