Question 30
Content Domain 3: ModelingA team needs to group unlabeled customer records into segments based on similarity, with no target variable available. Which model is most appropriate for this task?
Correct answer: C
Explanation
Use k-means when the goal is to partition unlabeled data into similarity-based groups, whereas logistic regression and linear regression require labeled targets and tree-based methods are typically supervised. — Source material lists k-means among modeling approaches for model selection tasks.
Why each option is right or wrong
A. Logistic regression
Logistic regression predicts a categorical target and requires labeled outcomes for training.
B. Linear regression
Linear regression predicts a continuous target and requires labeled numeric outcomes.
C. k-means
k-means is the clustering method in the provided model list and fits this scenario because the records are unlabeled and must be grouped by similarity rather than predicted against a known target.
D. Random forests
Random forests are generally used for supervised prediction tasks with labeled target variables.