Question 37
Section 3Which output-control parameter primarily caps the LENGTH of a model's generation?
Correct answer: A
Explanation
Output length, often set as "max output tokens," limits how many tokens the model can generate before stopping. This parameter primarily controls the LENGTH of the response, unlike settings such as temperature that affect randomness or style.
Why each option is right or wrong
A. Output length / max output tokens
Under the standard generation controls used by LLM APIs, the parameter that sets the hard stop on how many tokens the model may emit is the output-length limit, commonly exposed as max_output_tokens or max_tokens. By contrast, parameters like temperature or top_p only change sampling behavior and do not impose a numeric ceiling on the response length.
B. Top-p
Top-p limits token selection to a probability mass, affecting diversity rather than response length.
C. Temperature
Temperature changes randomness in token choice, not the maximum number of tokens generated.
D. Safety threshold
Safety threshold filters or blocks unsafe content; it is not a length-control setting.