Question 28
Domain 4Your automated review analyzes comments and docstrings. The current prompt instructs Claude to "check that comments are accurate and up-to-date." Findings frequently flag acceptable patterns (TODO markers, straightforward descriptions) while missing comments that describe behavior the code no longer implements. What change addresses the root cause of this inconsistent analysis?
Correct answer: D
Explanation
The prompt is too vague because “check that comments are accurate and up-to-date” leaves the model to guess what counts as a problem, so it flags harmless items like TODOs. Defining the rule as “flag comments only when their claimed behavior contradicts actual code behavior” targets the real failure mode and prevents overreporting of acceptable comments.
Why each option is right or wrong
A. Include git blame data so Claude can identify comments that predate recent code modifications
Comment age does not prove inaccuracy; old comments can still match current code behavior.
B. Filter out TODO, FIXME, and descriptive comment patterns before analysis to reduce noise
Noise filtering removes some benign comments but does not define what makes a comment wrong.
C. Add few-shot examples of misleading comments to help the model recognize similar patterns in the codebase
Examples may help pattern matching, but unclear evaluation criteria remain the core problem.
D. Specify explicit criteria: flag comments only when their claimed behavior contradicts actual code behavior
The prompt’s current wording is open-ended, so the model is effectively making a subjective judgment about “accuracy” and “up-to-date” status, which explains the false positives on benign TODOs and plain descriptions. The fix is to constrain the review criterion to a concrete contradiction test: only flag a comment or docstring when it asserts behavior that the code does not actually implement, which directly targets the missed stale-comment cases and eliminates discretionary over-flagging.