Exit 25 / 40

Question 25

Domain 6: Evaluation and Monitoring

A team keeps changing prompts, model versions, and judge rubrics during experiments. Which practice best preserves reproducibility?