Exit 21 / 40

Question 21

Domain 2: Evaluation, Tuning, and Quality Optimization

You’re tasked with selecting the better agent between two candidates based on performance across multiple tasks, including classification, question answering, and summarization. What is the best strategy to compare their performance?