Question 15
Domain 2: ML Model DevelopmentAn ML engineer has trained a neural network by using stochastic gradient descent (SGD). The neural network performs poorly on the test set. The values for training loss and validation loss remain high and show an oscillating pattern. The values decrease for a few epochs and then increase for a few epochs before repeating the same cycle. What should the ML engineer do to improve the training process?
Correct answer: D
Explanation
SGD is sensitive to the learning rate, and an oscillating loss that drops for a few epochs then rises again usually means the updates are too large and overshoot the minimum. Decreasing the learning rate makes parameter updates smaller, which stabilizes training and helps the model converge instead of bouncing around.
Why each option is right or wrong
A. Introduce early stopping.
Early stopping limits overfitting; it does not fix unstable optimization with persistently high training loss.
B. Increase the size of the test set.
Test set size affects evaluation confidence, not the SGD updates or convergence behavior.
C. Increase the learning rate.
A higher learning rate usually worsens overshooting and makes oscillations more severe.
D. Decrease the learning rate.
In SGD, the learning rate directly scales each parameter update, and when it is too large the optimizer overshoots the local minimum, producing the repeated down-up oscillation seen in both training and validation loss. Reducing the learning rate makes the step size smaller and is the standard fix for unstable convergence under gradient descent-based training, especially when the loss does not steadily trend downward over epochs.