PMLE Practice Q34

A. Perform data validation to ensure that the input data to the pipeline matches the input data format for the endpoint.

B. Refactor the transformation code in the batch data pipeline so that it can be used outside of the pipeline and employ the same code in the endpoint.

The endpoint must execute the identical preprocessing logic used during training to avoid training-serving skew, which is why the transformation code from the batch Dataflow pipeline needs to be extracted into reusable code and invoked at inference time. In this scenario, the batch pipeline is not itself the serving path, so leaving the transforms embedded only in that pipeline would prevent real-time requests from being normalized the same way as the training data.

C. Refactor the transformation code in the batch data pipeline so that it can be used outside of the pipeline and share this code with the endpoint's end users.

D. Batch the real-time requests using a time window, preprocess the batched requests using the Dataflow pipeline, and then send the preprocessed requests to the endpoint.

Question 34

Explanation

Why each option is right or wrong