Generative AI Leader Practice Q35

A. Multimodality - native input of text, images, audio, and video in one model

Google’s Gemini models are designed with native multimodal input, meaning a single model can process text, images, audio, and video together rather than requiring separate systems. That directly fits the retailer’s use case because the conversation combines a product image, a written review, and an audio voicemail in one interaction, which is exactly the kind of cross-modal input Gemini is built to handle.

B. Federated personalization across user devices

Federated personalization focuses on adapting models across devices, not jointly understanding image, text, and audio inputs.

C. Quantization-aware training for on-device inference

Quantization-aware training is a model optimization technique for efficient deployment, not a cross-media understanding capability.

D. Symbolic reasoning over a knowledge graph

Symbolic reasoning over knowledge graphs concerns structured logic, not native processing of mixed media in one model.

Question 35

Explanation

Why each option is right or wrong