Question 17
Domain 2: Fundamentals of Generative AIA company is building a mobile app for users who have a visual impairment. The app must be able to hear what users say and provide voice responses. Which solution will meet these requirements?
Correct answer: A
Explanation
Deep learning is “a subset of ML that uses neural networks with many layers,” and it “powers most modern… speech recognition” systems. A neural network can learn speech patterns from audio data, which fits an app that must “hear what users say” and respond by voice.
Why each option is right or wrong
A. Use a deep learning neural network to perform speech recognition.
Speech recognition is the task of converting spoken audio into text, which the source material identifies as a deep learning use case and notes is powered by neural networks with many layers. Under the AWS service mapping, Amazon Transcribe is the managed speech-to-text service, but among the answer choices the only solution that matches the required audio-understanding capability is a deep learning neural network, since it can be trained on speech data to infer spoken words and support voice-driven interaction.
B. Build ML models to search for patterns in numeric data.
ML for numeric patterns fits structured prediction, not speech input or voice output.
C. Use generative AI summarization to generate human-like text.
Summarization generates text from text; it does not provide speech recognition or text-to-speech.
D. Build custom models for image classification and recognition.
Computer vision interprets images and video, not audio or spoken language.