Questions
Questions

CS-7643-O01, OAN, OSZ Quiz #4: Module 3

Single choice

Which of the following attention models uses a subset of the input to derive the output, and can not be trained directly with gradient methods?

Options
A.Hard attention
B.Soft attention
View Explanation

View Explanation

Verified Answer
Please login to view
Step-by-Step Analysis
Exploring the concept of attention models in neural networks, we examine what each option implies. Option 1: 'Hard attention' suggests that the model selects a discrete subset of the input to focus on. In practice, this selection is non-differentiable, which means gradients cannot flow through the sampling decision in a......Login to view full explanation

Log in for full answers

We've collected over 50,000 authentic exam questions and detailed explanations from around the globe. Log in now and get instant access to the answers!

Similar Questions

More Practical Tools for Students Powered by AI Study Helper

Join us and instantly unlock extensive past papers & exclusive solutions to get a head start on your studies!