题目
题目

11785/11685/11485 Quiz-14

多项选择题

 Consider a single-headed attention layer. What happens to the dimensions of the value weight matrix Wv, when we double the maximum input sequence length? Select all that apply

选项
A.None of the above
B.Half the number of columns
C.Half the number of rows
D.Double the number of rows
E.Double the number of columns
查看解析

查看解析

标准答案
Please login to view
思路分析
Start by recalling the role of the value weight matrix Wv in a single-headed attention layer: Wv is the projection that maps the input feature dimension (commonly denoted as d_model) to the value space (often denoted as d_v). Crucially, this matrix is defined by feature dimensions, not by the sequence length. Option 1: None of th......Login to view full explanation

登录即可查看完整答案

我们收录了全球超50000道考试原题与详细解析,现在登录,立即获得答案。

类似问题

更多留学生实用工具

加入我们,立即解锁 海量真题独家解析,让复习快人一步!