题目
题目

CS-7643-O01, OAN, OSZ Quiz #4: Module 3

多项填空题

As defined in Attention is All You Need, what is the size of the cross-attention matrix between the encoder and decoder given the following English to Spanish translation: I am very handsome -> Soy muy guapo Please assume the following: d_k = d_q = 64 d_v = 32 Please ignore the <SOS> and <EOS> tokens. cross-attention means Attention(Q, K, V) NOTE: Please round to the nearest integer. [Fill in the blank] rows[Fill in the blank] columns

查看解析

查看解析

标准答案
Please login to view
思路分析
To determine the size of the cross-attention matrix, we need to recall how cross-attention is defined in Transformer architectures. The cross-attention matrix is formed by Attention(Q, K, V), where Q comes from the decoder (the current target sequence being generated) and K and V come from the encoder (the source sequence). The dimensions of this mat......Login to view full explanation

登录即可查看完整答案

我们收录了全球超50000道考试原题与详细解析,现在登录,立即获得答案。

类似问题

更多留学生实用工具

加入我们,立即解锁 海量真题独家解析,让复习快人一步!