题目
多项选择题
What operations are part of a standard Transformer block? (Select all that apply.)
选项
A.Layer normalization
B.Convolutional layer
C.Residual connection
D.Self-attention layer
查看解析
标准答案
Please login to view
思路分析
The question asks which operations are part of a standard Transformer block and specifies that it is a select-all-that-apply type.
Option 1: Layer normalization. In a typical Transformer block, after the multi-head self-attention and feed-forward sublayers, the outputs are usually passed through la......Login to view full explanation登录即可查看完整答案
我们收录了全球超50000道考试原题与详细解析,现在登录,立即获得答案。
类似问题
What is the main architectural innovation in ChatGPT that allows it to handle complex language tasks?
A team of data scientists is experiencing poor performance with their Transformer model. After inspecting their implementation, you find several suspicious design choices. Which one would you identify as the MOST problematic based on the architectural principles discussed in the lecture?
In a consumer society, many adults channel creativity into buying things
Economic stress and unpredictable times have resulted in a booming industry for self-help products
更多留学生实用工具
希望你的学习变得更简单
加入我们,立即解锁 海量真题 与 独家解析,让复习快人一步!