题目
BU.330.775.T2.FA25 Final- Requires Respondus LockDown Browser
单项选择题
Which statement is correct?
选项
A.Stochastic Gradient Descent (SGD) computes the gradients using the whole training set to update the model parameters once.
B.Batch Gradient Descent (BGD) computes the gradients using one data point to update the models parameters once.
C.Mini-batch Gradient Descent has the most bouncing behavior compared to SGD and BGD.
D.10 training epochs mean each data point has the opportunity to update the model parameters 10 times.
查看解析
标准答案
Please login to view
思路分析
Question restatement: Which statement is correct?
Option A: 'Stochastic Gradient Descent (SGD) computes the gradients using the whole training set to update the model parameters once.' This is inaccurate. SGD updates parameters using one or a few samples at a time, not the entire training set, which is characteristic of batch methods. The description more closely matches Batch Gradient Descent in its general idea, but SGD specifica......Login to view full explanation登录即可查看完整答案
我们收录了全球超50000道考试原题与详细解析,现在登录,立即获得答案。
类似问题
Which of the following statements about gradient descent and learning rate is true?
假设你正在训练一个网络,参数为 [4.5, 2.5, 1.2, 0.6],学习率为 0.2,梯度为 [-1, 9, 2, 5]。更新一个梯度下降步长后,网络的参数等于多少? Suppose that you are training a network with parameters [4.5, 2.5, 1.2, 0.6], a learning rate of 0.2, and a gradient of [-1, 9, 2, 5]. After one update step of gradient descent, what would your network's parameters be equal to?
在梯度下降中如何更新参数?How do we update the parameters in gradient descent?
Which of the following best describes the role of the gradient in gradient descent?
更多留学生实用工具
希望你的学习变得更简单
加入我们,立即解锁 海量真题 与 独家解析,让复习快人一步!