题目
25S-STATS-102B-LEC-3 S25 Midterm Exam- Requires Respondus LockDown Browser
单项选择题
Consider a bi-variate quadratic function 𝑔 ( 𝑤 ) = 𝑤 𝑇 𝐶 𝑤 , where 𝑤 = [ 𝑤 1 𝑤 2 ] , and 𝐶 = [ 0.01 0 0 12 ] . Using RMSprop, please provide the direction vector 𝑤 𝑘 given that 𝑤 𝑘 − 1 = [ 1 1 ] . Assume the step-length parameter is 0.1, the turning parameter is 0.9, and all the exponential averages used in RMSprop at 𝑘 − 2 are 0.
选项
A.𝑤
𝑘
=
[
0.999
0.76
]
B.None of these options are correct.
C.𝑤
𝑘
=
[
0.998
−
1.4
]
D.𝑤
𝑘
=
[
0.994
−
6.590
]
E.𝑤
𝑘
=
[
0.610
−
1.915
]
查看解析
标准答案
Please login to view
思路分析
We start by restating the problem setup and the quantities we will use to compute the RMSprop update.
- The bi-variate quadratic function is g(w) = w^T C w with w = [w1, w2]^T and C = [[0.01, 0], [0, 12]]. Since C is symmetric, the gradient is ∇g(w) = (C + C^T) w = 2 C w.
- With w_{k-1} = [1, 1]^T, we compute the gradient at k-1: ∇g(w_{k-1}) = 2 C w_{k-1} = 2 * [[0.01, 0], [0, 12]] * [1, 1]^T = [0.02, 24]^T.
- The RMSprop update uses v_t which accumulates squared gradients: v_t = β v_{t-1} + (1 − β) g_t^2 (elementwise), and w_t = w_{t-1} − α g_t / sqrt(v_t).
- Given that at k−2 the exponential averages are 0, we first compute v_{k-1} using g_{k-1} = ∇g(w_{k-1}) = [0.02, 24]. With β = 0.9 and α = 0.1:
v_{k-1} = 0.9 * 0 + 0.1 * (g_{k-1} ∘ g_{k-1}) = ......Login to view full explanation登录即可查看完整答案
我们收录了全球超50000道考试原题与详细解析,现在登录,立即获得答案。
类似问题
Which of the following statements about gradient descent and learning rate is true?
Which statement is correct?
假设你正在训练一个网络,参数为 [4.5, 2.5, 1.2, 0.6],学习率为 0.2,梯度为 [-1, 9, 2, 5]。更新一个梯度下降步长后,网络的参数等于多少? Suppose that you are training a network with parameters [4.5, 2.5, 1.2, 0.6], a learning rate of 0.2, and a gradient of [-1, 9, 2, 5]. After one update step of gradient descent, what would your network's parameters be equal to?
在梯度下降中如何更新参数?How do we update the parameters in gradient descent?
更多留学生实用工具
希望你的学习变得更简单
加入我们,立即解锁 海量真题 与 独家解析,让复习快人一步!