MATLAB演示梯度上升寻找极值
梯度
梯度的本意是一个向量(矢量),表示某一函数在该点处的方向导数沿着该方向取得最大值,即函数在该点处沿着该方向(此梯度的方向)变化最快,变化率最大(为该梯度的模)。对于二元函数
f
(
x
,
y
)
f(x,y)
f(x,y)的梯度计算公式为:
gradf
(
x
,
y
)
=
∇
f
(
x
,
y
)
=
{
∂
f
∂
x
,
∂
f
∂
y
}
=
f
x
(
x
,
y
)
i
ˉ
+
f
y
(
x
,
y
)
j
ˉ
\operatorname{gradf}(\mathrm{x}, \mathrm{y})=\nabla f(x, y)=\left\{\frac{\partial f}{\partial x}, \frac{\partial f}{\partial y}\right\}=f_{x}(x, y) \bar{i}+f_{y}(x, y) \bar{j}
gradf(x,y)=∇f(x,y)={∂x∂f,∂y∂f}=fx(x,y)iˉ+fy(x,y)jˉ
利用梯度求解以下函数的最大值,同时用MATLAB画出迭代过程。
z
=
f
(
x
,
y
)
=
(
x
2
−
2
x
)
e
−
x
2
−
y
2
−
x
y
z=f(x, y)=\left(x^{2}-2 x\right) \mathrm{e}^{-x^{2}-y^{2}-x y}
z=f(x,y)=(x2−2x)e−x2−y2−xy
利用梯度的话,我们需要先求偏导。可以手算,也可以用MATLAB的符号推导功能。
z
=
(
x
2
−
2
x
)
⋅
e
−
x
2
−
y
2
−
x
y
∂
z
∂
x
=
−
2
(
1
+
x
3
+
(
y
2
−
2
)
x
2
+
(
−
y
−
1
)
x
)
e
−
x
2
−
x
y
−
y
2
∂
z
∂
y
=
−
x
(
x
+
2
y
)
(
x
−
2
)
e
−
x
2
−
x
y
−
y
2
\begin{align} z &= (x^{2}-2x)\cdot{\mathrm e}^{-x^{2}-y^{2}-xy} \\ \frac{\partial z}{\partial x} &= -2 (1+x^{3}+(\frac{y}{2}-2) x^{2}+(-y-1) x) {\mathrm e}^{-x^{2}-x y-y^{2}} \\ \frac{\partial z}{\partial y} &= -x(x+2 y)(x-2) {\mathrm e}^{-x^{2}-x y-y^{2}} \end{align}
z∂x∂z∂y∂z=(x2−2x)⋅e−x2−y2−xy=−2(1+x3+(2y−2)x2+(−y−1)x)e−x2−xy−y2=−x(x+2y)(x−2)e−x2−xy−y2
syms x y
z = (x^2-2*x)*exp(-x^2-y^2-x*y);
zx = simplify(diff(z,x));
zy = diff(z,y);
朝着梯度的方向函数值的上升最快。先设定一个初始值
x
0
=
0
,
y
0
=
0
x_0 = 0,y_0 = 0
x0=0,y0=0和步长
a
=
0.01
a=0.01
a=0.01然后按照梯度的方向开始迭代上升。
x
n
=
x
n
−
1
+
a
∂
f
∂
x
n
−
1
y
n
=
y
n
−
1
+
a
∂
f
∂
y
n
−
1
\begin{align} x_n = x_{n-1} + a\frac{\partial f}{\partial x_{n-1}} \\ y_n = y_{n-1} + a\frac{\partial f}{\partial y_{n-1}} \end{align}
xn=xn−1+a∂xn−1∂fyn=yn−1+a∂yn−1∂f
MATLAB代码
f = @(x,y) (x.^2 - 2.*x).*exp(-x.^2-y.^2-x.*y); % 目标函数
fx = @(x,y) -2.*(1+x.^3+(0.5.*y-2).*x.^2-(y+1).*x).*exp(-x.^2-x.*y-y.^2); % x偏导
fy = @(x,y) -x.*(x+2.*y).*(x-2).*exp(-x.^2-x.*y-y.^2); % y偏导
[x0,y0] = meshgrid(-3:.2:3,-3:.2:3);
z0 = f(x0,y0);
figure
surf(x0,y0,z0)
view(30,30) % 调整视角
hold on
[x,y,a] = deal(0,0,0.01);
z = f(x,y);
[tx,ty,tz] = deal(x,y,z+0.03); % 纪录过程
h = plot3(tx,ty,tz,'r*-');
set(gcf,'Position',[200 200 1000 800])
for i = 1:100
x = x + a*fx(x,y); % x变量梯度上升
y = y + a*fy(x,y); % y变量梯度上升
z = f(x,y);
if mod(i,5) == 0
pause(1);
[tx,ty,tz] = deal([tx x],[ty y],[tz z+0.03]);
h.XData = tx;
h.YData = ty;
h.ZData = tz;
end
end