学习率的调整
1. lr_scheduler机制
有的时候需要我们通过一定机制来调整学习率,这个时候可以借助于torch.optim.lr_scheduler类来进行调整;torch.optim.lr_scheduler模块提供了一些根据epoch训练次数来调整学习率(learning rate)的方法。一般情况下我们会设置随着epoch的增大而逐渐减小学习率从而达到更好的训练效果
torch.optim.lr_scheduler.StepLR
class torch.optim.lr_scheduler.StepLR(optimizer, step_size, gamma=0.1, last_epoch=-1)
- optimizer (Optimizer):要更改学习率的优化器;
- step_size(int):每训练step_size个epoch,更新一次参数;
- gamma(float):更新lr的乘法因子;
- last_epoch (int):最后一个epoch的index,如果是训练了很多个epoch后中断了,继续训练,这个值就等于加载的模型的epoch。默认为-1表示从头开始训练,即从epoch=1开始。
每过step_size个epoch,做一次更新:
n
e
w
_
l
r
=
i
n
i
t
i
a
l
_
l
r
×
γ
e
p
o
c
h
/
/
s
t
e
p
_
s
i
z
e
new\_lr = initial\_lr \times \gamma^{epoch // step\_size}
new_lr=initial_lr×γepoch//step_size
其中new_lr是得到的新的学习率,lrinitial_lr是初始的学习率,sizestep_size是参数step_size, \gammaγ是参数gamma。
2.
import torch
import torch.nn as nn
from torch.optim.lr_scheduler import StepLR
import itertools
initial_lr = 0.1
class model(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(in_channels=3, out_channels=3, kernel_size=3)
def forward(self, x):
pass
net_1 = model()
optimizer_1 = torch.optim.Adam(net_1.parameters(), lr = initial_lr)
scheduler_1 = StepLR(optimizer_1, step_size=3, gamma=0.1)
print("初始化的学习率:", optimizer_1.defaults['lr'])
for epoch in range(1, 11):
# train
optimizer_1.zero_grad()
optimizer_1.step()
print("第%d个epoch的学习率:%f" % (epoch, optimizer_1.param_groups[0]['lr']))
scheduler_1.step()
3. 通常scheduler.step() 需要在optimizer.step()后面使用
import torch
import torch.nn as nn
import torch.optim as optim
model = nn.Conv2d(3, 64, 3)
optimizer = optim.SGD(model.parameters(), lr=0.5)
lr_scheduler = optim.lr_scheduler.StepLR(optimizer, step_size=2)
for i in range(5):
optimizer.zero_grad()
x = model(torch.randn(3, 3, 64, 64))
loss = x.sum()
loss.backward()
print('{} optim: {}'.format(i, optimizer.param_groups[0]['lr']))
optimizer.step()
print('{} scheduler: {}'.format(i, lr_scheduler.get_lr()[0]))
lr_scheduler.step()
官网中的举例
- scheduler.step()是放在train()之后的
- 而optimizer.step()应该在train()里面的(每batch-size更新一次梯度)
- 说明scheduler.step()正确顺序就应该在optimizer.step()后面
>>> # Assuming optimizer uses lr = 0.05 for all groups
>>> # lr = 0.05 if epoch < 30
>>> # lr = 0.005 if 30 <= epoch < 60
>>> # lr = 0.0005 if 60 <= epoch < 90
>>> # ...
>>> scheduler = StepLR(optimizer, step_size=30, gamma=0.1)
>>> for epoch in range(100):
>>> train(...)
>>> validate(...)
>>> scheduler.step()