本节开始说一下DNN分类的pytorch实现,先说一下二分类
流程还是跟前面一样
代码
1 数据导入
我们使用最常见的iris数据集
data = pd.read_csv('./iris.csv')
data.columns = ["f1","f2","f3","f4","label"]
data = data.head(99)
data
因为iris鸢尾花数据集是一个三分类的数据,我们只去前99条数据,这样的话就只有两个分类了。
2.数据拆分
from sklearn.model_selection import train_test_split
train,test = train_test_split(data, train_size=0.7)
train_x = train[[c for c in data.columns if c != 'label']].values
test_x = test[[c for c in data.columns if c != 'label']].values
train_y = train.label.values.reshape(-1, 1)
test_y = test.label.values.reshape(-1, 1)
3.To Tensor
train_x = torch.from_numpy(train_x).type(torch.FloatTensor)
test_x = torch.from_numpy(test_x).type(torch.FloatTensor)
train_y = torch.from_numpy(train_y).type(torch.FloatTensor)
test_y = torch.from_numpy(test_y).type(torch.FloatTensor)
train_x.shape, train_y.shape
#(torch.Size([69, 4]), torch.Size([69, 1]))
4.数据重构
from torch.utils.data import TensorDataset
from torch.utils.data import DataLoader
train_ds = TensorDataset(train_x, train_y)
train_dl = DataLoader(train_ds, batch_size=batch, shuffle=True)
test_ds = TensorDataset(test_x, test_y)
test_dl = DataLoader(test_ds, batch_size=batch * 2)
5.网络定义
from torch import nn
import torch.nn.functional as F
class DNN(nn.Module):
def __init__(self):
super().__init__()
self.hidden1 = nn.Linear(4, 64)
self.hidden2 = nn.Linear(64, 64)
self.hidden3 = nn.Linear(64, 1)
def forward(self, input):
x = F.relu(self.hidden1(input))
x = F.relu(self.hidden2(x))
x = torch.sigmoid(self.hidden3(x))
return x
#二分类准确率计算函数
def accuracy(out, yb):
preds = (out>0.5).type(torch.IntTensor)
return (preds == yb).float().mean()
def get_model():
model = DNN()
return model, torch.optim.Adam(model.parameters(), lr=lr)
loss_fn = nn.BCELoss()
model, opt = get_model()
model#查看网络结构
DNN(
(hidden1): Linear(in_features=4, out_features=64, bias=True)
(hidden2): Linear(in_features=64, out_features=64, bias=True)
(hidden3): Linear(in_features=64, out_features=1, bias=True)
)
我们也可以根据上节课内容可视化一下
6. 训练
train_loss = []
train_acc = []
test_loss = []
test_acc = []
for epoch in range(epochs+1):
model.train()
for xb, yb in train_dl:
pred = model(xb)
loss = loss_fn(pred, yb)
loss.backward()
opt.step()
opt.zero_grad()
if epoch%1==0:
model.eval()
with torch.no_grad():
train_epoch_loss = sum(loss_fn(model(xb), yb) for xb, yb in train_dl)
test_epoch_loss = sum(loss_fn(model(xb), yb) for xb, yb in test_dl)
acc_mean_train = np.mean([accuracy(model(xb), yb) for xb, yb in train_dl])
acc_mean_val = np.mean([accuracy(model(xb), yb) for xb, yb in test_dl])
train_loss.append(train_epoch_loss.data.item() / len(test_dl))
test_loss.append(test_epoch_loss.data.item() / len(test_dl))
train_acc.append(acc_mean_train)
test_acc.append(acc_mean_val)
template = ("epoch:{:2d}, 训练损失:{:.5f}, 训练准确率:{:.1f},验证损失:{:.5f}, 验证准确率:{:.1f}")
print(template.format(epoch, train_epoch_loss.data.item() / len(test_dl), acc_mean_train*100, test_epoch_loss.data.item() / len(test_dl), acc_mean_val*100))
print('训练完成')
epoch: 0, 训练损失:3.09122, 训练准确率:57.0,验证损失:0.68206, 验证准确率:36.7
epoch: 1, 训练损失:2.87476, 训练准确率:54.3,验证损失:0.69797, 验证准确率:36.7
epoch: 2, 训练损失:2.62978, 训练准确率:61.0,验证损失:0.59363, 验证准确率:36.7
epoch: 3, 训练损失:2.30378, 训练准确率:100.0,验证损失:0.50508, 验证准确率:100.0
epoch: 4, 训练损失:2.05582, 训练准确率:100.0,验证损失:0.44803, 验证准确率:100.0
epoch: 5, 训练损失:1.76421, 训练准确率:100.0,验证损失:0.38924, 验证准确率:100.0
epoch: 6, 训练损失:1.54745, 训练准确率:100.0,验证损失:0.32642, 验证准确率:100.0
…
epoch:98, 训练损失:0.00304, 训练准确率:100.0,验证损失:0.00067, 验证准确率:100.0
epoch:99, 训练损失:0.00311, 训练准确率:100.0,验证损失:0.00067, 验证准确率:100.0
epoch:100, 训练损失:0.00300, 训练准确率:100.0,验证损失:0.00068, 验证准确率:100.0
训练完成
7.查看结果
import matplotlib.pyplot as plt
#损失值
plt.plot(range(len(train_loss)), train_loss, label='train_loss')
plt.plot(range(len(test_loss)), test_loss, label='test_loss')
plt.legend()
# 准确率
plt.plot(range(len(train_acc)), train_acc, label='train_acc')
plt.plot(range(len(test_acc)), test_acc, label='test_acc')
plt.legend()