昇思25天学习打卡营第16天|ShuffleNet图像分类

news2024/10/6 6:49:38

学AI还能赢奖品?每天30分钟,25天打通AI任督二脉 (qq.com)

ShuffleNet图像分类

当前案例不支持在GPU设备上静态图模式运行,其他模式运行皆支持。

ShuffleNet网络介绍

ShuffleNetV1是旷视科技提出的一种计算高效的CNN模型,和MobileNet, SqueezeNet等一样主要应用在移动端,所以模型的设计目标就是利用有限的计算资源来达到最好的模型精度。ShuffleNetV1的设计核心是引入了两种操作:Pointwise Group Convolution和Channel Shuffle,这在保持精度的同时大大降低了模型的计算量。因此,ShuffleNetV1和MobileNet类似,都是通过设计更高效的网络结构来实现模型的压缩和加速。

了解ShuffleNet更多详细内容,详见论文ShuffleNet。

如下图所示,ShuffleNet在保持不低的准确率的前提下,将参数量几乎降低到了最小,因此其运算速度较快,单位参数量对模型准确率的贡献非常高。

shufflenet1

图片来源:Bianco S, Cadene R, Celona L, et al. Benchmark analysis of representative deep neural network architectures[J]. IEEE access, 2018, 6: 64270-64277.

ShuffleNet通过Pointwise Group Convolution和Channel Shuffle操作,降低模型的计算量,同时保持精度。

模型架构

ShuffleNet最显著的特点在于对不同通道进行重排来解决Group Convolution带来的弊端。通过对ResNet的Bottleneck单元进行改进,在较小的计算量的情况下达到了较高的准确率。

Pointwise Group Convolution

Group Convolution(分组卷积)原理如下图所示,相比于普通的卷积操作,分组卷积的情况下,每一组的卷积核大小为in_channels/g*k*k,一共有g组,所有组共有(in_channels/g*k*k)*out_channels个参数,是正常卷积参数的1/g。分组卷积中,每个卷积核只处理输入特征图的一部分通道,其优点在于参数量会有所降低,但输出通道数仍等于卷积核的数量

shufflenet2

图片来源:Huang G, Liu S, Van der Maaten L, et al. Condensenet: An efficient densenet using learned group convolutions[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 2752-2761.

Depthwise Convolution(深度可分离卷积)将组数g分为和输入通道相等的in_channels,然后对每一个in_channels做卷积操作,每个卷积核只处理一个通道,记卷积核大小为1*k*k,则卷积核参数量为:in_channels*k*k,得到的feature maps通道数与输入通道数相等

Pointwise Group Convolution(逐点分组卷积)在分组卷积的基础上,令每一组的卷积核大小为 1×1,卷积核参数量为(in_channels/g*1*1)*out_channels。

%%capture captured_output
# 实验环境已经预装了mindspore==2.2.14,如需更换mindspore版本,可更改下面mindspore的版本号
!pip uninstall mindspore -y
!pip install -i https://pypi.mirrors.ustc.edu.cn/simple mindspore==2.2.14
# 查看当前 mindspore 版本
!pip show mindspore
Name: mindspore
Version: 2.2.14
Summary: MindSpore is a new open source deep learning training/inference framework that could be used for mobile, edge and cloud scenarios.
Home-page: https://www.mindspore.cn
Author: The MindSpore Authors
Author-email: contact@mindspore.cn
License: Apache 2.0
Location: /home/nginx/miniconda/envs/jupyter/lib/python3.9/site-packages
Requires: asttokens, astunparse, numpy, packaging, pillow, protobuf, psutil, scipy
Required-by: 
from mindspore import nn
import mindspore.ops as ops
from mindspore import Tensor

class GroupConv(nn.Cell):
    def __init__(self, in_channels, out_channels, kernel_size,
                 stride, pad_mode="pad", pad=0, groups=1, has_bias=False):
        super(GroupConv, self).__init__()
        self.groups = groups
        self.convs = nn.CellList()
        for _ in range(groups):
            self.convs.append(nn.Conv2d(in_channels // groups, out_channels // groups,
                                        kernel_size=kernel_size, stride=stride, has_bias=has_bias,
                                        padding=pad, pad_mode=pad_mode, group=1, weight_init='xavier_uniform'))

    def construct(self, x):
        features = ops.split(x, split_size_or_sections=int(len(x[0]) // self.groups), axis=1)
        outputs = ()
        for i in range(self.groups):
            outputs = outputs + (self.convs[i](features[i].astype("float32")),)
        out = ops.cat(outputs, axis=1)
        return out

Pointwise Group Convolution结合点卷积(Pointwise Convolution)和分组卷积(Group Convolution)。将输入特征图分成多个组,在每个组内应用1x1的点卷积,对每个组的点卷积结果进行合并。通过GroupConv 类实现Pointwise Group Convolution。不同组别通道之间有信息隔离问题。

Channel Shuffle

Group Convolution的弊端在于不同组别的通道无法进行信息交流,堆积GConv层后一个问题是不同组之间的特征图是不通信的,这就好像分成了g个互不相干的道路,每一个人各走各的,这可能会降低网络的特征提取能力。这也是Xception,MobileNet等网络采用密集的1x1卷积(Dense Pointwise Convolution)的原因。

为了解决不同组别通道“近亲繁殖”的问题,ShuffleNet优化了大量密集的1x1卷积(在使用的情况下计算量占用率达到了惊人的93.4%),引入Channel Shuffle机制(通道重排)。这项操作直观上表现为将不同分组通道均匀分散重组,使网络在下一层能处理不同组别通道的信息。

shufflenet3

如下图所示,对于g组,每组有n个通道的特征图,首先reshape成g行n列的矩阵,再将矩阵转置成n行g列,最后进行flatten操作,得到新的排列。这些操作都是可微分可导的且计算简单,在解决了信息交互的同时符合了ShuffleNet轻量级网络设计的轻量特征。

shufflenet4

为了阅读方便,将Channel Shuffle的代码实现放在下方ShuffleNet模块的代码中。

Channel Shuffle通道重排,解决不同组别通道之间的信息隔离问题。

ShuffleNet模块

如下图所示,ShuffleNet对ResNet中的Bottleneck结构进行由(a)到(b), (c)的更改:

  1. 将开始和最后的1×1卷积模块(降维、升维)改成Point Wise Group Convolution;

  2. 为了进行不同通道的信息交流,再降维之后进行Channel Shuffle;

  3. 降采样模块中,3×3Depth Wise Convolution的步长设置为2,长宽降为原来的一般,因此shortcut中采用步长为2的3×3平均池化,并把相加改成拼接。

shufflenet5

class ShuffleV1Block(nn.Cell):
    def __init__(self, inp, oup, group, first_group, mid_channels, ksize, stride):
        super(ShuffleV1Block, self).__init__()
        self.stride = stride
        pad = ksize // 2
        self.group = group
        if stride == 2:
            outputs = oup - inp
        else:
            outputs = oup
        self.relu = nn.ReLU()
        branch_main_1 = [
            GroupConv(in_channels=inp, out_channels=mid_channels,
                      kernel_size=1, stride=1, pad_mode="pad", pad=0,
                      groups=1 if first_group else group),
            nn.BatchNorm2d(mid_channels),
            nn.ReLU(),
        ]
        branch_main_2 = [
            nn.Conv2d(mid_channels, mid_channels, kernel_size=ksize, stride=stride,
                      pad_mode='pad', padding=pad, group=mid_channels,
                      weight_init='xavier_uniform', has_bias=False),
            nn.BatchNorm2d(mid_channels),
            GroupConv(in_channels=mid_channels, out_channels=outputs,
                      kernel_size=1, stride=1, pad_mode="pad", pad=0,
                      groups=group),
            nn.BatchNorm2d(outputs),
        ]
        self.branch_main_1 = nn.SequentialCell(branch_main_1)
        self.branch_main_2 = nn.SequentialCell(branch_main_2)
        if stride == 2:
            self.branch_proj = nn.AvgPool2d(kernel_size=3, stride=2, pad_mode='same')

    def construct(self, old_x):
        left = old_x
        right = old_x
        out = old_x
        right = self.branch_main_1(right)
        if self.group > 1:
            right = self.channel_shuffle(right)
        right = self.branch_main_2(right)
        if self.stride == 1:
            out = self.relu(left + right)
        elif self.stride == 2:
            left = self.branch_proj(left)
            out = ops.cat((left, right), 1)
            out = self.relu(out)
        return out

    def channel_shuffle(self, x):
        batchsize, num_channels, height, width = ops.shape(x)
        group_channels = num_channels // self.group
        x = ops.reshape(x, (batchsize, group_channels, self.group, height, width))
        x = ops.transpose(x, (0, 2, 1, 3, 4))
        x = ops.reshape(x, (batchsize, num_channels, height, width))
        return x

ShuffleV1Block类实现ShuffleNet V1的基本构建块。channel_shuffle 方法实现了通道重排。在ShuffleV1Block类中,leftright分别代表了ShuffleNet V1架构中的两个分支的输出,是基于残差连接的Bottleneck结构进行的修改。

构建ShuffleNet网络

ShuffleNet网络结构如下图所示,以输入图像224×224,组数3(g = 3)为例,首先通过数量24,卷积核大小为3×3,stride为2的卷积层,输出特征图大小为112×112,channel为24;然后通过stride为2的最大池化层,输出特征图大小为56×56,channel数不变;再堆叠3个ShuffleNet模块(Stage2, Stage3, Stage4),三个模块分别重复4次、8次、4次,其中每个模块开始先经过一次下采样模块(上图(c)),使特征图长宽减半,channel翻倍(Stage2的下采样模块除外,将channel数从24变为240);随后经过全局平均池化,输出大小为1×1×960,再经过全连接层和softmax,得到分类概率。

shufflenet6

class ShuffleNetV1(nn.Cell):
    def __init__(self, n_class=1000, model_size='2.0x', group=3):
        super(ShuffleNetV1, self).__init__()
        print('model size is ', model_size)
        self.stage_repeats = [4, 8, 4]
        self.model_size = model_size
        if group == 3:
            if model_size == '0.5x':
                self.stage_out_channels = [-1, 12, 120, 240, 480]
            elif model_size == '1.0x':
                self.stage_out_channels = [-1, 24, 240, 480, 960]
            elif model_size == '1.5x':
                self.stage_out_channels = [-1, 24, 360, 720, 1440]
            elif model_size == '2.0x':
                self.stage_out_channels = [-1, 48, 480, 960, 1920]
            else:
                raise NotImplementedError
        elif group == 8:
            if model_size == '0.5x':
                self.stage_out_channels = [-1, 16, 192, 384, 768]
            elif model_size == '1.0x':
                self.stage_out_channels = [-1, 24, 384, 768, 1536]
            elif model_size == '1.5x':
                self.stage_out_channels = [-1, 24, 576, 1152, 2304]
            elif model_size == '2.0x':
                self.stage_out_channels = [-1, 48, 768, 1536, 3072]
            else:
                raise NotImplementedError
        input_channel = self.stage_out_channels[1]
        self.first_conv = nn.SequentialCell(
            nn.Conv2d(3, input_channel, 3, 2, 'pad', 1, weight_init='xavier_uniform', has_bias=False),
            nn.BatchNorm2d(input_channel),
            nn.ReLU(),
        )
        self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, pad_mode='same')
        features = []
        for idxstage in range(len(self.stage_repeats)):
            numrepeat = self.stage_repeats[idxstage]
            output_channel = self.stage_out_channels[idxstage + 2]
            for i in range(numrepeat):
                stride = 2 if i == 0 else 1
                first_group = idxstage == 0 and i == 0
                features.append(ShuffleV1Block(input_channel, output_channel,
                                               group=group, first_group=first_group,
                                               mid_channels=output_channel // 4, ksize=3, stride=stride))
                input_channel = output_channel
        self.features = nn.SequentialCell(features)
        self.globalpool = nn.AvgPool2d(7)
        self.classifier = nn.Dense(self.stage_out_channels[-1], n_class)

    def construct(self, x):
        x = self.first_conv(x)
        x = self.maxpool(x)
        x = self.features(x)
        x = self.globalpool(x)
        x = ops.reshape(x, (-1, self.stage_out_channels[-1]))
        x = self.classifier(x)
        return x

ShuffleNetV1类构建完整的ShuffleNet网络,包含了多个阶段(stage)的ShuffleV1Block。根据model_size和group,初始化各个阶段的输出通道数。创建了第一个卷积模块first_conv,一个最大池化层maxpool。循环创建多个阶段,每个阶段中包含多个ShuffleV1Block,stride=2特征图大小从输入到输出依次减半。全局平均池化globalpool 、调整特征图形状reshape、全连接层classifier进行分类预测。

模型训练和评估

采用CIFAR-10数据集对ShuffleNet进行预训练。

训练集准备与加载

采用CIFAR-10数据集对ShuffleNet进行预训练。CIFAR-10共有60000张32*32的彩色图像,均匀地分为10个类别,其中50000张图片作为训练集,10000图片作为测试集。如下示例使用mindspore.dataset.Cifar10Dataset接口下载并加载CIFAR-10的训练集。目前仅支持二进制版本(CIFAR-10 binary version)。

from download import download

url = "https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/notebook/datasets/cifar-10-binary.tar.gz"

download(url, "./dataset", kind="tar.gz", replace=True)
Creating data folder...
Downloading data from https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/notebook/datasets/cifar-10-binary.tar.gz (162.2 MB)

file_sizes: 100%|█████████████████████████████| 170M/170M [00:00<00:00, 183MB/s]
Extracting tar.gz file...
Successfully downloaded / unzipped to ./dataset

[6]:

'./dataset'
import mindspore as ms
from mindspore.dataset import Cifar10Dataset
from mindspore.dataset import vision, transforms

def get_dataset(train_dataset_path, batch_size, usage):
    image_trans = []
    if usage == "train":
        image_trans = [
            vision.RandomCrop((32, 32), (4, 4, 4, 4)),
            vision.RandomHorizontalFlip(prob=0.5),
            vision.Resize((224, 224)),
            vision.Rescale(1.0 / 255.0, 0.0),
            vision.Normalize([0.4914, 0.4822, 0.4465], [0.2023, 0.1994, 0.2010]),
            vision.HWC2CHW()
        ]
    elif usage == "test":
        image_trans = [
            vision.Resize((224, 224)),
            vision.Rescale(1.0 / 255.0, 0.0),
            vision.Normalize([0.4914, 0.4822, 0.4465], [0.2023, 0.1994, 0.2010]),
            vision.HWC2CHW()
        ]
    label_trans = transforms.TypeCast(ms.int32)
    dataset = Cifar10Dataset(train_dataset_path, usage=usage, shuffle=True)
    dataset = dataset.map(image_trans, 'image')
    dataset = dataset.map(label_trans, 'label')
    dataset = dataset.batch(batch_size, drop_remainder=True)
    return dataset

dataset = get_dataset("./dataset/cifar-10-batches-bin", 128, "train")
batches_per_epoch = dataset.get_dataset_size()

通过 download 函数下载 CIFAR-10 数据集。get_dataset 函数对数据进行预处理,包括随机裁剪、翻转、调整大小、归一化等,并将数据分批。

模型训练

本节用随机初始化的参数做预训练。首先调用ShuffleNetV1定义网络,参数量选择"2.0x",并定义损失函数为交叉熵损失,学习率经过4轮的warmup后采用余弦退火,优化器采用Momentum。最后用train.model中的Model接口将模型、损失函数、优化器封装在model中,并用model.train()对网络进行训练。将ModelCheckpointCheckpointConfigTimeMonitorLossMonitor传入回调函数中,将会打印训练的轮数、损失和时间,并将ckpt文件保存在当前目录下。

import time
import mindspore
import numpy as np
from mindspore import Tensor, nn
from mindspore.train import ModelCheckpoint, CheckpointConfig, TimeMonitor, LossMonitor, Model, Top1CategoricalAccuracy, Top5CategoricalAccuracy

def train():
    mindspore.set_context(mode=mindspore.PYNATIVE_MODE, device_target="Ascend")
    net = ShuffleNetV1(model_size="2.0x", n_class=10)
    loss = nn.CrossEntropyLoss(weight=None, reduction='mean', label_smoothing=0.1)
    min_lr = 0.0005
    base_lr = 0.05
    lr_scheduler = mindspore.nn.cosine_decay_lr(min_lr,
                                                base_lr,
                                                batches_per_epoch*250,
                                                batches_per_epoch,
                                                decay_epoch=250)
    lr = Tensor(lr_scheduler[-1])
    optimizer = nn.Momentum(params=net.trainable_params(), learning_rate=lr, momentum=0.9, weight_decay=0.00004, loss_scale=1024)
    loss_scale_manager = ms.amp.FixedLossScaleManager(1024, drop_overflow_update=False)
    model = Model(net, loss_fn=loss, optimizer=optimizer, amp_level="O3", loss_scale_manager=loss_scale_manager)
    callback = [TimeMonitor(), LossMonitor()]
    save_ckpt_path = "./"
    config_ckpt = CheckpointConfig(save_checkpoint_steps=batches_per_epoch, keep_checkpoint_max=5)
    ckpt_callback = ModelCheckpoint("shufflenetv1", directory=save_ckpt_path, config=config_ckpt)
    callback += [ckpt_callback]

    print("============== Starting Training ==============")
    start_time = time.time()
    # 由于时间原因,epoch = 5,可根据需求进行调整
    model.train(5, dataset, callbacks=callback)
    use_time = time.time() - start_time
    hour = str(int(use_time // 60 // 60))
    minute = str(int(use_time // 60 % 60))
    second = str(int(use_time % 60))
    print("total time:" + hour + "h " + minute + "m " + second + "s")
    print("============== Train Success ==============")

if __name__ == '__main__':
    train()
model size is  2.0x
============== Starting Training ==============
epoch: 1 step: 1, loss is 2.487356185913086
epoch: 1 step: 2, loss is 2.4824328422546387
epoch: 1 step: 3, loss is 2.360534191131592
epoch: 1 step: 4, loss is 2.3881638050079346
epoch: 1 step: 5, loss is 2.4093079566955566
epoch: 1 step: 6, loss is 2.4161176681518555
epoch: 1 step: 7, loss is 2.307513952255249
epoch: 1 step: 8, loss is 2.3627288341522217
epoch: 1 step: 9, loss is 2.3538804054260254
epoch: 1 step: 10, loss is 2.3459255695343018
epoch: 1 step: 11, loss is 2.2624566555023193
epoch: 1 step: 12, loss is 2.315800666809082
epoch: 1 step: 13, loss is 2.308826446533203
epoch: 1 step: 14, loss is 2.276533842086792
epoch: 1 step: 15, loss is 2.3070459365844727
epoch: 1 step: 16, loss is 2.3226258754730225
epoch: 1 step: 17, loss is 2.3349223136901855
epoch: 1 step: 18, loss is 2.2568447589874268
epoch: 1 step: 19, loss is 2.300696849822998
epoch: 1 step: 20, loss is 2.234652519226074
epoch: 1 step: 21, loss is 2.2882838249206543
epoch: 1 step: 22, loss is 2.2472236156463623
epoch: 1 step: 23, loss is 2.2128169536590576
epoch: 1 step: 24, loss is 2.2238047122955322
epoch: 1 step: 25, loss is 2.188947916030884
epoch: 1 step: 26, loss is 2.12522292137146
epoch: 1 step: 27, loss is 2.1997528076171875
epoch: 1 step: 28, loss is 2.2016754150390625
epoch: 1 step: 29, loss is 2.267056465148926
epoch: 1 step: 30, loss is 2.220615863800049
epoch: 1 step: 31, loss is 2.163125514984131
epoch: 1 step: 32, loss is 2.1402995586395264
epoch: 1 step: 33, loss is 2.201935052871704
epoch: 1 step: 34, loss is 2.271354913711548
epoch: 1 step: 35, loss is 2.211737632751465
epoch: 1 step: 36, loss is 2.1788642406463623
epoch: 1 step: 37, loss is 2.142369031906128
epoch: 1 step: 38, loss is 2.150986433029175
epoch: 1 step: 39, loss is 2.1608712673187256
epoch: 1 step: 40, loss is 2.1094672679901123
epoch: 1 step: 41, loss is 2.1704046726226807
epoch: 1 step: 42, loss is 2.133333683013916
epoch: 1 step: 43, loss is 2.107698440551758
epoch: 1 step: 44, loss is 2.1666100025177
epoch: 1 step: 45, loss is 2.092921495437622
epoch: 1 step: 46, loss is 2.152892589569092
epoch: 1 step: 47, loss is 2.2185187339782715
epoch: 1 step: 48, loss is 2.108494758605957
epoch: 1 step: 49, loss is 2.124622344970703
epoch: 1 step: 50, loss is 2.0244524478912354
epoch: 1 step: 51, loss is 2.092057704925537
epoch: 1 step: 52, loss is 2.07167387008667
epoch: 1 step: 53, loss is 2.0988404750823975
epoch: 1 step: 54, loss is 2.0495665073394775
epoch: 1 step: 55, loss is 2.125042676925659
epoch: 1 step: 56, loss is 2.119354248046875
epoch: 1 step: 57, loss is 2.048170804977417
epoch: 1 step: 58, loss is 2.1628291606903076
epoch: 1 step: 59, loss is 2.062399387359619
epoch: 1 step: 60, loss is 2.1404664516448975
epoch: 1 step: 61, loss is 2.083807945251465
epoch: 1 step: 62, loss is 2.0124189853668213
epoch: 1 step: 63, loss is 2.0625486373901367
epoch: 1 step: 64, loss is 2.1296193599700928
epoch: 1 step: 65, loss is 2.095654010772705
epoch: 1 step: 66, loss is 2.019275426864624
epoch: 1 step: 67, loss is 1.9981178045272827
epoch: 1 step: 68, loss is 2.1032302379608154
epoch: 1 step: 69, loss is 2.0070807933807373
epoch: 1 step: 70, loss is 1.982177734375
epoch: 1 step: 71, loss is 2.008201837539673
epoch: 1 step: 72, loss is 2.148066520690918
epoch: 1 step: 73, loss is 2.047478675842285
epoch: 1 step: 74, loss is 2.01863956451416
epoch: 1 step: 76, loss is 2.1667799949645996
epoch: 1 step: 77, loss is 2.0567209720611572
epoch: 1 step: 78, loss is 2.05881929397583
epoch: 1 step: 79, loss is 2.0320184230804443
epoch: 1 step: 80, loss is 2.0250511169433594
epoch: 1 step: 81, loss is 2.0239150524139404
epoch: 1 step: 82, loss is 2.064993143081665
epoch: 1 step: 83, loss is 2.041712522506714
epoch: 1 step: 84, loss is 2.024174690246582
epoch: 1 step: 85, loss is 1.9830849170684814
epoch: 1 step: 86, loss is 2.028733730316162
epoch: 1 step: 87, loss is 1.9664912223815918
epoch: 1 step: 88, loss is 2.0286335945129395
epoch: 1 step: 89, loss is 2.0676581859588623
epoch: 1 step: 90, loss is 2.057992696762085
epoch: 1 step: 91, loss is 2.0323309898376465
epoch: 1 step: 92, loss is 1.984572172164917
epoch: 1 step: 93, loss is 2.0055670738220215
epoch: 1 step: 94, loss is 2.0943844318389893
epoch: 1 step: 95, loss is 1.894536018371582
epoch: 1 step: 96, loss is 1.9870450496673584
epoch: 1 step: 97, loss is 1.9714359045028687
epoch: 1 step: 98, loss is 2.0173873901367188
epoch: 1 step: 99, loss is 2.0027568340301514
epoch: 1 step: 100, loss is 2.0405211448669434
epoch: 1 step: 101, loss is 2.035789728164673
epoch: 1 step: 102, loss is 2.1301562786102295
epoch: 1 step: 103, loss is 1.987611174583435
epoch: 1 step: 104, loss is 2.0551233291625977
epoch: 1 step: 105, loss is 1.9624370336532593
epoch: 1 step: 106, loss is 1.9536328315734863
epoch: 1 step: 107, loss is 2.0937461853027344
epoch: 1 step: 108, loss is 2.0652730464935303
epoch: 1 step: 109, loss is 1.9838111400604248
epoch: 1 step: 110, loss is 2.10089111328125
epoch: 1 step: 111, loss is 2.036222457885742
epoch: 1 step: 112, loss is 1.9383563995361328
epoch: 1 step: 113, loss is 2.059077739715576
epoch: 1 step: 114, loss is 2.040703535079956
epoch: 1 step: 115, loss is 1.9993565082550049
epoch: 1 step: 116, loss is 2.050079584121704
epoch: 1 step: 117, loss is 1.9684908390045166
epoch: 1 step: 118, loss is 1.9950356483459473
epoch: 1 step: 119, loss is 2.0948896408081055
epoch: 1 step: 120, loss is 1.9908406734466553
epoch: 1 step: 121, loss is 2.0056936740875244
epoch: 1 step: 122, loss is 1.9615106582641602
epoch: 1 step: 123, loss is 2.0008866786956787
epoch: 1 step: 124, loss is 2.0557472705841064
epoch: 1 step: 125, loss is 1.9648888111114502
epoch: 1 step: 126, loss is 2.043461799621582
epoch: 1 step: 127, loss is 2.1182963848114014
epoch: 1 step: 128, loss is 2.100033760070801
epoch: 1 step: 129, loss is 1.9823192358016968
epoch: 1 step: 130, loss is 1.8895161151885986
epoch: 1 step: 131, loss is 1.900534749031067
epoch: 1 step: 132, loss is 1.9451504945755005
epoch: 1 step: 133, loss is 1.9374111890792847
epoch: 1 step: 134, loss is 2.0243308544158936
epoch: 1 step: 135, loss is 1.9536951780319214
epoch: 1 step: 136, loss is 2.1073625087738037
epoch: 1 step: 137, loss is 2.0234878063201904
epoch: 1 step: 138, loss is 1.9592859745025635
epoch: 1 step: 139, loss is 2.0593502521514893
epoch: 1 step: 140, loss is 2.0046000480651855
epoch: 1 step: 141, loss is 1.9181723594665527
epoch: 1 step: 142, loss is 1.8924415111541748
epoch: 1 step: 143, loss is 2.0287814140319824
epoch: 1 step: 144, loss is 2.0727789402008057
epoch: 1 step: 145, loss is 2.0406992435455322
epoch: 1 step: 146, loss is 1.9817670583724976
epoch: 1 step: 147, loss is 2.0699660778045654
epoch: 1 step: 148, loss is 2.087092638015747
epoch: 1 step: 149, loss is 2.040200710296631
epoch: 1 step: 150, loss is 2.01078724861145
epoch: 1 step: 151, loss is 2.0484468936920166
epoch: 1 step: 152, loss is 1.9874005317687988
epoch: 1 step: 153, loss is 1.9814248085021973
epoch: 1 step: 154, loss is 1.967137098312378
epoch: 1 step: 155, loss is 1.9417905807495117
epoch: 1 step: 156, loss is 1.9325077533721924
epoch: 1 step: 157, loss is 2.038830041885376
epoch: 1 step: 158, loss is 1.9740386009216309
epoch: 1 step: 159, loss is 1.982688307762146
epoch: 1 step: 160, loss is 2.0671212673187256
epoch: 1 step: 161, loss is 1.966343641281128
epoch: 1 step: 162, loss is 1.9925696849822998
epoch: 1 step: 163, loss is 1.946681261062622
epoch: 1 step: 164, loss is 1.946541428565979
epoch: 1 step: 165, loss is 2.023364543914795
epoch: 1 step: 166, loss is 1.902091145515442
epoch: 1 step: 167, loss is 1.9311779737472534
epoch: 1 step: 168, loss is 2.014444351196289
epoch: 1 step: 169, loss is 1.9326798915863037
epoch: 1 step: 170, loss is 2.029386520385742
epoch: 1 step: 171, loss is 1.9097285270690918
epoch: 1 step: 172, loss is 1.8932981491088867
epoch: 1 step: 173, loss is 1.9614872932434082
epoch: 1 step: 174, loss is 1.9768420457839966
epoch: 1 step: 175, loss is 1.9417202472686768
epoch: 1 step: 176, loss is 1.9478507041931152
epoch: 1 step: 177, loss is 1.935025691986084
epoch: 1 step: 178, loss is 1.9080874919891357
epoch: 1 step: 179, loss is 2.007932662963867
epoch: 1 step: 180, loss is 1.943689227104187
epoch: 1 step: 181, loss is 1.9303019046783447
epoch: 1 step: 182, loss is 2.0126776695251465
epoch: 1 step: 183, loss is 1.951655626296997
epoch: 1 step: 184, loss is 1.8968234062194824
epoch: 1 step: 185, loss is 1.9077074527740479
epoch: 1 step: 186, loss is 1.9576302766799927
epoch: 1 step: 187, loss is 2.0126757621765137
epoch: 1 step: 188, loss is 1.9318937063217163
epoch: 1 step: 189, loss is 2.070370674133301
epoch: 1 step: 190, loss is 1.9427413940429688
epoch: 1 step: 191, loss is 1.9774736166000366
epoch: 1 step: 192, loss is 1.9788503646850586
epoch: 1 step: 193, loss is 1.930885672569275
epoch: 1 step: 194, loss is 1.9480682611465454
epoch: 1 step: 195, loss is 1.9859598875045776
epoch: 1 step: 196, loss is 2.002511978149414
epoch: 1 step: 197, loss is 1.940032958984375
epoch: 1 step: 198, loss is 1.9755717515945435
epoch: 1 step: 199, loss is 1.939618706703186
epoch: 1 step: 200, loss is 2.001927137374878
epoch: 1 step: 201, loss is 1.9595344066619873
epoch: 1 step: 202, loss is 1.9937856197357178
epoch: 1 step: 203, loss is 2.03942608833313
epoch: 1 step: 204, loss is 1.9261990785598755
epoch: 1 step: 205, loss is 1.9673774242401123
epoch: 1 step: 206, loss is 1.9115777015686035
epoch: 1 step: 207, loss is 1.8920965194702148
epoch: 1 step: 208, loss is 2.0317740440368652
epoch: 1 step: 209, loss is 1.9703094959259033
epoch: 1 step: 210, loss is 1.8635448217391968
epoch: 1 step: 211, loss is 1.955244779586792
epoch: 1 step: 212, loss is 2.0128376483917236
epoch: 1 step: 213, loss is 1.9284648895263672
epoch: 1 step: 214, loss is 1.9553477764129639
epoch: 1 step: 215, loss is 1.9950671195983887
epoch: 1 step: 216, loss is 1.919708490371704
epoch: 1 step: 217, loss is 1.9465675354003906
epoch: 1 step: 218, loss is 1.9259291887283325
epoch: 1 step: 219, loss is 1.8622301816940308
epoch: 1 step: 220, loss is 1.925300121307373
epoch: 1 step: 221, loss is 1.8980937004089355
epoch: 1 step: 222, loss is 1.960056185722351
epoch: 1 step: 223, loss is 1.9597039222717285
epoch: 1 step: 224, loss is 1.8744360208511353
epoch: 1 step: 225, loss is 1.9043315649032593
epoch: 1 step: 226, loss is 1.9731948375701904
epoch: 1 step: 227, loss is 1.8819395303726196
epoch: 1 step: 228, loss is 1.8765403032302856
epoch: 1 step: 229, loss is 1.8105037212371826
epoch: 1 step: 230, loss is 1.9757781028747559
epoch: 1 step: 231, loss is 1.986043095588684
epoch: 1 step: 232, loss is 1.9739878177642822
epoch: 1 step: 233, loss is 1.9835622310638428
epoch: 1 step: 234, loss is 1.7967957258224487
epoch: 1 step: 235, loss is 1.9233113527297974
epoch: 1 step: 236, loss is 1.937123417854309
epoch: 1 step: 237, loss is 1.8942066431045532
epoch: 1 step: 238, loss is 1.997687816619873
epoch: 1 step: 239, loss is 1.9943475723266602
epoch: 1 step: 240, loss is 1.897137999534607
epoch: 1 step: 241, loss is 1.9524480104446411
epoch: 1 step: 242, loss is 1.858811378479004
epoch: 1 step: 243, loss is 1.884620189666748
epoch: 1 step: 244, loss is 1.9191231727600098
epoch: 1 step: 245, loss is 1.9229687452316284
epoch: 1 step: 246, loss is 1.9181244373321533
epoch: 1 step: 247, loss is 1.858770489692688
epoch: 1 step: 248, loss is 1.9117258787155151
epoch: 1 step: 249, loss is 1.92829430103302
epoch: 1 step: 250, loss is 1.9199907779693604
epoch: 1 step: 251, loss is 1.9308414459228516
epoch: 1 step: 252, loss is 1.945981502532959
epoch: 1 step: 253, loss is 1.9086549282073975
epoch: 1 step: 254, loss is 2.00075364112854
epoch: 1 step: 255, loss is 1.9397599697113037
epoch: 1 step: 256, loss is 1.9448959827423096
epoch: 1 step: 257, loss is 1.815433382987976
epoch: 1 step: 258, loss is 1.859832763671875
epoch: 1 step: 259, loss is 1.971736192703247
epoch: 1 step: 260, loss is 2.004875898361206
epoch: 1 step: 261, loss is 1.9012647867202759
epoch: 1 step: 262, loss is 1.8306684494018555
epoch: 1 step: 263, loss is 1.984313726425171
epoch: 1 step: 264, loss is 1.8443315029144287
epoch: 1 step: 265, loss is 1.8624560832977295
epoch: 1 step: 266, loss is 1.9326941967010498
epoch: 1 step: 267, loss is 1.9113365411758423
epoch: 1 step: 268, loss is 1.9158111810684204
epoch: 1 step: 269, loss is 2.049499273300171
epoch: 1 step: 270, loss is 1.9675345420837402
epoch: 1 step: 271, loss is 1.9279316663742065
epoch: 1 step: 272, loss is 1.8232439756393433
epoch: 1 step: 273, loss is 1.8978644609451294
epoch: 1 step: 274, loss is 1.89058256149292
epoch: 1 step: 275, loss is 1.8708573579788208
epoch: 1 step: 276, loss is 1.8658140897750854
epoch: 1 step: 277, loss is 1.9091688394546509
epoch: 1 step: 278, loss is 1.8666309118270874
epoch: 1 step: 279, loss is 1.9317255020141602
epoch: 1 step: 280, loss is 1.991161823272705
epoch: 1 step: 281, loss is 1.8272242546081543
epoch: 1 step: 282, loss is 1.9044129848480225
epoch: 1 step: 283, loss is 1.9505462646484375
epoch: 1 step: 284, loss is 1.859652042388916
epoch: 1 step: 285, loss is 1.88494074344635
epoch: 1 step: 286, loss is 1.9080116748809814
epoch: 1 step: 287, loss is 1.8882266283035278
epoch: 1 step: 288, loss is 1.9163529872894287
epoch: 1 step: 289, loss is 1.8715424537658691
epoch: 1 step: 290, loss is 1.895844578742981
epoch: 1 step: 291, loss is 1.926143765449524
epoch: 1 step: 292, loss is 1.8780732154846191
epoch: 1 step: 293, loss is 2.0592093467712402
epoch: 1 step: 294, loss is 1.8703604936599731
epoch: 1 step: 295, loss is 1.927522897720337
epoch: 1 step: 296, loss is 1.7786564826965332
epoch: 1 step: 297, loss is 1.860284686088562
epoch: 1 step: 298, loss is 1.8343312740325928
epoch: 1 step: 299, loss is 1.9373613595962524
epoch: 1 step: 300, loss is 1.8537380695343018
epoch: 1 step: 301, loss is 1.9855666160583496
epoch: 1 step: 302, loss is 1.922060489654541
epoch: 1 step: 303, loss is 1.9835606813430786
epoch: 1 step: 304, loss is 1.8231443166732788
epoch: 1 step: 305, loss is 1.9540646076202393
epoch: 1 step: 306, loss is 1.831480622291565
epoch: 1 step: 307, loss is 1.8689630031585693
epoch: 1 step: 308, loss is 1.9101365804672241
epoch: 1 step: 309, loss is 1.90790593624115
epoch: 1 step: 310, loss is 1.9162652492523193
epoch: 1 step: 311, loss is 1.9321383237838745
epoch: 1 step: 312, loss is 1.9892783164978027
epoch: 1 step: 313, loss is 1.8137577772140503
epoch: 1 step: 314, loss is 1.899651288986206
epoch: 1 step: 315, loss is 1.9436644315719604
epoch: 1 step: 316, loss is 1.7932474613189697
epoch: 1 step: 317, loss is 1.9467377662658691
epoch: 1 step: 318, loss is 1.9173305034637451
epoch: 1 step: 319, loss is 1.9331930875778198
epoch: 1 step: 320, loss is 1.8739713430404663
epoch: 1 step: 321, loss is 1.9036370515823364
epoch: 1 step: 322, loss is 1.8275716304779053
epoch: 1 step: 323, loss is 1.944501280784607
epoch: 1 step: 324, loss is 1.8843615055084229
epoch: 1 step: 325, loss is 1.837125539779663
epoch: 1 step: 326, loss is 1.8234528303146362
epoch: 1 step: 327, loss is 1.8961336612701416
epoch: 1 step: 328, loss is 1.9050860404968262
epoch: 1 step: 329, loss is 1.8687013387680054
epoch: 1 step: 330, loss is 1.8470200300216675
epoch: 1 step: 331, loss is 1.9998462200164795
epoch: 1 step: 332, loss is 1.9475746154785156
epoch: 1 step: 333, loss is 1.8606044054031372
epoch: 1 step: 334, loss is 1.8657100200653076
epoch: 1 step: 335, loss is 1.8894202709197998
epoch: 1 step: 336, loss is 1.8159394264221191
epoch: 1 step: 337, loss is 1.7843891382217407
epoch: 1 step: 338, loss is 1.8916523456573486
epoch: 1 step: 339, loss is 1.7935060262680054
epoch: 1 step: 340, loss is 1.8278096914291382
epoch: 1 step: 341, loss is 1.8356901407241821
epoch: 1 step: 342, loss is 1.8652888536453247
epoch: 1 step: 343, loss is 1.9316866397857666
epoch: 1 step: 344, loss is 1.803539752960205
epoch: 1 step: 345, loss is 1.84087336063385
epoch: 1 step: 346, loss is 2.020559549331665
epoch: 1 step: 347, loss is 1.9127906560897827
epoch: 1 step: 348, loss is 1.8542912006378174
epoch: 1 step: 349, loss is 1.8020504713058472
epoch: 1 step: 350, loss is 1.9152642488479614
epoch: 1 step: 351, loss is 1.9949240684509277
epoch: 1 step: 352, loss is 1.8379167318344116
epoch: 1 step: 353, loss is 1.9077467918395996
epoch: 1 step: 354, loss is 1.896492600440979
epoch: 1 step: 355, loss is 1.9723351001739502
epoch: 1 step: 356, loss is 2.032634735107422
epoch: 1 step: 357, loss is 1.9882268905639648
epoch: 1 step: 358, loss is 1.8524292707443237
epoch: 1 step: 359, loss is 1.969857931137085
epoch: 1 step: 360, loss is 1.978499174118042
epoch: 1 step: 361, loss is 1.7964271306991577
epoch: 1 step: 362, loss is 1.746205449104309
epoch: 1 step: 363, loss is 1.8945220708847046
epoch: 1 step: 364, loss is 1.894822597503662
epoch: 1 step: 365, loss is 1.8976346254348755
epoch: 1 step: 366, loss is 2.011072874069214
epoch: 1 step: 367, loss is 1.8875963687896729
epoch: 1 step: 368, loss is 1.8663601875305176
epoch: 1 step: 369, loss is 1.7830815315246582
epoch: 1 step: 370, loss is 1.888221263885498
epoch: 1 step: 371, loss is 1.867682695388794
epoch: 1 step: 372, loss is 1.864630937576294
epoch: 1 step: 373, loss is 1.8272963762283325
epoch: 1 step: 374, loss is 1.8457086086273193
epoch: 1 step: 375, loss is 1.7715091705322266
epoch: 1 step: 376, loss is 1.8974144458770752
epoch: 1 step: 377, loss is 1.8871971368789673
epoch: 1 step: 378, loss is 1.895826816558838
epoch: 1 step: 379, loss is 1.8768517971038818
epoch: 1 step: 380, loss is 1.9171315431594849
epoch: 1 step: 381, loss is 1.897505760192871
epoch: 1 step: 382, loss is 1.9331414699554443
epoch: 1 step: 383, loss is 1.9014980792999268
epoch: 1 step: 384, loss is 1.9041805267333984
epoch: 1 step: 385, loss is 1.8365564346313477
epoch: 1 step: 386, loss is 1.8542208671569824
epoch: 1 step: 387, loss is 1.7386746406555176
epoch: 1 step: 388, loss is 1.8902710676193237
epoch: 1 step: 389, loss is 1.801804780960083
epoch: 1 step: 390, loss is 1.8723918199539185
Train epoch time: 521838.712 ms, per step time: 1338.048 ms
epoch: 2 step: 1, loss is 1.834862232208252
epoch: 2 step: 2, loss is 1.849745750427246
epoch: 2 step: 3, loss is 1.7054263353347778
epoch: 2 step: 4, loss is 1.7429084777832031
epoch: 2 step: 5, loss is 1.8363549709320068
epoch: 2 step: 6, loss is 1.910990595817566
epoch: 2 step: 7, loss is 1.878669023513794
epoch: 2 step: 8, loss is 1.8345451354980469
epoch: 2 step: 9, loss is 1.80141019821167
epoch: 2 step: 10, loss is 1.762511134147644
epoch: 2 step: 11, loss is 1.7321021556854248
epoch: 2 step: 12, loss is 1.9149668216705322
epoch: 2 step: 13, loss is 1.8518586158752441
epoch: 2 step: 14, loss is 1.8448243141174316
epoch: 2 step: 15, loss is 1.7971129417419434
epoch: 2 step: 16, loss is 1.8262436389923096
epoch: 2 step: 17, loss is 1.8549871444702148
epoch: 2 step: 18, loss is 1.837705373764038
epoch: 2 step: 19, loss is 1.8181580305099487
epoch: 2 step: 20, loss is 1.7538896799087524
epoch: 2 step: 21, loss is 1.7806408405303955
epoch: 2 step: 22, loss is 1.7826101779937744
epoch: 2 step: 23, loss is 1.8982733488082886
epoch: 2 step: 24, loss is 1.7070553302764893
epoch: 2 step: 25, loss is 1.7258012294769287
epoch: 2 step: 26, loss is 1.8621647357940674
epoch: 2 step: 27, loss is 1.8421783447265625
epoch: 2 step: 28, loss is 1.718170404434204
epoch: 2 step: 29, loss is 1.9222381114959717
epoch: 2 step: 30, loss is 1.8912699222564697
epoch: 2 step: 31, loss is 1.866867184638977
epoch: 2 step: 32, loss is 1.9003292322158813
epoch: 2 step: 33, loss is 1.8883135318756104
epoch: 2 step: 34, loss is 1.8029189109802246
epoch: 2 step: 35, loss is 1.7748987674713135
epoch: 2 step: 36, loss is 1.8300625085830688
epoch: 2 step: 37, loss is 1.7659794092178345
epoch: 2 step: 38, loss is 1.8682295083999634
epoch: 2 step: 39, loss is 1.8282811641693115
epoch: 2 step: 40, loss is 1.845916509628296
epoch: 2 step: 41, loss is 1.8930107355117798
epoch: 2 step: 42, loss is 2.0058679580688477
epoch: 2 step: 43, loss is 1.816670536994934
epoch: 2 step: 44, loss is 1.8222073316574097
epoch: 2 step: 45, loss is 1.8928234577178955
epoch: 2 step: 46, loss is 1.9439499378204346
epoch: 2 step: 47, loss is 1.8602352142333984
epoch: 2 step: 48, loss is 1.8652702569961548
epoch: 2 step: 49, loss is 1.7660480737686157
epoch: 2 step: 50, loss is 1.7618073225021362
epoch: 2 step: 51, loss is 1.8508286476135254
epoch: 2 step: 52, loss is 1.8416081666946411
epoch: 2 step: 53, loss is 1.9593312740325928
epoch: 2 step: 54, loss is 2.051502227783203
epoch: 2 step: 55, loss is 1.7987666130065918
epoch: 2 step: 56, loss is 1.76220703125
epoch: 2 step: 57, loss is 1.820993423461914
epoch: 2 step: 58, loss is 1.814599633216858
epoch: 2 step: 59, loss is 1.8549848794937134
epoch: 2 step: 60, loss is 1.8419703245162964
epoch: 2 step: 61, loss is 1.7779773473739624
epoch: 2 step: 62, loss is 1.832528829574585
epoch: 2 step: 63, loss is 1.7780293226242065
epoch: 2 step: 64, loss is 2.028048276901245
epoch: 2 step: 65, loss is 1.8628039360046387
epoch: 2 step: 66, loss is 1.8829567432403564
epoch: 2 step: 67, loss is 1.8905911445617676
epoch: 2 step: 68, loss is 1.8611254692077637
epoch: 2 step: 69, loss is 2.012552261352539
epoch: 2 step: 70, loss is 2.042849540710449
epoch: 2 step: 71, loss is 1.8537817001342773
epoch: 2 step: 72, loss is 1.8915587663650513
epoch: 2 step: 73, loss is 1.7604776620864868
epoch: 2 step: 74, loss is 1.8138325214385986
epoch: 2 step: 75, loss is 1.835501790046692
epoch: 2 step: 76, loss is 1.8492381572723389
epoch: 2 step: 77, loss is 1.8122605085372925
epoch: 2 step: 78, loss is 1.9520466327667236
epoch: 2 step: 79, loss is 1.8288925886154175
epoch: 2 step: 80, loss is 1.7794359922409058
epoch: 2 step: 81, loss is 1.7442902326583862
epoch: 2 step: 82, loss is 1.8219373226165771
epoch: 2 step: 83, loss is 1.832875370979309
epoch: 2 step: 84, loss is 1.8357572555541992
epoch: 2 step: 85, loss is 1.7819247245788574
epoch: 2 step: 86, loss is 1.8400955200195312
epoch: 2 step: 87, loss is 1.7577227354049683
epoch: 2 step: 88, loss is 1.7630740404129028
epoch: 2 step: 89, loss is 1.7891596555709839
epoch: 2 step: 90, loss is 1.9814701080322266
epoch: 2 step: 91, loss is 1.7651830911636353
epoch: 2 step: 92, loss is 1.7697455883026123
epoch: 2 step: 93, loss is 1.8312571048736572
epoch: 2 step: 94, loss is 1.7942986488342285
epoch: 2 step: 95, loss is 1.7663031816482544
epoch: 2 step: 96, loss is 1.8291752338409424
epoch: 2 step: 97, loss is 1.8728686571121216
epoch: 2 step: 98, loss is 1.8407989740371704
epoch: 2 step: 99, loss is 1.853529453277588
epoch: 2 step: 100, loss is 1.7540059089660645
epoch: 2 step: 101, loss is 1.7780358791351318
epoch: 2 step: 102, loss is 1.7782604694366455
epoch: 2 step: 103, loss is 1.809798240661621
epoch: 2 step: 104, loss is 1.968491554260254
epoch: 2 step: 105, loss is 1.8270950317382812
epoch: 2 step: 106, loss is 1.729768991470337
epoch: 2 step: 107, loss is 1.9185408353805542
epoch: 2 step: 108, loss is 1.8389308452606201
epoch: 2 step: 109, loss is 1.9074642658233643
epoch: 2 step: 110, loss is 1.7574139833450317
epoch: 2 step: 111, loss is 1.7461507320404053
epoch: 2 step: 112, loss is 1.81948721408844
epoch: 2 step: 113, loss is 1.7444366216659546
epoch: 2 step: 114, loss is 1.9319080114364624
epoch: 2 step: 115, loss is 1.7952524423599243
epoch: 2 step: 116, loss is 1.808004379272461
epoch: 2 step: 117, loss is 1.8550138473510742
epoch: 2 step: 118, loss is 1.7955305576324463
epoch: 2 step: 119, loss is 1.8233920335769653
epoch: 2 step: 120, loss is 1.9276678562164307
epoch: 2 step: 121, loss is 1.8658866882324219
epoch: 2 step: 122, loss is 1.751840591430664
epoch: 2 step: 123, loss is 1.8440660238265991
epoch: 2 step: 124, loss is 1.8113787174224854
epoch: 2 step: 125, loss is 1.7702703475952148
epoch: 2 step: 126, loss is 1.8395698070526123
epoch: 2 step: 127, loss is 1.8186851739883423
epoch: 2 step: 128, loss is 1.8916821479797363
epoch: 2 step: 129, loss is 1.740739107131958
epoch: 2 step: 130, loss is 1.8140079975128174
epoch: 2 step: 131, loss is 1.8821009397506714
epoch: 2 step: 132, loss is 1.974631667137146
epoch: 2 step: 133, loss is 1.7252826690673828
epoch: 2 step: 134, loss is 1.7294714450836182
epoch: 2 step: 135, loss is 1.746224284172058
epoch: 2 step: 136, loss is 1.835279941558838
epoch: 2 step: 137, loss is 1.8112125396728516
epoch: 2 step: 138, loss is 1.8524725437164307
epoch: 2 step: 139, loss is 1.8074004650115967
epoch: 2 step: 140, loss is 1.7848730087280273
epoch: 2 step: 141, loss is 1.7558326721191406
epoch: 2 step: 142, loss is 1.8558393716812134
epoch: 2 step: 143, loss is 1.8726437091827393
epoch: 2 step: 144, loss is 1.8240671157836914
epoch: 2 step: 145, loss is 1.7925260066986084
epoch: 2 step: 146, loss is 1.7609148025512695
epoch: 2 step: 147, loss is 1.817206621170044
epoch: 2 step: 148, loss is 1.8012206554412842
epoch: 2 step: 149, loss is 1.7344681024551392
epoch: 2 step: 150, loss is 1.8545403480529785
epoch: 2 step: 151, loss is 1.9346392154693604
epoch: 2 step: 152, loss is 1.7847907543182373
epoch: 2 step: 153, loss is 1.7996902465820312
epoch: 2 step: 154, loss is 1.8361784219741821
epoch: 2 step: 155, loss is 1.802649974822998
epoch: 2 step: 156, loss is 1.8078819513320923
epoch: 2 step: 157, loss is 1.7052006721496582
epoch: 2 step: 158, loss is 1.7516093254089355
epoch: 2 step: 159, loss is 1.8032681941986084
epoch: 2 step: 160, loss is 1.8555339574813843
epoch: 2 step: 161, loss is 1.8104159832000732
epoch: 2 step: 162, loss is 1.7981441020965576
epoch: 2 step: 163, loss is 1.7937678098678589
epoch: 2 step: 164, loss is 1.8369343280792236
epoch: 2 step: 165, loss is 1.7590316534042358
epoch: 2 step: 166, loss is 1.903198003768921
epoch: 2 step: 167, loss is 1.809228777885437
epoch: 2 step: 168, loss is 1.7207717895507812
epoch: 2 step: 169, loss is 1.6803689002990723
epoch: 2 step: 170, loss is 1.7886427640914917
epoch: 2 step: 171, loss is 1.7613189220428467
epoch: 2 step: 172, loss is 1.8411223888397217
epoch: 2 step: 173, loss is 1.789926290512085
epoch: 2 step: 174, loss is 1.8194866180419922
epoch: 2 step: 175, loss is 1.7798203229904175
epoch: 2 step: 176, loss is 1.7686288356781006
epoch: 2 step: 177, loss is 1.7124934196472168
epoch: 2 step: 178, loss is 1.8090981245040894
epoch: 2 step: 179, loss is 1.8615831136703491
epoch: 2 step: 180, loss is 1.9105007648468018
epoch: 2 step: 181, loss is 1.7555516958236694
epoch: 2 step: 182, loss is 1.8180744647979736
epoch: 2 step: 183, loss is 1.7644696235656738
epoch: 2 step: 184, loss is 1.7582026720046997
epoch: 2 step: 185, loss is 1.7807061672210693
epoch: 2 step: 186, loss is 1.9195256233215332
epoch: 2 step: 187, loss is 1.8233197927474976
epoch: 2 step: 188, loss is 1.7427889108657837
epoch: 2 step: 189, loss is 1.9037153720855713
epoch: 2 step: 190, loss is 1.7720736265182495
epoch: 2 step: 191, loss is 1.6791305541992188
epoch: 2 step: 192, loss is 1.8491283655166626
epoch: 2 step: 193, loss is 1.7785916328430176
epoch: 2 step: 194, loss is 1.7852778434753418
epoch: 2 step: 195, loss is 1.7878570556640625
epoch: 2 step: 196, loss is 1.8266456127166748
epoch: 2 step: 197, loss is 1.7733447551727295
epoch: 2 step: 198, loss is 1.7677949666976929
epoch: 2 step: 199, loss is 1.693023443222046
epoch: 2 step: 200, loss is 1.757947564125061
epoch: 2 step: 201, loss is 1.7988077402114868
epoch: 2 step: 202, loss is 1.8327466249465942
epoch: 2 step: 203, loss is 1.7021350860595703
epoch: 2 step: 204, loss is 1.8033387660980225
epoch: 2 step: 205, loss is 1.7271716594696045
epoch: 2 step: 206, loss is 1.7071491479873657
epoch: 2 step: 207, loss is 1.8059285879135132
epoch: 2 step: 208, loss is 1.7724381685256958
epoch: 2 step: 209, loss is 1.6461853981018066
epoch: 2 step: 210, loss is 1.7500430345535278
epoch: 2 step: 211, loss is 1.7728413343429565
epoch: 2 step: 212, loss is 1.7478232383728027
epoch: 2 step: 213, loss is 1.805396318435669
epoch: 2 step: 214, loss is 1.7564077377319336
epoch: 2 step: 215, loss is 1.8484628200531006
epoch: 2 step: 216, loss is 1.842712163925171
epoch: 2 step: 217, loss is 1.7719712257385254
epoch: 2 step: 218, loss is 1.7246894836425781
epoch: 2 step: 219, loss is 1.7537436485290527
epoch: 2 step: 220, loss is 1.8012590408325195
epoch: 2 step: 221, loss is 1.7071820497512817
epoch: 2 step: 222, loss is 1.7319221496582031
epoch: 2 step: 223, loss is 1.7418954372406006
epoch: 2 step: 224, loss is 1.756356954574585
epoch: 2 step: 225, loss is 1.9026412963867188
epoch: 2 step: 226, loss is 1.7950100898742676
epoch: 2 step: 227, loss is 1.7687002420425415
epoch: 2 step: 228, loss is 1.7988834381103516
epoch: 2 step: 229, loss is 1.7558226585388184
epoch: 2 step: 230, loss is 1.7972447872161865
epoch: 2 step: 231, loss is 1.7668306827545166
epoch: 2 step: 232, loss is 1.7706915140151978
epoch: 2 step: 233, loss is 1.8523707389831543
epoch: 2 step: 234, loss is 1.7863742113113403
epoch: 2 step: 235, loss is 1.7455711364746094
epoch: 2 step: 236, loss is 1.8494515419006348
epoch: 2 step: 237, loss is 1.8098007440567017
epoch: 2 step: 238, loss is 1.7320868968963623
epoch: 2 step: 239, loss is 1.7073906660079956
epoch: 2 step: 240, loss is 1.7543745040893555
epoch: 2 step: 241, loss is 1.7828773260116577
epoch: 2 step: 242, loss is 1.8017992973327637
epoch: 2 step: 243, loss is 1.75504469871521
epoch: 2 step: 244, loss is 1.8391574621200562
epoch: 2 step: 245, loss is 1.8867601156234741
epoch: 2 step: 246, loss is 1.7903711795806885
epoch: 2 step: 247, loss is 1.724091649055481
epoch: 2 step: 248, loss is 1.756180763244629
epoch: 2 step: 249, loss is 1.7248591184616089
epoch: 2 step: 250, loss is 1.6956037282943726
epoch: 2 step: 251, loss is 1.9044727087020874
epoch: 2 step: 252, loss is 1.7202427387237549
epoch: 2 step: 253, loss is 1.8590054512023926
epoch: 2 step: 254, loss is 1.7541002035140991
epoch: 2 step: 255, loss is 1.7449924945831299
epoch: 2 step: 256, loss is 1.8348194360733032
epoch: 2 step: 257, loss is 1.8214263916015625
epoch: 2 step: 258, loss is 1.7933714389801025
epoch: 2 step: 259, loss is 1.7224136590957642
epoch: 2 step: 260, loss is 1.8056836128234863
epoch: 2 step: 261, loss is 1.804041862487793
epoch: 2 step: 262, loss is 1.77969229221344
epoch: 2 step: 263, loss is 1.7709102630615234
epoch: 2 step: 264, loss is 1.7914643287658691
epoch: 2 step: 265, loss is 1.7718045711517334
epoch: 2 step: 266, loss is 1.8532987833023071
epoch: 2 step: 267, loss is 1.7140107154846191
epoch: 2 step: 268, loss is 1.868401288986206
epoch: 2 step: 269, loss is 1.831573486328125
epoch: 2 step: 270, loss is 1.9516619443893433
epoch: 2 step: 271, loss is 1.7784669399261475
epoch: 2 step: 272, loss is 1.7207751274108887
epoch: 2 step: 273, loss is 1.823856234550476
epoch: 2 step: 274, loss is 1.8284249305725098
epoch: 2 step: 275, loss is 1.8176841735839844
epoch: 2 step: 276, loss is 1.7832752466201782
epoch: 2 step: 277, loss is 1.8370630741119385
epoch: 2 step: 278, loss is 1.7447307109832764
epoch: 2 step: 279, loss is 1.8517227172851562
epoch: 2 step: 280, loss is 1.7266693115234375
epoch: 2 step: 281, loss is 1.8552228212356567
epoch: 2 step: 282, loss is 1.8542191982269287
epoch: 2 step: 283, loss is 1.8414582014083862
epoch: 2 step: 284, loss is 1.6943976879119873
epoch: 2 step: 285, loss is 1.783471703529358
epoch: 2 step: 286, loss is 1.736589789390564
epoch: 2 step: 287, loss is 1.7182533740997314
epoch: 2 step: 288, loss is 1.672207236289978
epoch: 2 step: 289, loss is 1.7645574808120728
epoch: 2 step: 290, loss is 1.7900241613388062
epoch: 2 step: 291, loss is 1.6984515190124512
epoch: 2 step: 292, loss is 1.822729229927063
epoch: 2 step: 293, loss is 1.8808338642120361
epoch: 2 step: 294, loss is 1.6853476762771606
epoch: 2 step: 295, loss is 1.8157317638397217
epoch: 2 step: 296, loss is 1.8059608936309814
epoch: 2 step: 297, loss is 1.7534812688827515
epoch: 2 step: 298, loss is 1.76722252368927
epoch: 2 step: 299, loss is 1.8134665489196777
epoch: 2 step: 300, loss is 1.6877107620239258
epoch: 2 step: 301, loss is 1.6185097694396973
epoch: 2 step: 302, loss is 1.7758712768554688
epoch: 2 step: 303, loss is 1.755950689315796
epoch: 2 step: 304, loss is 1.704177737236023
epoch: 2 step: 305, loss is 1.6892074346542358
epoch: 2 step: 306, loss is 1.815639615058899
epoch: 2 step: 307, loss is 1.8549275398254395
epoch: 2 step: 308, loss is 1.8104307651519775
epoch: 2 step: 309, loss is 1.735305666923523
epoch: 2 step: 310, loss is 1.7396290302276611
epoch: 2 step: 311, loss is 1.8549031019210815
epoch: 2 step: 312, loss is 1.7778637409210205
epoch: 2 step: 313, loss is 1.6825811862945557
epoch: 2 step: 314, loss is 1.6482632160186768
epoch: 2 step: 315, loss is 1.7172439098358154
epoch: 2 step: 316, loss is 1.7548999786376953
epoch: 2 step: 317, loss is 1.5849336385726929
epoch: 2 step: 318, loss is 1.7280230522155762
epoch: 2 step: 319, loss is 1.8552422523498535
epoch: 2 step: 320, loss is 1.6886619329452515
epoch: 2 step: 321, loss is 1.7072863578796387
epoch: 2 step: 322, loss is 1.6938064098358154
epoch: 2 step: 323, loss is 1.6820656061172485
epoch: 2 step: 324, loss is 1.7380082607269287
epoch: 2 step: 325, loss is 1.804673671722412
epoch: 2 step: 326, loss is 1.7792102098464966
epoch: 2 step: 327, loss is 1.8998641967773438
epoch: 2 step: 328, loss is 1.7556779384613037
epoch: 2 step: 329, loss is 1.7103122472763062
epoch: 2 step: 330, loss is 1.8625636100769043
epoch: 2 step: 331, loss is 1.771412968635559
epoch: 2 step: 332, loss is 1.7709072828292847
epoch: 2 step: 333, loss is 1.6749439239501953
epoch: 2 step: 334, loss is 1.7068681716918945
epoch: 2 step: 335, loss is 1.7397358417510986
epoch: 2 step: 336, loss is 1.7584718465805054
epoch: 2 step: 337, loss is 1.7621710300445557
epoch: 2 step: 338, loss is 1.7754266262054443
epoch: 2 step: 339, loss is 1.7668453454971313
epoch: 2 step: 340, loss is 1.8041253089904785
epoch: 2 step: 341, loss is 1.7770304679870605
epoch: 2 step: 342, loss is 1.7730276584625244
epoch: 2 step: 343, loss is 1.7460929155349731
epoch: 2 step: 344, loss is 1.758213758468628
epoch: 2 step: 345, loss is 1.744318962097168
epoch: 2 step: 346, loss is 1.675571322441101
epoch: 2 step: 347, loss is 1.7015323638916016
epoch: 2 step: 348, loss is 1.7086268663406372
epoch: 2 step: 349, loss is 1.9846800565719604
epoch: 2 step: 350, loss is 1.7432291507720947
epoch: 2 step: 351, loss is 1.7636148929595947
epoch: 2 step: 352, loss is 1.7178688049316406
epoch: 2 step: 353, loss is 1.693436861038208
epoch: 2 step: 354, loss is 1.7899049520492554
epoch: 2 step: 355, loss is 1.8009570837020874
epoch: 2 step: 356, loss is 1.7731927633285522
epoch: 2 step: 357, loss is 1.7382633686065674
epoch: 2 step: 358, loss is 1.87336003780365
epoch: 2 step: 359, loss is 1.7038979530334473
epoch: 2 step: 360, loss is 1.7798569202423096
epoch: 2 step: 361, loss is 1.6971291303634644
epoch: 2 step: 362, loss is 1.6986044645309448
epoch: 2 step: 363, loss is 1.8026065826416016
epoch: 2 step: 364, loss is 1.6595711708068848
epoch: 2 step: 365, loss is 1.8221900463104248
epoch: 2 step: 366, loss is 1.7795649766921997
epoch: 2 step: 367, loss is 1.7483983039855957
epoch: 2 step: 368, loss is 1.8899071216583252
epoch: 2 step: 369, loss is 1.6659290790557861
epoch: 2 step: 370, loss is 1.746132254600525
epoch: 2 step: 371, loss is 1.683497428894043
epoch: 2 step: 372, loss is 1.7085784673690796
epoch: 2 step: 373, loss is 1.7269572019577026
epoch: 2 step: 374, loss is 1.7484683990478516
epoch: 2 step: 375, loss is 1.7971100807189941
epoch: 2 step: 376, loss is 1.8449757099151611
epoch: 2 step: 377, loss is 1.8399022817611694
epoch: 2 step: 378, loss is 1.777262806892395
epoch: 2 step: 379, loss is 1.7276949882507324
epoch: 2 step: 380, loss is 1.7123351097106934
epoch: 2 step: 381, loss is 1.7212237119674683
epoch: 2 step: 382, loss is 1.7655616998672485
epoch: 2 step: 383, loss is 1.7500815391540527
epoch: 2 step: 384, loss is 1.789592981338501
epoch: 2 step: 385, loss is 1.8415806293487549
epoch: 2 step: 386, loss is 1.6377445459365845
epoch: 2 step: 387, loss is 1.6689974069595337
epoch: 2 step: 388, loss is 1.746870994567871
epoch: 2 step: 389, loss is 1.6795246601104736
epoch: 2 step: 390, loss is 1.661218285560608
Train epoch time: 140430.625 ms, per step time: 360.079 ms
epoch: 3 step: 1, loss is 1.715036153793335
epoch: 3 step: 2, loss is 1.6474828720092773
epoch: 3 step: 3, loss is 1.657715916633606
epoch: 3 step: 4, loss is 1.5708024501800537
epoch: 3 step: 5, loss is 1.7823667526245117
epoch: 3 step: 6, loss is 1.672377586364746
epoch: 3 step: 7, loss is 1.7155275344848633
epoch: 3 step: 8, loss is 1.6355713605880737
epoch: 3 step: 9, loss is 1.6684916019439697
epoch: 3 step: 10, loss is 1.7313625812530518
epoch: 3 step: 11, loss is 1.6965363025665283
epoch: 3 step: 12, loss is 1.686906337738037
epoch: 3 step: 13, loss is 1.7073938846588135
epoch: 3 step: 14, loss is 1.6544504165649414
epoch: 3 step: 15, loss is 1.747122049331665
epoch: 3 step: 16, loss is 1.7504947185516357
epoch: 3 step: 17, loss is 1.7175284624099731
epoch: 3 step: 18, loss is 1.6662980318069458
epoch: 3 step: 19, loss is 1.7750905752182007
epoch: 3 step: 20, loss is 1.7788915634155273
epoch: 3 step: 21, loss is 1.640061378479004
epoch: 3 step: 22, loss is 1.7295269966125488
epoch: 3 step: 23, loss is 1.7082421779632568
epoch: 3 step: 24, loss is 1.7740626335144043
epoch: 3 step: 25, loss is 1.6252870559692383
epoch: 3 step: 26, loss is 1.7297760248184204
epoch: 3 step: 27, loss is 1.696889877319336
epoch: 3 step: 28, loss is 1.7444108724594116
epoch: 3 step: 29, loss is 1.6641700267791748
epoch: 3 step: 30, loss is 1.6326826810836792
epoch: 3 step: 31, loss is 1.6847602128982544
epoch: 3 step: 32, loss is 1.6761665344238281
epoch: 3 step: 33, loss is 1.8507628440856934
epoch: 3 step: 34, loss is 1.5723615884780884
epoch: 3 step: 35, loss is 1.6921000480651855
epoch: 3 step: 36, loss is 1.754352331161499
epoch: 3 step: 37, loss is 1.736189842224121
epoch: 3 step: 38, loss is 1.743882179260254
epoch: 3 step: 39, loss is 1.7992541790008545
epoch: 3 step: 40, loss is 1.774916172027588
epoch: 3 step: 41, loss is 1.7360620498657227
epoch: 3 step: 42, loss is 1.7283296585083008
epoch: 3 step: 43, loss is 1.7825541496276855
epoch: 3 step: 44, loss is 1.7632367610931396
epoch: 3 step: 45, loss is 1.7638394832611084
epoch: 3 step: 46, loss is 1.8444280624389648
epoch: 3 step: 47, loss is 1.7872759103775024
epoch: 3 step: 48, loss is 1.7166814804077148
epoch: 3 step: 49, loss is 1.80313241481781
epoch: 3 step: 50, loss is 1.6236335039138794
epoch: 3 step: 51, loss is 1.6777052879333496
epoch: 3 step: 52, loss is 1.7862696647644043
epoch: 3 step: 53, loss is 1.6133134365081787
epoch: 3 step: 54, loss is 1.695854902267456
epoch: 3 step: 55, loss is 1.7122721672058105
epoch: 3 step: 56, loss is 1.6208970546722412
epoch: 3 step: 57, loss is 1.7131991386413574
epoch: 3 step: 58, loss is 1.7421503067016602
epoch: 3 step: 59, loss is 1.7451450824737549
epoch: 3 step: 60, loss is 1.7532395124435425
epoch: 3 step: 61, loss is 1.6705751419067383
epoch: 3 step: 62, loss is 1.7952446937561035
epoch: 3 step: 63, loss is 1.795323133468628
epoch: 3 step: 64, loss is 1.7619107961654663
epoch: 3 step: 65, loss is 1.824674367904663
epoch: 3 step: 66, loss is 1.6873490810394287
epoch: 3 step: 67, loss is 1.7082254886627197
epoch: 3 step: 68, loss is 1.713441252708435
epoch: 3 step: 69, loss is 1.733770489692688
epoch: 3 step: 70, loss is 1.6999213695526123
epoch: 3 step: 71, loss is 1.653806209564209
epoch: 3 step: 72, loss is 1.7012852430343628
epoch: 3 step: 73, loss is 1.624730110168457
epoch: 3 step: 74, loss is 1.6058542728424072
epoch: 3 step: 75, loss is 1.7592147588729858
epoch: 3 step: 76, loss is 1.7891876697540283
epoch: 3 step: 77, loss is 1.6831679344177246
epoch: 3 step: 78, loss is 1.5972199440002441
epoch: 3 step: 79, loss is 1.7847541570663452
epoch: 3 step: 80, loss is 1.5809112787246704
epoch: 3 step: 81, loss is 1.8128926753997803
epoch: 3 step: 82, loss is 1.5927855968475342
epoch: 3 step: 83, loss is 1.7312583923339844
epoch: 3 step: 84, loss is 1.753067970275879
epoch: 3 step: 85, loss is 1.738058090209961
epoch: 3 step: 86, loss is 1.7906383275985718
epoch: 3 step: 87, loss is 1.8381223678588867
epoch: 3 step: 88, loss is 1.7381422519683838
epoch: 3 step: 89, loss is 1.7485058307647705
epoch: 3 step: 90, loss is 1.7682496309280396
epoch: 3 step: 91, loss is 1.7035850286483765
epoch: 3 step: 92, loss is 1.692620038986206
epoch: 3 step: 93, loss is 1.6465071439743042
epoch: 3 step: 94, loss is 1.6246577501296997
epoch: 3 step: 95, loss is 1.6367791891098022
epoch: 3 step: 96, loss is 1.958984375
epoch: 3 step: 97, loss is 1.6318498849868774
epoch: 3 step: 98, loss is 1.5361195802688599
epoch: 3 step: 99, loss is 1.6217405796051025
epoch: 3 step: 100, loss is 1.602027177810669
epoch: 3 step: 101, loss is 1.5985596179962158
epoch: 3 step: 102, loss is 1.6872880458831787
epoch: 3 step: 103, loss is 1.7511180639266968
epoch: 3 step: 104, loss is 1.8850643634796143
epoch: 3 step: 105, loss is 1.7996480464935303
epoch: 3 step: 106, loss is 1.6875202655792236
epoch: 3 step: 107, loss is 1.713998794555664
epoch: 3 step: 108, loss is 1.5832546949386597
epoch: 3 step: 109, loss is 1.6304515600204468
epoch: 3 step: 110, loss is 1.7660911083221436
epoch: 3 step: 111, loss is 1.7160937786102295
epoch: 3 step: 112, loss is 1.5161769390106201
epoch: 3 step: 113, loss is 1.6995164155960083
epoch: 3 step: 114, loss is 1.8551595211029053
epoch: 3 step: 115, loss is 1.7827998399734497
epoch: 3 step: 116, loss is 1.6238816976547241
epoch: 3 step: 117, loss is 1.8509113788604736
epoch: 3 step: 118, loss is 1.7182554006576538
epoch: 3 step: 119, loss is 1.7552552223205566
epoch: 3 step: 121, loss is 1.661957025527954
epoch: 3 step: 122, loss is 1.6754220724105835
epoch: 3 step: 123, loss is 1.7662798166275024
epoch: 3 step: 124, loss is 1.7356739044189453
epoch: 3 step: 125, loss is 1.8035058975219727
epoch: 3 step: 126, loss is 1.692866563796997
epoch: 3 step: 127, loss is 1.7170971632003784
epoch: 3 step: 128, loss is 1.6585720777511597
epoch: 3 step: 129, loss is 1.72513747215271
epoch: 3 step: 130, loss is 1.7004284858703613
epoch: 3 step: 131, loss is 1.676867961883545
epoch: 3 step: 132, loss is 1.6530230045318604
epoch: 3 step: 133, loss is 1.7507742643356323
epoch: 3 step: 134, loss is 1.7966041564941406
epoch: 3 step: 135, loss is 1.7693893909454346
epoch: 3 step: 136, loss is 1.6658430099487305
epoch: 3 step: 137, loss is 1.6244542598724365
epoch: 3 step: 138, loss is 1.6259570121765137
epoch: 3 step: 139, loss is 1.6822419166564941
epoch: 3 step: 140, loss is 1.850181221961975
epoch: 3 step: 141, loss is 1.7429217100143433
epoch: 3 step: 142, loss is 1.69063138961792
epoch: 3 step: 143, loss is 1.7009793519973755
epoch: 3 step: 144, loss is 1.7540186643600464
epoch: 3 step: 145, loss is 1.7211129665374756
epoch: 3 step: 146, loss is 1.7596542835235596
epoch: 3 step: 147, loss is 1.621204137802124
epoch: 3 step: 148, loss is 1.6826605796813965
epoch: 3 step: 149, loss is 1.7301777601242065
epoch: 3 step: 150, loss is 1.7946531772613525
epoch: 3 step: 151, loss is 1.5984761714935303
epoch: 3 step: 152, loss is 1.6736040115356445
epoch: 3 step: 153, loss is 1.6323038339614868
epoch: 3 step: 154, loss is 1.6896190643310547
epoch: 3 step: 155, loss is 1.722955346107483
epoch: 3 step: 156, loss is 1.6411956548690796
epoch: 3 step: 157, loss is 1.7016927003860474
epoch: 3 step: 158, loss is 1.7301901578903198
epoch: 3 step: 159, loss is 1.7329767942428589
epoch: 3 step: 160, loss is 1.6560049057006836
epoch: 3 step: 161, loss is 1.7290512323379517
epoch: 3 step: 162, loss is 1.6875516176223755
epoch: 3 step: 163, loss is 1.6642581224441528
epoch: 3 step: 164, loss is 1.8076648712158203
epoch: 3 step: 165, loss is 1.7624316215515137
epoch: 3 step: 166, loss is 1.76237952709198
epoch: 3 step: 167, loss is 1.6950775384902954
epoch: 3 step: 168, loss is 1.620380163192749
epoch: 3 step: 169, loss is 1.8368569612503052
epoch: 3 step: 170, loss is 1.6234006881713867
epoch: 3 step: 171, loss is 1.9272172451019287
epoch: 3 step: 172, loss is 1.7308259010314941
epoch: 3 step: 173, loss is 1.7092185020446777
epoch: 3 step: 174, loss is 1.677390217781067
epoch: 3 step: 175, loss is 1.6899553537368774
epoch: 3 step: 176, loss is 1.7895318269729614
epoch: 3 step: 177, loss is 1.6374738216400146
epoch: 3 step: 178, loss is 1.6670913696289062
epoch: 3 step: 179, loss is 1.7139530181884766
epoch: 3 step: 180, loss is 1.7262768745422363
epoch: 3 step: 181, loss is 1.7273998260498047
epoch: 3 step: 182, loss is 1.6726765632629395
epoch: 3 step: 183, loss is 1.828143835067749
epoch: 3 step: 184, loss is 1.6950719356536865
epoch: 3 step: 185, loss is 1.6884915828704834
epoch: 3 step: 186, loss is 1.6581523418426514
epoch: 3 step: 187, loss is 1.6636587381362915
epoch: 3 step: 188, loss is 1.7778453826904297
epoch: 3 step: 189, loss is 1.6951615810394287
epoch: 3 step: 190, loss is 1.7078237533569336
epoch: 3 step: 191, loss is 1.7957340478897095
epoch: 3 step: 192, loss is 1.5808676481246948
epoch: 3 step: 193, loss is 1.7798625230789185
epoch: 3 step: 194, loss is 1.711366057395935
epoch: 3 step: 195, loss is 1.6328104734420776
epoch: 3 step: 196, loss is 1.673056721687317
epoch: 3 step: 197, loss is 1.8323453664779663
epoch: 3 step: 198, loss is 1.7388792037963867
epoch: 3 step: 199, loss is 1.6818656921386719
epoch: 3 step: 200, loss is 1.7251689434051514
epoch: 3 step: 201, loss is 1.7291287183761597
epoch: 3 step: 202, loss is 1.7162761688232422
epoch: 3 step: 203, loss is 1.7972345352172852
epoch: 3 step: 204, loss is 1.685490608215332
epoch: 3 step: 205, loss is 1.8276681900024414
epoch: 3 step: 206, loss is 1.7020764350891113
epoch: 3 step: 207, loss is 1.6709315776824951
epoch: 3 step: 208, loss is 1.6524537801742554
epoch: 3 step: 209, loss is 1.6768217086791992
epoch: 3 step: 210, loss is 1.7436610460281372
epoch: 3 step: 211, loss is 1.733081579208374
epoch: 3 step: 212, loss is 1.6846342086791992
epoch: 3 step: 213, loss is 1.6859984397888184
epoch: 3 step: 214, loss is 1.8113853931427002
epoch: 3 step: 215, loss is 1.6925203800201416
epoch: 3 step: 216, loss is 1.6171596050262451
epoch: 3 step: 217, loss is 1.6766417026519775
epoch: 3 step: 218, loss is 1.7185215950012207
epoch: 3 step: 219, loss is 1.5875868797302246
epoch: 3 step: 220, loss is 1.7078516483306885
epoch: 3 step: 221, loss is 1.6866421699523926
epoch: 3 step: 222, loss is 1.7786364555358887
epoch: 3 step: 223, loss is 1.6472911834716797
epoch: 3 step: 224, loss is 1.6140011548995972
epoch: 3 step: 225, loss is 1.6925756931304932
epoch: 3 step: 226, loss is 1.6128917932510376
epoch: 3 step: 227, loss is 1.814792275428772
epoch: 3 step: 228, loss is 1.6881840229034424
epoch: 3 step: 229, loss is 1.7506282329559326
epoch: 3 step: 230, loss is 1.6750633716583252
epoch: 3 step: 231, loss is 1.652660608291626
epoch: 3 step: 232, loss is 1.6869404315948486
epoch: 3 step: 233, loss is 1.6548941135406494
epoch: 3 step: 234, loss is 1.7189624309539795
epoch: 3 step: 235, loss is 1.6367188692092896
epoch: 3 step: 236, loss is 1.8399276733398438
epoch: 3 step: 237, loss is 1.6386637687683105
epoch: 3 step: 238, loss is 1.6106817722320557
epoch: 3 step: 239, loss is 1.5091688632965088
epoch: 3 step: 240, loss is 1.7534894943237305
epoch: 3 step: 241, loss is 1.7333624362945557
epoch: 3 step: 242, loss is 1.8093721866607666
epoch: 3 step: 243, loss is 1.7643702030181885
epoch: 3 step: 244, loss is 1.801408052444458
epoch: 3 step: 245, loss is 1.7226362228393555
epoch: 3 step: 246, loss is 1.681382656097412
epoch: 3 step: 247, loss is 1.7535449266433716
epoch: 3 step: 248, loss is 1.7454997301101685
epoch: 3 step: 249, loss is 1.625917911529541
epoch: 3 step: 250, loss is 1.6731133460998535
epoch: 3 step: 251, loss is 1.7583502531051636
epoch: 3 step: 252, loss is 1.670466423034668
epoch: 3 step: 253, loss is 1.7863614559173584
epoch: 3 step: 254, loss is 1.642508625984192
epoch: 3 step: 255, loss is 1.7797843217849731
epoch: 3 step: 256, loss is 1.7379049062728882
epoch: 3 step: 257, loss is 1.7443997859954834
epoch: 3 step: 258, loss is 1.656363606452942
epoch: 3 step: 259, loss is 1.6958774328231812
epoch: 3 step: 260, loss is 1.645730972290039
epoch: 3 step: 261, loss is 1.7251406908035278
epoch: 3 step: 262, loss is 1.578377366065979
epoch: 3 step: 263, loss is 1.6736092567443848
epoch: 3 step: 264, loss is 1.77023184299469
epoch: 3 step: 265, loss is 1.7185709476470947
epoch: 3 step: 266, loss is 1.5423574447631836
epoch: 3 step: 267, loss is 1.7092128992080688
epoch: 3 step: 268, loss is 1.6410908699035645
epoch: 3 step: 269, loss is 1.6624755859375
epoch: 3 step: 270, loss is 1.5723826885223389
epoch: 3 step: 271, loss is 1.6907060146331787
epoch: 3 step: 272, loss is 1.6497552394866943
epoch: 3 step: 273, loss is 1.6850025653839111
epoch: 3 step: 274, loss is 1.6133666038513184
epoch: 3 step: 275, loss is 1.6401327848434448
epoch: 3 step: 276, loss is 1.5669808387756348
epoch: 3 step: 277, loss is 1.6670163869857788
epoch: 3 step: 278, loss is 1.5849509239196777
epoch: 3 step: 279, loss is 1.6250602006912231
epoch: 3 step: 280, loss is 1.6068509817123413
epoch: 3 step: 281, loss is 1.687977910041809
epoch: 3 step: 282, loss is 1.7549883127212524
epoch: 3 step: 283, loss is 1.601057767868042
epoch: 3 step: 284, loss is 1.5960354804992676
epoch: 3 step: 285, loss is 1.6581069231033325
epoch: 3 step: 286, loss is 1.620342493057251
epoch: 3 step: 287, loss is 1.5567959547042847
epoch: 3 step: 288, loss is 1.6604020595550537
epoch: 3 step: 289, loss is 1.7468873262405396
epoch: 3 step: 290, loss is 1.6225519180297852
epoch: 3 step: 291, loss is 1.6740832328796387
epoch: 3 step: 292, loss is 1.6024932861328125
epoch: 3 step: 293, loss is 1.7326544523239136
epoch: 3 step: 294, loss is 1.7520151138305664
epoch: 3 step: 295, loss is 1.7345353364944458
epoch: 3 step: 296, loss is 1.658333659172058
epoch: 3 step: 297, loss is 1.6240086555480957
epoch: 3 step: 298, loss is 1.7309271097183228
epoch: 3 step: 299, loss is 1.6567069292068481
epoch: 3 step: 300, loss is 1.6143699884414673
epoch: 3 step: 301, loss is 1.6671743392944336
epoch: 3 step: 302, loss is 1.61781907081604
epoch: 3 step: 303, loss is 1.7606620788574219
epoch: 3 step: 304, loss is 1.6801334619522095
epoch: 3 step: 305, loss is 1.607485055923462
epoch: 3 step: 306, loss is 1.63124680519104
epoch: 3 step: 307, loss is 1.6822651624679565
epoch: 3 step: 308, loss is 1.6892354488372803
epoch: 3 step: 309, loss is 1.5519628524780273
epoch: 3 step: 310, loss is 1.5416107177734375
epoch: 3 step: 311, loss is 1.7223525047302246
epoch: 3 step: 312, loss is 1.6520583629608154
epoch: 3 step: 313, loss is 1.7451046705245972
epoch: 3 step: 314, loss is 1.7313584089279175
epoch: 3 step: 315, loss is 1.5561679601669312
epoch: 3 step: 316, loss is 1.5857634544372559
epoch: 3 step: 317, loss is 1.7505254745483398
epoch: 3 step: 318, loss is 1.5984164476394653
epoch: 3 step: 319, loss is 1.7173864841461182
epoch: 3 step: 320, loss is 1.657182216644287
epoch: 3 step: 321, loss is 1.5848265886306763
epoch: 3 step: 322, loss is 1.754823088645935
epoch: 3 step: 323, loss is 1.7772974967956543
epoch: 3 step: 324, loss is 1.6578189134597778
epoch: 3 step: 325, loss is 1.6364493370056152
epoch: 3 step: 326, loss is 1.717586636543274
epoch: 3 step: 327, loss is 1.676952838897705
epoch: 3 step: 328, loss is 1.7036805152893066
epoch: 3 step: 329, loss is 1.7681641578674316
epoch: 3 step: 330, loss is 1.6032052040100098
epoch: 3 step: 331, loss is 1.629135012626648
epoch: 3 step: 332, loss is 1.6627397537231445
epoch: 3 step: 333, loss is 1.589951992034912
epoch: 3 step: 334, loss is 1.6370927095413208
epoch: 3 step: 335, loss is 1.8232088088989258
epoch: 3 step: 336, loss is 1.635468602180481
epoch: 3 step: 337, loss is 1.6314421892166138
epoch: 3 step: 338, loss is 1.7062299251556396
epoch: 3 step: 339, loss is 1.6799325942993164
epoch: 3 step: 340, loss is 1.7260220050811768
epoch: 3 step: 341, loss is 1.7736895084381104
epoch: 3 step: 342, loss is 1.7677066326141357
epoch: 3 step: 343, loss is 1.62371826171875
epoch: 3 step: 344, loss is 1.8004579544067383
epoch: 3 step: 345, loss is 1.6883655786514282
epoch: 3 step: 346, loss is 1.680694341659546
epoch: 3 step: 347, loss is 1.6327433586120605
epoch: 3 step: 348, loss is 1.7157868146896362
epoch: 3 step: 349, loss is 1.553451657295227
epoch: 3 step: 350, loss is 1.5815865993499756
epoch: 3 step: 351, loss is 1.6608421802520752
epoch: 3 step: 352, loss is 1.6624281406402588
epoch: 3 step: 353, loss is 1.7508538961410522
epoch: 3 step: 354, loss is 1.6291323900222778
epoch: 3 step: 355, loss is 1.549778938293457
epoch: 3 step: 356, loss is 1.6602725982666016
epoch: 3 step: 357, loss is 1.6622283458709717
epoch: 3 step: 358, loss is 1.6748517751693726
epoch: 3 step: 359, loss is 1.628075122833252
epoch: 3 step: 360, loss is 1.6040685176849365
epoch: 3 step: 361, loss is 1.6652014255523682
epoch: 3 step: 362, loss is 1.6706931591033936
epoch: 3 step: 363, loss is 1.763084888458252
epoch: 3 step: 364, loss is 1.5816519260406494
epoch: 3 step: 365, loss is 1.6744717359542847
epoch: 3 step: 366, loss is 1.6755478382110596
epoch: 3 step: 367, loss is 1.6712672710418701
epoch: 3 step: 368, loss is 1.644205927848816
epoch: 3 step: 369, loss is 1.8095952272415161
epoch: 3 step: 370, loss is 1.700730800628662
epoch: 3 step: 371, loss is 1.6767041683197021
epoch: 3 step: 372, loss is 1.7047760486602783
epoch: 3 step: 373, loss is 1.6957398653030396
epoch: 3 step: 374, loss is 1.724259853363037
epoch: 3 step: 375, loss is 1.719929814338684
epoch: 3 step: 376, loss is 1.7747626304626465
epoch: 3 step: 377, loss is 1.663613200187683
epoch: 3 step: 378, loss is 1.6453518867492676
epoch: 3 step: 379, loss is 1.7729018926620483
epoch: 3 step: 380, loss is 1.6704776287078857
epoch: 3 step: 381, loss is 1.73311448097229
epoch: 3 step: 382, loss is 1.6858258247375488
epoch: 3 step: 383, loss is 1.7387051582336426
epoch: 3 step: 384, loss is 1.6899727582931519
epoch: 3 step: 385, loss is 1.668046474456787
epoch: 3 step: 386, loss is 1.744113564491272
epoch: 3 step: 387, loss is 1.7295618057250977
epoch: 3 step: 388, loss is 1.6718225479125977
epoch: 3 step: 389, loss is 1.6799012422561646
epoch: 3 step: 390, loss is 1.6646846532821655
Train epoch time: 137277.460 ms, per step time: 351.993 ms
epoch: 4 step: 1, loss is 1.7665646076202393
epoch: 4 step: 2, loss is 1.7729929685592651
epoch: 4 step: 3, loss is 1.6470832824707031
epoch: 4 step: 4, loss is 1.6297564506530762
epoch: 4 step: 5, loss is 1.8630563020706177
epoch: 4 step: 6, loss is 1.602109432220459
epoch: 4 step: 7, loss is 1.6661564111709595
epoch: 4 step: 8, loss is 1.6210423707962036
epoch: 4 step: 9, loss is 1.718315601348877
epoch: 4 step: 10, loss is 1.6194649934768677
epoch: 4 step: 11, loss is 1.6551474332809448
epoch: 4 step: 12, loss is 1.587424874305725
epoch: 4 step: 13, loss is 1.7578119039535522
epoch: 4 step: 14, loss is 1.5686590671539307
epoch: 4 step: 15, loss is 1.6391873359680176
epoch: 4 step: 16, loss is 1.6676154136657715
epoch: 4 step: 17, loss is 1.57009756565094
epoch: 4 step: 18, loss is 1.5941141843795776
epoch: 4 step: 19, loss is 1.600658655166626
epoch: 4 step: 20, loss is 1.739237666130066
epoch: 4 step: 21, loss is 1.731091856956482
epoch: 4 step: 22, loss is 1.6422715187072754
epoch: 4 step: 23, loss is 1.707399606704712
epoch: 4 step: 24, loss is 1.611019492149353
epoch: 4 step: 25, loss is 1.6092603206634521
epoch: 4 step: 26, loss is 1.7111058235168457
epoch: 4 step: 27, loss is 1.6297054290771484
epoch: 4 step: 28, loss is 1.5696802139282227
epoch: 4 step: 29, loss is 1.6419718265533447
epoch: 4 step: 30, loss is 1.6743837594985962
epoch: 4 step: 31, loss is 1.582382082939148
epoch: 4 step: 32, loss is 1.577364444732666
epoch: 4 step: 33, loss is 1.6486183404922485
epoch: 4 step: 34, loss is 1.5592169761657715
epoch: 4 step: 35, loss is 1.7035810947418213
epoch: 4 step: 36, loss is 1.6690112352371216
epoch: 4 step: 37, loss is 1.7288247346878052
epoch: 4 step: 38, loss is 1.6016851663589478
epoch: 4 step: 39, loss is 1.5863162279129028
epoch: 4 step: 40, loss is 1.5745949745178223
epoch: 4 step: 41, loss is 1.7239658832550049
epoch: 4 step: 42, loss is 1.6004254817962646
epoch: 4 step: 43, loss is 1.6561836004257202
epoch: 4 step: 44, loss is 1.5571510791778564
epoch: 4 step: 45, loss is 1.674217700958252
epoch: 4 step: 46, loss is 1.6153512001037598
epoch: 4 step: 47, loss is 1.6411802768707275
epoch: 4 step: 48, loss is 1.7577253580093384
epoch: 4 step: 49, loss is 1.5709835290908813
epoch: 4 step: 50, loss is 1.6777337789535522
epoch: 4 step: 51, loss is 1.5701429843902588
epoch: 4 step: 52, loss is 1.6471360921859741
epoch: 4 step: 53, loss is 1.6432157754898071
epoch: 4 step: 54, loss is 1.5561511516571045
epoch: 4 step: 55, loss is 1.526894211769104
epoch: 4 step: 56, loss is 1.602760672569275
epoch: 4 step: 57, loss is 1.6962835788726807
epoch: 4 step: 58, loss is 1.5803269147872925
epoch: 4 step: 59, loss is 1.6841492652893066
epoch: 4 step: 60, loss is 1.5673837661743164
epoch: 4 step: 61, loss is 1.5781997442245483
epoch: 4 step: 62, loss is 1.7395633459091187
epoch: 4 step: 63, loss is 1.7205133438110352
epoch: 4 step: 64, loss is 1.6280759572982788
epoch: 4 step: 65, loss is 1.5832451581954956
epoch: 4 step: 66, loss is 1.6047499179840088
epoch: 4 step: 67, loss is 1.6179279088974
epoch: 4 step: 68, loss is 1.6697516441345215
epoch: 4 step: 69, loss is 1.758061408996582
epoch: 4 step: 70, loss is 1.6961665153503418
epoch: 4 step: 71, loss is 1.6344490051269531
epoch: 4 step: 72, loss is 1.6972894668579102
epoch: 4 step: 73, loss is 1.5842139720916748
epoch: 4 step: 74, loss is 1.689929485321045
epoch: 4 step: 75, loss is 1.7666369676589966
epoch: 4 step: 76, loss is 1.7107584476470947
epoch: 4 step: 77, loss is 1.6161322593688965
epoch: 4 step: 78, loss is 1.6704270839691162
epoch: 4 step: 79, loss is 1.8084357976913452
epoch: 4 step: 80, loss is 1.624877691268921
epoch: 4 step: 81, loss is 1.792837142944336
epoch: 4 step: 82, loss is 1.6415560245513916
epoch: 4 step: 83, loss is 1.8001680374145508
epoch: 4 step: 84, loss is 1.7114930152893066
epoch: 4 step: 85, loss is 1.6641674041748047
epoch: 4 step: 86, loss is 1.7693594694137573
epoch: 4 step: 87, loss is 1.7237135171890259
epoch: 4 step: 88, loss is 1.6091974973678589
epoch: 4 step: 89, loss is 1.5529742240905762
epoch: 4 step: 90, loss is 1.6230971813201904
epoch: 4 step: 91, loss is 1.6457107067108154
epoch: 4 step: 92, loss is 1.756361722946167
epoch: 4 step: 93, loss is 1.627695918083191
epoch: 4 step: 94, loss is 1.6984617710113525
epoch: 4 step: 95, loss is 1.7225611209869385
epoch: 4 step: 96, loss is 1.8529340028762817
epoch: 4 step: 97, loss is 1.6294504404067993
epoch: 4 step: 98, loss is 1.6180821657180786
epoch: 4 step: 99, loss is 1.5178242921829224
epoch: 4 step: 100, loss is 1.6345207691192627
epoch: 4 step: 101, loss is 1.6975094079971313
epoch: 4 step: 102, loss is 1.6876051425933838
epoch: 4 step: 103, loss is 1.5193448066711426
epoch: 4 step: 104, loss is 1.5847586393356323
epoch: 4 step: 105, loss is 1.7236822843551636
epoch: 4 step: 106, loss is 1.6124868392944336
epoch: 4 step: 107, loss is 1.7172006368637085
epoch: 4 step: 108, loss is 1.6853677034378052
epoch: 4 step: 109, loss is 1.6522276401519775
epoch: 4 step: 110, loss is 1.570724368095398
epoch: 4 step: 111, loss is 1.6803762912750244
epoch: 4 step: 112, loss is 1.6421411037445068
epoch: 4 step: 113, loss is 1.816841721534729
epoch: 4 step: 114, loss is 1.685617208480835
epoch: 4 step: 115, loss is 1.6247706413269043
epoch: 4 step: 116, loss is 1.629885196685791
epoch: 4 step: 117, loss is 1.6980822086334229
epoch: 4 step: 118, loss is 1.7601029872894287
epoch: 4 step: 119, loss is 1.7125773429870605
epoch: 4 step: 120, loss is 1.7811102867126465
epoch: 4 step: 121, loss is 1.6890501976013184
epoch: 4 step: 122, loss is 1.5726172924041748
epoch: 4 step: 123, loss is 1.5811424255371094
epoch: 4 step: 124, loss is 1.6147644519805908
epoch: 4 step: 125, loss is 1.7560322284698486
epoch: 4 step: 126, loss is 1.6851052045822144
epoch: 4 step: 127, loss is 1.6187108755111694
epoch: 4 step: 128, loss is 1.6883559226989746
epoch: 4 step: 129, loss is 1.7009637355804443
epoch: 4 step: 130, loss is 1.6632158756256104
epoch: 4 step: 131, loss is 1.620411992073059
epoch: 4 step: 132, loss is 1.5214558839797974
epoch: 4 step: 133, loss is 1.661008358001709
epoch: 4 step: 134, loss is 1.8005833625793457
epoch: 4 step: 135, loss is 1.7580019235610962
epoch: 4 step: 136, loss is 1.6286892890930176
epoch: 4 step: 137, loss is 1.588380217552185
epoch: 4 step: 138, loss is 1.7015607357025146
epoch: 4 step: 139, loss is 1.6343886852264404
epoch: 4 step: 140, loss is 1.766648530960083
epoch: 4 step: 141, loss is 1.6016619205474854
epoch: 4 step: 142, loss is 1.6623468399047852
epoch: 4 step: 143, loss is 1.6932034492492676
epoch: 4 step: 144, loss is 1.5762094259262085
epoch: 4 step: 145, loss is 1.648411750793457
epoch: 4 step: 146, loss is 1.5969412326812744
epoch: 4 step: 147, loss is 1.6773122549057007
epoch: 4 step: 148, loss is 1.581958293914795
epoch: 4 step: 149, loss is 1.7402784824371338
epoch: 4 step: 150, loss is 1.7538864612579346
epoch: 4 step: 151, loss is 1.739711046218872
epoch: 4 step: 152, loss is 1.717599630355835
epoch: 4 step: 153, loss is 1.6932005882263184
epoch: 4 step: 154, loss is 1.5731552839279175
epoch: 4 step: 155, loss is 1.6623612642288208
epoch: 4 step: 156, loss is 1.6327764987945557
epoch: 4 step: 157, loss is 1.791067123413086
epoch: 4 step: 158, loss is 1.649024248123169
epoch: 4 step: 159, loss is 1.6950156688690186
epoch: 4 step: 160, loss is 1.693366289138794
epoch: 4 step: 161, loss is 1.7357239723205566
epoch: 4 step: 162, loss is 1.6453094482421875
epoch: 4 step: 163, loss is 1.7228856086730957
epoch: 4 step: 164, loss is 1.737823724746704
epoch: 4 step: 165, loss is 1.6535261869430542
epoch: 4 step: 166, loss is 1.5822927951812744
epoch: 4 step: 167, loss is 1.7670165300369263
epoch: 4 step: 168, loss is 1.6337506771087646
epoch: 4 step: 169, loss is 1.626481294631958
epoch: 4 step: 170, loss is 1.614647388458252
epoch: 4 step: 171, loss is 1.5730955600738525
epoch: 4 step: 172, loss is 1.6641924381256104
epoch: 4 step: 173, loss is 1.7176563739776611
epoch: 4 step: 174, loss is 1.7853620052337646
epoch: 4 step: 175, loss is 1.6836720705032349
epoch: 4 step: 176, loss is 1.626739740371704
epoch: 4 step: 177, loss is 1.6609660387039185
epoch: 4 step: 178, loss is 1.658376693725586
epoch: 4 step: 179, loss is 1.6433024406433105
epoch: 4 step: 180, loss is 1.683169960975647
epoch: 4 step: 181, loss is 1.731007695198059
epoch: 4 step: 182, loss is 1.5968960523605347
epoch: 4 step: 183, loss is 1.7125822305679321
epoch: 4 step: 184, loss is 1.7482081651687622
epoch: 4 step: 185, loss is 1.4856934547424316
epoch: 4 step: 186, loss is 1.693978190422058
epoch: 4 step: 187, loss is 1.7054548263549805
epoch: 4 step: 188, loss is 1.617133617401123
epoch: 4 step: 189, loss is 1.7198596000671387
epoch: 4 step: 190, loss is 1.5512173175811768
epoch: 4 step: 191, loss is 1.6030992269515991
epoch: 4 step: 192, loss is 1.634267807006836
epoch: 4 step: 193, loss is 1.7750589847564697
epoch: 4 step: 194, loss is 1.6731497049331665
epoch: 4 step: 195, loss is 1.654366374015808
epoch: 4 step: 196, loss is 1.611971139907837
epoch: 4 step: 197, loss is 1.535520076751709
epoch: 4 step: 198, loss is 1.710953950881958
epoch: 4 step: 199, loss is 1.7189371585845947
epoch: 4 step: 200, loss is 1.6510109901428223
epoch: 4 step: 201, loss is 1.6933293342590332
epoch: 4 step: 202, loss is 1.6117509603500366
epoch: 4 step: 203, loss is 1.696395993232727
epoch: 4 step: 204, loss is 1.708708643913269
epoch: 4 step: 205, loss is 1.7166845798492432
epoch: 4 step: 206, loss is 1.549208164215088
epoch: 4 step: 207, loss is 1.646860122680664
epoch: 4 step: 208, loss is 1.761258602142334
epoch: 4 step: 209, loss is 1.632847785949707
epoch: 4 step: 210, loss is 1.6269383430480957
epoch: 4 step: 211, loss is 1.6960309743881226
epoch: 4 step: 212, loss is 1.692077875137329
epoch: 4 step: 213, loss is 1.7015795707702637
epoch: 4 step: 214, loss is 1.6602762937545776
epoch: 4 step: 215, loss is 1.5575273036956787
epoch: 4 step: 216, loss is 1.641848087310791
epoch: 4 step: 217, loss is 1.6609628200531006
epoch: 4 step: 218, loss is 1.647723913192749
epoch: 4 step: 219, loss is 1.5897362232208252
epoch: 4 step: 220, loss is 1.6362701654434204
epoch: 4 step: 221, loss is 1.4809000492095947
epoch: 4 step: 222, loss is 1.6989250183105469
epoch: 4 step: 223, loss is 1.684139609336853
epoch: 4 step: 224, loss is 1.604182243347168
epoch: 4 step: 225, loss is 1.597058892250061
epoch: 4 step: 226, loss is 1.5693527460098267
epoch: 4 step: 227, loss is 1.675795555114746
epoch: 4 step: 228, loss is 1.7298972606658936
epoch: 4 step: 229, loss is 1.5077394247055054
epoch: 4 step: 230, loss is 1.7128574848175049
epoch: 4 step: 231, loss is 1.6700948476791382
epoch: 4 step: 232, loss is 1.7240345478057861
epoch: 4 step: 233, loss is 1.6436165571212769
epoch: 4 step: 234, loss is 1.597033977508545
epoch: 4 step: 235, loss is 1.630689263343811
epoch: 4 step: 236, loss is 1.5650776624679565
epoch: 4 step: 237, loss is 1.5909273624420166
epoch: 4 step: 238, loss is 1.5114649534225464
epoch: 4 step: 239, loss is 1.4799362421035767
epoch: 4 step: 240, loss is 1.6256462335586548
epoch: 4 step: 241, loss is 1.6227055788040161
epoch: 4 step: 242, loss is 1.6424766778945923
epoch: 4 step: 243, loss is 1.6381595134735107
epoch: 4 step: 244, loss is 1.4983725547790527
epoch: 4 step: 245, loss is 1.6759635210037231
epoch: 4 step: 246, loss is 1.6533443927764893
epoch: 4 step: 247, loss is 1.6394306421279907
epoch: 4 step: 248, loss is 1.5157358646392822
epoch: 4 step: 249, loss is 1.794636845588684
epoch: 4 step: 250, loss is 1.543221354484558
epoch: 4 step: 251, loss is 1.7333502769470215
epoch: 4 step: 252, loss is 1.5999860763549805
epoch: 4 step: 253, loss is 1.7100112438201904
epoch: 4 step: 254, loss is 1.6709332466125488
epoch: 4 step: 255, loss is 1.734560489654541
epoch: 4 step: 256, loss is 1.6745027303695679
epoch: 4 step: 257, loss is 1.5218114852905273
epoch: 4 step: 258, loss is 1.6951849460601807
epoch: 4 step: 259, loss is 1.7123606204986572
epoch: 4 step: 260, loss is 1.587955117225647
epoch: 4 step: 261, loss is 1.6506975889205933
epoch: 4 step: 262, loss is 1.679764747619629
epoch: 4 step: 263, loss is 1.5851755142211914
epoch: 4 step: 264, loss is 1.531790018081665
epoch: 4 step: 265, loss is 1.660425066947937
epoch: 4 step: 266, loss is 1.5950064659118652
epoch: 4 step: 267, loss is 1.7657142877578735
epoch: 4 step: 268, loss is 1.6085398197174072
epoch: 4 step: 269, loss is 1.6423161029815674
epoch: 4 step: 270, loss is 1.661104440689087
epoch: 4 step: 271, loss is 1.6248400211334229
epoch: 4 step: 272, loss is 1.5915062427520752
epoch: 4 step: 273, loss is 1.61185884475708
epoch: 4 step: 274, loss is 1.5535991191864014
epoch: 4 step: 275, loss is 1.6014961004257202
epoch: 4 step: 276, loss is 1.4933654069900513
epoch: 4 step: 277, loss is 1.5550355911254883
epoch: 4 step: 278, loss is 1.679355263710022
epoch: 4 step: 279, loss is 1.5373117923736572
epoch: 4 step: 280, loss is 1.5168204307556152
epoch: 4 step: 281, loss is 1.5989758968353271
epoch: 4 step: 282, loss is 1.5473474264144897
epoch: 4 step: 283, loss is 1.6486226320266724
epoch: 4 step: 284, loss is 1.5247681140899658
epoch: 4 step: 285, loss is 1.4759743213653564
epoch: 4 step: 286, loss is 1.5603187084197998
epoch: 4 step: 287, loss is 1.5089948177337646
epoch: 4 step: 288, loss is 1.6216365098953247
epoch: 4 step: 289, loss is 1.6203341484069824
epoch: 4 step: 290, loss is 1.6797423362731934
epoch: 4 step: 291, loss is 1.5971567630767822
epoch: 4 step: 292, loss is 1.6036550998687744
epoch: 4 step: 293, loss is 1.475748062133789
epoch: 4 step: 294, loss is 1.652454137802124
epoch: 4 step: 295, loss is 1.646332025527954
epoch: 4 step: 296, loss is 1.6595573425292969
epoch: 4 step: 297, loss is 1.7773164510726929
epoch: 4 step: 298, loss is 1.487428903579712
epoch: 4 step: 299, loss is 1.7178772687911987
epoch: 4 step: 300, loss is 1.663989782333374
epoch: 4 step: 301, loss is 1.6825196743011475
epoch: 4 step: 302, loss is 1.5681579113006592
epoch: 4 step: 303, loss is 1.7053875923156738
epoch: 4 step: 304, loss is 1.634131669998169
epoch: 4 step: 305, loss is 1.5480968952178955
epoch: 4 step: 306, loss is 1.7233874797821045
epoch: 4 step: 307, loss is 1.5639179944992065
epoch: 4 step: 308, loss is 1.4927529096603394
epoch: 4 step: 309, loss is 1.7415945529937744
epoch: 4 step: 310, loss is 1.6337419748306274
epoch: 4 step: 311, loss is 1.568310260772705
epoch: 4 step: 312, loss is 1.566457748413086
epoch: 4 step: 313, loss is 1.7154525518417358
epoch: 4 step: 314, loss is 1.6517589092254639
epoch: 4 step: 315, loss is 1.6655876636505127
epoch: 4 step: 316, loss is 1.6702594757080078
epoch: 4 step: 317, loss is 1.651951789855957
epoch: 4 step: 318, loss is 1.6761770248413086
epoch: 4 step: 319, loss is 1.5988519191741943
epoch: 4 step: 320, loss is 1.588598370552063
epoch: 4 step: 321, loss is 1.6207096576690674
epoch: 4 step: 322, loss is 1.5871853828430176
epoch: 4 step: 323, loss is 1.6355260610580444
epoch: 4 step: 324, loss is 1.6514222621917725
epoch: 4 step: 325, loss is 1.6661778688430786
epoch: 4 step: 326, loss is 1.5839908123016357
epoch: 4 step: 327, loss is 1.5816900730133057
epoch: 4 step: 328, loss is 1.6220271587371826
epoch: 4 step: 329, loss is 1.5795402526855469
epoch: 4 step: 330, loss is 1.6900765895843506
epoch: 4 step: 331, loss is 1.62031888961792
epoch: 4 step: 332, loss is 1.609192132949829
epoch: 4 step: 333, loss is 1.602006435394287
epoch: 4 step: 334, loss is 1.4889105558395386
epoch: 4 step: 335, loss is 1.6649023294448853
epoch: 4 step: 336, loss is 1.5220891237258911
epoch: 4 step: 337, loss is 1.6232905387878418
epoch: 4 step: 338, loss is 1.636336088180542
epoch: 4 step: 339, loss is 1.5883915424346924
epoch: 4 step: 340, loss is 1.6407790184020996
epoch: 4 step: 341, loss is 1.641093373298645
epoch: 4 step: 342, loss is 1.7343354225158691
epoch: 4 step: 343, loss is 1.6026684045791626
epoch: 4 step: 344, loss is 1.6374675035476685
epoch: 4 step: 345, loss is 1.4993141889572144
epoch: 4 step: 346, loss is 1.479374885559082
epoch: 4 step: 347, loss is 1.615074872970581
epoch: 4 step: 348, loss is 1.50643789768219
epoch: 4 step: 349, loss is 1.5816335678100586
epoch: 4 step: 350, loss is 1.5549730062484741
epoch: 4 step: 351, loss is 1.6186494827270508
epoch: 4 step: 352, loss is 1.5993502140045166
epoch: 4 step: 353, loss is 1.5495643615722656
epoch: 4 step: 354, loss is 1.6571474075317383
epoch: 4 step: 355, loss is 1.560755968093872
epoch: 4 step: 356, loss is 1.5330283641815186
epoch: 4 step: 357, loss is 1.5570311546325684
epoch: 4 step: 358, loss is 1.6629343032836914
epoch: 4 step: 359, loss is 1.5750832557678223
epoch: 4 step: 360, loss is 1.6677916049957275
epoch: 4 step: 361, loss is 1.6378893852233887
epoch: 4 step: 362, loss is 1.6026842594146729
epoch: 4 step: 363, loss is 1.6248705387115479
epoch: 4 step: 364, loss is 1.5606310367584229
epoch: 4 step: 365, loss is 1.5833842754364014
epoch: 4 step: 366, loss is 1.6341984272003174
epoch: 4 step: 367, loss is 1.552811861038208
epoch: 4 step: 368, loss is 1.627872109413147
epoch: 4 step: 369, loss is 1.6115565299987793
epoch: 4 step: 370, loss is 1.635170578956604
epoch: 4 step: 371, loss is 1.5942012071609497
epoch: 4 step: 372, loss is 1.529130220413208
epoch: 4 step: 373, loss is 1.5580171346664429
epoch: 4 step: 374, loss is 1.5679078102111816
epoch: 4 step: 375, loss is 1.5632736682891846
epoch: 4 step: 376, loss is 1.6893508434295654
epoch: 4 step: 377, loss is 1.5821154117584229
epoch: 4 step: 378, loss is 1.5859253406524658
epoch: 4 step: 379, loss is 1.6288336515426636
epoch: 4 step: 380, loss is 1.6055165529251099
epoch: 4 step: 381, loss is 1.6159390211105347
epoch: 4 step: 382, loss is 1.7954552173614502
epoch: 4 step: 383, loss is 1.6712247133255005
epoch: 4 step: 384, loss is 1.490175724029541
epoch: 4 step: 385, loss is 1.6611671447753906
epoch: 4 step: 386, loss is 1.5845706462860107
epoch: 4 step: 387, loss is 1.5821824073791504
epoch: 4 step: 388, loss is 1.554640293121338
epoch: 4 step: 389, loss is 1.5333958864212036
epoch: 4 step: 390, loss is 1.5936779975891113
Train epoch time: 138425.337 ms, per step time: 354.937 ms
epoch: 5 step: 1, loss is 1.5464458465576172
epoch: 5 step: 2, loss is 1.5490660667419434
epoch: 5 step: 3, loss is 1.642794132232666
epoch: 5 step: 4, loss is 1.5246491432189941
epoch: 5 step: 5, loss is 1.6389213800430298
epoch: 5 step: 6, loss is 1.6280150413513184
epoch: 5 step: 7, loss is 1.5669608116149902
epoch: 5 step: 8, loss is 1.5948519706726074
epoch: 5 step: 9, loss is 1.5713417530059814
epoch: 5 step: 10, loss is 1.5953696966171265
epoch: 5 step: 11, loss is 1.7376065254211426
epoch: 5 step: 12, loss is 1.4884297847747803
epoch: 5 step: 13, loss is 1.5369075536727905
epoch: 5 step: 14, loss is 1.5518091917037964
epoch: 5 step: 15, loss is 1.489527940750122
epoch: 5 step: 16, loss is 1.5873897075653076
epoch: 5 step: 17, loss is 1.4995498657226562
epoch: 5 step: 18, loss is 1.6806131601333618
epoch: 5 step: 19, loss is 1.685680627822876
epoch: 5 step: 20, loss is 1.5529975891113281
epoch: 5 step: 21, loss is 1.548069715499878
epoch: 5 step: 22, loss is 1.6290614604949951
epoch: 5 step: 23, loss is 1.6917892694473267
epoch: 5 step: 24, loss is 1.7500019073486328
epoch: 5 step: 25, loss is 1.6475093364715576
epoch: 5 step: 26, loss is 1.6011072397232056
epoch: 5 step: 27, loss is 1.5014625787734985
epoch: 5 step: 28, loss is 1.624285340309143
epoch: 5 step: 29, loss is 1.5557959079742432
epoch: 5 step: 30, loss is 1.6671175956726074
epoch: 5 step: 31, loss is 1.6444437503814697
epoch: 5 step: 32, loss is 1.71543288230896
epoch: 5 step: 33, loss is 1.707556962966919
epoch: 5 step: 34, loss is 1.5246912240982056
epoch: 5 step: 35, loss is 1.5803017616271973
epoch: 5 step: 36, loss is 1.6457101106643677
epoch: 5 step: 37, loss is 1.5279536247253418
epoch: 5 step: 38, loss is 1.6704199314117432
epoch: 5 step: 39, loss is 1.7229416370391846
epoch: 5 step: 40, loss is 1.7658991813659668
epoch: 5 step: 41, loss is 1.5633502006530762
epoch: 5 step: 42, loss is 1.53678560256958
epoch: 5 step: 43, loss is 1.5142439603805542
epoch: 5 step: 44, loss is 1.5646271705627441
epoch: 5 step: 45, loss is 1.6065638065338135
epoch: 5 step: 46, loss is 1.5175371170043945
epoch: 5 step: 47, loss is 1.6855463981628418
epoch: 5 step: 48, loss is 1.4915989637374878
epoch: 5 step: 49, loss is 1.5415297746658325
epoch: 5 step: 50, loss is 1.676391839981079
epoch: 5 step: 51, loss is 1.583856225013733
epoch: 5 step: 52, loss is 1.6200132369995117
epoch: 5 step: 53, loss is 1.5288734436035156
epoch: 5 step: 54, loss is 1.7616413831710815
epoch: 5 step: 55, loss is 1.5649387836456299
epoch: 5 step: 56, loss is 1.5781047344207764
epoch: 5 step: 57, loss is 1.5427920818328857
epoch: 5 step: 58, loss is 1.6305038928985596
epoch: 5 step: 59, loss is 1.5771982669830322
epoch: 5 step: 60, loss is 1.5448408126831055
epoch: 5 step: 61, loss is 1.649052381515503
epoch: 5 step: 62, loss is 1.5636181831359863
epoch: 5 step: 63, loss is 1.5163731575012207
epoch: 5 step: 64, loss is 1.6505489349365234
epoch: 5 step: 65, loss is 1.5638759136199951
epoch: 5 step: 66, loss is 1.4435665607452393
epoch: 5 step: 67, loss is 1.5043169260025024
epoch: 5 step: 68, loss is 1.5111082792282104
epoch: 5 step: 69, loss is 1.5772652626037598
epoch: 5 step: 70, loss is 1.5063726902008057
epoch: 5 step: 71, loss is 1.6912550926208496
epoch: 5 step: 72, loss is 1.5251529216766357
epoch: 5 step: 73, loss is 1.6467955112457275
epoch: 5 step: 74, loss is 1.5381712913513184
epoch: 5 step: 75, loss is 1.7241185903549194
epoch: 5 step: 76, loss is 1.6900105476379395
epoch: 5 step: 77, loss is 1.5301634073257446
epoch: 5 step: 78, loss is 1.6200151443481445
epoch: 5 step: 79, loss is 1.656822919845581
epoch: 5 step: 80, loss is 1.6287442445755005
epoch: 5 step: 81, loss is 1.6830562353134155
epoch: 5 step: 82, loss is 1.5854827165603638
epoch: 5 step: 83, loss is 1.5578904151916504
epoch: 5 step: 84, loss is 1.5953161716461182
epoch: 5 step: 85, loss is 1.5304856300354004
epoch: 5 step: 86, loss is 1.509493112564087
epoch: 5 step: 87, loss is 1.545473337173462
epoch: 5 step: 88, loss is 1.6200509071350098
epoch: 5 step: 89, loss is 1.5903046131134033
epoch: 5 step: 90, loss is 1.5930538177490234
epoch: 5 step: 91, loss is 1.6241464614868164
epoch: 5 step: 92, loss is 1.6602556705474854
epoch: 5 step: 93, loss is 1.6808658838272095
epoch: 5 step: 94, loss is 1.56581711769104
epoch: 5 step: 95, loss is 1.6218782663345337
epoch: 5 step: 96, loss is 1.569502592086792
epoch: 5 step: 97, loss is 1.5460721254348755
epoch: 5 step: 98, loss is 1.5093741416931152
epoch: 5 step: 99, loss is 1.583551049232483
epoch: 5 step: 100, loss is 1.6650238037109375
epoch: 5 step: 101, loss is 1.4839850664138794
epoch: 5 step: 102, loss is 1.5525320768356323
epoch: 5 step: 103, loss is 1.74044930934906
epoch: 5 step: 104, loss is 1.572702407836914
epoch: 5 step: 105, loss is 1.6207044124603271
epoch: 5 step: 106, loss is 1.6691240072250366
epoch: 5 step: 107, loss is 1.609930157661438
epoch: 5 step: 108, loss is 1.624471664428711
epoch: 5 step: 109, loss is 1.6149026155471802
epoch: 5 step: 110, loss is 1.5685765743255615
epoch: 5 step: 111, loss is 1.5273855924606323
epoch: 5 step: 112, loss is 1.4798717498779297
epoch: 5 step: 113, loss is 1.5261898040771484
epoch: 5 step: 114, loss is 1.6411833763122559
epoch: 5 step: 115, loss is 1.7388556003570557
epoch: 5 step: 116, loss is 1.585860252380371
epoch: 5 step: 117, loss is 1.5292375087738037
epoch: 5 step: 118, loss is 1.609330654144287
epoch: 5 step: 119, loss is 1.515244722366333
epoch: 5 step: 120, loss is 1.6614331007003784
epoch: 5 step: 121, loss is 1.6544468402862549
epoch: 5 step: 122, loss is 1.7791221141815186
epoch: 5 step: 123, loss is 1.6416209936141968
epoch: 5 step: 124, loss is 1.6372172832489014
epoch: 5 step: 125, loss is 1.5603387355804443
epoch: 5 step: 126, loss is 1.651618242263794
epoch: 5 step: 127, loss is 1.6001204252243042
epoch: 5 step: 128, loss is 1.6608588695526123
epoch: 5 step: 129, loss is 1.6286869049072266
epoch: 5 step: 130, loss is 1.6249475479125977
epoch: 5 step: 131, loss is 1.612933874130249
epoch: 5 step: 132, loss is 1.6135444641113281
epoch: 5 step: 133, loss is 1.5163543224334717
epoch: 5 step: 134, loss is 1.560084342956543
epoch: 5 step: 135, loss is 1.7146730422973633
epoch: 5 step: 136, loss is 1.5430347919464111
epoch: 5 step: 137, loss is 1.6204291582107544
epoch: 5 step: 138, loss is 1.5874114036560059
epoch: 5 step: 139, loss is 1.667879343032837
epoch: 5 step: 140, loss is 1.50150465965271
epoch: 5 step: 141, loss is 1.524300456047058
epoch: 5 step: 142, loss is 1.6768046617507935
epoch: 5 step: 143, loss is 1.6169962882995605
epoch: 5 step: 144, loss is 1.6757628917694092
epoch: 5 step: 145, loss is 1.7038618326187134
epoch: 5 step: 146, loss is 1.6276432275772095
epoch: 5 step: 147, loss is 1.6057173013687134
epoch: 5 step: 148, loss is 1.5628745555877686
epoch: 5 step: 149, loss is 1.5292811393737793
epoch: 5 step: 150, loss is 1.5407886505126953
epoch: 5 step: 151, loss is 1.7137854099273682
epoch: 5 step: 152, loss is 1.5135602951049805
epoch: 5 step: 153, loss is 1.5987777709960938
epoch: 5 step: 154, loss is 1.5855841636657715
epoch: 5 step: 155, loss is 1.5194423198699951
epoch: 5 step: 156, loss is 1.539477825164795
epoch: 5 step: 157, loss is 1.5126971006393433
epoch: 5 step: 158, loss is 1.654653787612915
epoch: 5 step: 159, loss is 1.6087019443511963
epoch: 5 step: 160, loss is 1.6396305561065674
epoch: 5 step: 161, loss is 1.561169981956482
epoch: 5 step: 162, loss is 1.749948263168335
epoch: 5 step: 163, loss is 1.687319278717041
epoch: 5 step: 164, loss is 1.4695658683776855
epoch: 5 step: 165, loss is 1.713590383529663
epoch: 5 step: 166, loss is 1.6356000900268555
epoch: 5 step: 167, loss is 1.6140506267547607
epoch: 5 step: 168, loss is 1.542904019355774
epoch: 5 step: 169, loss is 1.5961171388626099
epoch: 5 step: 170, loss is 1.6317425966262817
epoch: 5 step: 171, loss is 1.5642426013946533
epoch: 5 step: 172, loss is 1.4814541339874268
epoch: 5 step: 173, loss is 1.5676054954528809
epoch: 5 step: 174, loss is 1.562029480934143
epoch: 5 step: 175, loss is 1.6520495414733887
epoch: 5 step: 176, loss is 1.4933428764343262
epoch: 5 step: 177, loss is 1.4860643148422241
epoch: 5 step: 178, loss is 1.6030986309051514
epoch: 5 step: 179, loss is 1.647951364517212
epoch: 5 step: 180, loss is 1.566894769668579
epoch: 5 step: 181, loss is 1.5801382064819336
epoch: 5 step: 182, loss is 1.6369566917419434
epoch: 5 step: 183, loss is 1.5468864440917969
epoch: 5 step: 184, loss is 1.5584207773208618
epoch: 5 step: 185, loss is 1.4804672002792358
epoch: 5 step: 186, loss is 1.5537633895874023
epoch: 5 step: 187, loss is 1.6010252237319946
epoch: 5 step: 188, loss is 1.5608432292938232
epoch: 5 step: 189, loss is 1.7281334400177002
epoch: 5 step: 190, loss is 1.5919592380523682
epoch: 5 step: 191, loss is 1.733491063117981
epoch: 5 step: 192, loss is 1.435774326324463
epoch: 5 step: 193, loss is 1.5986652374267578
epoch: 5 step: 194, loss is 1.5278804302215576
epoch: 5 step: 195, loss is 1.6496506929397583
epoch: 5 step: 196, loss is 1.4763741493225098
epoch: 5 step: 197, loss is 1.6587899923324585
epoch: 5 step: 198, loss is 1.526326060295105
epoch: 5 step: 199, loss is 1.568572759628296
epoch: 5 step: 200, loss is 1.7952017784118652
epoch: 5 step: 201, loss is 1.6071856021881104
epoch: 5 step: 202, loss is 1.5374550819396973
epoch: 5 step: 203, loss is 1.598673939704895
epoch: 5 step: 204, loss is 1.598262071609497
epoch: 5 step: 205, loss is 1.467951774597168
epoch: 5 step: 206, loss is 1.6616650819778442
epoch: 5 step: 207, loss is 1.6752253770828247
epoch: 5 step: 208, loss is 1.6036208868026733
epoch: 5 step: 209, loss is 1.4383089542388916
epoch: 5 step: 210, loss is 1.5149781703948975
epoch: 5 step: 211, loss is 1.6035890579223633
epoch: 5 step: 212, loss is 1.47220778465271
epoch: 5 step: 213, loss is 1.6376032829284668
epoch: 5 step: 214, loss is 1.5957834720611572
epoch: 5 step: 215, loss is 1.5264352560043335
epoch: 5 step: 216, loss is 1.4593554735183716
epoch: 5 step: 217, loss is 1.4451793432235718
epoch: 5 step: 218, loss is 1.5605082511901855
epoch: 5 step: 219, loss is 1.5353803634643555
epoch: 5 step: 220, loss is 1.486664891242981
epoch: 5 step: 221, loss is 1.4957565069198608
epoch: 5 step: 222, loss is 1.4623057842254639
epoch: 5 step: 223, loss is 1.5755987167358398
epoch: 5 step: 224, loss is 1.6323907375335693
epoch: 5 step: 225, loss is 1.5898890495300293
epoch: 5 step: 226, loss is 1.6523762941360474
epoch: 5 step: 227, loss is 1.6369037628173828
epoch: 5 step: 228, loss is 1.6995673179626465
epoch: 5 step: 229, loss is 1.5883913040161133
epoch: 5 step: 230, loss is 1.5560717582702637
epoch: 5 step: 231, loss is 1.7499263286590576
epoch: 5 step: 232, loss is 1.4835195541381836
epoch: 5 step: 233, loss is 1.5076158046722412
epoch: 5 step: 234, loss is 1.6261305809020996
epoch: 5 step: 235, loss is 1.491349220275879
epoch: 5 step: 236, loss is 1.606314778327942
epoch: 5 step: 237, loss is 1.612988829612732
epoch: 5 step: 238, loss is 1.645527720451355
epoch: 5 step: 239, loss is 1.4435009956359863
epoch: 5 step: 240, loss is 1.5991226434707642
epoch: 5 step: 241, loss is 1.4744620323181152
epoch: 5 step: 242, loss is 1.5162442922592163
epoch: 5 step: 243, loss is 1.6369259357452393
epoch: 5 step: 244, loss is 1.5557739734649658
epoch: 5 step: 245, loss is 1.7018060684204102
epoch: 5 step: 246, loss is 1.6451897621154785
epoch: 5 step: 247, loss is 1.5741554498672485
epoch: 5 step: 248, loss is 1.5491695404052734
epoch: 5 step: 249, loss is 1.5999127626419067
epoch: 5 step: 250, loss is 1.6908071041107178
epoch: 5 step: 251, loss is 1.759283185005188
epoch: 5 step: 252, loss is 1.5661420822143555
epoch: 5 step: 253, loss is 1.6386096477508545
epoch: 5 step: 254, loss is 1.5429463386535645
epoch: 5 step: 255, loss is 1.527472972869873
epoch: 5 step: 256, loss is 1.6500145196914673
epoch: 5 step: 257, loss is 1.5596885681152344
epoch: 5 step: 258, loss is 1.6451014280319214
epoch: 5 step: 259, loss is 1.4805189371109009
epoch: 5 step: 260, loss is 1.6150541305541992
epoch: 5 step: 261, loss is 1.505074143409729
epoch: 5 step: 262, loss is 1.5360993146896362
epoch: 5 step: 263, loss is 1.6581367254257202
epoch: 5 step: 264, loss is 1.5800639390945435
epoch: 5 step: 265, loss is 1.562695026397705
epoch: 5 step: 266, loss is 1.5492441654205322
epoch: 5 step: 267, loss is 1.6822898387908936
epoch: 5 step: 268, loss is 1.6009395122528076
epoch: 5 step: 269, loss is 1.617034673690796
epoch: 5 step: 270, loss is 1.6110554933547974
epoch: 5 step: 271, loss is 1.700515627861023
epoch: 5 step: 272, loss is 1.5993274450302124
epoch: 5 step: 273, loss is 1.6145036220550537
epoch: 5 step: 274, loss is 1.5093674659729004
epoch: 5 step: 275, loss is 1.4977017641067505
epoch: 5 step: 276, loss is 1.4924139976501465
epoch: 5 step: 277, loss is 1.52211332321167
epoch: 5 step: 278, loss is 1.6417977809906006
epoch: 5 step: 279, loss is 1.551783800125122
epoch: 5 step: 280, loss is 1.4824382066726685
epoch: 5 step: 281, loss is 1.5753940343856812
epoch: 5 step: 282, loss is 1.6696312427520752
epoch: 5 step: 283, loss is 1.5151088237762451
epoch: 5 step: 284, loss is 1.5019285678863525
epoch: 5 step: 285, loss is 1.5295757055282593
epoch: 5 step: 286, loss is 1.5404324531555176
epoch: 5 step: 287, loss is 1.5012264251708984
epoch: 5 step: 288, loss is 1.6743615865707397
epoch: 5 step: 289, loss is 1.5306564569473267
epoch: 5 step: 290, loss is 1.5608266592025757
epoch: 5 step: 291, loss is 1.528624176979065
epoch: 5 step: 292, loss is 1.569962501525879
epoch: 5 step: 293, loss is 1.645022988319397
epoch: 5 step: 294, loss is 1.4979685544967651
epoch: 5 step: 295, loss is 1.5598974227905273
epoch: 5 step: 296, loss is 1.533524990081787
epoch: 5 step: 297, loss is 1.52768874168396
epoch: 5 step: 298, loss is 1.680696725845337
epoch: 5 step: 299, loss is 1.4535813331604004
epoch: 5 step: 300, loss is 1.6198949813842773
epoch: 5 step: 301, loss is 1.572227954864502
epoch: 5 step: 302, loss is 1.6345454454421997
epoch: 5 step: 303, loss is 1.641379475593567
epoch: 5 step: 304, loss is 1.6259126663208008
epoch: 5 step: 305, loss is 1.5314037799835205
epoch: 5 step: 306, loss is 1.491726279258728
epoch: 5 step: 307, loss is 1.655726432800293
epoch: 5 step: 308, loss is 1.5316689014434814
epoch: 5 step: 309, loss is 1.511293888092041
epoch: 5 step: 310, loss is 1.5328251123428345
epoch: 5 step: 311, loss is 1.5820999145507812
epoch: 5 step: 312, loss is 1.432289958000183
epoch: 5 step: 313, loss is 1.550790786743164
epoch: 5 step: 314, loss is 1.4269719123840332
epoch: 5 step: 315, loss is 1.5760592222213745
epoch: 5 step: 316, loss is 1.6118370294570923
epoch: 5 step: 317, loss is 1.501477599143982
epoch: 5 step: 318, loss is 1.4097479581832886
epoch: 5 step: 319, loss is 1.541562557220459
epoch: 5 step: 320, loss is 1.593937635421753
epoch: 5 step: 321, loss is 1.6400054693222046
epoch: 5 step: 322, loss is 1.5357779264450073
epoch: 5 step: 323, loss is 1.5754220485687256
epoch: 5 step: 324, loss is 1.475951075553894
epoch: 5 step: 325, loss is 1.485856533050537
epoch: 5 step: 326, loss is 1.636470913887024
epoch: 5 step: 327, loss is 1.4998888969421387
epoch: 5 step: 328, loss is 1.6415295600891113
epoch: 5 step: 329, loss is 1.5465683937072754
epoch: 5 step: 330, loss is 1.583432912826538
epoch: 5 step: 331, loss is 1.4831609725952148
epoch: 5 step: 332, loss is 1.4656082391738892
epoch: 5 step: 333, loss is 1.8529162406921387
epoch: 5 step: 334, loss is 1.5814213752746582
epoch: 5 step: 335, loss is 1.475433111190796
epoch: 5 step: 336, loss is 1.584326148033142
epoch: 5 step: 337, loss is 1.735493779182434
epoch: 5 step: 338, loss is 1.6232712268829346
epoch: 5 step: 339, loss is 1.6351675987243652
epoch: 5 step: 340, loss is 1.6148631572723389
epoch: 5 step: 341, loss is 1.5339014530181885
epoch: 5 step: 342, loss is 1.5618462562561035
epoch: 5 step: 343, loss is 1.4798628091812134
epoch: 5 step: 344, loss is 1.593613624572754
epoch: 5 step: 345, loss is 1.5689895153045654
epoch: 5 step: 346, loss is 1.5093587636947632
epoch: 5 step: 347, loss is 1.4439517259597778
epoch: 5 step: 348, loss is 1.554625153541565
epoch: 5 step: 349, loss is 1.555283546447754
epoch: 5 step: 350, loss is 1.6005525588989258
epoch: 5 step: 351, loss is 1.6001478433609009
epoch: 5 step: 352, loss is 1.5207288265228271
epoch: 5 step: 353, loss is 1.5999352931976318
epoch: 5 step: 354, loss is 1.640831708908081
epoch: 5 step: 355, loss is 1.590141773223877
epoch: 5 step: 356, loss is 1.5481913089752197
epoch: 5 step: 357, loss is 1.4639291763305664
epoch: 5 step: 358, loss is 1.5692020654678345
epoch: 5 step: 359, loss is 1.4926801919937134
epoch: 5 step: 360, loss is 1.5099605321884155
epoch: 5 step: 361, loss is 1.5415282249450684
epoch: 5 step: 362, loss is 1.558133840560913
epoch: 5 step: 363, loss is 1.5935063362121582
epoch: 5 step: 364, loss is 1.4944791793823242
epoch: 5 step: 365, loss is 1.5923937559127808
epoch: 5 step: 366, loss is 1.577401041984558
epoch: 5 step: 367, loss is 1.5803579092025757
epoch: 5 step: 368, loss is 1.5800776481628418
epoch: 5 step: 369, loss is 1.6132313013076782
epoch: 5 step: 370, loss is 1.5150730609893799
epoch: 5 step: 371, loss is 1.6051198244094849
epoch: 5 step: 372, loss is 1.5535749197006226
epoch: 5 step: 373, loss is 1.783123254776001
epoch: 5 step: 374, loss is 1.6735365390777588
epoch: 5 step: 375, loss is 1.8395044803619385
epoch: 5 step: 376, loss is 1.6252373456954956
epoch: 5 step: 377, loss is 1.6070736646652222
epoch: 5 step: 378, loss is 1.4488918781280518
epoch: 5 step: 379, loss is 1.6128714084625244
epoch: 5 step: 380, loss is 1.5303810834884644
epoch: 5 step: 381, loss is 1.5577932596206665
epoch: 5 step: 382, loss is 1.6230170726776123
epoch: 5 step: 383, loss is 1.6536394357681274
epoch: 5 step: 384, loss is 1.4754302501678467
epoch: 5 step: 385, loss is 1.6879909038543701
epoch: 5 step: 386, loss is 1.516668677330017
epoch: 5 step: 387, loss is 1.5837337970733643
epoch: 5 step: 388, loss is 1.720490574836731
epoch: 5 step: 389, loss is 1.5138230323791504
epoch: 5 step: 390, loss is 1.676946997642517
Train epoch time: 140180.187 ms, per step time: 359.436 ms
total time:0h 17m 58s
============== Train Success ==============

训练好的模型保存在当前目录的shufflenetv1-5_390.ckpt中,用作评估。

train函数定义了模型的训练过程。设置MindSpore的运行模式为Pynative模式,目标设备为Ascend。损失函数为交叉熵损失,学习率采用余弦退火策略,优化器用Momentum。使用Model类封装网络、损失函数和优化器,启用混合精度训练(AMP)。回调函数CheckpointConfig配置ModelCheckpoint的相关参数,ModelCheckpoint自动保存模型参数;TimeMonitor监测时间;LossMonitor监测损失值。

模型评估

在CIFAR-10的测试集上对模型进行评估。

设置好评估模型的路径后加载数据集,并设置Top 1, Top 5的评估标准,最后用model.eval()接口对模型进行评估。

from mindspore import load_checkpoint, load_param_into_net

def test():
    mindspore.set_context(mode=mindspore.GRAPH_MODE, device_target="Ascend")
    dataset = get_dataset("./dataset/cifar-10-batches-bin", 128, "test")
    net = ShuffleNetV1(model_size="2.0x", n_class=10)
    param_dict = load_checkpoint("shufflenetv1-5_390.ckpt")
    load_param_into_net(net, param_dict)
    net.set_train(False)
    loss = nn.CrossEntropyLoss(weight=None, reduction='mean', label_smoothing=0.1)
    eval_metrics = {'Loss': nn.Loss(), 'Top_1_Acc': Top1CategoricalAccuracy(),
                    'Top_5_Acc': Top5CategoricalAccuracy()}
    model = Model(net, loss_fn=loss, metrics=eval_metrics)
    start_time = time.time()
    res = model.eval(dataset, dataset_sink_mode=False)
    use_time = time.time() - start_time
    hour = str(int(use_time // 60 // 60))
    minute = str(int(use_time // 60 % 60))
    second = str(int(use_time % 60))
    log = "result:" + str(res) + ", ckpt:'" + "./shufflenetv1-5_390.ckpt" \
        + "', time: " + hour + "h " + minute + "m " + second + "s"
    print(log)
    filename = './eval_log.txt'
    with open(filename, 'a') as file_object:
        file_object.write(log + '\n')

if __name__ == '__main__':
    test()
model size is  2.0x
result:{'Loss': 1.5920430620511372, 'Top_1_Acc': 0.5093149038461539, 'Top_5_Acc': 0.9325921474358975}, ckpt:'./shufflenetv1-5_390.ckpt', time: 0h 1m 37s

设置MindSpore的运行模式为图模式(GRAPH_MODE),并指定设备为Ascend。使用之前定义的get_dataset函数加载CIFAR-10测试集。加载预训练的ShuffleNetV1模型,并将其设置为评估模式(非训练模式)。定义损失函数和评估指标(包括损失、Top-1准确率和Top-5准确率)。使用model.eval接口对模型进行评估,并记录评估时间。将评估结果和时间打印并记录到日志文件中。

模型预测

在CIFAR-10的测试集上对模型进行预测,并将预测结果可视化。

import mindspore
import matplotlib.pyplot as plt
import mindspore.dataset as ds

net = ShuffleNetV1(model_size="2.0x", n_class=10)
show_lst = []
param_dict = load_checkpoint("shufflenetv1-5_390.ckpt")
load_param_into_net(net, param_dict)
model = Model(net)
dataset_predict = ds.Cifar10Dataset(dataset_dir="./dataset/cifar-10-batches-bin", shuffle=False, usage="train")
dataset_show = ds.Cifar10Dataset(dataset_dir="./dataset/cifar-10-batches-bin", shuffle=False, usage="train")
dataset_show = dataset_show.batch(16)
show_images_lst = next(dataset_show.create_dict_iterator())["image"].asnumpy()
image_trans = [
    vision.RandomCrop((32, 32), (4, 4, 4, 4)),
    vision.RandomHorizontalFlip(prob=0.5),
    vision.Resize((224, 224)),
    vision.Rescale(1.0 / 255.0, 0.0),
    vision.Normalize([0.4914, 0.4822, 0.4465], [0.2023, 0.1994, 0.2010]),
    vision.HWC2CHW()
        ]
dataset_predict = dataset_predict.map(image_trans, 'image')
dataset_predict = dataset_predict.batch(16)
class_dict = {0:"airplane", 1:"automobile", 2:"bird", 3:"cat", 4:"deer", 5:"dog", 6:"frog", 7:"horse", 8:"ship", 9:"truck"}
# 推理效果展示(上方为预测的结果,下方为推理效果图片)
plt.figure(figsize=(16, 5))
predict_data = next(dataset_predict.create_dict_iterator())
output = model.predict(ms.Tensor(predict_data['image']))
pred = np.argmax(output.asnumpy(), axis=1)
index = 0
for image in show_images_lst:
    plt.subplot(2, 8, index+1)
    plt.title('{}'.format(class_dict[pred[index]]))
    index += 1
    plt.imshow(image)
    plt.axis("off")
plt.show()
model size is  2.0x

加载预训练模型和参数。准备了用于预测的数据集,对图像进行了预处理操作(随机裁剪、水平翻转、调整大小、归一化等)。对预处理后的数据集进行批处理。使用model.predict()方法对数据进行预测,将预测结果与实际图像一起显示出来。

核心思想:ShuffleNet引入Pointwise Group Convolution和Channel Shuffle,显著降低模型的计算量,同时保持了较高的模型精度。在有限的计算资源下达到最佳的模型精度,非常适合移动端和嵌入式设备。

关键技术点:Pointwise Group Convolution通过分组卷积,减少了参数量和计算量。Channel Shuffle通过通道重排操作,解决了不同组别通道之间的信息隔离问题。

模型架构:ShuffleNet模块对ResNet的Bottleneck单元进行了改进。ShuffleNet网络结构包括多个ShuffleNet模块的堆叠,通过不同阶段的重复和下采样操作,逐步增加通道数和减小特征图尺寸。

训练和评估:使用CIFAR-10数据集进行训练和评估。采用随机初始化的参数进行预训练,使用交叉熵损失函数和Momentum优化器,结合余弦退火学习率调整。在CIFAR-10测试集上进行评估,使用Top-1和Top-5准确率作为评估指标,验证模型的性能。

可视化分析:通过可视化预测结果,直观地了解了模型的分类效果。        

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/1892482.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

4D 生物打印智能生物墨水需要具备哪些关键特性?

4D 生物打印智能生物墨水需要具备哪些关键特性&#xff1f; 1. 可打印性 (Printability) MEB&#xff1a;生物墨水需要具有良好的流变学特性&#xff0c;例如高粘度、剪切稀化行为和触变性。打印参数&#xff08;如喷嘴直径、打印速度和层厚&#xff09;会影响打印性能。 喷…

Golang 依赖注入设计哲学|12.6K 的依赖注入库 wire

一、前言 线上项目往往依赖非常多的具备特定能力的资源&#xff0c;如&#xff1a;DB、MQ、各种中间件&#xff0c;以及随着项目业务的复杂化&#xff0c;单一项目内&#xff0c;业务模块也逐渐增多&#xff0c;如何高效、整洁管理各种资源十分重要。 本文从“术”层面&#…

游戏AI的创造思路-技术基础-自然语言处理

自然语言处理-可以对游戏AI特别是RPG类、语言类游戏进行“附魔”&#xff0c;开发出“随机应变”和你聊天的“女友”、“队友”或者是根据你定义的文本库来用接近自然语言的生成“语言”&#xff0c;推动游戏情景在受控范围内前进 目录 1. 自然语言处理定义 2. 发展历史 3. …

【配置网络和使用ssh服务】

文章目录 一、配置文件二、配置网络1.使用系统菜单配置网络2.通过网卡配置文件配置网络3.使用图形界面配置网络4.使用nmcli命令配置网络 三、配置远程控制服务1.配置sshd服务2.安全密钥验证3.远程传输命令 一、配置文件 跟网络有关的主要配置文件如下&#xff1a; /etc/host.c…

RS232、RS485、RS422、RS423、RS449的联系与区别

这些标准&#xff08;RS232、RS485、RS422、RS423、RS449&#xff09;都涉及将并行数据转换为串行数据进行传输&#xff1a; 数据转换过程&#xff1a; 在发送端&#xff0c;并行数据&#xff08;通常是字节或字&#xff09;被转换成串行比特流。 在接收端&#xff0c;串行比特…

CH552G使用的pwm出现的问题,及设置

输出pwm的频率周期很不准确 可能是因为没有外部晶振的稳定晶振周期有关。 使用的示波器出现操作失误 在使用小型示波器的过程中发现集成了信号发生器和示波器的连接端口是不同的。刚开始把示波器测试口错插入了信号发生器的接口&#xff0c;困扰好一会儿&#xff0c;幸好用一…

Zabbix 配置WEB监控

Zabbix WEB监控介绍 在Zabbix中配置Web监控&#xff0c;可以监控网站的可用性和响应时间。Zabbix提供了内置的Web监控功能&#xff0c;通过配置Web场景&#xff08;Web Scenario&#xff09;&#xff0c;可以监控HTTP/HTTPS协议下的Web服务。 通过Zabbix的WEB监控可以监控网站…

超声波气象站的工作原理

TH-CQX5超声波气象站中的超声波技术是其核心工作原理之一&#xff0c;以下是关于超声波气象站中超声波的详细解释&#xff1a;超声波是一种频率高于人耳能听到的声音频率范围的声波&#xff0c;通常指频率在20kHz以上的声波。超声波具有较短的波长和强的穿透能力&#xff0c;能…

vue安装+测试

1.下载node.js 在浏览器中打开nodejs官网https://nodejs.org/zh-cn/ &#xff0c;选择需要的版本 2.检查nodejs是否安装成功 打开cmd&#xff0c;输入命令 node -v PS C:\Users\neuer> node -v v20.15.03.安装cnpm 遇到npm ERR! code CERT_HAS_EXPIRED问题 清除npm缓存 n…

【TS】交叉类型 和 联合类型

文章目录 1. 交叉类型&#xff08;Intersection Types&#xff09;2. 联合类型&#xff08;Union Types&#xff09; 1. 交叉类型&#xff08;Intersection Types&#xff09; 交叉类型将多个类型合并为一个类型&#xff0c;这个新类型具有所有类型的特性。使用 & 符号来定…

妙手ERP接入Miravia,支持高效上货、批量订单处理

欧洲电子商务市场目前已经成为了中国跨境电商出口的“新蓝海”。放眼欧洲&#xff0c;西班牙电商市场规模并不算大&#xff0c;但却是增长率最高的市场之一&#xff0c;并且正在追赶其他电商市场。  据Statista的调查数据显示&#xff0c;2024年初西班牙的电商收入将达到355亿…

python自动化办公-往ppt插入图片

目录 思路 代码 代码效果 思路 1、导包 2、打开ppt 3、新增1张幻灯片&#xff0c;选择自己需要的版式 4、输入标题 5、设置好图片的位置和大小&#xff0c;插入准备好的图片 6、保存文件 代码 from pptx import Presentation from pptx.util import Inches # 打开pp…

【C语言入门】初识C语言:掌握编程的基石

&#x1f4dd;个人主页&#x1f339;&#xff1a;Eternity._ ⏩收录专栏⏪&#xff1a;C语言 “ 登神长阶 ” &#x1f921;往期回顾&#x1f921;&#xff1a;C语言入门 &#x1f339;&#x1f339;期待您的关注 &#x1f339;&#x1f339; ❀C语言入门 &#x1f4d2;1. 选择…

c->c++(二):class

本文主要探讨C类的相关知识。 构造和析构函数 构造函数(可多个)&#xff1a;对象产生时调用初始化class属性、分配class内部需要的动态内存 析构函数&#xff08;一个&#xff09;&#xff1a;对对象消亡时调用回收分配动态内存 C提供默认构造和析构,…

使用pdf.js在Vue、React中预览Pdf文件,支持PC端、移动端

&#x1f4dd; 使用背景 在前端开发中&#xff0c;有时候我们需要进行pdf文件的预览操作&#xff0c;通过在网上查询&#xff0c;基本都是一下几种常见的预览pdf文件的方法&#xff1a; 实现方案效果HTML 标签iframe 标签iOS&#xff1a;只能展示第一页&#xff0c;多页不能展…

Windows安全认证机制——Windows常见协议

一.LLMNR协议 1.LLMNR简介 链路本地多播名称解析&#xff08;LLMNR&#xff09;是一个基于域名系统&#xff08;DNS&#xff09;数据包格式的协议&#xff0c;使用此协议可以解析局域网中本地链路上的主机名称。它可以很好地支持IPv4和IPv6&#xff0c;是仅次于DNS解析的名称…

63、基于深度学习网络的数字分类(matlab)

1、基于深度学习网络的数字分类的原理及流程 基于深度学习网络的数字分类是一种常见的机器学习任务&#xff0c;通常使用的是卷积神经网络&#xff08;CNN&#xff09;来实现。下面是其原理及流程的简要说明&#xff1a; 数据收集&#xff1a;首先&#xff0c;需要收集包含数字…

Mybatis一级缓存

缓存 MyBatis 包含一个非常强大的查询缓存特性,它可以非常方便地配置和定制。MyBatis 3 中的缓存实现的很多改进都已经实现了,使得它更加强大而且易于配置。 Mybatis和Hibernate一样&#xff0c;也有一级和二级缓存&#xff0c;同样默认开启的只有一级缓存&#xff0c;二级缓…

【笔记】解决 CSS:backface-visibility:hidden; 容器翻转 引起的容器内 input不可用

起因 今天&#xff0c;做了一个卡片翻转的案例。原本参考的案例是一个非常简单的两个div翻面效果&#xff0c;使用的 backface-visibility:hidden; 实现两个容器互为背面。基础div就是纯色&#xff0c;什么都没有&#xff0c;很容易就实现了翻转。 出现问题 我要做的案例&am…

【Python机器学习】算法链与管道——在网格搜索中使用管道

在网格搜索中使用管道的工作原理与使用任何其他估计器都相同。 我们定义一个需要搜索的参数网络&#xff0c;并利用管道和参数网格构建一个GridSearchCV。不过在指定参数网格时存在一处细微的变化。我们需要为每个参数指定它在管道中所属的步骤。我们要调节的两个参数C和gamma…