Distilling Knowledge With Wide Residual Networks
Hinton et al, ‘Distilling the Knowledge in a Neural Network’ (2015)
Original article: https://arxiv.org/abs/1503.02531
Presentation by:
Abhinav Katoch (katoch@uchicago.edu),
**Javid Lakha (javid.lakha@outlook.com), **
**Sergio Naval (s.naval.m@gmail.com), **
Eduard Tuchfeld (eduard.tuchfeld@gmail.com)
Introduction
The paper, “Distilling the Knowledge in a Neural Network” (Hinton et al. 2015) develops a methodology to transfer knowledge between classification learning models. Although knowledge transfer between models can be desirable in different practical scenarios, the paper highlights as one of the objectives achieving simpler models that obtain similar performance as cumbersome models or ensamble of specialist models without their computational overhead.
The strategy to generate this simpler model implies:
(1) Train cumbersome models or ensambles of specialists models that maximize predictivity without taking into account computation cost constraints.
(2) Reutilize knowledge generated by the cumbersome models to train simpler models. Specifically, the knowledge reutilized takes the form of the individual class probabilities.
Generally the cumbersome model would have been trained with the objective of maximizing the average log probability of the correct class. As a side-effect of the patterns identified, it will also assign probabilities for all the other classes. For example, in a network trained to recognized MNIST written numbers, even if a cumbersome model predicts correctly that a written number is a 7, if the specific number was similar to a 3 the model will assign probability to the 3 class recognizing a pattern between the inputs and the class.
The destillation techique allows simpler models to utilize this additional information/knowledge generated by the cumbersome model to overperform other models with the same complexity trained using only the class targets.
Distillation
As described in the introduction, part of the knowledge generated by the cumbersome models is reflected in the probabilities assigned to each class for each sample. A common approach to assign probabilites to multiple classes is using a softmax layer. The softmax output layer assigns a probability q for a class i using scores $z_i$ with the following expression:
T is a temperature parameter that is generally set to 1 given that higher temperatures will not change the class with higher probability, just produce a softer probability distribution over classes.
As we know from the lectures on Simulated Annealing, exponentiating a function tends to produce a very peaked output. For example if we have points (1,2,3) in the logit space, in the exponentiated space we get back (2.7, 7.29 and 19.683). Similarly, normalizing in the exponentiated space gives us (8%, 25% and 67% as the probabilities associated with these points) as opposed to (16.6%, 33.33% and 50%) in the logit space. Since the softmax function exponentiates the logits, its effect as an output layer is to exaggerate the value of the most probable class and downplay the probability of the less probable classes.
This is a problem because in the distillation process we are trying to convey information to the simpler model but if we use the target distribution from the more complex model we are losing some of the knowledge that it actually developed in producing its outputs. By raising the temperature, we decrease the absolute value of the logits being sent into the softmax function and then exponentiate and sum those smaller values. The effect of making the value of the logits smaller is that it pushes the exponentiated class values closer together and hence the proportional differences between the class outputs for a given input becomes closer to linear.
It is possible to appreciate the impact of the Temperature parameter with an example on written number recognition. The following chart shows the probabilities assigned for each of the classes (0 to 9) depending on the temperature. With a temperature of 1 the model assigns almost all the probability to the 4 class. When the temperature increases, the model assigns an increasing amount of probability to other classes based on the patterns identified that match the class. In this example it can be seen that model assigns probability to 9 class as second most likely option:
from IPython.display import Image, display
display(Image(filename='four.png', embed=True))
Generally in tasks like MNIST, models achieve an almost perfect classification with high confidence resulting in probabilities very near 1 for the correct class and probabilities in the order of magnitude of 1e-6 or 1e-9 for incorrect classes. By utilizing high temperatures instead of 1, the softmax output layer allow the probabilities to express the knowledge generated by the model in the form of low probabilities assigned to wrong classes. Typical probabilities of MNIST will not express patterns of other classes unless a high temperature is used:
display(Image(filename='two.png', embed=True))
from IPython.display import Image, display
display(Image(filename='seven.png', embed=True))
The “distillation” solution introduced in the paper propose to train the simpler model to match the probabilities generated by the cumbersome model at high temperature as a mechanism to transfer knowledge between the models. The student model is trained with an objective function that combines matching the probabilities (soft targets) and the original class labels (hard targets).
For the soft target, cross-entropy loss is used between the two sets of probabilites generated at the same temperature. For a task with i classes, teacher probabilities p (generated by scores v) and student probabilities q (generated by scores z) the soft objective is described by the following function:
Each case in the transfer set contributes a cross-entropy gradient, with respect to each score, $z_i$ of the distilled model. This gradient is given by:
Cross entropy is also used for the hard target (matching true classes). A softmax output layer with temperature 1 is used to match the true class (1 for the true class, 0 for all false classes).
The paper propose to combine the two cross-entropy using a weighted average. In the implementation of this paper we use a parameter $\alpha$ to regulate the weight in the soft target, the hard target obtaines a weight $1-\alpha$. It is important to note that the magnitude of the $C_{soft}$ gradients are different from the $C_{hard}$ gradients given the different temperature, gradients of soft targets scale as $1/T^2$. Before combining the two objective functions, soft cost needs to be scaled by $T^2$. The total obtective function is defined as:
Being the parameters introduced for the distillation:
- T the temperatures in the softmax layer used in soft targets that regulates the expressivity of probabilities generated by teacher model.
- $\alpha$ the weight parameter applied in the soft target
These two parameters will be fitted to specific tasks.
Below, we implement two Residual Convolutional Neural Networks (ResCNN) - a “cumbersome” parent network that serves as the teacher and a significantly simpler child network that serves as the student.
The implementations are benchmarked on the CIFAR-10 image set; CIFAR-10 was chosen as the appropriate dataset given that is significantly more challenging than the MNIST dataset - allowing us to better differentiate the performance of the parent vs child implementations and knowledge transfer process.
#Torch
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
import torch.backends.cudnn as cudnn
from torch.autograd import Variable
#TorchVision
import torchvision
import torchvision.transforms as transforms
#Numpy
import numpy as np
#MatplotLib
import matplotlib.pyplot as plt
#System
import os
import sys
import argparse
import time
import copy
from functools import partial
from IPython.display import IFrame, Image, display, clear_output
#GraphGiz for Graph Visualization
try:
from graphviz import Digraph
except Exception as e:
print('Cannot Import GraphGiz')
#etc
import re
#
print('PyTorch Version: {}'.format(torch.__version__))
PyTorch Version: 0.4.0
$ \textbf{Prepare Data Loaders} $
#Reference:
#http://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html
#
BATCH_SIZE_TRAIN = 128
#
transform_train = transforms.Compose([
transforms.RandomCrop(32, padding=4),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize( (0.5, 0.5, 0.5), (0.5, 0.5, 0.5) ),
])
transform_test = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize( (0.5, 0.5, 0.5), (0.5, 0.5, 0.5) ),
])
#
pin_memory = False
if torch.cuda.is_available():
torch.cuda.empty_cache()
pin_memory = True
#
trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform_train)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=BATCH_SIZE_TRAIN, shuffle=True, num_workers=2,
pin_memory=pin_memory)
#
classes = ('airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck')
#
#Test Data Loader
BATCH_SIZE_TEST = 128
testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform_test)
testloader = torch.utils.data.DataLoader(testset, batch_size=BATCH_SIZE_TEST, shuffle=False, num_workers=2,
pin_memory=pin_memory)
Files already downloaded and verified
Files already downloaded and verified
$ \textbf{Visualizing the Data Set} $
#Reference:
#http://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html
#
def imshow(img):
img = img / 2 + 0.5 # unnormalize
npimg = img.numpy()
plt.imshow(np.transpose(npimg, (1, 2, 0)))
# get some random training images
N = 24
dataiter = iter(testloader)
images, labels = dataiter.next()
images_tmp = images[0:N,:,:]
# show images
imshow(torchvision.utils.make_grid(images_tmp))
plt.figure(figsize=(32*N,32*N))
plt.show()
<matplotlib.figure.Figure at 0xa7fc358>
#LETS VISUALIZE THE NETWORK
#REFERENCE: https://discuss.pytorch.org/t/print-autograd-graph/692/
##
##REQUIRES GRAPHVIZ TO RUN
##
def viz_nn(var, params):
""" Produces Graphviz representation of PyTorch autograd graph
"""
try:
param_map = {id(v): k for k, v in params.items()}
print(param_map)
node_attr = dict(style='filled',
shape='box',
align='left',
fontsize='12',
ranksep='0.1',
height='0.2')
dot = Digraph(node_attr=node_attr, graph_attr=dict(size="12,12"))
seen = set()
def size_to_str(size):
return '('+(', ').join(['%d'% v for v in size])+')'
def add_nodes(var):
if var not in seen:
if torch.is_tensor(var):
dot.node(str(id(var)), size_to_str(var.size()), fillcolor='orange')
elif hasattr(var, 'variable'):
u = var.variable
node_name = '%s\n %s' % (param_map.get(id(u)), size_to_str(u.size()))
dot.node(str(id(var)), node_name, fillcolor='lightblue')
else:
dot.node(str(id(var)), str(type(var).__name__))
seen.add(var)
if hasattr(var, 'next_functions'):
for u in var.next_functions:
if u[0] is not None:
dot.edge(str(id(u[0])), str(id(var)))
add_nodes(u[0])
if hasattr(var, 'saved_tensors'):
for t in var.saved_tensors:
dot.edge(str(id(t)), str(id(var)))
add_nodes(t)
add_nodes(var.grad_fn)
return dot
except Exception as e:
print('viz_nn threw error: {}'.format( e ) )
$ \textbf{ Prepare Auxiliary Functions - Saved Model Loaders, Training Harness } $
def load_model( model, optimizer, dir_path='./checkpoint', model_name='DeepResNet_parent' ):
'''loads previously saved model into memory'''
try:
model_path = dir_path + '/' + model_name
if torch.cuda.is_available():
model_path += '_GPU.pth'
checkpoint = torch.load(model_path)
else:
model_path += '.pth'
checkpoint = torch.load(model_path)
#
epoch = checkpoint['epoch']
optimizer_dict = checkpoint['optimizer']
optimizer.load_state_dict(optimizer_dict)
state_dict = checkpoint['state_dict']
model.load_state_dict(state_dict, strict=False)
print('learning rate: {}'.format(lr))
print('initial epoch: {}'.format(epoch))
return model, optimizer, epoch
except Exception as e:
print('[load_model] Exception: {}'.format(e))
return 0
#
#if you pass just parent_model, then training_harness will train the parent model
#if you pass parent_model AND child_model, then training_harness assumes the parent model is trained
#and we need to train the child model FROM the parent model
#
def training_harness( dataloader, optimizer, loss_func, parent_model, child_model = None, epochs = 51, init_epoch = 0,
model_name = 'DeepResNet', save_every_n_epochs = 10 ):
'''
if training child from parent via distillation, loss_func needs to have signature
(parent_output, labels, child_output)
'''
#
N_EPOCHS = epochs
train_child_model_from_parent = False
if child_model is not None:
model_name += '_child'
train_child_model_from_parent = True
else:
model_name += '_parent'
#
#
for epoch in range( init_epoch, N_EPOCHS ):
for i, (images, labels) in enumerate(dataloader):
#
if torch.cuda.is_available():
inputs, labels = images.cuda(), labels.cuda()
#
images = Variable(images)
labels = Variable(labels)
#
if train_child_model_from_parent:
parent_probability_dist = parent_model( images )
parent_probability_dist = Variable( parent_probability_dist, requires_grad=False )
child_probability_dist = child_model( images )
loss = loss_func( child_probability_dist, labels, parent_probability_dist )
else:
parent_probability_dist = parent_model( images )
loss = loss_func( parent_probability_dist, labels )
#
optimizer.zero_grad()
#
loss.backward()
#
optimizer.step()
#
if (i+1) % 10 == 0:
print ("Epoch [%d/%d], Iter [%d/%d] Loss: %.4f" %(epoch+1, N_EPOCHS, i+1, len(trainloader), loss.data[0]))
#save the model results every 20 epochs
if (epoch + 1) % int(save_every_n_epochs) == 0:
#Save Model
print('\n[Saving Checkpoint]')
state = {
'epoch' : epoch + 1,
'optimizer' : optimizer.state_dict()
}
if train_child_model_from_parent:
state['state_dict']: child_model.state_dict()
else:
state['state_dict']: parent_model.state_dict()
if not os.path.isdir('checkpoint'):
os.mkdir('checkpoint')
if torch.cuda.is_available():
path = './checkpoint' + '/' + model_name + '_GPU.pth'
torch.save(state, path)
else:
path = './checkpoint' + '/' + model_name + '.pth'
torch.save(state, path)
def test_harness( dataloader, model ):
correct = 0
total = 0
for images, labels in dataloader:
if torch.cuda.is_available():
inputs, labels = images.cuda(), labels.cuda()
images = Variable(images)
outputs = model(images)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum()
print('Accuracy of the model on the test images: %d %%' % (100 * correct / total))
return correct, total
Residual CNN Architecture Overview
Reference:
**1. Deep Residual Learning for Image Recognition | Kaiming He et al. | https://arxiv.org/pdf/1512.03385.pdf** |
**2. Convolutional Neural Networks (CNNs / ConvNets) | Stanford CS231N | http://cs231n.github.io/convolutional-networks/** |
PLEASE NOTE: IMAGES ARE TAKEN FROM REFERENCES (1) AND (2) ABOVE
Convolutional neural networks are deep neural networks which are designed to work with image data (though they can be used for other tasks too - e.g. natural language processing). When working with images, the inputs into a CNN are pixel data, which are structured into matrices if the images are in black and white or into tensors (multidimensional arrays representing the red/green/blue channels - dimension 3) if the images are in colour. Note that a CIFAR-10 image, 32x32 in size, is input as a tensor variable of size [3,32,32] or [$Channels_{in}$, $Width$, $Height$]
The distinguishing feature of convolutional neural networks is the convolutional kernel. These are small (our model uses a standard square 3x3 kernel) matrices and which “slide” across the input tensors and learn to ‘pick out’ distinguishing features such as edges (a good visual demonstration, in an image processing context, can be found at http://setosa.io/ev/image-kernels/). This approach has an important property: the patterns the kernel learns are translation invariant. Whereas a traditional neural network would have to re-learn a pattern if it appeared at a different location, if a CNN kernel learns a pattern in one part of an image, it will recognise the same pattern everywhere else in the image.
Below is an illustration from Stanford’s CS231N course on how the convolution kernel works.
from IPython.display import Image, display
display(Image(filename='convolution_operation.JPG', embed=True, width=500, height=500))
##REFERENCE: http://cs231n.github.io/convolutional-networks/
Convolutional neural networks typically have many layers. This is because they learn patterns hierarchically: for example, the first convolutional layer may learn small patterns such as edges; the second layer may learn larger patterns such an an eye; and third may learn an even larger pattern such as a face.
from IPython.display import Image, display
display(Image(filename='CNN cat.png', embed=True, width=300, height=300))
However, whereas a traditional neural network has a comparatively small number of fully connected layers (so that every input interacts with every output), convolutional neural networks typically have sparse connectivity between neurons. This is because meaningful patterns are detected with image kernels. Nonetheless, though sparse connectivity reduces the processing power required, the depth of a convolutional neural network means that it will nonetheless be computationally intensive - often, this means that computations need to be done on dedicated graphics processing units (GPUs) which are optimised for these tasks.
$\textbf{Multi-Layer Perceptrons (MLP) vs Convolutional Neural Networks}$
Previously in the course, we implemented a simple one-hidden layer perceptron network and used it to classify images using the MNIST dataset. So why didn’t we use the same architecture on the CIFAR-10 dataset? Well, clearly, a one hidden-layer network was not likely to cut it. CIFAR-10 images are significantly more complex, as visualized above, than the MNIST “hand-written digits” images. There are many more forms a “ship” or an “automobile” can take than a hand-written digit; thus, picking out the distinguishing features is a more complicated task.
So why not implement, a fully connected, deep, multi-layer perceptron? Well, we could - and if we had enough time and training data - it should perform as well or better than the convolutional neural network approach we used. However, by using fully connected layers, we note that there will be 3x32x32 or ~3072 connections (weights) per neuron in the next layer. This approach simply does not scale well with deeper and wider networks or with larger-size input images. Thus, we decided to implement the golden-standard for image recognition: the convolutional neural network!
$\textbf{ Added Residual Modification to the Convolutional Neural Network}$
We implemented a modified convolutional neural network. Specifically, we used the “residual” modification introduced by Kaiming He et al. (2015). Prior to work by Kaiming He et al.at Microsoft Research, deep neural networks used for image recognition suffered from performance degradation. The question then becomes - why should a deep neural network performance worse than a shallower one? Shouldn’t the deeper network perform, in the worst case scenario, as bad as the shallower NN? (If the deeper layers simply pass identifical data (identity matrix) from the prior layers downward, then the deeper NN should perform as “bad” as the shallower one). Yet this was not the case, as illustrated below!
**Below Fig 1 Reference: Deep Residual Learning for Image Recognition | Kaiming He et al. | https://arxiv.org/pdf/1512.03385.pdf** |
from IPython.display import Image, display
display(Image(filename='deep_res_training_error.JPG', embed=True, width=500, height=500))
#Reference: Deep Residual Learning for Image Recognition | Kaiming He et al. | https://arxiv.org/pdf/1512.03385.pdf**
The issue, as it happens, lies in the “degradation” problem. Although it is not obvious, what exactly happens as the performance degrades with deep neural networks. As the authors put it, “Unexpectedly, such degradation is not caused by overfitting, and adding more layers to a suitably deep model leads to higher training error.” One plausible explanation is that adding additional layers leads to optimization issues - the layers become $\it{harder}$ to train.
The authors of the paper address the issue by introducing $\it{residuals}$ - the idenity of input $\textbf{X}.$ As shown in Fig. 2 below, the residual is added back to the output of the trainable layer within a residual block. As the authors hypothesize, “To the extreme, if an identity mapping were optimal, it would be easier to push the residual to zero than to fit an identity mapping by a stack of nonlinear layers.”
from IPython.display import Image, display
display(Image(filename='residual_building_block.JPG', embed=True, width=300, height=300))
#Reference: Deep Residual Learning for Image Recognition | Kaiming He et al. | https://arxiv.org/pdf/1512.03385.pdf**
Mathematically, whilst a layer in a traditional neural network is trained to calculate a function F(x), the residual layers of a ResNet are trained to calculate the function F(x) + x. This means that, in theory, increasing depth should not harm a residual neural network, because (in the worst case) the layers can revert to being identity layers.
This architecture in fact proves to be $\textbf{very effective}$ and minimizes the “degradation” problem described above with 100+ layer deep neural networks. Please see Fig. 3 below reference from the paper.
**Below Fig 3 Reference: Deep Residual Learning for Image Recognition | Kaiming He et al. | https://arxiv.org/pdf/1512.03385.pdf** |
from IPython.display import Image, display
display(Image(filename='deep_res_training_error_with_residuals.JPG', embed=True, width=400, height=400))
#Reference: Deep Residual Learning for Image Recognition | Kaiming He et al. | https://arxiv.org/pdf/1512.03385.pdf**
Wide ResNet Parent (Teacher) Model Implementation
#Parent ResNet Implementation Adopted from PyTorch Tutorial
#https://github.com/yunjey/pytorch-tutorial/blob/master/tutorials/02-intermediate/deep_residual_network/main.py
#3x3 Convolution
def conv3x3(in_channels, out_channels, stride=1):
return nn.Conv2d(in_channels, out_channels, kernel_size=3,
stride=stride, padding=1, bias=False)
# Residual Block
class ResidualBlock(nn.Module):
def __init__(self, in_channels, out_channels, stride=1, downsample=None):
super(ResidualBlock, self).__init__()
#
self.conv1 = nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=stride, padding=1, bias=False)
torch.nn.init.xavier_uniform(self.conv1.weight)
self.bn1 = nn.BatchNorm2d(out_channels)
self.relu = nn.ReLU(inplace=True)
#
self.conv2 = nn.Conv2d(out_channels, out_channels, kernel_size=3, stride=1, padding=1, bias=False)
torch.nn.init.xavier_uniform(self.conv2.weight)
self.bn2 = nn.BatchNorm2d(out_channels)
#
self.downsample = downsample
def forward(self, x):
#
residual = x
#
out = self.conv1(x)
out = self.bn1(out)
out = self.relu(out)
#
out = F.dropout(out, p=0.25)
#
out = self.conv2(out)
out = self.bn2(out)
#
if self.downsample:
residual = self.downsample(x)
out += residual
#
out = self.relu(out)
#
return out
# ResNet Module
class WideResNet(nn.Module):
def __init__(self, block, layers, num_classes=10):
super(WideResNet, self).__init__()
#
self.in_channels = 16
self.conv1 = nn.Conv2d(3, 16, kernel_size=3, stride=1, padding=1, bias=False)
torch.nn.init.xavier_uniform(self.conv1.weight)
self.bn = nn.BatchNorm2d(16)
self.relu = nn.ReLU(inplace=True)
#
self.layer1 = self.make_layer(block, 160, layers[0] )
self.layer2 = self.make_layer(block, 320, layers[1], 2 )
self.layer3 = self.make_layer(block, 640, layers[2], 2 )
#
self.avg_pool = nn.AvgPool2d(8)
self.fc = nn.Linear(640, num_classes)
torch.nn.init.xavier_uniform(self.fc.weight)
def weights_init(self, m):
classname = m.__class__.__name__
if classname.find('Conv') != -1:
torch.nn.init.xavier_uniform( m.weight )
def make_layer(self, block, out_channels, blocks, stride=1):
downsample = None
if (stride != 1) or (self.in_channels != out_channels):
downsample = nn.Sequential(
nn.Conv2d(self.in_channels, out_channels, kernel_size=3, padding=1, stride=stride, bias=False),
nn.BatchNorm2d(out_channels))
downsample.apply(self.weights_init)
layers = []
layers.append(block(self.in_channels, out_channels, stride, downsample))
self.in_channels = out_channels
for i in range(1, blocks):
layers.append(block(self.in_channels, out_channels))
return nn.Sequential(*layers)
def forward(self, x):
#
out = self.conv1(x)
out = self.bn(out)
out = self.relu(out)
#
out = self.layer1(out)
out = self.layer2(out)
out = self.layer3(out)
#
out = self.avg_pool(out)
out = out.view(out.size(0), -1)
out = self.fc(out)
return out
$ \textbf{ Graph Visualization of the Wide Residual Parent Network } $
#Uncomment to run visializer (GraphViz)
#m = WideResNet(ResidualBlock, [4,4,4])
#y = m(Variable(images_tmp))
#g = viz_nn(y, m.state_dict())
#IFrame(g.view(), width=1000, height=1000, embed=True)
display(Image(filename='res_convnet_parent_graph.JPG', embed=True))
resnet_parent = WideResNet(ResidualBlock, [4,4,4])
learning_rate = 0.01
epoch = 0
optimizer_parent = torch.optim.SGD(resnet_parent.parameters(), lr=learning_rate, momentum=0.9, weight_decay=5e-4)
#GPU Acceleration
if torch.cuda.is_available():
gpu_id = torch.cuda.get_device_name(0)
print('ENABLING GPU ACCELERATION || {}'.format(gpu_id))
resnet_parent = torch.nn.DataParallel(resnet_parent, device_ids=range(torch.cuda.device_count()))
resnet_parent.cuda()
cudnn.benchmark = True
#Load Previous Model?
load_previous_model = False
if load_previous_model:
resnet_parent, optimizer_parent, epoch = load_model( resnet_parent, optimizer_parent )
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:54: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:69: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:15: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
from ipykernel import kernelapp as app
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:20: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:64: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
ENABLING GPU ACCELERATION || GeForce GTX 1080 Ti
#let's train the parent model
#func signature for training_harness:
#dataloader, optimizer, loss_func, parent_model, child_model = None, epochs = 51, init_epoch = 0,
#model_name = 'DeepResNet', save_every_n_epochs = 10 ):
loss_func = nn.CrossEntropyLoss()
training_harness( trainloader, optimizer_parent, loss_func, resnet_parent )
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:46: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
Epoch [1/51], Iter [10/391] Loss: 2.0253
Epoch [1/51], Iter [20/391] Loss: 1.9886
Epoch [1/51], Iter [30/391] Loss: 2.0261
Epoch [1/51], Iter [40/391] Loss: 1.8313
Epoch [1/51], Iter [50/391] Loss: 1.7924
Epoch [1/51], Iter [60/391] Loss: 1.6717
Epoch [1/51], Iter [70/391] Loss: 1.6798
Epoch [1/51], Iter [80/391] Loss: 1.7018
Epoch [1/51], Iter [90/391] Loss: 1.5539
Epoch [1/51], Iter [100/391] Loss: 1.5560
Epoch [1/51], Iter [110/391] Loss: 1.4972
Epoch [1/51], Iter [120/391] Loss: 1.4055
Epoch [1/51], Iter [130/391] Loss: 1.4232
Epoch [1/51], Iter [140/391] Loss: 1.4804
Epoch [1/51], Iter [150/391] Loss: 1.4402
Epoch [1/51], Iter [160/391] Loss: 1.4663
Epoch [1/51], Iter [170/391] Loss: 1.4200
Epoch [1/51], Iter [180/391] Loss: 1.3065
Epoch [1/51], Iter [190/391] Loss: 1.2539
Epoch [1/51], Iter [200/391] Loss: 1.3325
Epoch [1/51], Iter [210/391] Loss: 1.2613
Epoch [1/51], Iter [220/391] Loss: 1.2503
Epoch [1/51], Iter [230/391] Loss: 1.3215
Epoch [1/51], Iter [240/391] Loss: 1.1664
Epoch [1/51], Iter [250/391] Loss: 1.2119
Epoch [1/51], Iter [260/391] Loss: 1.0775
Epoch [1/51], Iter [270/391] Loss: 1.2370
Epoch [1/51], Iter [280/391] Loss: 1.3742
Epoch [1/51], Iter [290/391] Loss: 1.1783
Epoch [1/51], Iter [300/391] Loss: 1.0394
Epoch [1/51], Iter [310/391] Loss: 1.1139
Epoch [1/51], Iter [320/391] Loss: 1.1557
Epoch [1/51], Iter [330/391] Loss: 1.1958
Epoch [1/51], Iter [340/391] Loss: 0.9712
Epoch [1/51], Iter [350/391] Loss: 1.0658
Epoch [1/51], Iter [360/391] Loss: 1.0048
Epoch [1/51], Iter [370/391] Loss: 0.9204
Epoch [1/51], Iter [380/391] Loss: 1.2974
Epoch [1/51], Iter [390/391] Loss: 0.9679
Epoch [2/51], Iter [10/391] Loss: 0.8928
Epoch [2/51], Iter [20/391] Loss: 1.0614
Epoch [2/51], Iter [30/391] Loss: 1.0362
Epoch [2/51], Iter [40/391] Loss: 0.9566
Epoch [2/51], Iter [50/391] Loss: 0.9094
Epoch [2/51], Iter [60/391] Loss: 0.9292
Epoch [2/51], Iter [70/391] Loss: 0.9766
Epoch [2/51], Iter [80/391] Loss: 0.8804
Epoch [2/51], Iter [90/391] Loss: 0.9196
Epoch [2/51], Iter [100/391] Loss: 0.8430
Epoch [2/51], Iter [110/391] Loss: 0.7953
Epoch [2/51], Iter [120/391] Loss: 1.0470
Epoch [2/51], Iter [130/391] Loss: 0.8837
Epoch [2/51], Iter [140/391] Loss: 0.9246
Epoch [2/51], Iter [150/391] Loss: 0.8357
Epoch [2/51], Iter [160/391] Loss: 0.9535
Epoch [2/51], Iter [170/391] Loss: 0.7751
Epoch [2/51], Iter [180/391] Loss: 0.9818
Epoch [2/51], Iter [190/391] Loss: 0.8772
Epoch [2/51], Iter [200/391] Loss: 0.8569
Epoch [2/51], Iter [210/391] Loss: 0.7671
Epoch [2/51], Iter [220/391] Loss: 0.7005
Epoch [2/51], Iter [230/391] Loss: 0.7309
Epoch [2/51], Iter [240/391] Loss: 0.6750
Epoch [2/51], Iter [250/391] Loss: 0.6941
Epoch [2/51], Iter [260/391] Loss: 0.9540
Epoch [2/51], Iter [270/391] Loss: 0.8348
Epoch [2/51], Iter [280/391] Loss: 0.9087
Epoch [2/51], Iter [290/391] Loss: 0.7309
Epoch [2/51], Iter [300/391] Loss: 0.6678
Epoch [2/51], Iter [310/391] Loss: 0.9465
Epoch [2/51], Iter [320/391] Loss: 0.9523
Epoch [2/51], Iter [330/391] Loss: 0.9161
Epoch [2/51], Iter [340/391] Loss: 0.7952
Epoch [2/51], Iter [350/391] Loss: 0.6022
Epoch [2/51], Iter [360/391] Loss: 0.7817
Epoch [2/51], Iter [370/391] Loss: 0.7931
Epoch [2/51], Iter [380/391] Loss: 0.8099
Epoch [2/51], Iter [390/391] Loss: 0.8243
Epoch [3/51], Iter [10/391] Loss: 0.7902
Epoch [3/51], Iter [20/391] Loss: 0.6759
Epoch [3/51], Iter [30/391] Loss: 0.6360
Epoch [3/51], Iter [40/391] Loss: 0.7320
Epoch [3/51], Iter [50/391] Loss: 0.6285
Epoch [3/51], Iter [60/391] Loss: 0.6488
Epoch [3/51], Iter [70/391] Loss: 0.6378
Epoch [3/51], Iter [80/391] Loss: 0.5531
Epoch [3/51], Iter [90/391] Loss: 0.7009
Epoch [3/51], Iter [100/391] Loss: 0.5981
Epoch [3/51], Iter [110/391] Loss: 0.6663
Epoch [3/51], Iter [120/391] Loss: 0.9001
Epoch [3/51], Iter [130/391] Loss: 0.5232
Epoch [3/51], Iter [140/391] Loss: 0.5854
Epoch [3/51], Iter [150/391] Loss: 0.7308
Epoch [3/51], Iter [160/391] Loss: 0.7944
Epoch [3/51], Iter [170/391] Loss: 0.6368
Epoch [3/51], Iter [180/391] Loss: 0.6183
Epoch [3/51], Iter [190/391] Loss: 0.5542
Epoch [3/51], Iter [200/391] Loss: 0.6507
Epoch [3/51], Iter [210/391] Loss: 0.6460
Epoch [3/51], Iter [220/391] Loss: 0.6555
Epoch [3/51], Iter [230/391] Loss: 0.5031
Epoch [3/51], Iter [240/391] Loss: 0.4790
Epoch [3/51], Iter [250/391] Loss: 0.5314
Epoch [3/51], Iter [260/391] Loss: 0.8747
Epoch [3/51], Iter [270/391] Loss: 0.7780
Epoch [3/51], Iter [280/391] Loss: 0.6667
Epoch [3/51], Iter [290/391] Loss: 0.6339
Epoch [3/51], Iter [300/391] Loss: 0.5372
Epoch [3/51], Iter [310/391] Loss: 0.6202
Epoch [3/51], Iter [320/391] Loss: 0.6630
Epoch [3/51], Iter [330/391] Loss: 0.6755
Epoch [3/51], Iter [340/391] Loss: 0.4813
Epoch [3/51], Iter [350/391] Loss: 0.5644
Epoch [3/51], Iter [360/391] Loss: 0.6194
Epoch [3/51], Iter [370/391] Loss: 0.5740
Epoch [3/51], Iter [380/391] Loss: 0.5463
Epoch [3/51], Iter [390/391] Loss: 0.5272
Epoch [4/51], Iter [10/391] Loss: 0.4478
Epoch [4/51], Iter [20/391] Loss: 0.5927
Epoch [4/51], Iter [30/391] Loss: 0.5346
Epoch [4/51], Iter [40/391] Loss: 0.5817
Epoch [4/51], Iter [50/391] Loss: 0.5204
Epoch [4/51], Iter [60/391] Loss: 0.5620
Epoch [4/51], Iter [70/391] Loss: 0.5390
Epoch [4/51], Iter [80/391] Loss: 0.5167
Epoch [4/51], Iter [90/391] Loss: 0.5492
Epoch [4/51], Iter [100/391] Loss: 0.5951
Epoch [4/51], Iter [110/391] Loss: 0.3805
Epoch [4/51], Iter [120/391] Loss: 0.8020
Epoch [4/51], Iter [130/391] Loss: 0.4213
Epoch [4/51], Iter [140/391] Loss: 0.5546
Epoch [4/51], Iter [150/391] Loss: 0.6417
Epoch [4/51], Iter [160/391] Loss: 0.5448
Epoch [4/51], Iter [170/391] Loss: 0.4980
Epoch [4/51], Iter [180/391] Loss: 0.5574
Epoch [4/51], Iter [190/391] Loss: 0.4242
Epoch [4/51], Iter [200/391] Loss: 0.6783
Epoch [4/51], Iter [210/391] Loss: 0.5495
Epoch [4/51], Iter [220/391] Loss: 0.7103
Epoch [4/51], Iter [230/391] Loss: 0.3709
Epoch [4/51], Iter [240/391] Loss: 0.7642
Epoch [4/51], Iter [250/391] Loss: 0.6522
Epoch [4/51], Iter [260/391] Loss: 0.6016
Epoch [4/51], Iter [270/391] Loss: 0.6154
Epoch [4/51], Iter [280/391] Loss: 0.6665
Epoch [4/51], Iter [290/391] Loss: 0.5109
Epoch [4/51], Iter [300/391] Loss: 0.5098
Epoch [4/51], Iter [310/391] Loss: 0.6103
Epoch [4/51], Iter [320/391] Loss: 0.4568
Epoch [4/51], Iter [330/391] Loss: 0.5321
Epoch [4/51], Iter [340/391] Loss: 0.4309
Epoch [4/51], Iter [350/391] Loss: 0.5022
Epoch [4/51], Iter [360/391] Loss: 0.5260
Epoch [4/51], Iter [370/391] Loss: 0.5107
Epoch [4/51], Iter [380/391] Loss: 0.3819
Epoch [4/51], Iter [390/391] Loss: 0.5066
Epoch [5/51], Iter [10/391] Loss: 0.5101
Epoch [5/51], Iter [20/391] Loss: 0.5596
Epoch [5/51], Iter [30/391] Loss: 0.4691
Epoch [5/51], Iter [40/391] Loss: 0.4391
Epoch [5/51], Iter [50/391] Loss: 0.4803
Epoch [5/51], Iter [60/391] Loss: 0.4539
Epoch [5/51], Iter [70/391] Loss: 0.5865
Epoch [5/51], Iter [80/391] Loss: 0.4088
Epoch [5/51], Iter [90/391] Loss: 0.3907
Epoch [5/51], Iter [100/391] Loss: 0.4782
Epoch [5/51], Iter [110/391] Loss: 0.4904
Epoch [5/51], Iter [120/391] Loss: 0.4109
Epoch [5/51], Iter [130/391] Loss: 0.3567
Epoch [5/51], Iter [140/391] Loss: 0.4787
Epoch [5/51], Iter [150/391] Loss: 0.4190
Epoch [5/51], Iter [160/391] Loss: 0.4920
Epoch [5/51], Iter [170/391] Loss: 0.4813
Epoch [5/51], Iter [180/391] Loss: 0.5149
Epoch [5/51], Iter [190/391] Loss: 0.5529
Epoch [5/51], Iter [200/391] Loss: 0.4072
Epoch [5/51], Iter [210/391] Loss: 0.3453
Epoch [5/51], Iter [220/391] Loss: 0.4094
Epoch [5/51], Iter [230/391] Loss: 0.5059
Epoch [5/51], Iter [240/391] Loss: 0.5581
Epoch [5/51], Iter [250/391] Loss: 0.4755
Epoch [5/51], Iter [260/391] Loss: 0.5502
Epoch [5/51], Iter [270/391] Loss: 0.3544
Epoch [5/51], Iter [280/391] Loss: 0.4040
Epoch [5/51], Iter [290/391] Loss: 0.5260
Epoch [5/51], Iter [300/391] Loss: 0.3757
Epoch [5/51], Iter [310/391] Loss: 0.5549
Epoch [5/51], Iter [320/391] Loss: 0.4223
Epoch [5/51], Iter [330/391] Loss: 0.3622
Epoch [5/51], Iter [340/391] Loss: 0.4769
Epoch [5/51], Iter [350/391] Loss: 0.4163
Epoch [5/51], Iter [360/391] Loss: 0.3149
Epoch [5/51], Iter [370/391] Loss: 0.4451
Epoch [5/51], Iter [380/391] Loss: 0.5428
Epoch [5/51], Iter [390/391] Loss: 0.3623
Epoch [6/51], Iter [10/391] Loss: 0.3673
Epoch [6/51], Iter [20/391] Loss: 0.3525
Epoch [6/51], Iter [30/391] Loss: 0.4825
Epoch [6/51], Iter [40/391] Loss: 0.4908
Epoch [6/51], Iter [50/391] Loss: 0.4333
Epoch [6/51], Iter [60/391] Loss: 0.3765
Epoch [6/51], Iter [70/391] Loss: 0.3985
Epoch [6/51], Iter [80/391] Loss: 0.2935
Epoch [6/51], Iter [90/391] Loss: 0.3902
Epoch [6/51], Iter [100/391] Loss: 0.4368
Epoch [6/51], Iter [110/391] Loss: 0.3292
Epoch [6/51], Iter [120/391] Loss: 0.4512
Epoch [6/51], Iter [130/391] Loss: 0.4048
Epoch [6/51], Iter [140/391] Loss: 0.5215
Epoch [6/51], Iter [150/391] Loss: 0.3648
Epoch [6/51], Iter [160/391] Loss: 0.3047
Epoch [6/51], Iter [170/391] Loss: 0.3335
Epoch [6/51], Iter [180/391] Loss: 0.5230
Epoch [6/51], Iter [190/391] Loss: 0.3384
Epoch [6/51], Iter [200/391] Loss: 0.4199
Epoch [6/51], Iter [210/391] Loss: 0.4263
Epoch [6/51], Iter [220/391] Loss: 0.3901
Epoch [6/51], Iter [230/391] Loss: 0.4686
Epoch [6/51], Iter [240/391] Loss: 0.4242
Epoch [6/51], Iter [250/391] Loss: 0.3506
Epoch [6/51], Iter [260/391] Loss: 0.3807
Epoch [6/51], Iter [270/391] Loss: 0.2305
Epoch [6/51], Iter [280/391] Loss: 0.3754
Epoch [6/51], Iter [290/391] Loss: 0.3566
Epoch [6/51], Iter [300/391] Loss: 0.3269
Epoch [6/51], Iter [310/391] Loss: 0.3828
Epoch [6/51], Iter [320/391] Loss: 0.5872
Epoch [6/51], Iter [330/391] Loss: 0.3726
Epoch [6/51], Iter [340/391] Loss: 0.4469
Epoch [6/51], Iter [350/391] Loss: 0.4436
Epoch [6/51], Iter [360/391] Loss: 0.5045
Epoch [6/51], Iter [370/391] Loss: 0.4462
Epoch [6/51], Iter [380/391] Loss: 0.2581
Epoch [6/51], Iter [390/391] Loss: 0.2772
Epoch [7/51], Iter [10/391] Loss: 0.3774
Epoch [7/51], Iter [20/391] Loss: 0.4721
Epoch [7/51], Iter [30/391] Loss: 0.3648
Epoch [7/51], Iter [40/391] Loss: 0.3712
Epoch [7/51], Iter [50/391] Loss: 0.3209
Epoch [7/51], Iter [60/391] Loss: 0.4442
Epoch [7/51], Iter [70/391] Loss: 0.4480
Epoch [7/51], Iter [80/391] Loss: 0.3401
Epoch [7/51], Iter [90/391] Loss: 0.3707
Epoch [7/51], Iter [100/391] Loss: 0.3496
Epoch [7/51], Iter [110/391] Loss: 0.4482
Epoch [7/51], Iter [120/391] Loss: 0.1894
Epoch [7/51], Iter [130/391] Loss: 0.4873
Epoch [7/51], Iter [140/391] Loss: 0.4583
Epoch [7/51], Iter [150/391] Loss: 0.5869
Epoch [7/51], Iter [160/391] Loss: 0.3126
Epoch [7/51], Iter [170/391] Loss: 0.4160
Epoch [7/51], Iter [180/391] Loss: 0.2832
Epoch [7/51], Iter [190/391] Loss: 0.3873
Epoch [7/51], Iter [200/391] Loss: 0.2915
Epoch [7/51], Iter [210/391] Loss: 0.3970
Epoch [7/51], Iter [220/391] Loss: 0.5006
Epoch [7/51], Iter [230/391] Loss: 0.5323
Epoch [7/51], Iter [240/391] Loss: 0.2921
Epoch [7/51], Iter [250/391] Loss: 0.4082
Epoch [7/51], Iter [260/391] Loss: 0.3651
Epoch [7/51], Iter [270/391] Loss: 0.3553
Epoch [7/51], Iter [280/391] Loss: 0.4040
Epoch [7/51], Iter [290/391] Loss: 0.3693
Epoch [7/51], Iter [300/391] Loss: 0.3770
Epoch [7/51], Iter [310/391] Loss: 0.3637
Epoch [7/51], Iter [320/391] Loss: 0.4709
Epoch [7/51], Iter [330/391] Loss: 0.3719
Epoch [7/51], Iter [340/391] Loss: 0.3212
Epoch [7/51], Iter [350/391] Loss: 0.3025
Epoch [7/51], Iter [360/391] Loss: 0.4560
Epoch [7/51], Iter [370/391] Loss: 0.3663
Epoch [7/51], Iter [380/391] Loss: 0.2460
Epoch [7/51], Iter [390/391] Loss: 0.3333
Epoch [8/51], Iter [10/391] Loss: 0.2876
Epoch [8/51], Iter [20/391] Loss: 0.3259
Epoch [8/51], Iter [30/391] Loss: 0.3265
Epoch [8/51], Iter [40/391] Loss: 0.4530
Epoch [8/51], Iter [50/391] Loss: 0.2807
Epoch [8/51], Iter [60/391] Loss: 0.3829
Epoch [8/51], Iter [70/391] Loss: 0.3323
Epoch [8/51], Iter [80/391] Loss: 0.2945
Epoch [8/51], Iter [90/391] Loss: 0.3715
Epoch [8/51], Iter [100/391] Loss: 0.3725
Epoch [8/51], Iter [110/391] Loss: 0.2919
Epoch [8/51], Iter [120/391] Loss: 0.3556
Epoch [8/51], Iter [130/391] Loss: 0.3101
Epoch [8/51], Iter [140/391] Loss: 0.3927
Epoch [8/51], Iter [150/391] Loss: 0.3106
Epoch [8/51], Iter [160/391] Loss: 0.3291
Epoch [8/51], Iter [170/391] Loss: 0.1854
Epoch [8/51], Iter [180/391] Loss: 0.3121
Epoch [8/51], Iter [190/391] Loss: 0.2865
Epoch [8/51], Iter [200/391] Loss: 0.3123
Epoch [8/51], Iter [210/391] Loss: 0.2070
Epoch [8/51], Iter [220/391] Loss: 0.4619
Epoch [8/51], Iter [230/391] Loss: 0.3027
Epoch [8/51], Iter [240/391] Loss: 0.4155
Epoch [8/51], Iter [250/391] Loss: 0.3539
Epoch [8/51], Iter [260/391] Loss: 0.3598
Epoch [8/51], Iter [270/391] Loss: 0.3184
Epoch [8/51], Iter [280/391] Loss: 0.3070
Epoch [8/51], Iter [290/391] Loss: 0.3438
Epoch [8/51], Iter [300/391] Loss: 0.3367
Epoch [8/51], Iter [310/391] Loss: 0.2633
Epoch [8/51], Iter [320/391] Loss: 0.4018
Epoch [8/51], Iter [330/391] Loss: 0.3813
Epoch [8/51], Iter [340/391] Loss: 0.2128
Epoch [8/51], Iter [350/391] Loss: 0.3391
Epoch [8/51], Iter [360/391] Loss: 0.3288
Epoch [8/51], Iter [370/391] Loss: 0.2326
Epoch [8/51], Iter [380/391] Loss: 0.3481
Epoch [8/51], Iter [390/391] Loss: 0.4156
Epoch [9/51], Iter [10/391] Loss: 0.3558
Epoch [9/51], Iter [20/391] Loss: 0.2532
Epoch [9/51], Iter [30/391] Loss: 0.2974
Epoch [9/51], Iter [40/391] Loss: 0.2597
Epoch [9/51], Iter [50/391] Loss: 0.2166
Epoch [9/51], Iter [60/391] Loss: 0.3292
Epoch [9/51], Iter [70/391] Loss: 0.3572
Epoch [9/51], Iter [80/391] Loss: 0.2863
Epoch [9/51], Iter [90/391] Loss: 0.2979
Epoch [9/51], Iter [100/391] Loss: 0.2657
Epoch [9/51], Iter [110/391] Loss: 0.2937
Epoch [9/51], Iter [120/391] Loss: 0.3881
Epoch [9/51], Iter [130/391] Loss: 0.3357
Epoch [9/51], Iter [140/391] Loss: 0.3044
Epoch [9/51], Iter [150/391] Loss: 0.3412
Epoch [9/51], Iter [160/391] Loss: 0.2759
Epoch [9/51], Iter [170/391] Loss: 0.2440
Epoch [9/51], Iter [180/391] Loss: 0.3071
Epoch [9/51], Iter [190/391] Loss: 0.3265
Epoch [9/51], Iter [200/391] Loss: 0.3516
Epoch [9/51], Iter [210/391] Loss: 0.2661
Epoch [9/51], Iter [220/391] Loss: 0.3152
Epoch [9/51], Iter [230/391] Loss: 0.3075
Epoch [9/51], Iter [240/391] Loss: 0.3170
Epoch [9/51], Iter [250/391] Loss: 0.3460
Epoch [9/51], Iter [260/391] Loss: 0.4506
Epoch [9/51], Iter [270/391] Loss: 0.3680
Epoch [9/51], Iter [280/391] Loss: 0.2660
Epoch [9/51], Iter [290/391] Loss: 0.3075
Epoch [9/51], Iter [300/391] Loss: 0.3469
Epoch [9/51], Iter [310/391] Loss: 0.2495
Epoch [9/51], Iter [320/391] Loss: 0.3165
Epoch [9/51], Iter [330/391] Loss: 0.2596
Epoch [9/51], Iter [340/391] Loss: 0.2839
Epoch [9/51], Iter [350/391] Loss: 0.2556
Epoch [9/51], Iter [360/391] Loss: 0.2973
Epoch [9/51], Iter [370/391] Loss: 0.2658
Epoch [9/51], Iter [380/391] Loss: 0.2741
Epoch [9/51], Iter [390/391] Loss: 0.2652
Epoch [10/51], Iter [10/391] Loss: 0.3302
Epoch [10/51], Iter [20/391] Loss: 0.2052
Epoch [10/51], Iter [30/391] Loss: 0.1528
Epoch [10/51], Iter [40/391] Loss: 0.1540
Epoch [10/51], Iter [50/391] Loss: 0.1351
Epoch [10/51], Iter [60/391] Loss: 0.3083
Epoch [10/51], Iter [70/391] Loss: 0.3251
Epoch [10/51], Iter [80/391] Loss: 0.3196
Epoch [10/51], Iter [90/391] Loss: 0.2895
Epoch [10/51], Iter [100/391] Loss: 0.1794
Epoch [10/51], Iter [110/391] Loss: 0.2217
Epoch [10/51], Iter [120/391] Loss: 0.2537
Epoch [10/51], Iter [130/391] Loss: 0.2621
Epoch [10/51], Iter [140/391] Loss: 0.3361
Epoch [10/51], Iter [150/391] Loss: 0.2499
Epoch [10/51], Iter [160/391] Loss: 0.2623
Epoch [10/51], Iter [170/391] Loss: 0.3462
Epoch [10/51], Iter [180/391] Loss: 0.3614
Epoch [10/51], Iter [190/391] Loss: 0.3087
Epoch [10/51], Iter [200/391] Loss: 0.2595
Epoch [10/51], Iter [210/391] Loss: 0.2701
Epoch [10/51], Iter [220/391] Loss: 0.2849
Epoch [10/51], Iter [230/391] Loss: 0.3225
Epoch [10/51], Iter [240/391] Loss: 0.2371
Epoch [10/51], Iter [250/391] Loss: 0.2836
Epoch [10/51], Iter [260/391] Loss: 0.3209
Epoch [10/51], Iter [270/391] Loss: 0.3023
Epoch [10/51], Iter [280/391] Loss: 0.2925
Epoch [10/51], Iter [290/391] Loss: 0.1608
Epoch [10/51], Iter [300/391] Loss: 0.2910
Epoch [10/51], Iter [310/391] Loss: 0.2224
Epoch [10/51], Iter [320/391] Loss: 0.2824
Epoch [10/51], Iter [330/391] Loss: 0.3145
Epoch [10/51], Iter [340/391] Loss: 0.3086
Epoch [10/51], Iter [350/391] Loss: 0.2783
Epoch [10/51], Iter [360/391] Loss: 0.3334
Epoch [10/51], Iter [370/391] Loss: 0.2750
Epoch [10/51], Iter [380/391] Loss: 0.3528
Epoch [10/51], Iter [390/391] Loss: 0.1847
[Saving Checkpoint]
Epoch [11/51], Iter [10/391] Loss: 0.2828
Epoch [11/51], Iter [20/391] Loss: 0.1861
Epoch [11/51], Iter [30/391] Loss: 0.1897
Epoch [11/51], Iter [40/391] Loss: 0.2999
Epoch [11/51], Iter [50/391] Loss: 0.1270
Epoch [11/51], Iter [60/391] Loss: 0.2406
Epoch [11/51], Iter [70/391] Loss: 0.1170
Epoch [11/51], Iter [80/391] Loss: 0.3510
Epoch [11/51], Iter [90/391] Loss: 0.2706
Epoch [11/51], Iter [100/391] Loss: 0.2945
Epoch [11/51], Iter [110/391] Loss: 0.1949
Epoch [11/51], Iter [120/391] Loss: 0.1371
Epoch [11/51], Iter [130/391] Loss: 0.1880
Epoch [11/51], Iter [140/391] Loss: 0.2691
Epoch [11/51], Iter [150/391] Loss: 0.2978
Epoch [11/51], Iter [160/391] Loss: 0.2500
Epoch [11/51], Iter [170/391] Loss: 0.5706
Epoch [11/51], Iter [180/391] Loss: 0.2087
Epoch [11/51], Iter [190/391] Loss: 0.2218
Epoch [11/51], Iter [200/391] Loss: 0.2162
Epoch [11/51], Iter [210/391] Loss: 0.2666
Epoch [11/51], Iter [220/391] Loss: 0.1812
Epoch [11/51], Iter [230/391] Loss: 0.3489
Epoch [11/51], Iter [240/391] Loss: 0.2499
Epoch [11/51], Iter [250/391] Loss: 0.2212
Epoch [11/51], Iter [260/391] Loss: 0.2406
Epoch [11/51], Iter [270/391] Loss: 0.2228
Epoch [11/51], Iter [280/391] Loss: 0.1828
Epoch [11/51], Iter [290/391] Loss: 0.1864
Epoch [11/51], Iter [300/391] Loss: 0.2701
Epoch [11/51], Iter [310/391] Loss: 0.3366
Epoch [11/51], Iter [320/391] Loss: 0.2777
Epoch [11/51], Iter [330/391] Loss: 0.2421
Epoch [11/51], Iter [340/391] Loss: 0.3030
Epoch [11/51], Iter [350/391] Loss: 0.2268
Epoch [11/51], Iter [360/391] Loss: 0.3625
Epoch [11/51], Iter [370/391] Loss: 0.3297
Epoch [11/51], Iter [380/391] Loss: 0.1912
Epoch [11/51], Iter [390/391] Loss: 0.3824
Epoch [12/51], Iter [10/391] Loss: 0.1580
Epoch [12/51], Iter [20/391] Loss: 0.3155
Epoch [12/51], Iter [30/391] Loss: 0.1379
Epoch [12/51], Iter [40/391] Loss: 0.3351
Epoch [12/51], Iter [50/391] Loss: 0.2297
Epoch [12/51], Iter [60/391] Loss: 0.1389
Epoch [12/51], Iter [70/391] Loss: 0.1628
Epoch [12/51], Iter [80/391] Loss: 0.1874
Epoch [12/51], Iter [90/391] Loss: 0.1373
Epoch [12/51], Iter [100/391] Loss: 0.2857
Epoch [12/51], Iter [110/391] Loss: 0.1746
Epoch [12/51], Iter [120/391] Loss: 0.1713
Epoch [12/51], Iter [130/391] Loss: 0.2843
Epoch [12/51], Iter [140/391] Loss: 0.1432
Epoch [12/51], Iter [150/391] Loss: 0.3063
Epoch [12/51], Iter [160/391] Loss: 0.2891
Epoch [12/51], Iter [170/391] Loss: 0.3255
Epoch [12/51], Iter [180/391] Loss: 0.2674
Epoch [12/51], Iter [190/391] Loss: 0.2608
Epoch [12/51], Iter [200/391] Loss: 0.2035
Epoch [12/51], Iter [210/391] Loss: 0.1966
Epoch [12/51], Iter [220/391] Loss: 0.2331
Epoch [12/51], Iter [230/391] Loss: 0.2446
Epoch [12/51], Iter [240/391] Loss: 0.2859
Epoch [12/51], Iter [250/391] Loss: 0.2649
Epoch [12/51], Iter [260/391] Loss: 0.2463
Epoch [12/51], Iter [270/391] Loss: 0.2293
Epoch [12/51], Iter [280/391] Loss: 0.1782
Epoch [12/51], Iter [290/391] Loss: 0.1886
Epoch [12/51], Iter [300/391] Loss: 0.3310
Epoch [12/51], Iter [310/391] Loss: 0.2842
Epoch [12/51], Iter [320/391] Loss: 0.2562
Epoch [12/51], Iter [330/391] Loss: 0.3179
Epoch [12/51], Iter [340/391] Loss: 0.2692
Epoch [12/51], Iter [350/391] Loss: 0.3663
Epoch [12/51], Iter [360/391] Loss: 0.1761
Epoch [12/51], Iter [370/391] Loss: 0.2325
Epoch [12/51], Iter [380/391] Loss: 0.2892
Epoch [12/51], Iter [390/391] Loss: 0.2449
Epoch [13/51], Iter [10/391] Loss: 0.2556
Epoch [13/51], Iter [20/391] Loss: 0.1267
Epoch [13/51], Iter [30/391] Loss: 0.2005
Epoch [13/51], Iter [40/391] Loss: 0.2045
Epoch [13/51], Iter [50/391] Loss: 0.1626
Epoch [13/51], Iter [60/391] Loss: 0.1095
Epoch [13/51], Iter [70/391] Loss: 0.3359
Epoch [13/51], Iter [80/391] Loss: 0.2149
Epoch [13/51], Iter [90/391] Loss: 0.2359
Epoch [13/51], Iter [100/391] Loss: 0.1496
Epoch [13/51], Iter [110/391] Loss: 0.2013
Epoch [13/51], Iter [120/391] Loss: 0.2478
Epoch [13/51], Iter [130/391] Loss: 0.1788
Epoch [13/51], Iter [140/391] Loss: 0.1837
Epoch [13/51], Iter [150/391] Loss: 0.2696
Epoch [13/51], Iter [160/391] Loss: 0.2160
Epoch [13/51], Iter [170/391] Loss: 0.1921
Epoch [13/51], Iter [180/391] Loss: 0.1955
Epoch [13/51], Iter [190/391] Loss: 0.2004
Epoch [13/51], Iter [200/391] Loss: 0.2191
Epoch [13/51], Iter [210/391] Loss: 0.2764
Epoch [13/51], Iter [220/391] Loss: 0.2895
Epoch [13/51], Iter [230/391] Loss: 0.1520
Epoch [13/51], Iter [240/391] Loss: 0.1515
Epoch [13/51], Iter [250/391] Loss: 0.2255
Epoch [13/51], Iter [260/391] Loss: 0.1286
Epoch [13/51], Iter [270/391] Loss: 0.2239
Epoch [13/51], Iter [280/391] Loss: 0.2390
Epoch [13/51], Iter [290/391] Loss: 0.3146
Epoch [13/51], Iter [300/391] Loss: 0.2724
Epoch [13/51], Iter [310/391] Loss: 0.1780
Epoch [13/51], Iter [320/391] Loss: 0.1480
Epoch [13/51], Iter [330/391] Loss: 0.1408
Epoch [13/51], Iter [340/391] Loss: 0.2693
Epoch [13/51], Iter [350/391] Loss: 0.2213
Epoch [13/51], Iter [360/391] Loss: 0.2365
Epoch [13/51], Iter [370/391] Loss: 0.2124
Epoch [13/51], Iter [380/391] Loss: 0.1601
Epoch [13/51], Iter [390/391] Loss: 0.2021
Epoch [14/51], Iter [10/391] Loss: 0.1398
Epoch [14/51], Iter [20/391] Loss: 0.2626
Epoch [14/51], Iter [30/391] Loss: 0.2114
Epoch [14/51], Iter [40/391] Loss: 0.3358
Epoch [14/51], Iter [50/391] Loss: 0.1748
Epoch [14/51], Iter [60/391] Loss: 0.1898
Epoch [14/51], Iter [70/391] Loss: 0.1914
Epoch [14/51], Iter [80/391] Loss: 0.1711
Epoch [14/51], Iter [90/391] Loss: 0.2650
Epoch [14/51], Iter [100/391] Loss: 0.0958
Epoch [14/51], Iter [110/391] Loss: 0.1819
Epoch [14/51], Iter [120/391] Loss: 0.2395
Epoch [14/51], Iter [130/391] Loss: 0.2162
Epoch [14/51], Iter [140/391] Loss: 0.1970
Epoch [14/51], Iter [150/391] Loss: 0.2203
Epoch [14/51], Iter [160/391] Loss: 0.1603
Epoch [14/51], Iter [170/391] Loss: 0.1518
Epoch [14/51], Iter [180/391] Loss: 0.1105
Epoch [14/51], Iter [190/391] Loss: 0.1531
Epoch [14/51], Iter [200/391] Loss: 0.1774
Epoch [14/51], Iter [210/391] Loss: 0.1407
Epoch [14/51], Iter [220/391] Loss: 0.1669
Epoch [14/51], Iter [230/391] Loss: 0.2263
Epoch [14/51], Iter [240/391] Loss: 0.2120
Epoch [14/51], Iter [250/391] Loss: 0.1583
Epoch [14/51], Iter [260/391] Loss: 0.3533
Epoch [14/51], Iter [270/391] Loss: 0.2933
Epoch [14/51], Iter [280/391] Loss: 0.2491
Epoch [14/51], Iter [290/391] Loss: 0.2256
Epoch [14/51], Iter [300/391] Loss: 0.2344
Epoch [14/51], Iter [310/391] Loss: 0.1973
Epoch [14/51], Iter [320/391] Loss: 0.2305
Epoch [14/51], Iter [330/391] Loss: 0.2268
Epoch [14/51], Iter [340/391] Loss: 0.3032
Epoch [14/51], Iter [350/391] Loss: 0.1797
Epoch [14/51], Iter [360/391] Loss: 0.1983
Epoch [14/51], Iter [370/391] Loss: 0.2322
Epoch [14/51], Iter [380/391] Loss: 0.2620
Epoch [14/51], Iter [390/391] Loss: 0.2534
Epoch [15/51], Iter [10/391] Loss: 0.1648
Epoch [15/51], Iter [20/391] Loss: 0.1102
Epoch [15/51], Iter [30/391] Loss: 0.1768
Epoch [15/51], Iter [40/391] Loss: 0.1012
Epoch [15/51], Iter [50/391] Loss: 0.3027
Epoch [15/51], Iter [60/391] Loss: 0.1928
Epoch [15/51], Iter [70/391] Loss: 0.1448
Epoch [15/51], Iter [80/391] Loss: 0.1238
Epoch [15/51], Iter [90/391] Loss: 0.0837
Epoch [15/51], Iter [100/391] Loss: 0.1435
Epoch [15/51], Iter [110/391] Loss: 0.1538
Epoch [15/51], Iter [120/391] Loss: 0.1158
Epoch [15/51], Iter [130/391] Loss: 0.2321
Epoch [15/51], Iter [140/391] Loss: 0.2350
Epoch [15/51], Iter [150/391] Loss: 0.1049
Epoch [15/51], Iter [160/391] Loss: 0.1295
Epoch [15/51], Iter [170/391] Loss: 0.1562
Epoch [15/51], Iter [180/391] Loss: 0.1188
Epoch [15/51], Iter [190/391] Loss: 0.1277
Epoch [15/51], Iter [200/391] Loss: 0.1871
Epoch [15/51], Iter [210/391] Loss: 0.2019
Epoch [15/51], Iter [220/391] Loss: 0.1743
Epoch [15/51], Iter [230/391] Loss: 0.2524
Epoch [15/51], Iter [240/391] Loss: 0.2452
Epoch [15/51], Iter [250/391] Loss: 0.2008
Epoch [15/51], Iter [260/391] Loss: 0.1096
Epoch [15/51], Iter [270/391] Loss: 0.1598
Epoch [15/51], Iter [280/391] Loss: 0.1813
Epoch [15/51], Iter [290/391] Loss: 0.2211
Epoch [15/51], Iter [300/391] Loss: 0.0912
Epoch [15/51], Iter [310/391] Loss: 0.2240
Epoch [15/51], Iter [320/391] Loss: 0.1286
Epoch [15/51], Iter [330/391] Loss: 0.2866
Epoch [15/51], Iter [340/391] Loss: 0.2105
Epoch [15/51], Iter [350/391] Loss: 0.1818
Epoch [15/51], Iter [360/391] Loss: 0.1614
Epoch [15/51], Iter [370/391] Loss: 0.2205
Epoch [15/51], Iter [380/391] Loss: 0.2798
Epoch [15/51], Iter [390/391] Loss: 0.1106
Epoch [16/51], Iter [10/391] Loss: 0.1934
Epoch [16/51], Iter [20/391] Loss: 0.1264
Epoch [16/51], Iter [30/391] Loss: 0.1369
Epoch [16/51], Iter [40/391] Loss: 0.1295
Epoch [16/51], Iter [50/391] Loss: 0.1792
Epoch [16/51], Iter [60/391] Loss: 0.1852
Epoch [16/51], Iter [70/391] Loss: 0.1344
Epoch [16/51], Iter [80/391] Loss: 0.1826
Epoch [16/51], Iter [90/391] Loss: 0.2068
Epoch [16/51], Iter [100/391] Loss: 0.1930
Epoch [16/51], Iter [110/391] Loss: 0.1477
Epoch [16/51], Iter [120/391] Loss: 0.1432
Epoch [16/51], Iter [130/391] Loss: 0.1398
Epoch [16/51], Iter [140/391] Loss: 0.1916
Epoch [16/51], Iter [150/391] Loss: 0.1994
Epoch [16/51], Iter [160/391] Loss: 0.1503
Epoch [16/51], Iter [170/391] Loss: 0.1161
Epoch [16/51], Iter [180/391] Loss: 0.1225
Epoch [16/51], Iter [190/391] Loss: 0.2046
Epoch [16/51], Iter [200/391] Loss: 0.2253
Epoch [16/51], Iter [210/391] Loss: 0.2156
Epoch [16/51], Iter [220/391] Loss: 0.1150
Epoch [16/51], Iter [230/391] Loss: 0.1191
Epoch [16/51], Iter [240/391] Loss: 0.2027
Epoch [16/51], Iter [250/391] Loss: 0.1761
Epoch [16/51], Iter [260/391] Loss: 0.1853
Epoch [16/51], Iter [270/391] Loss: 0.1924
Epoch [16/51], Iter [280/391] Loss: 0.1455
Epoch [16/51], Iter [290/391] Loss: 0.2373
Epoch [16/51], Iter [300/391] Loss: 0.1995
Epoch [16/51], Iter [310/391] Loss: 0.0954
Epoch [16/51], Iter [320/391] Loss: 0.1379
Epoch [16/51], Iter [330/391] Loss: 0.2402
Epoch [16/51], Iter [340/391] Loss: 0.1296
Epoch [16/51], Iter [350/391] Loss: 0.1597
Epoch [16/51], Iter [360/391] Loss: 0.2349
Epoch [16/51], Iter [370/391] Loss: 0.1644
Epoch [16/51], Iter [380/391] Loss: 0.1913
Epoch [16/51], Iter [390/391] Loss: 0.1348
Epoch [17/51], Iter [10/391] Loss: 0.0929
Epoch [17/51], Iter [20/391] Loss: 0.1757
Epoch [17/51], Iter [30/391] Loss: 0.0726
Epoch [17/51], Iter [40/391] Loss: 0.1124
Epoch [17/51], Iter [50/391] Loss: 0.1225
Epoch [17/51], Iter [60/391] Loss: 0.1422
Epoch [17/51], Iter [70/391] Loss: 0.1369
Epoch [17/51], Iter [80/391] Loss: 0.1225
Epoch [17/51], Iter [90/391] Loss: 0.1314
Epoch [17/51], Iter [100/391] Loss: 0.1239
Epoch [17/51], Iter [110/391] Loss: 0.0985
Epoch [17/51], Iter [120/391] Loss: 0.1923
Epoch [17/51], Iter [130/391] Loss: 0.1407
Epoch [17/51], Iter [140/391] Loss: 0.1247
Epoch [17/51], Iter [150/391] Loss: 0.2221
Epoch [17/51], Iter [160/391] Loss: 0.1375
Epoch [17/51], Iter [170/391] Loss: 0.1135
Epoch [17/51], Iter [180/391] Loss: 0.1762
Epoch [17/51], Iter [190/391] Loss: 0.1687
Epoch [17/51], Iter [200/391] Loss: 0.1684
Epoch [17/51], Iter [210/391] Loss: 0.1409
Epoch [17/51], Iter [220/391] Loss: 0.1337
Epoch [17/51], Iter [230/391] Loss: 0.2014
Epoch [17/51], Iter [240/391] Loss: 0.1668
Epoch [17/51], Iter [250/391] Loss: 0.1048
Epoch [17/51], Iter [260/391] Loss: 0.0988
Epoch [17/51], Iter [270/391] Loss: 0.1745
Epoch [17/51], Iter [280/391] Loss: 0.0832
Epoch [17/51], Iter [290/391] Loss: 0.1320
Epoch [17/51], Iter [300/391] Loss: 0.2310
Epoch [17/51], Iter [310/391] Loss: 0.1698
Epoch [17/51], Iter [320/391] Loss: 0.2257
Epoch [17/51], Iter [330/391] Loss: 0.1973
Epoch [17/51], Iter [340/391] Loss: 0.2360
Epoch [17/51], Iter [350/391] Loss: 0.1492
Epoch [17/51], Iter [360/391] Loss: 0.1812
Epoch [17/51], Iter [370/391] Loss: 0.1163
Epoch [17/51], Iter [380/391] Loss: 0.1748
Epoch [17/51], Iter [390/391] Loss: 0.2157
Epoch [18/51], Iter [10/391] Loss: 0.1187
Epoch [18/51], Iter [20/391] Loss: 0.1273
Epoch [18/51], Iter [30/391] Loss: 0.1041
Epoch [18/51], Iter [40/391] Loss: 0.1041
Epoch [18/51], Iter [50/391] Loss: 0.1490
Epoch [18/51], Iter [60/391] Loss: 0.0934
Epoch [18/51], Iter [70/391] Loss: 0.1776
Epoch [18/51], Iter [80/391] Loss: 0.1653
Epoch [18/51], Iter [90/391] Loss: 0.0889
Epoch [18/51], Iter [100/391] Loss: 0.1237
Epoch [18/51], Iter [110/391] Loss: 0.1107
Epoch [18/51], Iter [120/391] Loss: 0.1519
Epoch [18/51], Iter [130/391] Loss: 0.1530
Epoch [18/51], Iter [140/391] Loss: 0.1627
Epoch [18/51], Iter [150/391] Loss: 0.0776
Epoch [18/51], Iter [160/391] Loss: 0.0885
Epoch [18/51], Iter [170/391] Loss: 0.1117
Epoch [18/51], Iter [180/391] Loss: 0.1309
Epoch [18/51], Iter [190/391] Loss: 0.1313
Epoch [18/51], Iter [200/391] Loss: 0.1651
Epoch [18/51], Iter [210/391] Loss: 0.1083
Epoch [18/51], Iter [220/391] Loss: 0.1423
Epoch [18/51], Iter [230/391] Loss: 0.0999
Epoch [18/51], Iter [240/391] Loss: 0.2317
Epoch [18/51], Iter [250/391] Loss: 0.1898
Epoch [18/51], Iter [260/391] Loss: 0.1187
Epoch [18/51], Iter [270/391] Loss: 0.1332
Epoch [18/51], Iter [280/391] Loss: 0.1585
Epoch [18/51], Iter [290/391] Loss: 0.1538
Epoch [18/51], Iter [300/391] Loss: 0.1062
Epoch [18/51], Iter [310/391] Loss: 0.1349
Epoch [18/51], Iter [320/391] Loss: 0.1772
Epoch [18/51], Iter [330/391] Loss: 0.1147
Epoch [18/51], Iter [340/391] Loss: 0.2091
Epoch [18/51], Iter [350/391] Loss: 0.2042
Epoch [18/51], Iter [360/391] Loss: 0.2626
Epoch [18/51], Iter [370/391] Loss: 0.2148
Epoch [18/51], Iter [380/391] Loss: 0.2109
Epoch [18/51], Iter [390/391] Loss: 0.1976
Epoch [19/51], Iter [10/391] Loss: 0.1101
Epoch [19/51], Iter [20/391] Loss: 0.0508
Epoch [19/51], Iter [30/391] Loss: 0.1792
Epoch [19/51], Iter [40/391] Loss: 0.0828
Epoch [19/51], Iter [50/391] Loss: 0.1180
Epoch [19/51], Iter [60/391] Loss: 0.1593
Epoch [19/51], Iter [70/391] Loss: 0.1254
Epoch [19/51], Iter [80/391] Loss: 0.1352
Epoch [19/51], Iter [90/391] Loss: 0.1120
Epoch [19/51], Iter [100/391] Loss: 0.1183
Epoch [19/51], Iter [110/391] Loss: 0.1521
Epoch [19/51], Iter [120/391] Loss: 0.1205
Epoch [19/51], Iter [130/391] Loss: 0.1449
Epoch [19/51], Iter [140/391] Loss: 0.2100
Epoch [19/51], Iter [150/391] Loss: 0.1193
Epoch [19/51], Iter [160/391] Loss: 0.0531
Epoch [19/51], Iter [170/391] Loss: 0.1662
Epoch [19/51], Iter [180/391] Loss: 0.1459
Epoch [19/51], Iter [190/391] Loss: 0.2013
Epoch [19/51], Iter [200/391] Loss: 0.1091
Epoch [19/51], Iter [210/391] Loss: 0.1201
Epoch [19/51], Iter [220/391] Loss: 0.1271
Epoch [19/51], Iter [230/391] Loss: 0.1275
Epoch [19/51], Iter [240/391] Loss: 0.2519
Epoch [19/51], Iter [250/391] Loss: 0.1639
Epoch [19/51], Iter [260/391] Loss: 0.1291
Epoch [19/51], Iter [270/391] Loss: 0.0924
Epoch [19/51], Iter [280/391] Loss: 0.1711
Epoch [19/51], Iter [290/391] Loss: 0.1363
Epoch [19/51], Iter [300/391] Loss: 0.1600
Epoch [19/51], Iter [310/391] Loss: 0.1181
Epoch [19/51], Iter [320/391] Loss: 0.0757
Epoch [19/51], Iter [330/391] Loss: 0.1059
Epoch [19/51], Iter [340/391] Loss: 0.1041
Epoch [19/51], Iter [350/391] Loss: 0.1305
Epoch [19/51], Iter [360/391] Loss: 0.1265
Epoch [19/51], Iter [370/391] Loss: 0.1070
Epoch [19/51], Iter [380/391] Loss: 0.1745
Epoch [19/51], Iter [390/391] Loss: 0.1203
Epoch [20/51], Iter [10/391] Loss: 0.1256
Epoch [20/51], Iter [20/391] Loss: 0.1330
Epoch [20/51], Iter [30/391] Loss: 0.0899
Epoch [20/51], Iter [40/391] Loss: 0.0520
Epoch [20/51], Iter [50/391] Loss: 0.1371
Epoch [20/51], Iter [60/391] Loss: 0.0932
Epoch [20/51], Iter [70/391] Loss: 0.1669
Epoch [20/51], Iter [80/391] Loss: 0.0824
Epoch [20/51], Iter [90/391] Loss: 0.0912
Epoch [20/51], Iter [100/391] Loss: 0.1431
Epoch [20/51], Iter [110/391] Loss: 0.1199
Epoch [20/51], Iter [120/391] Loss: 0.0880
Epoch [20/51], Iter [130/391] Loss: 0.1013
Epoch [20/51], Iter [140/391] Loss: 0.0697
Epoch [20/51], Iter [150/391] Loss: 0.1048
Epoch [20/51], Iter [160/391] Loss: 0.1144
Epoch [20/51], Iter [170/391] Loss: 0.0930
Epoch [20/51], Iter [180/391] Loss: 0.0678
Epoch [20/51], Iter [190/391] Loss: 0.0522
Epoch [20/51], Iter [200/391] Loss: 0.1396
Epoch [20/51], Iter [210/391] Loss: 0.1579
Epoch [20/51], Iter [220/391] Loss: 0.1643
Epoch [20/51], Iter [230/391] Loss: 0.1572
Epoch [20/51], Iter [240/391] Loss: 0.1326
Epoch [20/51], Iter [250/391] Loss: 0.1234
Epoch [20/51], Iter [260/391] Loss: 0.1304
Epoch [20/51], Iter [270/391] Loss: 0.1122
Epoch [20/51], Iter [280/391] Loss: 0.1754
Epoch [20/51], Iter [290/391] Loss: 0.1170
Epoch [20/51], Iter [300/391] Loss: 0.1488
Epoch [20/51], Iter [310/391] Loss: 0.1123
Epoch [20/51], Iter [320/391] Loss: 0.1779
Epoch [20/51], Iter [330/391] Loss: 0.1721
Epoch [20/51], Iter [340/391] Loss: 0.1864
Epoch [20/51], Iter [350/391] Loss: 0.1653
Epoch [20/51], Iter [360/391] Loss: 0.1375
Epoch [20/51], Iter [370/391] Loss: 0.1724
Epoch [20/51], Iter [380/391] Loss: 0.1294
Epoch [20/51], Iter [390/391] Loss: 0.0644
[Saving Checkpoint]
Epoch [21/51], Iter [10/391] Loss: 0.2098
Epoch [21/51], Iter [20/391] Loss: 0.0959
Epoch [21/51], Iter [30/391] Loss: 0.0609
Epoch [21/51], Iter [40/391] Loss: 0.1194
Epoch [21/51], Iter [50/391] Loss: 0.0659
Epoch [21/51], Iter [60/391] Loss: 0.1811
Epoch [21/51], Iter [70/391] Loss: 0.1467
Epoch [21/51], Iter [80/391] Loss: 0.0762
Epoch [21/51], Iter [90/391] Loss: 0.1073
Epoch [21/51], Iter [100/391] Loss: 0.0679
Epoch [21/51], Iter [110/391] Loss: 0.0820
Epoch [21/51], Iter [120/391] Loss: 0.0492
Epoch [21/51], Iter [130/391] Loss: 0.0976
Epoch [21/51], Iter [140/391] Loss: 0.1319
Epoch [21/51], Iter [150/391] Loss: 0.1302
Epoch [21/51], Iter [160/391] Loss: 0.1224
Epoch [21/51], Iter [170/391] Loss: 0.0737
Epoch [21/51], Iter [180/391] Loss: 0.1709
Epoch [21/51], Iter [190/391] Loss: 0.1315
Epoch [21/51], Iter [200/391] Loss: 0.1814
Epoch [21/51], Iter [210/391] Loss: 0.1575
Epoch [21/51], Iter [220/391] Loss: 0.1378
Epoch [21/51], Iter [230/391] Loss: 0.1164
Epoch [21/51], Iter [240/391] Loss: 0.1288
Epoch [21/51], Iter [250/391] Loss: 0.0805
Epoch [21/51], Iter [260/391] Loss: 0.1517
Epoch [21/51], Iter [270/391] Loss: 0.1394
Epoch [21/51], Iter [280/391] Loss: 0.1135
Epoch [21/51], Iter [290/391] Loss: 0.1104
Epoch [21/51], Iter [300/391] Loss: 0.0708
Epoch [21/51], Iter [310/391] Loss: 0.0972
Epoch [21/51], Iter [320/391] Loss: 0.1057
Epoch [21/51], Iter [330/391] Loss: 0.1120
Epoch [21/51], Iter [340/391] Loss: 0.1670
Epoch [21/51], Iter [350/391] Loss: 0.1566
Epoch [21/51], Iter [360/391] Loss: 0.1023
Epoch [21/51], Iter [370/391] Loss: 0.0777
Epoch [21/51], Iter [380/391] Loss: 0.2001
Epoch [21/51], Iter [390/391] Loss: 0.1203
Epoch [22/51], Iter [10/391] Loss: 0.1480
Epoch [22/51], Iter [20/391] Loss: 0.0818
Epoch [22/51], Iter [30/391] Loss: 0.0710
Epoch [22/51], Iter [40/391] Loss: 0.0638
Epoch [22/51], Iter [50/391] Loss: 0.0809
Epoch [22/51], Iter [60/391] Loss: 0.1806
Epoch [22/51], Iter [70/391] Loss: 0.0645
Epoch [22/51], Iter [80/391] Loss: 0.2258
Epoch [22/51], Iter [90/391] Loss: 0.2043
Epoch [22/51], Iter [100/391] Loss: 0.1818
Epoch [22/51], Iter [110/391] Loss: 0.0857
Epoch [22/51], Iter [120/391] Loss: 0.2097
Epoch [22/51], Iter [130/391] Loss: 0.1145
Epoch [22/51], Iter [140/391] Loss: 0.1251
Epoch [22/51], Iter [150/391] Loss: 0.1679
Epoch [22/51], Iter [160/391] Loss: 0.1177
Epoch [22/51], Iter [170/391] Loss: 0.1098
Epoch [22/51], Iter [180/391] Loss: 0.1032
Epoch [22/51], Iter [190/391] Loss: 0.1447
Epoch [22/51], Iter [200/391] Loss: 0.1328
Epoch [22/51], Iter [210/391] Loss: 0.0923
Epoch [22/51], Iter [220/391] Loss: 0.1292
Epoch [22/51], Iter [230/391] Loss: 0.1073
Epoch [22/51], Iter [240/391] Loss: 0.1750
Epoch [22/51], Iter [250/391] Loss: 0.0631
Epoch [22/51], Iter [260/391] Loss: 0.0852
Epoch [22/51], Iter [270/391] Loss: 0.1137
Epoch [22/51], Iter [280/391] Loss: 0.0554
Epoch [22/51], Iter [290/391] Loss: 0.0694
Epoch [22/51], Iter [300/391] Loss: 0.1474
Epoch [22/51], Iter [310/391] Loss: 0.0730
Epoch [22/51], Iter [320/391] Loss: 0.1556
Epoch [22/51], Iter [330/391] Loss: 0.1701
Epoch [22/51], Iter [340/391] Loss: 0.0447
Epoch [22/51], Iter [350/391] Loss: 0.1745
Epoch [22/51], Iter [360/391] Loss: 0.1107
Epoch [22/51], Iter [370/391] Loss: 0.1038
Epoch [22/51], Iter [380/391] Loss: 0.1150
Epoch [22/51], Iter [390/391] Loss: 0.0816
Epoch [23/51], Iter [10/391] Loss: 0.1186
Epoch [23/51], Iter [20/391] Loss: 0.0693
Epoch [23/51], Iter [30/391] Loss: 0.1061
Epoch [23/51], Iter [40/391] Loss: 0.2134
Epoch [23/51], Iter [50/391] Loss: 0.0431
Epoch [23/51], Iter [60/391] Loss: 0.1797
Epoch [23/51], Iter [70/391] Loss: 0.1607
Epoch [23/51], Iter [80/391] Loss: 0.1662
Epoch [23/51], Iter [90/391] Loss: 0.0953
Epoch [23/51], Iter [100/391] Loss: 0.0968
Epoch [23/51], Iter [110/391] Loss: 0.1700
Epoch [23/51], Iter [120/391] Loss: 0.1801
Epoch [23/51], Iter [130/391] Loss: 0.0572
Epoch [23/51], Iter [140/391] Loss: 0.1773
Epoch [23/51], Iter [150/391] Loss: 0.1227
Epoch [23/51], Iter [160/391] Loss: 0.0898
Epoch [23/51], Iter [170/391] Loss: 0.0598
Epoch [23/51], Iter [180/391] Loss: 0.1197
Epoch [23/51], Iter [190/391] Loss: 0.0538
Epoch [23/51], Iter [200/391] Loss: 0.1372
Epoch [23/51], Iter [210/391] Loss: 0.0536
Epoch [23/51], Iter [220/391] Loss: 0.0914
Epoch [23/51], Iter [230/391] Loss: 0.1102
Epoch [23/51], Iter [240/391] Loss: 0.0978
Epoch [23/51], Iter [250/391] Loss: 0.0723
Epoch [23/51], Iter [260/391] Loss: 0.1096
Epoch [23/51], Iter [270/391] Loss: 0.1550
Epoch [23/51], Iter [280/391] Loss: 0.0985
Epoch [23/51], Iter [290/391] Loss: 0.0900
Epoch [23/51], Iter [300/391] Loss: 0.0460
Epoch [23/51], Iter [310/391] Loss: 0.1189
Epoch [23/51], Iter [320/391] Loss: 0.1283
Epoch [23/51], Iter [330/391] Loss: 0.1307
Epoch [23/51], Iter [340/391] Loss: 0.1439
Epoch [23/51], Iter [350/391] Loss: 0.1444
Epoch [23/51], Iter [360/391] Loss: 0.1802
Epoch [23/51], Iter [370/391] Loss: 0.0542
Epoch [23/51], Iter [380/391] Loss: 0.1206
Epoch [23/51], Iter [390/391] Loss: 0.1101
Epoch [24/51], Iter [10/391] Loss: 0.0964
Epoch [24/51], Iter [20/391] Loss: 0.0511
Epoch [24/51], Iter [30/391] Loss: 0.0277
Epoch [24/51], Iter [40/391] Loss: 0.1358
Epoch [24/51], Iter [50/391] Loss: 0.1346
Epoch [24/51], Iter [60/391] Loss: 0.1407
Epoch [24/51], Iter [70/391] Loss: 0.1680
Epoch [24/51], Iter [80/391] Loss: 0.0439
Epoch [24/51], Iter [90/391] Loss: 0.0352
Epoch [24/51], Iter [100/391] Loss: 0.0853
Epoch [24/51], Iter [110/391] Loss: 0.0822
Epoch [24/51], Iter [120/391] Loss: 0.0907
Epoch [24/51], Iter [130/391] Loss: 0.1187
Epoch [24/51], Iter [140/391] Loss: 0.0353
Epoch [24/51], Iter [150/391] Loss: 0.1242
Epoch [24/51], Iter [160/391] Loss: 0.0880
Epoch [24/51], Iter [170/391] Loss: 0.0958
Epoch [24/51], Iter [180/391] Loss: 0.0940
Epoch [24/51], Iter [190/391] Loss: 0.0252
Epoch [24/51], Iter [200/391] Loss: 0.1485
Epoch [24/51], Iter [210/391] Loss: 0.0472
Epoch [24/51], Iter [220/391] Loss: 0.1545
Epoch [24/51], Iter [230/391] Loss: 0.0980
Epoch [24/51], Iter [240/391] Loss: 0.0881
Epoch [24/51], Iter [250/391] Loss: 0.0678
Epoch [24/51], Iter [260/391] Loss: 0.0919
Epoch [24/51], Iter [270/391] Loss: 0.1284
Epoch [24/51], Iter [280/391] Loss: 0.1134
Epoch [24/51], Iter [290/391] Loss: 0.1190
Epoch [24/51], Iter [300/391] Loss: 0.0558
Epoch [24/51], Iter [310/391] Loss: 0.1519
Epoch [24/51], Iter [320/391] Loss: 0.0898
Epoch [24/51], Iter [330/391] Loss: 0.1064
Epoch [24/51], Iter [340/391] Loss: 0.0624
Epoch [24/51], Iter [350/391] Loss: 0.1080
Epoch [24/51], Iter [360/391] Loss: 0.1027
Epoch [24/51], Iter [370/391] Loss: 0.1382
Epoch [24/51], Iter [380/391] Loss: 0.0478
Epoch [24/51], Iter [390/391] Loss: 0.0305
Epoch [25/51], Iter [10/391] Loss: 0.1223
Epoch [25/51], Iter [20/391] Loss: 0.0846
Epoch [25/51], Iter [30/391] Loss: 0.1074
Epoch [25/51], Iter [40/391] Loss: 0.1237
Epoch [25/51], Iter [50/391] Loss: 0.0939
Epoch [25/51], Iter [60/391] Loss: 0.0621
Epoch [25/51], Iter [70/391] Loss: 0.0597
Epoch [25/51], Iter [80/391] Loss: 0.1022
Epoch [25/51], Iter [90/391] Loss: 0.0805
Epoch [25/51], Iter [100/391] Loss: 0.1281
Epoch [25/51], Iter [110/391] Loss: 0.0549
Epoch [25/51], Iter [120/391] Loss: 0.0915
Epoch [25/51], Iter [130/391] Loss: 0.0414
Epoch [25/51], Iter [140/391] Loss: 0.0770
Epoch [25/51], Iter [150/391] Loss: 0.0720
Epoch [25/51], Iter [160/391] Loss: 0.1000
Epoch [25/51], Iter [170/391] Loss: 0.1310
Epoch [25/51], Iter [180/391] Loss: 0.1315
Epoch [25/51], Iter [190/391] Loss: 0.0782
Epoch [25/51], Iter [200/391] Loss: 0.1053
Epoch [25/51], Iter [210/391] Loss: 0.1152
Epoch [25/51], Iter [220/391] Loss: 0.0496
Epoch [25/51], Iter [230/391] Loss: 0.0862
Epoch [25/51], Iter [240/391] Loss: 0.1084
Epoch [25/51], Iter [250/391] Loss: 0.0328
Epoch [25/51], Iter [260/391] Loss: 0.1082
Epoch [25/51], Iter [270/391] Loss: 0.1005
Epoch [25/51], Iter [280/391] Loss: 0.1182
Epoch [25/51], Iter [290/391] Loss: 0.1112
Epoch [25/51], Iter [300/391] Loss: 0.1189
Epoch [25/51], Iter [310/391] Loss: 0.1724
Epoch [25/51], Iter [320/391] Loss: 0.0648
Epoch [25/51], Iter [330/391] Loss: 0.1764
Epoch [25/51], Iter [340/391] Loss: 0.0250
Epoch [25/51], Iter [350/391] Loss: 0.1127
Epoch [25/51], Iter [360/391] Loss: 0.0704
Epoch [25/51], Iter [370/391] Loss: 0.1141
Epoch [25/51], Iter [380/391] Loss: 0.0620
Epoch [25/51], Iter [390/391] Loss: 0.1611
Epoch [26/51], Iter [10/391] Loss: 0.0479
Epoch [26/51], Iter [20/391] Loss: 0.1433
Epoch [26/51], Iter [30/391] Loss: 0.0557
Epoch [26/51], Iter [40/391] Loss: 0.0891
Epoch [26/51], Iter [50/391] Loss: 0.1129
Epoch [26/51], Iter [60/391] Loss: 0.0501
Epoch [26/51], Iter [70/391] Loss: 0.1322
Epoch [26/51], Iter [80/391] Loss: 0.1009
Epoch [26/51], Iter [90/391] Loss: 0.1422
Epoch [26/51], Iter [100/391] Loss: 0.0504
Epoch [26/51], Iter [110/391] Loss: 0.0508
Epoch [26/51], Iter [120/391] Loss: 0.0814
Epoch [26/51], Iter [130/391] Loss: 0.0906
Epoch [26/51], Iter [140/391] Loss: 0.0665
Epoch [26/51], Iter [150/391] Loss: 0.0500
Epoch [26/51], Iter [160/391] Loss: 0.1518
Epoch [26/51], Iter [170/391] Loss: 0.1299
Epoch [26/51], Iter [180/391] Loss: 0.0816
Epoch [26/51], Iter [190/391] Loss: 0.1077
Epoch [26/51], Iter [200/391] Loss: 0.0827
Epoch [26/51], Iter [210/391] Loss: 0.1231
Epoch [26/51], Iter [220/391] Loss: 0.1646
Epoch [26/51], Iter [230/391] Loss: 0.1089
Epoch [26/51], Iter [240/391] Loss: 0.0967
Epoch [26/51], Iter [250/391] Loss: 0.1138
Epoch [26/51], Iter [260/391] Loss: 0.0699
Epoch [26/51], Iter [270/391] Loss: 0.1123
Epoch [26/51], Iter [280/391] Loss: 0.0836
Epoch [26/51], Iter [290/391] Loss: 0.0672
Epoch [26/51], Iter [300/391] Loss: 0.1680
Epoch [26/51], Iter [310/391] Loss: 0.0719
Epoch [26/51], Iter [320/391] Loss: 0.0535
Epoch [26/51], Iter [330/391] Loss: 0.0616
Epoch [26/51], Iter [340/391] Loss: 0.0314
Epoch [26/51], Iter [350/391] Loss: 0.0678
Epoch [26/51], Iter [360/391] Loss: 0.1008
Epoch [26/51], Iter [370/391] Loss: 0.0495
Epoch [26/51], Iter [380/391] Loss: 0.1036
Epoch [26/51], Iter [390/391] Loss: 0.1288
Epoch [27/51], Iter [10/391] Loss: 0.1374
Epoch [27/51], Iter [20/391] Loss: 0.0614
Epoch [27/51], Iter [30/391] Loss: 0.0769
Epoch [27/51], Iter [40/391] Loss: 0.0963
Epoch [27/51], Iter [50/391] Loss: 0.0724
Epoch [27/51], Iter [60/391] Loss: 0.0435
Epoch [27/51], Iter [70/391] Loss: 0.0401
Epoch [27/51], Iter [80/391] Loss: 0.0533
Epoch [27/51], Iter [90/391] Loss: 0.1001
Epoch [27/51], Iter [100/391] Loss: 0.0415
Epoch [27/51], Iter [110/391] Loss: 0.0976
Epoch [27/51], Iter [120/391] Loss: 0.0819
Epoch [27/51], Iter [130/391] Loss: 0.0979
Epoch [27/51], Iter [140/391] Loss: 0.0563
Epoch [27/51], Iter [150/391] Loss: 0.0813
Epoch [27/51], Iter [160/391] Loss: 0.0759
Epoch [27/51], Iter [170/391] Loss: 0.1109
Epoch [27/51], Iter [180/391] Loss: 0.0731
Epoch [27/51], Iter [190/391] Loss: 0.0536
Epoch [27/51], Iter [200/391] Loss: 0.0750
Epoch [27/51], Iter [210/391] Loss: 0.0784
Epoch [27/51], Iter [220/391] Loss: 0.2125
Epoch [27/51], Iter [230/391] Loss: 0.1498
Epoch [27/51], Iter [240/391] Loss: 0.1090
Epoch [27/51], Iter [250/391] Loss: 0.1876
Epoch [27/51], Iter [260/391] Loss: 0.1776
Epoch [27/51], Iter [270/391] Loss: 0.0910
Epoch [27/51], Iter [280/391] Loss: 0.0594
Epoch [27/51], Iter [290/391] Loss: 0.0958
Epoch [27/51], Iter [300/391] Loss: 0.0628
Epoch [27/51], Iter [310/391] Loss: 0.1194
Epoch [27/51], Iter [320/391] Loss: 0.0732
Epoch [27/51], Iter [330/391] Loss: 0.0961
Epoch [27/51], Iter [340/391] Loss: 0.0645
Epoch [27/51], Iter [350/391] Loss: 0.0825
Epoch [27/51], Iter [360/391] Loss: 0.1321
Epoch [27/51], Iter [370/391] Loss: 0.0768
Epoch [27/51], Iter [380/391] Loss: 0.0722
Epoch [27/51], Iter [390/391] Loss: 0.1090
Epoch [28/51], Iter [10/391] Loss: 0.0741
Epoch [28/51], Iter [20/391] Loss: 0.0918
Epoch [28/51], Iter [30/391] Loss: 0.1312
Epoch [28/51], Iter [40/391] Loss: 0.1121
Epoch [28/51], Iter [50/391] Loss: 0.0498
Epoch [28/51], Iter [60/391] Loss: 0.0798
Epoch [28/51], Iter [70/391] Loss: 0.0595
Epoch [28/51], Iter [80/391] Loss: 0.1065
Epoch [28/51], Iter [90/391] Loss: 0.0894
Epoch [28/51], Iter [100/391] Loss: 0.0929
Epoch [28/51], Iter [110/391] Loss: 0.0626
Epoch [28/51], Iter [120/391] Loss: 0.0621
Epoch [28/51], Iter [130/391] Loss: 0.0541
Epoch [28/51], Iter [140/391] Loss: 0.0391
Epoch [28/51], Iter [150/391] Loss: 0.0646
Epoch [28/51], Iter [160/391] Loss: 0.0710
Epoch [28/51], Iter [170/391] Loss: 0.1426
Epoch [28/51], Iter [180/391] Loss: 0.0728
Epoch [28/51], Iter [190/391] Loss: 0.0864
Epoch [28/51], Iter [200/391] Loss: 0.0501
Epoch [28/51], Iter [210/391] Loss: 0.0812
Epoch [28/51], Iter [220/391] Loss: 0.1134
Epoch [28/51], Iter [230/391] Loss: 0.1235
Epoch [28/51], Iter [240/391] Loss: 0.0977
Epoch [28/51], Iter [250/391] Loss: 0.0462
Epoch [28/51], Iter [260/391] Loss: 0.0957
Epoch [28/51], Iter [270/391] Loss: 0.0722
Epoch [28/51], Iter [280/391] Loss: 0.0752
Epoch [28/51], Iter [290/391] Loss: 0.0534
Epoch [28/51], Iter [300/391] Loss: 0.0895
Epoch [28/51], Iter [310/391] Loss: 0.1066
Epoch [28/51], Iter [320/391] Loss: 0.1534
Epoch [28/51], Iter [330/391] Loss: 0.1151
Epoch [28/51], Iter [340/391] Loss: 0.0716
Epoch [28/51], Iter [350/391] Loss: 0.0625
Epoch [28/51], Iter [360/391] Loss: 0.0944
Epoch [28/51], Iter [370/391] Loss: 0.0658
Epoch [28/51], Iter [380/391] Loss: 0.0690
Epoch [28/51], Iter [390/391] Loss: 0.0400
Epoch [29/51], Iter [10/391] Loss: 0.0804
Epoch [29/51], Iter [20/391] Loss: 0.1074
Epoch [29/51], Iter [30/391] Loss: 0.0678
Epoch [29/51], Iter [40/391] Loss: 0.0887
Epoch [29/51], Iter [50/391] Loss: 0.0380
Epoch [29/51], Iter [60/391] Loss: 0.0376
Epoch [29/51], Iter [70/391] Loss: 0.1133
Epoch [29/51], Iter [80/391] Loss: 0.1100
Epoch [29/51], Iter [90/391] Loss: 0.1148
Epoch [29/51], Iter [100/391] Loss: 0.0944
Epoch [29/51], Iter [110/391] Loss: 0.1059
Epoch [29/51], Iter [120/391] Loss: 0.0732
Epoch [29/51], Iter [130/391] Loss: 0.1032
Epoch [29/51], Iter [140/391] Loss: 0.0954
Epoch [29/51], Iter [150/391] Loss: 0.1015
Epoch [29/51], Iter [160/391] Loss: 0.0286
Epoch [29/51], Iter [170/391] Loss: 0.0402
Epoch [29/51], Iter [180/391] Loss: 0.1499
Epoch [29/51], Iter [190/391] Loss: 0.0280
Epoch [29/51], Iter [200/391] Loss: 0.0854
Epoch [29/51], Iter [210/391] Loss: 0.0636
Epoch [29/51], Iter [220/391] Loss: 0.0880
Epoch [29/51], Iter [230/391] Loss: 0.1042
Epoch [29/51], Iter [240/391] Loss: 0.0615
Epoch [29/51], Iter [250/391] Loss: 0.1177
Epoch [29/51], Iter [260/391] Loss: 0.0979
Epoch [29/51], Iter [270/391] Loss: 0.0801
Epoch [29/51], Iter [280/391] Loss: 0.0653
Epoch [29/51], Iter [290/391] Loss: 0.0808
Epoch [29/51], Iter [300/391] Loss: 0.0720
Epoch [29/51], Iter [310/391] Loss: 0.0323
Epoch [29/51], Iter [320/391] Loss: 0.0764
Epoch [29/51], Iter [330/391] Loss: 0.0977
Epoch [29/51], Iter [340/391] Loss: 0.0887
Epoch [29/51], Iter [350/391] Loss: 0.0868
Epoch [29/51], Iter [360/391] Loss: 0.0330
Epoch [29/51], Iter [370/391] Loss: 0.0863
Epoch [29/51], Iter [380/391] Loss: 0.0799
Epoch [29/51], Iter [390/391] Loss: 0.0983
Epoch [30/51], Iter [10/391] Loss: 0.0775
Epoch [30/51], Iter [20/391] Loss: 0.0674
Epoch [30/51], Iter [30/391] Loss: 0.1156
Epoch [30/51], Iter [40/391] Loss: 0.0495
Epoch [30/51], Iter [50/391] Loss: 0.1327
Epoch [30/51], Iter [60/391] Loss: 0.0639
Epoch [30/51], Iter [70/391] Loss: 0.1448
Epoch [30/51], Iter [80/391] Loss: 0.0522
Epoch [30/51], Iter [90/391] Loss: 0.0670
Epoch [30/51], Iter [100/391] Loss: 0.0356
Epoch [30/51], Iter [110/391] Loss: 0.0608
Epoch [30/51], Iter [120/391] Loss: 0.0748
Epoch [30/51], Iter [130/391] Loss: 0.0640
Epoch [30/51], Iter [140/391] Loss: 0.0480
Epoch [30/51], Iter [150/391] Loss: 0.0439
Epoch [30/51], Iter [160/391] Loss: 0.0352
Epoch [30/51], Iter [170/391] Loss: 0.0847
Epoch [30/51], Iter [180/391] Loss: 0.0902
Epoch [30/51], Iter [190/391] Loss: 0.0656
Epoch [30/51], Iter [200/391] Loss: 0.0430
Epoch [30/51], Iter [210/391] Loss: 0.0362
Epoch [30/51], Iter [220/391] Loss: 0.0438
Epoch [30/51], Iter [230/391] Loss: 0.0223
Epoch [30/51], Iter [240/391] Loss: 0.0949
Epoch [30/51], Iter [250/391] Loss: 0.0669
Epoch [30/51], Iter [260/391] Loss: 0.1003
Epoch [30/51], Iter [270/391] Loss: 0.0562
Epoch [30/51], Iter [280/391] Loss: 0.0847
Epoch [30/51], Iter [290/391] Loss: 0.1165
Epoch [30/51], Iter [300/391] Loss: 0.0930
Epoch [30/51], Iter [310/391] Loss: 0.0676
Epoch [30/51], Iter [320/391] Loss: 0.0535
Epoch [30/51], Iter [330/391] Loss: 0.0841
Epoch [30/51], Iter [340/391] Loss: 0.0584
Epoch [30/51], Iter [350/391] Loss: 0.0735
Epoch [30/51], Iter [360/391] Loss: 0.0680
Epoch [30/51], Iter [370/391] Loss: 0.0586
Epoch [30/51], Iter [380/391] Loss: 0.2232
Epoch [30/51], Iter [390/391] Loss: 0.1169
[Saving Checkpoint]
Epoch [31/51], Iter [10/391] Loss: 0.1015
Epoch [31/51], Iter [20/391] Loss: 0.0555
Epoch [31/51], Iter [30/391] Loss: 0.0684
Epoch [31/51], Iter [40/391] Loss: 0.0662
Epoch [31/51], Iter [50/391] Loss: 0.0436
Epoch [31/51], Iter [60/391] Loss: 0.0820
Epoch [31/51], Iter [70/391] Loss: 0.0443
Epoch [31/51], Iter [80/391] Loss: 0.1088
Epoch [31/51], Iter [90/391] Loss: 0.0600
Epoch [31/51], Iter [100/391] Loss: 0.1309
Epoch [31/51], Iter [110/391] Loss: 0.0904
Epoch [31/51], Iter [120/391] Loss: 0.0768
Epoch [31/51], Iter [130/391] Loss: 0.0622
Epoch [31/51], Iter [140/391] Loss: 0.0750
Epoch [31/51], Iter [150/391] Loss: 0.0591
Epoch [31/51], Iter [160/391] Loss: 0.0888
Epoch [31/51], Iter [170/391] Loss: 0.0496
Epoch [31/51], Iter [180/391] Loss: 0.0564
Epoch [31/51], Iter [190/391] Loss: 0.0999
Epoch [31/51], Iter [200/391] Loss: 0.0463
Epoch [31/51], Iter [210/391] Loss: 0.0720
Epoch [31/51], Iter [220/391] Loss: 0.1005
Epoch [31/51], Iter [230/391] Loss: 0.1018
Epoch [31/51], Iter [240/391] Loss: 0.0557
Epoch [31/51], Iter [250/391] Loss: 0.1119
Epoch [31/51], Iter [260/391] Loss: 0.0472
Epoch [31/51], Iter [270/391] Loss: 0.0781
Epoch [31/51], Iter [280/391] Loss: 0.1314
Epoch [31/51], Iter [290/391] Loss: 0.1713
Epoch [31/51], Iter [300/391] Loss: 0.0747
Epoch [31/51], Iter [310/391] Loss: 0.0584
Epoch [31/51], Iter [320/391] Loss: 0.0321
Epoch [31/51], Iter [330/391] Loss: 0.1459
Epoch [31/51], Iter [340/391] Loss: 0.0733
Epoch [31/51], Iter [350/391] Loss: 0.0695
Epoch [31/51], Iter [360/391] Loss: 0.1368
Epoch [31/51], Iter [370/391] Loss: 0.0626
Epoch [31/51], Iter [380/391] Loss: 0.1110
Epoch [31/51], Iter [390/391] Loss: 0.0529
Epoch [32/51], Iter [10/391] Loss: 0.1927
Epoch [32/51], Iter [20/391] Loss: 0.0627
Epoch [32/51], Iter [30/391] Loss: 0.0581
Epoch [32/51], Iter [40/391] Loss: 0.0566
Epoch [32/51], Iter [50/391] Loss: 0.0622
Epoch [32/51], Iter [60/391] Loss: 0.0247
Epoch [32/51], Iter [70/391] Loss: 0.1417
Epoch [32/51], Iter [80/391] Loss: 0.1219
Epoch [32/51], Iter [90/391] Loss: 0.0209
Epoch [32/51], Iter [100/391] Loss: 0.0612
Epoch [32/51], Iter [110/391] Loss: 0.1121
Epoch [32/51], Iter [120/391] Loss: 0.0529
Epoch [32/51], Iter [130/391] Loss: 0.0648
Epoch [32/51], Iter [140/391] Loss: 0.0670
Epoch [32/51], Iter [150/391] Loss: 0.0674
Epoch [32/51], Iter [160/391] Loss: 0.0261
Epoch [32/51], Iter [170/391] Loss: 0.1366
Epoch [32/51], Iter [180/391] Loss: 0.0654
Epoch [32/51], Iter [190/391] Loss: 0.0430
Epoch [32/51], Iter [200/391] Loss: 0.0845
Epoch [32/51], Iter [210/391] Loss: 0.0340
Epoch [32/51], Iter [220/391] Loss: 0.0404
Epoch [32/51], Iter [230/391] Loss: 0.0415
Epoch [32/51], Iter [240/391] Loss: 0.0566
Epoch [32/51], Iter [250/391] Loss: 0.0669
Epoch [32/51], Iter [260/391] Loss: 0.0559
Epoch [32/51], Iter [270/391] Loss: 0.0246
Epoch [32/51], Iter [280/391] Loss: 0.0721
Epoch [32/51], Iter [290/391] Loss: 0.0506
Epoch [32/51], Iter [300/391] Loss: 0.1161
Epoch [32/51], Iter [310/391] Loss: 0.0230
Epoch [32/51], Iter [320/391] Loss: 0.0927
Epoch [32/51], Iter [330/391] Loss: 0.0510
Epoch [32/51], Iter [340/391] Loss: 0.1111
Epoch [32/51], Iter [350/391] Loss: 0.0500
Epoch [32/51], Iter [360/391] Loss: 0.1287
Epoch [32/51], Iter [370/391] Loss: 0.0580
Epoch [32/51], Iter [380/391] Loss: 0.0399
Epoch [32/51], Iter [390/391] Loss: 0.0937
Epoch [33/51], Iter [10/391] Loss: 0.1153
Epoch [33/51], Iter [20/391] Loss: 0.0230
Epoch [33/51], Iter [30/391] Loss: 0.0634
Epoch [33/51], Iter [40/391] Loss: 0.0390
Epoch [33/51], Iter [50/391] Loss: 0.1006
Epoch [33/51], Iter [60/391] Loss: 0.0689
Epoch [33/51], Iter [70/391] Loss: 0.0906
Epoch [33/51], Iter [80/391] Loss: 0.0869
Epoch [33/51], Iter [90/391] Loss: 0.0884
Epoch [33/51], Iter [100/391] Loss: 0.1117
Epoch [33/51], Iter [110/391] Loss: 0.1015
Epoch [33/51], Iter [120/391] Loss: 0.0412
Epoch [33/51], Iter [130/391] Loss: 0.0537
Epoch [33/51], Iter [140/391] Loss: 0.0780
Epoch [33/51], Iter [150/391] Loss: 0.0296
Epoch [33/51], Iter [160/391] Loss: 0.0366
Epoch [33/51], Iter [170/391] Loss: 0.0425
Epoch [33/51], Iter [180/391] Loss: 0.0623
Epoch [33/51], Iter [190/391] Loss: 0.1002
Epoch [33/51], Iter [200/391] Loss: 0.0621
Epoch [33/51], Iter [210/391] Loss: 0.0301
Epoch [33/51], Iter [220/391] Loss: 0.0965
Epoch [33/51], Iter [230/391] Loss: 0.0408
Epoch [33/51], Iter [240/391] Loss: 0.0575
Epoch [33/51], Iter [250/391] Loss: 0.1362
Epoch [33/51], Iter [260/391] Loss: 0.0665
Epoch [33/51], Iter [270/391] Loss: 0.0449
Epoch [33/51], Iter [280/391] Loss: 0.1359
Epoch [33/51], Iter [290/391] Loss: 0.0235
Epoch [33/51], Iter [300/391] Loss: 0.1357
Epoch [33/51], Iter [310/391] Loss: 0.1024
Epoch [33/51], Iter [320/391] Loss: 0.0212
Epoch [33/51], Iter [330/391] Loss: 0.0937
Epoch [33/51], Iter [340/391] Loss: 0.0701
Epoch [33/51], Iter [350/391] Loss: 0.0686
Epoch [33/51], Iter [360/391] Loss: 0.1594
Epoch [33/51], Iter [370/391] Loss: 0.0791
Epoch [33/51], Iter [380/391] Loss: 0.0412
Epoch [33/51], Iter [390/391] Loss: 0.0803
Epoch [34/51], Iter [10/391] Loss: 0.0493
Epoch [34/51], Iter [20/391] Loss: 0.0348
Epoch [34/51], Iter [30/391] Loss: 0.0547
Epoch [34/51], Iter [40/391] Loss: 0.0438
Epoch [34/51], Iter [50/391] Loss: 0.0250
Epoch [34/51], Iter [60/391] Loss: 0.0905
Epoch [34/51], Iter [70/391] Loss: 0.0603
Epoch [34/51], Iter [80/391] Loss: 0.0775
Epoch [34/51], Iter [90/391] Loss: 0.0946
Epoch [34/51], Iter [100/391] Loss: 0.0519
Epoch [34/51], Iter [110/391] Loss: 0.0231
Epoch [34/51], Iter [120/391] Loss: 0.0364
Epoch [34/51], Iter [130/391] Loss: 0.0897
Epoch [34/51], Iter [140/391] Loss: 0.0209
Epoch [34/51], Iter [150/391] Loss: 0.0382
Epoch [34/51], Iter [160/391] Loss: 0.0468
Epoch [34/51], Iter [170/391] Loss: 0.0314
Epoch [34/51], Iter [180/391] Loss: 0.0427
Epoch [34/51], Iter [190/391] Loss: 0.0350
Epoch [34/51], Iter [200/391] Loss: 0.0395
Epoch [34/51], Iter [210/391] Loss: 0.0614
Epoch [34/51], Iter [220/391] Loss: 0.0622
Epoch [34/51], Iter [230/391] Loss: 0.0839
Epoch [34/51], Iter [240/391] Loss: 0.0936
Epoch [34/51], Iter [250/391] Loss: 0.0864
Epoch [34/51], Iter [260/391] Loss: 0.0566
Epoch [34/51], Iter [270/391] Loss: 0.0494
Epoch [34/51], Iter [280/391] Loss: 0.0487
Epoch [34/51], Iter [290/391] Loss: 0.1073
Epoch [34/51], Iter [300/391] Loss: 0.0480
Epoch [34/51], Iter [310/391] Loss: 0.1596
Epoch [34/51], Iter [320/391] Loss: 0.0666
Epoch [34/51], Iter [330/391] Loss: 0.0719
Epoch [34/51], Iter [340/391] Loss: 0.0815
Epoch [34/51], Iter [350/391] Loss: 0.1460
Epoch [34/51], Iter [360/391] Loss: 0.0783
Epoch [34/51], Iter [370/391] Loss: 0.0654
Epoch [34/51], Iter [380/391] Loss: 0.0609
Epoch [34/51], Iter [390/391] Loss: 0.0842
Epoch [35/51], Iter [10/391] Loss: 0.0746
Epoch [35/51], Iter [20/391] Loss: 0.0603
Epoch [35/51], Iter [30/391] Loss: 0.0622
Epoch [35/51], Iter [40/391] Loss: 0.0840
Epoch [35/51], Iter [50/391] Loss: 0.1017
Epoch [35/51], Iter [60/391] Loss: 0.0637
Epoch [35/51], Iter [70/391] Loss: 0.0366
Epoch [35/51], Iter [80/391] Loss: 0.0889
Epoch [35/51], Iter [90/391] Loss: 0.0634
Epoch [35/51], Iter [100/391] Loss: 0.0558
Epoch [35/51], Iter [110/391] Loss: 0.0181
Epoch [35/51], Iter [120/391] Loss: 0.0748
Epoch [35/51], Iter [130/391] Loss: 0.0340
Epoch [35/51], Iter [140/391] Loss: 0.0725
Epoch [35/51], Iter [150/391] Loss: 0.0748
Epoch [35/51], Iter [160/391] Loss: 0.0656
Epoch [35/51], Iter [170/391] Loss: 0.0651
Epoch [35/51], Iter [180/391] Loss: 0.0746
Epoch [35/51], Iter [190/391] Loss: 0.0626
Epoch [35/51], Iter [200/391] Loss: 0.0753
Epoch [35/51], Iter [210/391] Loss: 0.0792
Epoch [35/51], Iter [220/391] Loss: 0.0421
Epoch [35/51], Iter [230/391] Loss: 0.0351
Epoch [35/51], Iter [240/391] Loss: 0.0750
Epoch [35/51], Iter [250/391] Loss: 0.0606
Epoch [35/51], Iter [260/391] Loss: 0.0482
Epoch [35/51], Iter [270/391] Loss: 0.0820
Epoch [35/51], Iter [280/391] Loss: 0.0881
Epoch [35/51], Iter [290/391] Loss: 0.0854
Epoch [35/51], Iter [300/391] Loss: 0.0508
Epoch [35/51], Iter [310/391] Loss: 0.0452
Epoch [35/51], Iter [320/391] Loss: 0.0665
Epoch [35/51], Iter [330/391] Loss: 0.0417
Epoch [35/51], Iter [340/391] Loss: 0.0521
Epoch [35/51], Iter [350/391] Loss: 0.0304
Epoch [35/51], Iter [360/391] Loss: 0.0217
Epoch [35/51], Iter [370/391] Loss: 0.0758
Epoch [35/51], Iter [380/391] Loss: 0.1175
Epoch [35/51], Iter [390/391] Loss: 0.0875
Epoch [36/51], Iter [10/391] Loss: 0.0527
Epoch [36/51], Iter [20/391] Loss: 0.0560
Epoch [36/51], Iter [30/391] Loss: 0.0323
Epoch [36/51], Iter [40/391] Loss: 0.1034
Epoch [36/51], Iter [50/391] Loss: 0.0596
Epoch [36/51], Iter [60/391] Loss: 0.0474
Epoch [36/51], Iter [70/391] Loss: 0.0457
Epoch [36/51], Iter [80/391] Loss: 0.0867
Epoch [36/51], Iter [90/391] Loss: 0.0738
Epoch [36/51], Iter [100/391] Loss: 0.0661
Epoch [36/51], Iter [110/391] Loss: 0.0679
Epoch [36/51], Iter [120/391] Loss: 0.0330
Epoch [36/51], Iter [130/391] Loss: 0.0733
Epoch [36/51], Iter [140/391] Loss: 0.0245
Epoch [36/51], Iter [150/391] Loss: 0.0437
Epoch [36/51], Iter [160/391] Loss: 0.0340
Epoch [36/51], Iter [170/391] Loss: 0.0512
Epoch [36/51], Iter [180/391] Loss: 0.0548
Epoch [36/51], Iter [190/391] Loss: 0.0421
Epoch [36/51], Iter [200/391] Loss: 0.0311
Epoch [36/51], Iter [210/391] Loss: 0.0426
Epoch [36/51], Iter [220/391] Loss: 0.1291
Epoch [36/51], Iter [230/391] Loss: 0.0273
Epoch [36/51], Iter [240/391] Loss: 0.0498
Epoch [36/51], Iter [250/391] Loss: 0.0259
Epoch [36/51], Iter [260/391] Loss: 0.0128
Epoch [36/51], Iter [270/391] Loss: 0.0485
Epoch [36/51], Iter [280/391] Loss: 0.0544
Epoch [36/51], Iter [290/391] Loss: 0.0474
Epoch [36/51], Iter [300/391] Loss: 0.1019
Epoch [36/51], Iter [310/391] Loss: 0.0536
Epoch [36/51], Iter [320/391] Loss: 0.1214
Epoch [36/51], Iter [330/391] Loss: 0.0886
Epoch [36/51], Iter [340/391] Loss: 0.0440
Epoch [36/51], Iter [350/391] Loss: 0.0399
Epoch [36/51], Iter [360/391] Loss: 0.0522
Epoch [36/51], Iter [370/391] Loss: 0.0602
Epoch [36/51], Iter [380/391] Loss: 0.0658
Epoch [36/51], Iter [390/391] Loss: 0.0686
Epoch [37/51], Iter [10/391] Loss: 0.0624
Epoch [37/51], Iter [20/391] Loss: 0.0249
Epoch [37/51], Iter [30/391] Loss: 0.0507
Epoch [37/51], Iter [40/391] Loss: 0.0920
Epoch [37/51], Iter [50/391] Loss: 0.0772
Epoch [37/51], Iter [60/391] Loss: 0.0541
Epoch [37/51], Iter [70/391] Loss: 0.0510
Epoch [37/51], Iter [80/391] Loss: 0.0638
Epoch [37/51], Iter [90/391] Loss: 0.0457
Epoch [37/51], Iter [100/391] Loss: 0.0324
Epoch [37/51], Iter [110/391] Loss: 0.0364
Epoch [37/51], Iter [120/391] Loss: 0.0256
Epoch [37/51], Iter [130/391] Loss: 0.0962
Epoch [37/51], Iter [140/391] Loss: 0.0487
Epoch [37/51], Iter [150/391] Loss: 0.0079
Epoch [37/51], Iter [160/391] Loss: 0.0422
Epoch [37/51], Iter [170/391] Loss: 0.0952
Epoch [37/51], Iter [180/391] Loss: 0.0690
Epoch [37/51], Iter [190/391] Loss: 0.0913
Epoch [37/51], Iter [200/391] Loss: 0.0598
Epoch [37/51], Iter [210/391] Loss: 0.0767
Epoch [37/51], Iter [220/391] Loss: 0.0428
Epoch [37/51], Iter [230/391] Loss: 0.0594
Epoch [37/51], Iter [240/391] Loss: 0.0877
Epoch [37/51], Iter [250/391] Loss: 0.0470
Epoch [37/51], Iter [260/391] Loss: 0.0794
Epoch [37/51], Iter [270/391] Loss: 0.0567
Epoch [37/51], Iter [280/391] Loss: 0.0841
Epoch [37/51], Iter [290/391] Loss: 0.0569
Epoch [37/51], Iter [300/391] Loss: 0.0217
Epoch [37/51], Iter [310/391] Loss: 0.1146
Epoch [37/51], Iter [320/391] Loss: 0.0892
Epoch [37/51], Iter [330/391] Loss: 0.1163
Epoch [37/51], Iter [340/391] Loss: 0.0540
Epoch [37/51], Iter [350/391] Loss: 0.0508
Epoch [37/51], Iter [360/391] Loss: 0.0184
Epoch [37/51], Iter [370/391] Loss: 0.0710
Epoch [37/51], Iter [380/391] Loss: 0.0488
Epoch [37/51], Iter [390/391] Loss: 0.0217
Epoch [38/51], Iter [10/391] Loss: 0.0649
Epoch [38/51], Iter [20/391] Loss: 0.0192
Epoch [38/51], Iter [30/391] Loss: 0.0584
Epoch [38/51], Iter [40/391] Loss: 0.1026
Epoch [38/51], Iter [50/391] Loss: 0.0808
Epoch [38/51], Iter [60/391] Loss: 0.0625
Epoch [38/51], Iter [70/391] Loss: 0.0784
Epoch [38/51], Iter [80/391] Loss: 0.0995
Epoch [38/51], Iter [90/391] Loss: 0.0422
Epoch [38/51], Iter [100/391] Loss: 0.0918
Epoch [38/51], Iter [110/391] Loss: 0.0483
Epoch [38/51], Iter [120/391] Loss: 0.0263
Epoch [38/51], Iter [130/391] Loss: 0.0299
Epoch [38/51], Iter [140/391] Loss: 0.0531
Epoch [38/51], Iter [150/391] Loss: 0.0206
Epoch [38/51], Iter [160/391] Loss: 0.0550
Epoch [38/51], Iter [170/391] Loss: 0.0445
Epoch [38/51], Iter [180/391] Loss: 0.0311
Epoch [38/51], Iter [190/391] Loss: 0.0332
Epoch [38/51], Iter [200/391] Loss: 0.0424
Epoch [38/51], Iter [210/391] Loss: 0.0230
Epoch [38/51], Iter [220/391] Loss: 0.0390
Epoch [38/51], Iter [230/391] Loss: 0.0714
Epoch [38/51], Iter [240/391] Loss: 0.0598
Epoch [38/51], Iter [250/391] Loss: 0.0481
Epoch [38/51], Iter [260/391] Loss: 0.0700
Epoch [38/51], Iter [270/391] Loss: 0.0563
Epoch [38/51], Iter [280/391] Loss: 0.0714
Epoch [38/51], Iter [290/391] Loss: 0.0776
Epoch [38/51], Iter [300/391] Loss: 0.0704
Epoch [38/51], Iter [310/391] Loss: 0.0487
Epoch [38/51], Iter [320/391] Loss: 0.0299
Epoch [38/51], Iter [330/391] Loss: 0.0608
Epoch [38/51], Iter [340/391] Loss: 0.0411
Epoch [38/51], Iter [350/391] Loss: 0.0330
Epoch [38/51], Iter [360/391] Loss: 0.0325
Epoch [38/51], Iter [370/391] Loss: 0.0475
Epoch [38/51], Iter [380/391] Loss: 0.0943
Epoch [38/51], Iter [390/391] Loss: 0.0699
Epoch [39/51], Iter [10/391] Loss: 0.0639
Epoch [39/51], Iter [20/391] Loss: 0.0702
Epoch [39/51], Iter [30/391] Loss: 0.0526
Epoch [39/51], Iter [40/391] Loss: 0.0657
Epoch [39/51], Iter [50/391] Loss: 0.0476
Epoch [39/51], Iter [60/391] Loss: 0.0642
Epoch [39/51], Iter [70/391] Loss: 0.0404
Epoch [39/51], Iter [80/391] Loss: 0.0629
Epoch [39/51], Iter [90/391] Loss: 0.0134
Epoch [39/51], Iter [100/391] Loss: 0.0662
Epoch [39/51], Iter [110/391] Loss: 0.0652
Epoch [39/51], Iter [120/391] Loss: 0.0770
Epoch [39/51], Iter [130/391] Loss: 0.0317
Epoch [39/51], Iter [140/391] Loss: 0.0323
Epoch [39/51], Iter [150/391] Loss: 0.0331
Epoch [39/51], Iter [160/391] Loss: 0.0198
Epoch [39/51], Iter [170/391] Loss: 0.0728
Epoch [39/51], Iter [180/391] Loss: 0.0161
Epoch [39/51], Iter [190/391] Loss: 0.0974
Epoch [39/51], Iter [200/391] Loss: 0.0218
Epoch [39/51], Iter [210/391] Loss: 0.0527
Epoch [39/51], Iter [220/391] Loss: 0.0359
Epoch [39/51], Iter [230/391] Loss: 0.0643
Epoch [39/51], Iter [240/391] Loss: 0.0258
Epoch [39/51], Iter [250/391] Loss: 0.0681
Epoch [39/51], Iter [260/391] Loss: 0.0579
Epoch [39/51], Iter [270/391] Loss: 0.0633
Epoch [39/51], Iter [280/391] Loss: 0.0374
Epoch [39/51], Iter [290/391] Loss: 0.1268
Epoch [39/51], Iter [300/391] Loss: 0.0498
Epoch [39/51], Iter [310/391] Loss: 0.1313
Epoch [39/51], Iter [320/391] Loss: 0.0987
Epoch [39/51], Iter [330/391] Loss: 0.0801
Epoch [39/51], Iter [340/391] Loss: 0.0838
Epoch [39/51], Iter [350/391] Loss: 0.1128
Epoch [39/51], Iter [360/391] Loss: 0.0697
Epoch [39/51], Iter [370/391] Loss: 0.0278
Epoch [39/51], Iter [380/391] Loss: 0.0712
Epoch [39/51], Iter [390/391] Loss: 0.1050
Epoch [40/51], Iter [10/391] Loss: 0.0372
Epoch [40/51], Iter [20/391] Loss: 0.0836
Epoch [40/51], Iter [30/391] Loss: 0.0741
Epoch [40/51], Iter [40/391] Loss: 0.0486
Epoch [40/51], Iter [50/391] Loss: 0.0259
Epoch [40/51], Iter [60/391] Loss: 0.0704
Epoch [40/51], Iter [70/391] Loss: 0.0770
Epoch [40/51], Iter [80/391] Loss: 0.0551
Epoch [40/51], Iter [90/391] Loss: 0.0806
Epoch [40/51], Iter [100/391] Loss: 0.0356
Epoch [40/51], Iter [110/391] Loss: 0.0698
Epoch [40/51], Iter [120/391] Loss: 0.0909
Epoch [40/51], Iter [130/391] Loss: 0.0719
Epoch [40/51], Iter [140/391] Loss: 0.1249
Epoch [40/51], Iter [150/391] Loss: 0.0350
Epoch [40/51], Iter [160/391] Loss: 0.0725
Epoch [40/51], Iter [170/391] Loss: 0.1238
Epoch [40/51], Iter [180/391] Loss: 0.0612
Epoch [40/51], Iter [190/391] Loss: 0.1162
Epoch [40/51], Iter [200/391] Loss: 0.0872
Epoch [40/51], Iter [210/391] Loss: 0.0359
Epoch [40/51], Iter [220/391] Loss: 0.0851
Epoch [40/51], Iter [230/391] Loss: 0.0384
Epoch [40/51], Iter [240/391] Loss: 0.0222
Epoch [40/51], Iter [250/391] Loss: 0.0516
Epoch [40/51], Iter [260/391] Loss: 0.1032
Epoch [40/51], Iter [270/391] Loss: 0.0540
Epoch [40/51], Iter [280/391] Loss: 0.0399
Epoch [40/51], Iter [290/391] Loss: 0.0278
Epoch [40/51], Iter [300/391] Loss: 0.0641
Epoch [40/51], Iter [310/391] Loss: 0.0517
Epoch [40/51], Iter [320/391] Loss: 0.0449
Epoch [40/51], Iter [330/391] Loss: 0.0435
Epoch [40/51], Iter [340/391] Loss: 0.1057
Epoch [40/51], Iter [350/391] Loss: 0.0940
Epoch [40/51], Iter [360/391] Loss: 0.0455
Epoch [40/51], Iter [370/391] Loss: 0.0763
Epoch [40/51], Iter [380/391] Loss: 0.0495
Epoch [40/51], Iter [390/391] Loss: 0.1163
[Saving Checkpoint]
Epoch [41/51], Iter [10/391] Loss: 0.0548
Epoch [41/51], Iter [20/391] Loss: 0.0659
Epoch [41/51], Iter [30/391] Loss: 0.0522
Epoch [41/51], Iter [40/391] Loss: 0.0260
Epoch [41/51], Iter [50/391] Loss: 0.0213
Epoch [41/51], Iter [60/391] Loss: 0.0451
Epoch [41/51], Iter [70/391] Loss: 0.0183
Epoch [41/51], Iter [80/391] Loss: 0.0359
Epoch [41/51], Iter [90/391] Loss: 0.0658
Epoch [41/51], Iter [100/391] Loss: 0.0369
Epoch [41/51], Iter [110/391] Loss: 0.0328
Epoch [41/51], Iter [120/391] Loss: 0.0769
Epoch [41/51], Iter [130/391] Loss: 0.0291
Epoch [41/51], Iter [140/391] Loss: 0.0515
Epoch [41/51], Iter [150/391] Loss: 0.0287
Epoch [41/51], Iter [160/391] Loss: 0.0424
Epoch [41/51], Iter [170/391] Loss: 0.0206
Epoch [41/51], Iter [180/391] Loss: 0.0543
Epoch [41/51], Iter [190/391] Loss: 0.0399
Epoch [41/51], Iter [200/391] Loss: 0.0769
Epoch [41/51], Iter [210/391] Loss: 0.0399
Epoch [41/51], Iter [220/391] Loss: 0.0765
Epoch [41/51], Iter [230/391] Loss: 0.0309
Epoch [41/51], Iter [240/391] Loss: 0.0789
Epoch [41/51], Iter [250/391] Loss: 0.1348
Epoch [41/51], Iter [260/391] Loss: 0.0601
Epoch [41/51], Iter [270/391] Loss: 0.0262
Epoch [41/51], Iter [280/391] Loss: 0.0547
Epoch [41/51], Iter [290/391] Loss: 0.1095
Epoch [41/51], Iter [300/391] Loss: 0.0165
Epoch [41/51], Iter [310/391] Loss: 0.0436
Epoch [41/51], Iter [320/391] Loss: 0.0384
Epoch [41/51], Iter [330/391] Loss: 0.1047
Epoch [41/51], Iter [340/391] Loss: 0.1069
Epoch [41/51], Iter [350/391] Loss: 0.0954
Epoch [41/51], Iter [360/391] Loss: 0.0930
Epoch [41/51], Iter [370/391] Loss: 0.0708
Epoch [41/51], Iter [380/391] Loss: 0.0311
Epoch [41/51], Iter [390/391] Loss: 0.0534
Epoch [42/51], Iter [10/391] Loss: 0.0877
Epoch [42/51], Iter [20/391] Loss: 0.1186
Epoch [42/51], Iter [30/391] Loss: 0.0938
Epoch [42/51], Iter [40/391] Loss: 0.0678
Epoch [42/51], Iter [50/391] Loss: 0.0494
Epoch [42/51], Iter [60/391] Loss: 0.0579
Epoch [42/51], Iter [70/391] Loss: 0.0313
Epoch [42/51], Iter [80/391] Loss: 0.0524
Epoch [42/51], Iter [90/391] Loss: 0.0505
Epoch [42/51], Iter [100/391] Loss: 0.0297
Epoch [42/51], Iter [110/391] Loss: 0.0422
Epoch [42/51], Iter [120/391] Loss: 0.0253
Epoch [42/51], Iter [130/391] Loss: 0.0589
Epoch [42/51], Iter [140/391] Loss: 0.0328
Epoch [42/51], Iter [150/391] Loss: 0.0486
Epoch [42/51], Iter [160/391] Loss: 0.0388
Epoch [42/51], Iter [170/391] Loss: 0.0234
Epoch [42/51], Iter [180/391] Loss: 0.0349
Epoch [42/51], Iter [190/391] Loss: 0.0358
Epoch [42/51], Iter [200/391] Loss: 0.0284
Epoch [42/51], Iter [210/391] Loss: 0.0319
Epoch [42/51], Iter [220/391] Loss: 0.0355
Epoch [42/51], Iter [230/391] Loss: 0.0286
Epoch [42/51], Iter [240/391] Loss: 0.0408
Epoch [42/51], Iter [250/391] Loss: 0.0753
Epoch [42/51], Iter [260/391] Loss: 0.1430
Epoch [42/51], Iter [270/391] Loss: 0.0619
Epoch [42/51], Iter [280/391] Loss: 0.0440
Epoch [42/51], Iter [290/391] Loss: 0.0501
Epoch [42/51], Iter [300/391] Loss: 0.0854
Epoch [42/51], Iter [310/391] Loss: 0.0494
Epoch [42/51], Iter [320/391] Loss: 0.0219
Epoch [42/51], Iter [330/391] Loss: 0.0657
Epoch [42/51], Iter [340/391] Loss: 0.0711
Epoch [42/51], Iter [350/391] Loss: 0.0610
Epoch [42/51], Iter [360/391] Loss: 0.0389
Epoch [42/51], Iter [370/391] Loss: 0.0366
Epoch [42/51], Iter [380/391] Loss: 0.0595
Epoch [42/51], Iter [390/391] Loss: 0.0299
Epoch [43/51], Iter [10/391] Loss: 0.0444
Epoch [43/51], Iter [20/391] Loss: 0.0287
Epoch [43/51], Iter [30/391] Loss: 0.0321
Epoch [43/51], Iter [40/391] Loss: 0.0201
Epoch [43/51], Iter [50/391] Loss: 0.0129
Epoch [43/51], Iter [60/391] Loss: 0.0344
Epoch [43/51], Iter [70/391] Loss: 0.1280
Epoch [43/51], Iter [80/391] Loss: 0.0836
Epoch [43/51], Iter [90/391] Loss: 0.0878
Epoch [43/51], Iter [100/391] Loss: 0.1176
Epoch [43/51], Iter [110/391] Loss: 0.1251
Epoch [43/51], Iter [120/391] Loss: 0.0772
Epoch [43/51], Iter [130/391] Loss: 0.0577
Epoch [43/51], Iter [140/391] Loss: 0.0494
Epoch [43/51], Iter [150/391] Loss: 0.0451
Epoch [43/51], Iter [160/391] Loss: 0.0268
Epoch [43/51], Iter [170/391] Loss: 0.0620
Epoch [43/51], Iter [180/391] Loss: 0.0503
Epoch [43/51], Iter [190/391] Loss: 0.0493
Epoch [43/51], Iter [200/391] Loss: 0.0606
Epoch [43/51], Iter [210/391] Loss: 0.1024
Epoch [43/51], Iter [220/391] Loss: 0.0730
Epoch [43/51], Iter [230/391] Loss: 0.0605
Epoch [43/51], Iter [240/391] Loss: 0.0530
Epoch [43/51], Iter [250/391] Loss: 0.0634
Epoch [43/51], Iter [260/391] Loss: 0.0724
Epoch [43/51], Iter [270/391] Loss: 0.0504
Epoch [43/51], Iter [280/391] Loss: 0.0813
Epoch [43/51], Iter [290/391] Loss: 0.0492
Epoch [43/51], Iter [300/391] Loss: 0.0794
Epoch [43/51], Iter [310/391] Loss: 0.0540
Epoch [43/51], Iter [320/391] Loss: 0.0264
Epoch [43/51], Iter [330/391] Loss: 0.0166
Epoch [43/51], Iter [340/391] Loss: 0.0730
Epoch [43/51], Iter [350/391] Loss: 0.0378
Epoch [43/51], Iter [360/391] Loss: 0.0819
Epoch [43/51], Iter [370/391] Loss: 0.0475
Epoch [43/51], Iter [380/391] Loss: 0.0550
Epoch [43/51], Iter [390/391] Loss: 0.0507
Epoch [44/51], Iter [10/391] Loss: 0.0837
Epoch [44/51], Iter [20/391] Loss: 0.0448
Epoch [44/51], Iter [30/391] Loss: 0.0399
Epoch [44/51], Iter [40/391] Loss: 0.0880
Epoch [44/51], Iter [50/391] Loss: 0.0352
Epoch [44/51], Iter [60/391] Loss: 0.0624
Epoch [44/51], Iter [70/391] Loss: 0.0594
Epoch [44/51], Iter [80/391] Loss: 0.0221
Epoch [44/51], Iter [90/391] Loss: 0.0677
Epoch [44/51], Iter [100/391] Loss: 0.0419
Epoch [44/51], Iter [110/391] Loss: 0.0208
Epoch [44/51], Iter [120/391] Loss: 0.0120
Epoch [44/51], Iter [130/391] Loss: 0.0505
Epoch [44/51], Iter [140/391] Loss: 0.0794
Epoch [44/51], Iter [150/391] Loss: 0.1139
Epoch [44/51], Iter [160/391] Loss: 0.0795
Epoch [44/51], Iter [170/391] Loss: 0.0212
Epoch [44/51], Iter [180/391] Loss: 0.0266
Epoch [44/51], Iter [190/391] Loss: 0.0374
Epoch [44/51], Iter [200/391] Loss: 0.0296
Epoch [44/51], Iter [210/391] Loss: 0.0431
Epoch [44/51], Iter [220/391] Loss: 0.0795
Epoch [44/51], Iter [230/391] Loss: 0.0215
Epoch [44/51], Iter [240/391] Loss: 0.0825
Epoch [44/51], Iter [250/391] Loss: 0.0262
Epoch [44/51], Iter [260/391] Loss: 0.0425
Epoch [44/51], Iter [270/391] Loss: 0.0558
Epoch [44/51], Iter [280/391] Loss: 0.1194
Epoch [44/51], Iter [290/391] Loss: 0.0210
Epoch [44/51], Iter [300/391] Loss: 0.0830
Epoch [44/51], Iter [310/391] Loss: 0.0238
Epoch [44/51], Iter [320/391] Loss: 0.0228
Epoch [44/51], Iter [330/391] Loss: 0.0426
Epoch [44/51], Iter [340/391] Loss: 0.0964
Epoch [44/51], Iter [350/391] Loss: 0.0833
Epoch [44/51], Iter [360/391] Loss: 0.0256
Epoch [44/51], Iter [370/391] Loss: 0.1089
Epoch [44/51], Iter [380/391] Loss: 0.0871
Epoch [44/51], Iter [390/391] Loss: 0.0932
Epoch [45/51], Iter [10/391] Loss: 0.0219
Epoch [45/51], Iter [20/391] Loss: 0.0409
Epoch [45/51], Iter [30/391] Loss: 0.0607
Epoch [45/51], Iter [40/391] Loss: 0.1313
Epoch [45/51], Iter [50/391] Loss: 0.0498
Epoch [45/51], Iter [60/391] Loss: 0.0768
Epoch [45/51], Iter [70/391] Loss: 0.0286
Epoch [45/51], Iter [80/391] Loss: 0.0378
Epoch [45/51], Iter [90/391] Loss: 0.0485
Epoch [45/51], Iter [100/391] Loss: 0.0616
Epoch [45/51], Iter [110/391] Loss: 0.0426
Epoch [45/51], Iter [120/391] Loss: 0.0426
Epoch [45/51], Iter [130/391] Loss: 0.0386
Epoch [45/51], Iter [140/391] Loss: 0.0566
Epoch [45/51], Iter [150/391] Loss: 0.0143
Epoch [45/51], Iter [160/391] Loss: 0.0202
Epoch [45/51], Iter [170/391] Loss: 0.0325
Epoch [45/51], Iter [180/391] Loss: 0.0200
Epoch [45/51], Iter [190/391] Loss: 0.0202
Epoch [45/51], Iter [200/391] Loss: 0.0405
Epoch [45/51], Iter [210/391] Loss: 0.0596
Epoch [45/51], Iter [220/391] Loss: 0.0373
Epoch [45/51], Iter [230/391] Loss: 0.1219
Epoch [45/51], Iter [240/391] Loss: 0.0547
Epoch [45/51], Iter [250/391] Loss: 0.0499
Epoch [45/51], Iter [260/391] Loss: 0.0263
Epoch [45/51], Iter [270/391] Loss: 0.0388
Epoch [45/51], Iter [280/391] Loss: 0.0326
Epoch [45/51], Iter [290/391] Loss: 0.0281
Epoch [45/51], Iter [300/391] Loss: 0.0572
Epoch [45/51], Iter [310/391] Loss: 0.0489
Epoch [45/51], Iter [320/391] Loss: 0.0820
Epoch [45/51], Iter [330/391] Loss: 0.0669
Epoch [45/51], Iter [340/391] Loss: 0.0676
Epoch [45/51], Iter [350/391] Loss: 0.0792
Epoch [45/51], Iter [360/391] Loss: 0.0685
Epoch [45/51], Iter [370/391] Loss: 0.0342
Epoch [45/51], Iter [380/391] Loss: 0.0398
Epoch [45/51], Iter [390/391] Loss: 0.0214
Epoch [46/51], Iter [10/391] Loss: 0.0420
Epoch [46/51], Iter [20/391] Loss: 0.0107
Epoch [46/51], Iter [30/391] Loss: 0.0377
Epoch [46/51], Iter [40/391] Loss: 0.0245
Epoch [46/51], Iter [50/391] Loss: 0.0070
Epoch [46/51], Iter [60/391] Loss: 0.0573
Epoch [46/51], Iter [70/391] Loss: 0.0364
Epoch [46/51], Iter [80/391] Loss: 0.0347
Epoch [46/51], Iter [90/391] Loss: 0.0317
Epoch [46/51], Iter [100/391] Loss: 0.0249
Epoch [46/51], Iter [110/391] Loss: 0.0339
Epoch [46/51], Iter [120/391] Loss: 0.0395
Epoch [46/51], Iter [130/391] Loss: 0.0548
Epoch [46/51], Iter [140/391] Loss: 0.0766
Epoch [46/51], Iter [150/391] Loss: 0.0623
Epoch [46/51], Iter [160/391] Loss: 0.0110
Epoch [46/51], Iter [170/391] Loss: 0.0370
Epoch [46/51], Iter [180/391] Loss: 0.0367
Epoch [46/51], Iter [190/391] Loss: 0.0393
Epoch [46/51], Iter [200/391] Loss: 0.0533
Epoch [46/51], Iter [210/391] Loss: 0.0261
Epoch [46/51], Iter [220/391] Loss: 0.0477
Epoch [46/51], Iter [230/391] Loss: 0.0124
Epoch [46/51], Iter [240/391] Loss: 0.0228
Epoch [46/51], Iter [250/391] Loss: 0.0576
Epoch [46/51], Iter [260/391] Loss: 0.0604
Epoch [46/51], Iter [270/391] Loss: 0.0101
Epoch [46/51], Iter [280/391] Loss: 0.0153
Epoch [46/51], Iter [290/391] Loss: 0.1000
Epoch [46/51], Iter [300/391] Loss: 0.0601
Epoch [46/51], Iter [310/391] Loss: 0.0776
Epoch [46/51], Iter [320/391] Loss: 0.0510
Epoch [46/51], Iter [330/391] Loss: 0.0226
Epoch [46/51], Iter [340/391] Loss: 0.0751
Epoch [46/51], Iter [350/391] Loss: 0.0406
Epoch [46/51], Iter [360/391] Loss: 0.1228
Epoch [46/51], Iter [370/391] Loss: 0.0550
Epoch [46/51], Iter [380/391] Loss: 0.0645
Epoch [46/51], Iter [390/391] Loss: 0.0573
Epoch [47/51], Iter [10/391] Loss: 0.0511
Epoch [47/51], Iter [20/391] Loss: 0.0581
Epoch [47/51], Iter [30/391] Loss: 0.0308
Epoch [47/51], Iter [40/391] Loss: 0.0330
Epoch [47/51], Iter [50/391] Loss: 0.0606
Epoch [47/51], Iter [60/391] Loss: 0.0375
Epoch [47/51], Iter [70/391] Loss: 0.0287
Epoch [47/51], Iter [80/391] Loss: 0.0376
Epoch [47/51], Iter [90/391] Loss: 0.0528
Epoch [47/51], Iter [100/391] Loss: 0.0351
Epoch [47/51], Iter [110/391] Loss: 0.0565
Epoch [47/51], Iter [120/391] Loss: 0.0842
Epoch [47/51], Iter [130/391] Loss: 0.1583
Epoch [47/51], Iter [140/391] Loss: 0.0634
Epoch [47/51], Iter [150/391] Loss: 0.0926
Epoch [47/51], Iter [160/391] Loss: 0.0445
Epoch [47/51], Iter [170/391] Loss: 0.0736
Epoch [47/51], Iter [180/391] Loss: 0.0655
Epoch [47/51], Iter [190/391] Loss: 0.0475
Epoch [47/51], Iter [200/391] Loss: 0.1261
Epoch [47/51], Iter [210/391] Loss: 0.0495
Epoch [47/51], Iter [220/391] Loss: 0.0290
Epoch [47/51], Iter [230/391] Loss: 0.0196
Epoch [47/51], Iter [240/391] Loss: 0.0447
Epoch [47/51], Iter [250/391] Loss: 0.0230
Epoch [47/51], Iter [260/391] Loss: 0.0979
Epoch [47/51], Iter [270/391] Loss: 0.0988
Epoch [47/51], Iter [280/391] Loss: 0.0141
Epoch [47/51], Iter [290/391] Loss: 0.0558
Epoch [47/51], Iter [300/391] Loss: 0.0395
Epoch [47/51], Iter [310/391] Loss: 0.0498
Epoch [47/51], Iter [320/391] Loss: 0.0723
Epoch [47/51], Iter [330/391] Loss: 0.0488
Epoch [47/51], Iter [340/391] Loss: 0.0161
Epoch [47/51], Iter [350/391] Loss: 0.0936
Epoch [47/51], Iter [360/391] Loss: 0.0358
Epoch [47/51], Iter [370/391] Loss: 0.0626
Epoch [47/51], Iter [380/391] Loss: 0.0498
Epoch [47/51], Iter [390/391] Loss: 0.0559
Epoch [48/51], Iter [10/391] Loss: 0.0338
Epoch [48/51], Iter [20/391] Loss: 0.0274
Epoch [48/51], Iter [30/391] Loss: 0.0571
Epoch [48/51], Iter [40/391] Loss: 0.1573
Epoch [48/51], Iter [50/391] Loss: 0.0459
Epoch [48/51], Iter [60/391] Loss: 0.0378
Epoch [48/51], Iter [70/391] Loss: 0.0556
Epoch [48/51], Iter [80/391] Loss: 0.0342
Epoch [48/51], Iter [90/391] Loss: 0.0331
Epoch [48/51], Iter [100/391] Loss: 0.0233
Epoch [48/51], Iter [110/391] Loss: 0.0315
Epoch [48/51], Iter [120/391] Loss: 0.0551
Epoch [48/51], Iter [130/391] Loss: 0.0875
Epoch [48/51], Iter [140/391] Loss: 0.0210
Epoch [48/51], Iter [150/391] Loss: 0.0938
Epoch [48/51], Iter [160/391] Loss: 0.0154
Epoch [48/51], Iter [170/391] Loss: 0.0176
Epoch [48/51], Iter [180/391] Loss: 0.0453
Epoch [48/51], Iter [190/391] Loss: 0.0812
Epoch [48/51], Iter [200/391] Loss: 0.0337
Epoch [48/51], Iter [210/391] Loss: 0.0582
Epoch [48/51], Iter [220/391] Loss: 0.0535
Epoch [48/51], Iter [230/391] Loss: 0.0380
Epoch [48/51], Iter [240/391] Loss: 0.0454
Epoch [48/51], Iter [250/391] Loss: 0.0305
Epoch [48/51], Iter [260/391] Loss: 0.0346
Epoch [48/51], Iter [270/391] Loss: 0.0400
Epoch [48/51], Iter [280/391] Loss: 0.0271
Epoch [48/51], Iter [290/391] Loss: 0.0459
Epoch [48/51], Iter [300/391] Loss: 0.0440
Epoch [48/51], Iter [310/391] Loss: 0.1109
Epoch [48/51], Iter [320/391] Loss: 0.0237
Epoch [48/51], Iter [330/391] Loss: 0.0432
Epoch [48/51], Iter [340/391] Loss: 0.0248
Epoch [48/51], Iter [350/391] Loss: 0.0927
Epoch [48/51], Iter [360/391] Loss: 0.0277
Epoch [48/51], Iter [370/391] Loss: 0.0349
Epoch [48/51], Iter [380/391] Loss: 0.0332
Epoch [48/51], Iter [390/391] Loss: 0.0294
Epoch [49/51], Iter [10/391] Loss: 0.0753
Epoch [49/51], Iter [20/391] Loss: 0.0356
Epoch [49/51], Iter [30/391] Loss: 0.1338
Epoch [49/51], Iter [40/391] Loss: 0.0457
Epoch [49/51], Iter [50/391] Loss: 0.0134
Epoch [49/51], Iter [60/391] Loss: 0.0601
Epoch [49/51], Iter [70/391] Loss: 0.0206
Epoch [49/51], Iter [80/391] Loss: 0.0277
Epoch [49/51], Iter [90/391] Loss: 0.0138
Epoch [49/51], Iter [100/391] Loss: 0.0212
Epoch [49/51], Iter [110/391] Loss: 0.0328
Epoch [49/51], Iter [120/391] Loss: 0.0581
Epoch [49/51], Iter [130/391] Loss: 0.0510
Epoch [49/51], Iter [140/391] Loss: 0.0931
Epoch [49/51], Iter [150/391] Loss: 0.0198
Epoch [49/51], Iter [160/391] Loss: 0.0322
Epoch [49/51], Iter [170/391] Loss: 0.0589
Epoch [49/51], Iter [180/391] Loss: 0.0801
Epoch [49/51], Iter [190/391] Loss: 0.0449
Epoch [49/51], Iter [200/391] Loss: 0.1199
Epoch [49/51], Iter [210/391] Loss: 0.0816
Epoch [49/51], Iter [220/391] Loss: 0.0285
Epoch [49/51], Iter [230/391] Loss: 0.0296
Epoch [49/51], Iter [240/391] Loss: 0.0623
Epoch [49/51], Iter [250/391] Loss: 0.0478
Epoch [49/51], Iter [260/391] Loss: 0.0124
Epoch [49/51], Iter [270/391] Loss: 0.0167
Epoch [49/51], Iter [280/391] Loss: 0.0771
Epoch [49/51], Iter [290/391] Loss: 0.0281
Epoch [49/51], Iter [300/391] Loss: 0.0588
Epoch [49/51], Iter [310/391] Loss: 0.0327
Epoch [49/51], Iter [320/391] Loss: 0.0610
Epoch [49/51], Iter [330/391] Loss: 0.0169
Epoch [49/51], Iter [340/391] Loss: 0.0560
Epoch [49/51], Iter [350/391] Loss: 0.0852
Epoch [49/51], Iter [360/391] Loss: 0.0445
Epoch [49/51], Iter [370/391] Loss: 0.0418
Epoch [49/51], Iter [380/391] Loss: 0.0993
Epoch [49/51], Iter [390/391] Loss: 0.0643
Epoch [50/51], Iter [10/391] Loss: 0.0355
Epoch [50/51], Iter [20/391] Loss: 0.0214
Epoch [50/51], Iter [30/391] Loss: 0.0308
Epoch [50/51], Iter [40/391] Loss: 0.0637
Epoch [50/51], Iter [50/391] Loss: 0.0855
Epoch [50/51], Iter [60/391] Loss: 0.0846
Epoch [50/51], Iter [70/391] Loss: 0.0733
Epoch [50/51], Iter [80/391] Loss: 0.0213
Epoch [50/51], Iter [90/391] Loss: 0.0453
Epoch [50/51], Iter [100/391] Loss: 0.0188
Epoch [50/51], Iter [110/391] Loss: 0.0576
Epoch [50/51], Iter [120/391] Loss: 0.0300
Epoch [50/51], Iter [130/391] Loss: 0.0273
Epoch [50/51], Iter [140/391] Loss: 0.0800
Epoch [50/51], Iter [150/391] Loss: 0.0333
Epoch [50/51], Iter [160/391] Loss: 0.0425
Epoch [50/51], Iter [170/391] Loss: 0.0326
Epoch [50/51], Iter [180/391] Loss: 0.0169
Epoch [50/51], Iter [190/391] Loss: 0.0319
Epoch [50/51], Iter [200/391] Loss: 0.0200
Epoch [50/51], Iter [210/391] Loss: 0.0160
Epoch [50/51], Iter [220/391] Loss: 0.0146
Epoch [50/51], Iter [230/391] Loss: 0.0265
Epoch [50/51], Iter [240/391] Loss: 0.0651
Epoch [50/51], Iter [250/391] Loss: 0.0198
Epoch [50/51], Iter [260/391] Loss: 0.0248
Epoch [50/51], Iter [270/391] Loss: 0.0465
Epoch [50/51], Iter [280/391] Loss: 0.0790
Epoch [50/51], Iter [290/391] Loss: 0.0768
Epoch [50/51], Iter [300/391] Loss: 0.0104
Epoch [50/51], Iter [310/391] Loss: 0.0174
Epoch [50/51], Iter [320/391] Loss: 0.0547
Epoch [50/51], Iter [330/391] Loss: 0.0417
Epoch [50/51], Iter [340/391] Loss: 0.0888
Epoch [50/51], Iter [350/391] Loss: 0.0669
Epoch [50/51], Iter [360/391] Loss: 0.0419
Epoch [50/51], Iter [370/391] Loss: 0.0416
Epoch [50/51], Iter [380/391] Loss: 0.0627
Epoch [50/51], Iter [390/391] Loss: 0.0570
[Saving Checkpoint]
Epoch [51/51], Iter [10/391] Loss: 0.0388
Epoch [51/51], Iter [20/391] Loss: 0.0468
Epoch [51/51], Iter [30/391] Loss: 0.0501
Epoch [51/51], Iter [40/391] Loss: 0.0065
Epoch [51/51], Iter [50/391] Loss: 0.0748
Epoch [51/51], Iter [60/391] Loss: 0.0479
Epoch [51/51], Iter [70/391] Loss: 0.0381
Epoch [51/51], Iter [80/391] Loss: 0.0387
Epoch [51/51], Iter [90/391] Loss: 0.0088
Epoch [51/51], Iter [100/391] Loss: 0.0371
Epoch [51/51], Iter [110/391] Loss: 0.0141
Epoch [51/51], Iter [120/391] Loss: 0.0750
Epoch [51/51], Iter [130/391] Loss: 0.0281
Epoch [51/51], Iter [140/391] Loss: 0.0197
Epoch [51/51], Iter [150/391] Loss: 0.0663
Epoch [51/51], Iter [160/391] Loss: 0.1169
Epoch [51/51], Iter [170/391] Loss: 0.0598
Epoch [51/51], Iter [180/391] Loss: 0.0699
Epoch [51/51], Iter [190/391] Loss: 0.1447
Epoch [51/51], Iter [200/391] Loss: 0.0434
Epoch [51/51], Iter [210/391] Loss: 0.0505
Epoch [51/51], Iter [220/391] Loss: 0.0563
Epoch [51/51], Iter [230/391] Loss: 0.0450
Epoch [51/51], Iter [240/391] Loss: 0.0584
Epoch [51/51], Iter [250/391] Loss: 0.0395
Epoch [51/51], Iter [260/391] Loss: 0.0454
Epoch [51/51], Iter [270/391] Loss: 0.0590
Epoch [51/51], Iter [280/391] Loss: 0.0508
Epoch [51/51], Iter [290/391] Loss: 0.0372
Epoch [51/51], Iter [300/391] Loss: 0.0716
Epoch [51/51], Iter [310/391] Loss: 0.0773
Epoch [51/51], Iter [320/391] Loss: 0.0360
Epoch [51/51], Iter [330/391] Loss: 0.0608
Epoch [51/51], Iter [340/391] Loss: 0.0517
Epoch [51/51], Iter [350/391] Loss: 0.0883
Epoch [51/51], Iter [360/391] Loss: 0.0816
Epoch [51/51], Iter [370/391] Loss: 0.0212
Epoch [51/51], Iter [380/391] Loss: 0.0628
Epoch [51/51], Iter [390/391] Loss: 0.0336
test_harness( testloader, resnet_parent )
Accuracy of the model on the test images: 91 %
(tensor(9143, device='cuda:0'), 10000)
ResNet Child (Student) Model Implementation
# ResNet Module
##NOTE:
##Uses the same 'Residual Block' as the parent model; however, the child model is shallower and not as wide
##computationally much faster to train and deploy than the parent
##
class ResNetChild(nn.Module):
def __init__(self, block, layers, num_classes=10):
super(ResNetChild, self).__init__()
#
self.in_channels = 16
self.conv1 = nn.Conv2d(3, 16, kernel_size=3, stride=1, padding=1, bias=False)
torch.nn.init.xavier_uniform(self.conv1.weight)
self.bn = nn.BatchNorm2d(16)
self.relu = nn.ReLU(inplace=True)
#
self.layer1 = self.make_layer(block, 160, layers[0] )
self.layer2 = self.make_layer(block, 320, layers[1], 2 )
#
self.avg_pool = nn.AvgPool2d(16)
self.fc = nn.Linear(320, num_classes)
torch.nn.init.xavier_uniform(self.fc.weight)
def weights_init(self, m):
classname = m.__class__.__name__
if classname.find('Conv') != -1:
torch.nn.init.xavier_uniform( m.weight )
def make_layer(self, block, out_channels, blocks, stride=1):
downsample = None
if (stride != 1) or (self.in_channels != out_channels):
downsample = nn.Sequential(
nn.Conv2d(self.in_channels, out_channels, kernel_size=3, padding=1, stride=stride, bias=False),
nn.BatchNorm2d(out_channels))
downsample.apply(self.weights_init)
layers = []
layers.append(block(self.in_channels, out_channels, stride, downsample))
self.in_channels = out_channels
for i in range(1, blocks):
layers.append(block(self.in_channels, out_channels))
return nn.Sequential(*layers)
def forward(self, x):
#
out = self.conv1(x)
out = self.bn(out)
out = self.relu(out)
#
out = self.layer1(out)
out = self.layer2(out)
#
out = self.avg_pool(out)
out = out.view(out.size(0), -1)
out = self.fc(out)
return out
#Uncomment to run visializer (GraphViz)
#m = ResNetChild(ResidualBlock, [2,2])
#y = m(Variable(images_tmp))
#g_child = viz_nn(y, m.state_dict())
#IFrame(g_child.view(), width=1000, height=1000, embed=True)
display(Image(filename='res_convnet_child_graph.JPG', embed=True))
#helper function to get back a new child (student) model
def get_new_child_model( block = ResidualBlock, layers = [2, 2], learning_rate = 0.01 ):
resnet_child = ResNetChild( block, layers )
optimizer_child = torch.optim.SGD(resnet_child.parameters(), lr=learning_rate, momentum=0.9, weight_decay=5e-4)
#GPU Acceleration
if torch.cuda.is_available():
gpu_id = torch.cuda.get_device_name(0)
print('ENABLING GPU ACCELERATION || {}'.format(gpu_id))
resnet_child = torch.nn.DataParallel(resnet_child, device_ids=range(torch.cuda.device_count()))
resnet_child.cuda()
cudnn.benchmark = True
return resnet_child, optimizer_child
resnet_child, optimizer_child = get_new_child_model()
epoch = 0
#Load Previous Model?
load_previous_model = False
if load_previous_model:
resnet_child, optimizer_child, epoch = load_model( resnet_child, optimizer_child )
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:12: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
if sys.path[0] == '':
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:26: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:15: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
from ipykernel import kernelapp as app
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:20: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:21: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
ENABLING GPU ACCELERATION || GeForce GTX 1080 Ti
“Knowledge Distillation” Cost Function Implementation
Below we implement the “knowledge distillation” cost function. The cost function is a linear combination of KL divergence and Cross Entropy loss.
To compare the outputs for ‘hard targets,’ we use the Cross Entropy loss that is already implemented in PyTorch; the implementation incorporates a Soft-Max layer followed by an an ‘Cross-Entropy’ calculcation given the hard target (labels).
For the soft objective function we will instead use the KL divergence loss implementation; the KL divergence function as implemented in PyTorch uses target class probabilities. KL divergence is the difference between the cross-entropy P,Q and the entropy of P:
Given that entropy of P (soft targets generated by teacher model) does not depend on logits z (student model), using KL divergence instead of cross entropy will not have an impact on the gradients and we can utilize the scaling between the two losses described in the paper.
#Knowledge Distillation Loss Function
def knowledge_distillation_loss( student_output, labels, teacher_output, alpha, T ):
''' '''
student_output_pr = F.log_softmax(student_output, dim=1)
teacher_output_pr = F.softmax(teacher_output/T, dim=1)
#
loss_func = nn.KLDivLoss()
#
loss = loss_func( student_output_pr, teacher_output_pr ) * alpha * (T * T) #(T*T for gradient scaling)
loss = loss + F.cross_entropy(student_output, labels) * (1. - alpha)
#
return loss
Note that the PyTorch KLDivLoss implementation takes as input:
- Outputs - Log probabilities for each class (note F.log_softmax instead of F.softmax)
- Targets - Probabilites of each class, in the distillation implementation these are generated by teacher model.
Cross Entropy PyTorch implementation F.cross_entropy takes as inputs directly the scores (logits) before applying the Soft-Max layer and labels:
#let's train the child (student) model
#we can define different knowledge distillation (KD) functions by binding alpha and temperature like this
#alpha = 0, T=1 (as represented by a0_t1) means child network learns from 'labels' - not from parent.
#this is the baseline scenario
kd_loss_a0_t1 = partial( knowledge_distillation_loss, alpha=0, T=1 )
training_harness( trainloader, optimizer_child, kd_loss_a0_t1, resnet_parent, resnet_child )
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:46: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
Epoch [1/51], Iter [10/391] Loss: 2.2543
Epoch [1/51], Iter [20/391] Loss: 1.9698
Epoch [1/51], Iter [30/391] Loss: 1.8502
Epoch [1/51], Iter [40/391] Loss: 1.7565
Epoch [1/51], Iter [50/391] Loss: 1.7479
Epoch [1/51], Iter [60/391] Loss: 1.6707
Epoch [1/51], Iter [70/391] Loss: 1.8412
Epoch [1/51], Iter [80/391] Loss: 1.6146
Epoch [1/51], Iter [90/391] Loss: 1.7598
Epoch [1/51], Iter [100/391] Loss: 1.5648
Epoch [1/51], Iter [110/391] Loss: 1.5721
Epoch [1/51], Iter [120/391] Loss: 1.6673
Epoch [1/51], Iter [130/391] Loss: 1.5275
Epoch [1/51], Iter [140/391] Loss: 1.4643
Epoch [1/51], Iter [150/391] Loss: 1.6041
Epoch [1/51], Iter [160/391] Loss: 1.5165
Epoch [1/51], Iter [170/391] Loss: 1.4318
Epoch [1/51], Iter [180/391] Loss: 1.5028
Epoch [1/51], Iter [190/391] Loss: 1.3602
Epoch [1/51], Iter [200/391] Loss: 1.5109
Epoch [1/51], Iter [210/391] Loss: 1.2951
Epoch [1/51], Iter [220/391] Loss: 1.3385
Epoch [1/51], Iter [230/391] Loss: 1.3036
Epoch [1/51], Iter [240/391] Loss: 1.2850
Epoch [1/51], Iter [250/391] Loss: 1.3674
Epoch [1/51], Iter [260/391] Loss: 1.2867
Epoch [1/51], Iter [270/391] Loss: 1.2084
Epoch [1/51], Iter [280/391] Loss: 1.2379
Epoch [1/51], Iter [290/391] Loss: 1.1892
Epoch [1/51], Iter [300/391] Loss: 1.3938
Epoch [1/51], Iter [310/391] Loss: 1.3314
Epoch [1/51], Iter [320/391] Loss: 1.2846
Epoch [1/51], Iter [330/391] Loss: 1.3483
Epoch [1/51], Iter [340/391] Loss: 1.1913
Epoch [1/51], Iter [350/391] Loss: 1.2884
Epoch [1/51], Iter [360/391] Loss: 1.1806
Epoch [1/51], Iter [370/391] Loss: 1.2718
Epoch [1/51], Iter [380/391] Loss: 1.1063
Epoch [1/51], Iter [390/391] Loss: 1.1321
Epoch [2/51], Iter [10/391] Loss: 1.1250
Epoch [2/51], Iter [20/391] Loss: 1.0256
Epoch [2/51], Iter [30/391] Loss: 1.1243
Epoch [2/51], Iter [40/391] Loss: 1.1361
Epoch [2/51], Iter [50/391] Loss: 1.1288
Epoch [2/51], Iter [60/391] Loss: 1.1648
Epoch [2/51], Iter [70/391] Loss: 1.2306
Epoch [2/51], Iter [80/391] Loss: 1.0333
Epoch [2/51], Iter [90/391] Loss: 1.0290
Epoch [2/51], Iter [100/391] Loss: 1.1369
Epoch [2/51], Iter [110/391] Loss: 1.1180
Epoch [2/51], Iter [120/391] Loss: 1.0079
Epoch [2/51], Iter [130/391] Loss: 1.0252
Epoch [2/51], Iter [140/391] Loss: 0.9666
Epoch [2/51], Iter [150/391] Loss: 1.0758
Epoch [2/51], Iter [160/391] Loss: 1.0348
Epoch [2/51], Iter [170/391] Loss: 1.1567
Epoch [2/51], Iter [180/391] Loss: 1.1383
Epoch [2/51], Iter [190/391] Loss: 1.0508
Epoch [2/51], Iter [200/391] Loss: 0.9120
Epoch [2/51], Iter [210/391] Loss: 1.2081
Epoch [2/51], Iter [220/391] Loss: 1.0984
Epoch [2/51], Iter [230/391] Loss: 1.0180
Epoch [2/51], Iter [240/391] Loss: 0.9810
Epoch [2/51], Iter [250/391] Loss: 1.1745
Epoch [2/51], Iter [260/391] Loss: 0.9038
Epoch [2/51], Iter [270/391] Loss: 0.9240
Epoch [2/51], Iter [280/391] Loss: 1.0182
Epoch [2/51], Iter [290/391] Loss: 1.0739
Epoch [2/51], Iter [300/391] Loss: 0.8797
Epoch [2/51], Iter [310/391] Loss: 1.0376
Epoch [2/51], Iter [320/391] Loss: 1.0616
Epoch [2/51], Iter [330/391] Loss: 0.9933
Epoch [2/51], Iter [340/391] Loss: 0.8483
Epoch [2/51], Iter [350/391] Loss: 0.7524
Epoch [2/51], Iter [360/391] Loss: 0.9141
Epoch [2/51], Iter [370/391] Loss: 0.9415
Epoch [2/51], Iter [380/391] Loss: 1.0335
Epoch [2/51], Iter [390/391] Loss: 0.7997
Epoch [3/51], Iter [10/391] Loss: 0.9460
Epoch [3/51], Iter [20/391] Loss: 0.8170
Epoch [3/51], Iter [30/391] Loss: 0.8376
Epoch [3/51], Iter [40/391] Loss: 0.9451
Epoch [3/51], Iter [50/391] Loss: 1.0840
Epoch [3/51], Iter [60/391] Loss: 0.9801
Epoch [3/51], Iter [70/391] Loss: 0.9370
Epoch [3/51], Iter [80/391] Loss: 0.9017
Epoch [3/51], Iter [90/391] Loss: 0.9249
Epoch [3/51], Iter [100/391] Loss: 1.0162
Epoch [3/51], Iter [110/391] Loss: 0.8539
Epoch [3/51], Iter [120/391] Loss: 0.6952
Epoch [3/51], Iter [130/391] Loss: 0.9344
Epoch [3/51], Iter [140/391] Loss: 0.8056
Epoch [3/51], Iter [150/391] Loss: 1.0895
Epoch [3/51], Iter [160/391] Loss: 0.8263
Epoch [3/51], Iter [170/391] Loss: 1.0447
Epoch [3/51], Iter [180/391] Loss: 1.2192
Epoch [3/51], Iter [190/391] Loss: 0.7197
Epoch [3/51], Iter [200/391] Loss: 0.8167
Epoch [3/51], Iter [210/391] Loss: 0.8456
Epoch [3/51], Iter [220/391] Loss: 0.9917
Epoch [3/51], Iter [230/391] Loss: 1.0930
Epoch [3/51], Iter [240/391] Loss: 0.8934
Epoch [3/51], Iter [250/391] Loss: 0.9302
Epoch [3/51], Iter [260/391] Loss: 0.9201
Epoch [3/51], Iter [270/391] Loss: 0.7880
Epoch [3/51], Iter [280/391] Loss: 0.7620
Epoch [3/51], Iter [290/391] Loss: 0.8508
Epoch [3/51], Iter [300/391] Loss: 0.7889
Epoch [3/51], Iter [310/391] Loss: 0.8330
Epoch [3/51], Iter [320/391] Loss: 0.7811
Epoch [3/51], Iter [330/391] Loss: 0.9330
Epoch [3/51], Iter [340/391] Loss: 0.8466
Epoch [3/51], Iter [350/391] Loss: 0.8151
Epoch [3/51], Iter [360/391] Loss: 0.7568
Epoch [3/51], Iter [370/391] Loss: 0.7510
Epoch [3/51], Iter [380/391] Loss: 0.7044
Epoch [3/51], Iter [390/391] Loss: 0.7835
Epoch [4/51], Iter [10/391] Loss: 0.8822
Epoch [4/51], Iter [20/391] Loss: 0.8021
Epoch [4/51], Iter [30/391] Loss: 0.8458
Epoch [4/51], Iter [40/391] Loss: 0.7600
Epoch [4/51], Iter [50/391] Loss: 0.8137
Epoch [4/51], Iter [60/391] Loss: 0.8224
Epoch [4/51], Iter [70/391] Loss: 0.8889
Epoch [4/51], Iter [80/391] Loss: 0.8185
Epoch [4/51], Iter [90/391] Loss: 0.7325
Epoch [4/51], Iter [100/391] Loss: 0.7788
Epoch [4/51], Iter [110/391] Loss: 0.8990
Epoch [4/51], Iter [120/391] Loss: 0.7503
Epoch [4/51], Iter [130/391] Loss: 0.7329
Epoch [4/51], Iter [140/391] Loss: 0.6821
Epoch [4/51], Iter [150/391] Loss: 1.0431
Epoch [4/51], Iter [160/391] Loss: 0.7241
Epoch [4/51], Iter [170/391] Loss: 0.7323
Epoch [4/51], Iter [180/391] Loss: 0.7754
Epoch [4/51], Iter [190/391] Loss: 0.6716
Epoch [4/51], Iter [200/391] Loss: 0.6997
Epoch [4/51], Iter [210/391] Loss: 0.7166
Epoch [4/51], Iter [220/391] Loss: 0.7052
Epoch [4/51], Iter [230/391] Loss: 0.8842
Epoch [4/51], Iter [240/391] Loss: 0.9578
Epoch [4/51], Iter [250/391] Loss: 0.7822
Epoch [4/51], Iter [260/391] Loss: 0.6698
Epoch [4/51], Iter [270/391] Loss: 0.7308
Epoch [4/51], Iter [280/391] Loss: 0.7059
Epoch [4/51], Iter [290/391] Loss: 0.6414
Epoch [4/51], Iter [300/391] Loss: 0.5475
Epoch [4/51], Iter [310/391] Loss: 0.6783
Epoch [4/51], Iter [320/391] Loss: 0.7847
Epoch [4/51], Iter [330/391] Loss: 0.6354
Epoch [4/51], Iter [340/391] Loss: 0.8120
Epoch [4/51], Iter [350/391] Loss: 0.6986
Epoch [4/51], Iter [360/391] Loss: 0.8864
Epoch [4/51], Iter [370/391] Loss: 0.6154
Epoch [4/51], Iter [380/391] Loss: 0.7190
Epoch [4/51], Iter [390/391] Loss: 0.8972
Epoch [5/51], Iter [10/391] Loss: 0.5740
Epoch [5/51], Iter [20/391] Loss: 0.6669
Epoch [5/51], Iter [30/391] Loss: 0.6273
Epoch [5/51], Iter [40/391] Loss: 0.7709
Epoch [5/51], Iter [50/391] Loss: 0.7160
Epoch [5/51], Iter [60/391] Loss: 0.7196
Epoch [5/51], Iter [70/391] Loss: 0.7036
Epoch [5/51], Iter [80/391] Loss: 0.6084
Epoch [5/51], Iter [90/391] Loss: 0.7472
Epoch [5/51], Iter [100/391] Loss: 0.6216
Epoch [5/51], Iter [110/391] Loss: 0.7499
Epoch [5/51], Iter [120/391] Loss: 0.7700
Epoch [5/51], Iter [130/391] Loss: 0.5635
Epoch [5/51], Iter [140/391] Loss: 0.7627
Epoch [5/51], Iter [150/391] Loss: 0.6018
Epoch [5/51], Iter [160/391] Loss: 0.6257
Epoch [5/51], Iter [170/391] Loss: 0.8939
Epoch [5/51], Iter [180/391] Loss: 0.7454
Epoch [5/51], Iter [190/391] Loss: 0.6821
Epoch [5/51], Iter [200/391] Loss: 0.6832
Epoch [5/51], Iter [210/391] Loss: 0.6374
Epoch [5/51], Iter [220/391] Loss: 0.7065
Epoch [5/51], Iter [230/391] Loss: 0.7465
Epoch [5/51], Iter [240/391] Loss: 0.7286
Epoch [5/51], Iter [250/391] Loss: 0.5434
Epoch [5/51], Iter [260/391] Loss: 0.7350
Epoch [5/51], Iter [270/391] Loss: 0.7031
Epoch [5/51], Iter [280/391] Loss: 0.4654
Epoch [5/51], Iter [290/391] Loss: 0.6237
Epoch [5/51], Iter [300/391] Loss: 0.5479
Epoch [5/51], Iter [310/391] Loss: 0.7906
Epoch [5/51], Iter [320/391] Loss: 0.7707
Epoch [5/51], Iter [330/391] Loss: 0.8023
Epoch [5/51], Iter [340/391] Loss: 0.6752
Epoch [5/51], Iter [350/391] Loss: 0.5402
Epoch [5/51], Iter [360/391] Loss: 0.8599
Epoch [5/51], Iter [370/391] Loss: 0.6516
Epoch [5/51], Iter [380/391] Loss: 0.6821
Epoch [5/51], Iter [390/391] Loss: 0.6251
Epoch [6/51], Iter [10/391] Loss: 0.5845
Epoch [6/51], Iter [20/391] Loss: 0.5511
Epoch [6/51], Iter [30/391] Loss: 0.5964
Epoch [6/51], Iter [40/391] Loss: 0.6140
Epoch [6/51], Iter [50/391] Loss: 0.6461
Epoch [6/51], Iter [60/391] Loss: 0.6707
Epoch [6/51], Iter [70/391] Loss: 0.6832
Epoch [6/51], Iter [80/391] Loss: 0.5837
Epoch [6/51], Iter [90/391] Loss: 0.6337
Epoch [6/51], Iter [100/391] Loss: 0.5147
Epoch [6/51], Iter [110/391] Loss: 0.5561
Epoch [6/51], Iter [120/391] Loss: 0.5062
Epoch [6/51], Iter [130/391] Loss: 0.6485
Epoch [6/51], Iter [140/391] Loss: 0.6147
Epoch [6/51], Iter [150/391] Loss: 0.6160
Epoch [6/51], Iter [160/391] Loss: 0.5719
Epoch [6/51], Iter [170/391] Loss: 0.4889
Epoch [6/51], Iter [180/391] Loss: 0.4962
Epoch [6/51], Iter [190/391] Loss: 0.6084
Epoch [6/51], Iter [200/391] Loss: 0.4790
Epoch [6/51], Iter [210/391] Loss: 0.5801
Epoch [6/51], Iter [220/391] Loss: 0.5635
Epoch [6/51], Iter [230/391] Loss: 0.5999
Epoch [6/51], Iter [240/391] Loss: 0.6626
Epoch [6/51], Iter [250/391] Loss: 0.5631
Epoch [6/51], Iter [260/391] Loss: 0.4483
Epoch [6/51], Iter [270/391] Loss: 0.5422
Epoch [6/51], Iter [280/391] Loss: 0.6450
Epoch [6/51], Iter [290/391] Loss: 0.5213
Epoch [6/51], Iter [300/391] Loss: 0.6056
Epoch [6/51], Iter [310/391] Loss: 0.6452
Epoch [6/51], Iter [320/391] Loss: 0.5067
Epoch [6/51], Iter [330/391] Loss: 0.5934
Epoch [6/51], Iter [340/391] Loss: 0.6037
Epoch [6/51], Iter [350/391] Loss: 0.4807
Epoch [6/51], Iter [360/391] Loss: 0.5568
Epoch [6/51], Iter [370/391] Loss: 0.5841
Epoch [6/51], Iter [380/391] Loss: 0.6145
Epoch [6/51], Iter [390/391] Loss: 0.6214
Epoch [7/51], Iter [10/391] Loss: 0.4965
Epoch [7/51], Iter [20/391] Loss: 0.4967
Epoch [7/51], Iter [30/391] Loss: 0.5746
Epoch [7/51], Iter [40/391] Loss: 0.5655
Epoch [7/51], Iter [50/391] Loss: 0.6411
Epoch [7/51], Iter [60/391] Loss: 0.5381
Epoch [7/51], Iter [70/391] Loss: 0.4965
Epoch [7/51], Iter [80/391] Loss: 0.4617
Epoch [7/51], Iter [90/391] Loss: 0.6435
Epoch [7/51], Iter [100/391] Loss: 0.6019
Epoch [7/51], Iter [110/391] Loss: 0.3816
Epoch [7/51], Iter [120/391] Loss: 0.5999
Epoch [7/51], Iter [130/391] Loss: 0.4370
Epoch [7/51], Iter [140/391] Loss: 0.5489
Epoch [7/51], Iter [150/391] Loss: 0.5361
Epoch [7/51], Iter [160/391] Loss: 0.5753
Epoch [7/51], Iter [170/391] Loss: 0.5460
Epoch [7/51], Iter [180/391] Loss: 0.5190
Epoch [7/51], Iter [190/391] Loss: 0.6735
Epoch [7/51], Iter [200/391] Loss: 0.4385
Epoch [7/51], Iter [210/391] Loss: 0.5838
Epoch [7/51], Iter [220/391] Loss: 0.4400
Epoch [7/51], Iter [230/391] Loss: 0.5060
Epoch [7/51], Iter [240/391] Loss: 0.4225
Epoch [7/51], Iter [250/391] Loss: 0.6380
Epoch [7/51], Iter [260/391] Loss: 0.5989
Epoch [7/51], Iter [270/391] Loss: 0.4947
Epoch [7/51], Iter [280/391] Loss: 0.6721
Epoch [7/51], Iter [290/391] Loss: 0.3767
Epoch [7/51], Iter [300/391] Loss: 0.5097
Epoch [7/51], Iter [310/391] Loss: 0.6008
Epoch [7/51], Iter [320/391] Loss: 0.6951
Epoch [7/51], Iter [330/391] Loss: 0.5371
Epoch [7/51], Iter [340/391] Loss: 0.5692
Epoch [7/51], Iter [350/391] Loss: 0.3526
Epoch [7/51], Iter [360/391] Loss: 0.5547
Epoch [7/51], Iter [370/391] Loss: 0.5627
Epoch [7/51], Iter [380/391] Loss: 0.5655
Epoch [7/51], Iter [390/391] Loss: 0.6204
Epoch [8/51], Iter [10/391] Loss: 0.4085
Epoch [8/51], Iter [20/391] Loss: 0.4517
Epoch [8/51], Iter [30/391] Loss: 0.5653
Epoch [8/51], Iter [40/391] Loss: 0.3639
Epoch [8/51], Iter [50/391] Loss: 0.4966
Epoch [8/51], Iter [60/391] Loss: 0.4063
Epoch [8/51], Iter [70/391] Loss: 0.4786
Epoch [8/51], Iter [80/391] Loss: 0.5775
Epoch [8/51], Iter [90/391] Loss: 0.4419
Epoch [8/51], Iter [100/391] Loss: 0.4971
Epoch [8/51], Iter [110/391] Loss: 0.5830
Epoch [8/51], Iter [120/391] Loss: 0.4499
Epoch [8/51], Iter [130/391] Loss: 0.5179
Epoch [8/51], Iter [140/391] Loss: 0.5564
Epoch [8/51], Iter [150/391] Loss: 0.5127
Epoch [8/51], Iter [160/391] Loss: 0.4534
Epoch [8/51], Iter [170/391] Loss: 0.4891
Epoch [8/51], Iter [180/391] Loss: 0.4378
Epoch [8/51], Iter [190/391] Loss: 0.3664
Epoch [8/51], Iter [200/391] Loss: 0.6946
Epoch [8/51], Iter [210/391] Loss: 0.5098
Epoch [8/51], Iter [220/391] Loss: 0.4230
Epoch [8/51], Iter [230/391] Loss: 0.5760
Epoch [8/51], Iter [240/391] Loss: 0.5190
Epoch [8/51], Iter [250/391] Loss: 0.4676
Epoch [8/51], Iter [260/391] Loss: 0.4959
Epoch [8/51], Iter [270/391] Loss: 0.3979
Epoch [8/51], Iter [280/391] Loss: 0.4227
Epoch [8/51], Iter [290/391] Loss: 0.5648
Epoch [8/51], Iter [300/391] Loss: 0.4814
Epoch [8/51], Iter [310/391] Loss: 0.5910
Epoch [8/51], Iter [320/391] Loss: 0.3997
Epoch [8/51], Iter [330/391] Loss: 0.4538
Epoch [8/51], Iter [340/391] Loss: 0.5846
Epoch [8/51], Iter [350/391] Loss: 0.4051
Epoch [8/51], Iter [360/391] Loss: 0.5096
Epoch [8/51], Iter [370/391] Loss: 0.5039
Epoch [8/51], Iter [380/391] Loss: 0.5074
Epoch [8/51], Iter [390/391] Loss: 0.4866
Epoch [9/51], Iter [10/391] Loss: 0.3557
Epoch [9/51], Iter [20/391] Loss: 0.4282
Epoch [9/51], Iter [30/391] Loss: 0.5040
Epoch [9/51], Iter [40/391] Loss: 0.4272
Epoch [9/51], Iter [50/391] Loss: 0.5546
Epoch [9/51], Iter [60/391] Loss: 0.4725
Epoch [9/51], Iter [70/391] Loss: 0.4638
Epoch [9/51], Iter [80/391] Loss: 0.6853
Epoch [9/51], Iter [90/391] Loss: 0.6471
Epoch [9/51], Iter [100/391] Loss: 0.4294
Epoch [9/51], Iter [110/391] Loss: 0.4977
Epoch [9/51], Iter [120/391] Loss: 0.4171
Epoch [9/51], Iter [130/391] Loss: 0.5049
Epoch [9/51], Iter [140/391] Loss: 0.4431
Epoch [9/51], Iter [150/391] Loss: 0.5461
Epoch [9/51], Iter [160/391] Loss: 0.4894
Epoch [9/51], Iter [170/391] Loss: 0.5279
Epoch [9/51], Iter [180/391] Loss: 0.5577
Epoch [9/51], Iter [190/391] Loss: 0.4415
Epoch [9/51], Iter [200/391] Loss: 0.5352
Epoch [9/51], Iter [210/391] Loss: 0.4550
Epoch [9/51], Iter [220/391] Loss: 0.4540
Epoch [9/51], Iter [230/391] Loss: 0.5123
Epoch [9/51], Iter [240/391] Loss: 0.4089
Epoch [9/51], Iter [250/391] Loss: 0.5250
Epoch [9/51], Iter [260/391] Loss: 0.4715
Epoch [9/51], Iter [270/391] Loss: 0.3946
Epoch [9/51], Iter [280/391] Loss: 0.5808
Epoch [9/51], Iter [290/391] Loss: 0.3856
Epoch [9/51], Iter [300/391] Loss: 0.4584
Epoch [9/51], Iter [310/391] Loss: 0.4996
Epoch [9/51], Iter [320/391] Loss: 0.5744
Epoch [9/51], Iter [330/391] Loss: 0.5115
Epoch [9/51], Iter [340/391] Loss: 0.4686
Epoch [9/51], Iter [350/391] Loss: 0.4881
Epoch [9/51], Iter [360/391] Loss: 0.2829
Epoch [9/51], Iter [370/391] Loss: 0.3471
Epoch [9/51], Iter [380/391] Loss: 0.3832
Epoch [9/51], Iter [390/391] Loss: 0.4719
Epoch [10/51], Iter [10/391] Loss: 0.3656
Epoch [10/51], Iter [20/391] Loss: 0.3506
Epoch [10/51], Iter [30/391] Loss: 0.3193
Epoch [10/51], Iter [40/391] Loss: 0.3916
Epoch [10/51], Iter [50/391] Loss: 0.4774
Epoch [10/51], Iter [60/391] Loss: 0.4976
Epoch [10/51], Iter [70/391] Loss: 0.3874
Epoch [10/51], Iter [80/391] Loss: 0.3561
Epoch [10/51], Iter [90/391] Loss: 0.5140
Epoch [10/51], Iter [100/391] Loss: 0.3367
Epoch [10/51], Iter [110/391] Loss: 0.3110
Epoch [10/51], Iter [120/391] Loss: 0.3934
Epoch [10/51], Iter [130/391] Loss: 0.4034
Epoch [10/51], Iter [140/391] Loss: 0.3980
Epoch [10/51], Iter [150/391] Loss: 0.4661
Epoch [10/51], Iter [160/391] Loss: 0.4791
Epoch [10/51], Iter [170/391] Loss: 0.4085
Epoch [10/51], Iter [180/391] Loss: 0.4019
Epoch [10/51], Iter [190/391] Loss: 0.4834
Epoch [10/51], Iter [200/391] Loss: 0.6046
Epoch [10/51], Iter [210/391] Loss: 0.4421
Epoch [10/51], Iter [220/391] Loss: 0.3409
Epoch [10/51], Iter [230/391] Loss: 0.5361
Epoch [10/51], Iter [240/391] Loss: 0.3915
Epoch [10/51], Iter [250/391] Loss: 0.4282
Epoch [10/51], Iter [260/391] Loss: 0.4541
Epoch [10/51], Iter [270/391] Loss: 0.4328
Epoch [10/51], Iter [280/391] Loss: 0.4906
Epoch [10/51], Iter [290/391] Loss: 0.6196
Epoch [10/51], Iter [300/391] Loss: 0.4984
Epoch [10/51], Iter [310/391] Loss: 0.3491
Epoch [10/51], Iter [320/391] Loss: 0.3821
Epoch [10/51], Iter [330/391] Loss: 0.6899
Epoch [10/51], Iter [340/391] Loss: 0.4819
Epoch [10/51], Iter [350/391] Loss: 0.5484
Epoch [10/51], Iter [360/391] Loss: 0.4656
Epoch [10/51], Iter [370/391] Loss: 0.3492
Epoch [10/51], Iter [380/391] Loss: 0.2800
Epoch [10/51], Iter [390/391] Loss: 0.3663
[Saving Checkpoint]
Epoch [11/51], Iter [10/391] Loss: 0.2423
Epoch [11/51], Iter [20/391] Loss: 0.3662
Epoch [11/51], Iter [30/391] Loss: 0.4482
Epoch [11/51], Iter [40/391] Loss: 0.3433
Epoch [11/51], Iter [50/391] Loss: 0.4427
Epoch [11/51], Iter [60/391] Loss: 0.3703
Epoch [11/51], Iter [70/391] Loss: 0.4209
Epoch [11/51], Iter [80/391] Loss: 0.3447
Epoch [11/51], Iter [90/391] Loss: 0.3864
Epoch [11/51], Iter [100/391] Loss: 0.3484
Epoch [11/51], Iter [110/391] Loss: 0.4563
Epoch [11/51], Iter [120/391] Loss: 0.5370
Epoch [11/51], Iter [130/391] Loss: 0.5135
Epoch [11/51], Iter [140/391] Loss: 0.3203
Epoch [11/51], Iter [150/391] Loss: 0.3630
Epoch [11/51], Iter [160/391] Loss: 0.6241
Epoch [11/51], Iter [170/391] Loss: 0.4407
Epoch [11/51], Iter [180/391] Loss: 0.4519
Epoch [11/51], Iter [190/391] Loss: 0.4971
Epoch [11/51], Iter [200/391] Loss: 0.4521
Epoch [11/51], Iter [210/391] Loss: 0.3455
Epoch [11/51], Iter [220/391] Loss: 0.3956
Epoch [11/51], Iter [230/391] Loss: 0.3820
Epoch [11/51], Iter [240/391] Loss: 0.2715
Epoch [11/51], Iter [250/391] Loss: 0.2731
Epoch [11/51], Iter [260/391] Loss: 0.5480
Epoch [11/51], Iter [270/391] Loss: 0.4312
Epoch [11/51], Iter [280/391] Loss: 0.3734
Epoch [11/51], Iter [290/391] Loss: 0.4430
Epoch [11/51], Iter [300/391] Loss: 0.4145
Epoch [11/51], Iter [310/391] Loss: 0.3338
Epoch [11/51], Iter [320/391] Loss: 0.4436
Epoch [11/51], Iter [330/391] Loss: 0.2851
Epoch [11/51], Iter [340/391] Loss: 0.3519
Epoch [11/51], Iter [350/391] Loss: 0.3430
Epoch [11/51], Iter [360/391] Loss: 0.3554
Epoch [11/51], Iter [370/391] Loss: 0.5069
Epoch [11/51], Iter [380/391] Loss: 0.3424
Epoch [11/51], Iter [390/391] Loss: 0.4395
Epoch [12/51], Iter [10/391] Loss: 0.3882
Epoch [12/51], Iter [20/391] Loss: 0.3034
Epoch [12/51], Iter [30/391] Loss: 0.3626
Epoch [12/51], Iter [40/391] Loss: 0.3128
Epoch [12/51], Iter [50/391] Loss: 0.4349
Epoch [12/51], Iter [60/391] Loss: 0.4297
Epoch [12/51], Iter [70/391] Loss: 0.4297
Epoch [12/51], Iter [80/391] Loss: 0.4123
Epoch [12/51], Iter [90/391] Loss: 0.4208
Epoch [12/51], Iter [100/391] Loss: 0.4643
Epoch [12/51], Iter [110/391] Loss: 0.4603
Epoch [12/51], Iter [120/391] Loss: 0.4281
Epoch [12/51], Iter [130/391] Loss: 0.3752
Epoch [12/51], Iter [140/391] Loss: 0.3684
Epoch [12/51], Iter [150/391] Loss: 0.3551
Epoch [12/51], Iter [160/391] Loss: 0.3467
Epoch [12/51], Iter [170/391] Loss: 0.4482
Epoch [12/51], Iter [180/391] Loss: 0.4648
Epoch [12/51], Iter [190/391] Loss: 0.4173
Epoch [12/51], Iter [200/391] Loss: 0.3834
Epoch [12/51], Iter [210/391] Loss: 0.3838
Epoch [12/51], Iter [220/391] Loss: 0.2803
Epoch [12/51], Iter [230/391] Loss: 0.2371
Epoch [12/51], Iter [240/391] Loss: 0.3242
Epoch [12/51], Iter [250/391] Loss: 0.3454
Epoch [12/51], Iter [260/391] Loss: 0.3833
Epoch [12/51], Iter [270/391] Loss: 0.3102
Epoch [12/51], Iter [280/391] Loss: 0.4915
Epoch [12/51], Iter [290/391] Loss: 0.5191
Epoch [12/51], Iter [300/391] Loss: 0.3419
Epoch [12/51], Iter [310/391] Loss: 0.3922
Epoch [12/51], Iter [320/391] Loss: 0.3581
Epoch [12/51], Iter [330/391] Loss: 0.5035
Epoch [12/51], Iter [340/391] Loss: 0.3418
Epoch [12/51], Iter [350/391] Loss: 0.4431
Epoch [12/51], Iter [360/391] Loss: 0.2796
Epoch [12/51], Iter [370/391] Loss: 0.3126
Epoch [12/51], Iter [380/391] Loss: 0.4001
Epoch [12/51], Iter [390/391] Loss: 0.2784
Epoch [13/51], Iter [10/391] Loss: 0.4139
Epoch [13/51], Iter [20/391] Loss: 0.2935
Epoch [13/51], Iter [30/391] Loss: 0.4752
Epoch [13/51], Iter [40/391] Loss: 0.4078
Epoch [13/51], Iter [50/391] Loss: 0.3310
Epoch [13/51], Iter [60/391] Loss: 0.4685
Epoch [13/51], Iter [70/391] Loss: 0.3252
Epoch [13/51], Iter [80/391] Loss: 0.3094
Epoch [13/51], Iter [90/391] Loss: 0.3161
Epoch [13/51], Iter [100/391] Loss: 0.3539
Epoch [13/51], Iter [110/391] Loss: 0.5221
Epoch [13/51], Iter [120/391] Loss: 0.3382
Epoch [13/51], Iter [130/391] Loss: 0.2548
Epoch [13/51], Iter [140/391] Loss: 0.3016
Epoch [13/51], Iter [150/391] Loss: 0.2138
Epoch [13/51], Iter [160/391] Loss: 0.3649
Epoch [13/51], Iter [170/391] Loss: 0.3692
Epoch [13/51], Iter [180/391] Loss: 0.3642
Epoch [13/51], Iter [190/391] Loss: 0.3013
Epoch [13/51], Iter [200/391] Loss: 0.3440
Epoch [13/51], Iter [210/391] Loss: 0.4246
Epoch [13/51], Iter [220/391] Loss: 0.5115
Epoch [13/51], Iter [230/391] Loss: 0.4449
Epoch [13/51], Iter [240/391] Loss: 0.4302
Epoch [13/51], Iter [250/391] Loss: 0.4328
Epoch [13/51], Iter [260/391] Loss: 0.3513
Epoch [13/51], Iter [270/391] Loss: 0.2471
Epoch [13/51], Iter [280/391] Loss: 0.2545
Epoch [13/51], Iter [290/391] Loss: 0.3478
Epoch [13/51], Iter [300/391] Loss: 0.4870
Epoch [13/51], Iter [310/391] Loss: 0.3450
Epoch [13/51], Iter [320/391] Loss: 0.4476
Epoch [13/51], Iter [330/391] Loss: 0.3684
Epoch [13/51], Iter [340/391] Loss: 0.3738
Epoch [13/51], Iter [350/391] Loss: 0.3767
Epoch [13/51], Iter [360/391] Loss: 0.2969
Epoch [13/51], Iter [370/391] Loss: 0.3959
Epoch [13/51], Iter [380/391] Loss: 0.3299
Epoch [13/51], Iter [390/391] Loss: 0.4694
Epoch [14/51], Iter [10/391] Loss: 0.3032
Epoch [14/51], Iter [20/391] Loss: 0.2793
Epoch [14/51], Iter [30/391] Loss: 0.3144
Epoch [14/51], Iter [40/391] Loss: 0.2300
Epoch [14/51], Iter [50/391] Loss: 0.2261
Epoch [14/51], Iter [60/391] Loss: 0.2470
Epoch [14/51], Iter [70/391] Loss: 0.4309
Epoch [14/51], Iter [80/391] Loss: 0.3721
Epoch [14/51], Iter [90/391] Loss: 0.4239
Epoch [14/51], Iter [100/391] Loss: 0.2909
Epoch [14/51], Iter [110/391] Loss: 0.3771
Epoch [14/51], Iter [120/391] Loss: 0.4695
Epoch [14/51], Iter [130/391] Loss: 0.3408
Epoch [14/51], Iter [140/391] Loss: 0.2663
Epoch [14/51], Iter [150/391] Loss: 0.4586
Epoch [14/51], Iter [160/391] Loss: 0.5057
Epoch [14/51], Iter [170/391] Loss: 0.3756
Epoch [14/51], Iter [180/391] Loss: 0.3088
Epoch [14/51], Iter [190/391] Loss: 0.3458
Epoch [14/51], Iter [200/391] Loss: 0.3724
Epoch [14/51], Iter [210/391] Loss: 0.3038
Epoch [14/51], Iter [220/391] Loss: 0.3837
Epoch [14/51], Iter [230/391] Loss: 0.3690
Epoch [14/51], Iter [240/391] Loss: 0.4092
Epoch [14/51], Iter [250/391] Loss: 0.3756
Epoch [14/51], Iter [260/391] Loss: 0.3520
Epoch [14/51], Iter [270/391] Loss: 0.3983
Epoch [14/51], Iter [280/391] Loss: 0.4317
Epoch [14/51], Iter [290/391] Loss: 0.3383
Epoch [14/51], Iter [300/391] Loss: 0.3108
Epoch [14/51], Iter [310/391] Loss: 0.4413
Epoch [14/51], Iter [320/391] Loss: 0.3877
Epoch [14/51], Iter [330/391] Loss: 0.2791
Epoch [14/51], Iter [340/391] Loss: 0.2712
Epoch [14/51], Iter [350/391] Loss: 0.3126
Epoch [14/51], Iter [360/391] Loss: 0.3106
Epoch [14/51], Iter [370/391] Loss: 0.3091
Epoch [14/51], Iter [380/391] Loss: 0.3145
Epoch [14/51], Iter [390/391] Loss: 0.3230
Epoch [15/51], Iter [10/391] Loss: 0.3073
Epoch [15/51], Iter [20/391] Loss: 0.2976
Epoch [15/51], Iter [30/391] Loss: 0.3931
Epoch [15/51], Iter [40/391] Loss: 0.2735
Epoch [15/51], Iter [50/391] Loss: 0.3012
Epoch [15/51], Iter [60/391] Loss: 0.4287
Epoch [15/51], Iter [70/391] Loss: 0.3975
Epoch [15/51], Iter [80/391] Loss: 0.3319
Epoch [15/51], Iter [90/391] Loss: 0.2930
Epoch [15/51], Iter [100/391] Loss: 0.3291
Epoch [15/51], Iter [110/391] Loss: 0.3632
Epoch [15/51], Iter [120/391] Loss: 0.2785
Epoch [15/51], Iter [130/391] Loss: 0.3319
Epoch [15/51], Iter [140/391] Loss: 0.3024
Epoch [15/51], Iter [150/391] Loss: 0.4363
Epoch [15/51], Iter [160/391] Loss: 0.2242
Epoch [15/51], Iter [170/391] Loss: 0.2904
Epoch [15/51], Iter [180/391] Loss: 0.3055
Epoch [15/51], Iter [190/391] Loss: 0.3076
Epoch [15/51], Iter [200/391] Loss: 0.2821
Epoch [15/51], Iter [210/391] Loss: 0.3058
Epoch [15/51], Iter [220/391] Loss: 0.3288
Epoch [15/51], Iter [230/391] Loss: 0.3111
Epoch [15/51], Iter [240/391] Loss: 0.2470
Epoch [15/51], Iter [250/391] Loss: 0.3635
Epoch [15/51], Iter [260/391] Loss: 0.2983
Epoch [15/51], Iter [270/391] Loss: 0.2078
Epoch [15/51], Iter [280/391] Loss: 0.4800
Epoch [15/51], Iter [290/391] Loss: 0.3682
Epoch [15/51], Iter [300/391] Loss: 0.3217
Epoch [15/51], Iter [310/391] Loss: 0.4123
Epoch [15/51], Iter [320/391] Loss: 0.4583
Epoch [15/51], Iter [330/391] Loss: 0.2647
Epoch [15/51], Iter [340/391] Loss: 0.4146
Epoch [15/51], Iter [350/391] Loss: 0.3001
Epoch [15/51], Iter [360/391] Loss: 0.4164
Epoch [15/51], Iter [370/391] Loss: 0.3347
Epoch [15/51], Iter [380/391] Loss: 0.3765
Epoch [15/51], Iter [390/391] Loss: 0.3078
Epoch [16/51], Iter [10/391] Loss: 0.3211
Epoch [16/51], Iter [20/391] Loss: 0.2957
Epoch [16/51], Iter [30/391] Loss: 0.4255
Epoch [16/51], Iter [40/391] Loss: 0.3164
Epoch [16/51], Iter [50/391] Loss: 0.2125
Epoch [16/51], Iter [60/391] Loss: 0.2456
Epoch [16/51], Iter [70/391] Loss: 0.2846
Epoch [16/51], Iter [80/391] Loss: 0.4225
Epoch [16/51], Iter [90/391] Loss: 0.2608
Epoch [16/51], Iter [100/391] Loss: 0.2572
Epoch [16/51], Iter [110/391] Loss: 0.2679
Epoch [16/51], Iter [120/391] Loss: 0.3201
Epoch [16/51], Iter [130/391] Loss: 0.2078
Epoch [16/51], Iter [140/391] Loss: 0.3742
Epoch [16/51], Iter [150/391] Loss: 0.3038
Epoch [16/51], Iter [160/391] Loss: 0.1934
Epoch [16/51], Iter [170/391] Loss: 0.2532
Epoch [16/51], Iter [180/391] Loss: 0.2434
Epoch [16/51], Iter [190/391] Loss: 0.3525
Epoch [16/51], Iter [200/391] Loss: 0.2994
Epoch [16/51], Iter [210/391] Loss: 0.3205
Epoch [16/51], Iter [220/391] Loss: 0.3359
Epoch [16/51], Iter [230/391] Loss: 0.3155
Epoch [16/51], Iter [240/391] Loss: 0.2287
Epoch [16/51], Iter [250/391] Loss: 0.2595
Epoch [16/51], Iter [260/391] Loss: 0.2514
Epoch [16/51], Iter [270/391] Loss: 0.2760
Epoch [16/51], Iter [280/391] Loss: 0.3841
Epoch [16/51], Iter [290/391] Loss: 0.3521
Epoch [16/51], Iter [300/391] Loss: 0.3570
Epoch [16/51], Iter [310/391] Loss: 0.2736
Epoch [16/51], Iter [320/391] Loss: 0.3355
Epoch [16/51], Iter [330/391] Loss: 0.2005
Epoch [16/51], Iter [340/391] Loss: 0.2147
Epoch [16/51], Iter [350/391] Loss: 0.2458
Epoch [16/51], Iter [360/391] Loss: 0.3633
Epoch [16/51], Iter [370/391] Loss: 0.2526
Epoch [16/51], Iter [380/391] Loss: 0.4338
Epoch [16/51], Iter [390/391] Loss: 0.3053
Epoch [17/51], Iter [10/391] Loss: 0.2132
Epoch [17/51], Iter [20/391] Loss: 0.2483
Epoch [17/51], Iter [30/391] Loss: 0.3324
Epoch [17/51], Iter [40/391] Loss: 0.2100
Epoch [17/51], Iter [50/391] Loss: 0.2621
Epoch [17/51], Iter [60/391] Loss: 0.2990
Epoch [17/51], Iter [70/391] Loss: 0.3501
Epoch [17/51], Iter [80/391] Loss: 0.2749
Epoch [17/51], Iter [90/391] Loss: 0.2886
Epoch [17/51], Iter [100/391] Loss: 0.3512
Epoch [17/51], Iter [110/391] Loss: 0.2814
Epoch [17/51], Iter [120/391] Loss: 0.2342
Epoch [17/51], Iter [130/391] Loss: 0.2463
Epoch [17/51], Iter [140/391] Loss: 0.2692
Epoch [17/51], Iter [150/391] Loss: 0.3006
Epoch [17/51], Iter [160/391] Loss: 0.2963
Epoch [17/51], Iter [170/391] Loss: 0.4115
Epoch [17/51], Iter [180/391] Loss: 0.2661
Epoch [17/51], Iter [190/391] Loss: 0.2159
Epoch [17/51], Iter [200/391] Loss: 0.3202
Epoch [17/51], Iter [210/391] Loss: 0.2067
Epoch [17/51], Iter [220/391] Loss: 0.3173
Epoch [17/51], Iter [230/391] Loss: 0.3628
Epoch [17/51], Iter [240/391] Loss: 0.3557
Epoch [17/51], Iter [250/391] Loss: 0.2393
Epoch [17/51], Iter [260/391] Loss: 0.2540
Epoch [17/51], Iter [270/391] Loss: 0.2672
Epoch [17/51], Iter [280/391] Loss: 0.3437
Epoch [17/51], Iter [290/391] Loss: 0.2203
Epoch [17/51], Iter [300/391] Loss: 0.2445
Epoch [17/51], Iter [310/391] Loss: 0.3812
Epoch [17/51], Iter [320/391] Loss: 0.3232
Epoch [17/51], Iter [330/391] Loss: 0.2091
Epoch [17/51], Iter [340/391] Loss: 0.2755
Epoch [17/51], Iter [350/391] Loss: 0.2222
Epoch [17/51], Iter [360/391] Loss: 0.2827
Epoch [17/51], Iter [370/391] Loss: 0.2796
Epoch [17/51], Iter [380/391] Loss: 0.4948
Epoch [17/51], Iter [390/391] Loss: 0.2394
Epoch [18/51], Iter [10/391] Loss: 0.2737
Epoch [18/51], Iter [20/391] Loss: 0.3301
Epoch [18/51], Iter [30/391] Loss: 0.1926
Epoch [18/51], Iter [40/391] Loss: 0.2140
Epoch [18/51], Iter [50/391] Loss: 0.2025
Epoch [18/51], Iter [60/391] Loss: 0.2034
Epoch [18/51], Iter [70/391] Loss: 0.2202
Epoch [18/51], Iter [80/391] Loss: 0.2090
Epoch [18/51], Iter [90/391] Loss: 0.3269
Epoch [18/51], Iter [100/391] Loss: 0.3544
Epoch [18/51], Iter [110/391] Loss: 0.3542
Epoch [18/51], Iter [120/391] Loss: 0.2549
Epoch [18/51], Iter [130/391] Loss: 0.2730
Epoch [18/51], Iter [140/391] Loss: 0.2037
Epoch [18/51], Iter [150/391] Loss: 0.1788
Epoch [18/51], Iter [160/391] Loss: 0.2346
Epoch [18/51], Iter [170/391] Loss: 0.3478
Epoch [18/51], Iter [180/391] Loss: 0.2977
Epoch [18/51], Iter [190/391] Loss: 0.3338
Epoch [18/51], Iter [200/391] Loss: 0.3734
Epoch [18/51], Iter [210/391] Loss: 0.2623
Epoch [18/51], Iter [220/391] Loss: 0.2320
Epoch [18/51], Iter [230/391] Loss: 0.2620
Epoch [18/51], Iter [240/391] Loss: 0.3570
Epoch [18/51], Iter [250/391] Loss: 0.4484
Epoch [18/51], Iter [260/391] Loss: 0.2784
Epoch [18/51], Iter [270/391] Loss: 0.3204
Epoch [18/51], Iter [280/391] Loss: 0.2860
Epoch [18/51], Iter [290/391] Loss: 0.3967
Epoch [18/51], Iter [300/391] Loss: 0.2787
Epoch [18/51], Iter [310/391] Loss: 0.2478
Epoch [18/51], Iter [320/391] Loss: 0.3857
Epoch [18/51], Iter [330/391] Loss: 0.3510
Epoch [18/51], Iter [340/391] Loss: 0.3203
Epoch [18/51], Iter [350/391] Loss: 0.1980
Epoch [18/51], Iter [360/391] Loss: 0.2025
Epoch [18/51], Iter [370/391] Loss: 0.3112
Epoch [18/51], Iter [380/391] Loss: 0.3464
Epoch [18/51], Iter [390/391] Loss: 0.3522
Epoch [19/51], Iter [10/391] Loss: 0.3317
Epoch [19/51], Iter [20/391] Loss: 0.3158
Epoch [19/51], Iter [30/391] Loss: 0.2677
Epoch [19/51], Iter [40/391] Loss: 0.3668
Epoch [19/51], Iter [50/391] Loss: 0.2643
Epoch [19/51], Iter [60/391] Loss: 0.2768
Epoch [19/51], Iter [70/391] Loss: 0.2109
Epoch [19/51], Iter [80/391] Loss: 0.2487
Epoch [19/51], Iter [90/391] Loss: 0.2514
Epoch [19/51], Iter [100/391] Loss: 0.2197
Epoch [19/51], Iter [110/391] Loss: 0.1692
Epoch [19/51], Iter [120/391] Loss: 0.3910
Epoch [19/51], Iter [130/391] Loss: 0.2320
Epoch [19/51], Iter [140/391] Loss: 0.2980
Epoch [19/51], Iter [150/391] Loss: 0.2313
Epoch [19/51], Iter [160/391] Loss: 0.2017
Epoch [19/51], Iter [170/391] Loss: 0.2842
Epoch [19/51], Iter [180/391] Loss: 0.2213
Epoch [19/51], Iter [190/391] Loss: 0.2147
Epoch [19/51], Iter [200/391] Loss: 0.1669
Epoch [19/51], Iter [210/391] Loss: 0.3192
Epoch [19/51], Iter [220/391] Loss: 0.2682
Epoch [19/51], Iter [230/391] Loss: 0.1932
Epoch [19/51], Iter [240/391] Loss: 0.2271
Epoch [19/51], Iter [250/391] Loss: 0.2125
Epoch [19/51], Iter [260/391] Loss: 0.3469
Epoch [19/51], Iter [270/391] Loss: 0.2639
Epoch [19/51], Iter [280/391] Loss: 0.2919
Epoch [19/51], Iter [290/391] Loss: 0.2272
Epoch [19/51], Iter [300/391] Loss: 0.2433
Epoch [19/51], Iter [310/391] Loss: 0.3183
Epoch [19/51], Iter [320/391] Loss: 0.2644
Epoch [19/51], Iter [330/391] Loss: 0.3475
Epoch [19/51], Iter [340/391] Loss: 0.1950
Epoch [19/51], Iter [350/391] Loss: 0.1931
Epoch [19/51], Iter [360/391] Loss: 0.2521
Epoch [19/51], Iter [370/391] Loss: 0.2435
Epoch [19/51], Iter [380/391] Loss: 0.1661
Epoch [19/51], Iter [390/391] Loss: 0.3829
Epoch [20/51], Iter [10/391] Loss: 0.1949
Epoch [20/51], Iter [20/391] Loss: 0.2430
Epoch [20/51], Iter [30/391] Loss: 0.2228
Epoch [20/51], Iter [40/391] Loss: 0.1211
Epoch [20/51], Iter [50/391] Loss: 0.2338
Epoch [20/51], Iter [60/391] Loss: 0.2566
Epoch [20/51], Iter [70/391] Loss: 0.2686
Epoch [20/51], Iter [80/391] Loss: 0.1191
Epoch [20/51], Iter [90/391] Loss: 0.1674
Epoch [20/51], Iter [100/391] Loss: 0.1903
Epoch [20/51], Iter [110/391] Loss: 0.2670
Epoch [20/51], Iter [120/391] Loss: 0.2240
Epoch [20/51], Iter [130/391] Loss: 0.3218
Epoch [20/51], Iter [140/391] Loss: 0.1954
Epoch [20/51], Iter [150/391] Loss: 0.2408
Epoch [20/51], Iter [160/391] Loss: 0.2242
Epoch [20/51], Iter [170/391] Loss: 0.1962
Epoch [20/51], Iter [180/391] Loss: 0.1987
Epoch [20/51], Iter [190/391] Loss: 0.2211
Epoch [20/51], Iter [200/391] Loss: 0.2574
Epoch [20/51], Iter [210/391] Loss: 0.2719
Epoch [20/51], Iter [220/391] Loss: 0.2427
Epoch [20/51], Iter [230/391] Loss: 0.3085
Epoch [20/51], Iter [240/391] Loss: 0.2258
Epoch [20/51], Iter [250/391] Loss: 0.3169
Epoch [20/51], Iter [260/391] Loss: 0.2353
Epoch [20/51], Iter [270/391] Loss: 0.3026
Epoch [20/51], Iter [280/391] Loss: 0.2752
Epoch [20/51], Iter [290/391] Loss: 0.3217
Epoch [20/51], Iter [300/391] Loss: 0.2710
Epoch [20/51], Iter [310/391] Loss: 0.3172
Epoch [20/51], Iter [320/391] Loss: 0.1832
Epoch [20/51], Iter [330/391] Loss: 0.2309
Epoch [20/51], Iter [340/391] Loss: 0.2052
Epoch [20/51], Iter [350/391] Loss: 0.1681
Epoch [20/51], Iter [360/391] Loss: 0.2561
Epoch [20/51], Iter [370/391] Loss: 0.2353
Epoch [20/51], Iter [380/391] Loss: 0.2479
Epoch [20/51], Iter [390/391] Loss: 0.2421
[Saving Checkpoint]
Epoch [21/51], Iter [10/391] Loss: 0.2840
Epoch [21/51], Iter [20/391] Loss: 0.1539
Epoch [21/51], Iter [30/391] Loss: 0.1729
Epoch [21/51], Iter [40/391] Loss: 0.2907
Epoch [21/51], Iter [50/391] Loss: 0.2068
Epoch [21/51], Iter [60/391] Loss: 0.2832
Epoch [21/51], Iter [70/391] Loss: 0.2043
Epoch [21/51], Iter [80/391] Loss: 0.1719
Epoch [21/51], Iter [90/391] Loss: 0.2159
Epoch [21/51], Iter [100/391] Loss: 0.2655
Epoch [21/51], Iter [110/391] Loss: 0.2289
Epoch [21/51], Iter [120/391] Loss: 0.3018
Epoch [21/51], Iter [130/391] Loss: 0.3564
Epoch [21/51], Iter [140/391] Loss: 0.3097
Epoch [21/51], Iter [150/391] Loss: 0.1708
Epoch [21/51], Iter [160/391] Loss: 0.2042
Epoch [21/51], Iter [170/391] Loss: 0.1748
Epoch [21/51], Iter [180/391] Loss: 0.2331
Epoch [21/51], Iter [190/391] Loss: 0.1452
Epoch [21/51], Iter [200/391] Loss: 0.1819
Epoch [21/51], Iter [210/391] Loss: 0.1770
Epoch [21/51], Iter [220/391] Loss: 0.1716
Epoch [21/51], Iter [230/391] Loss: 0.2537
Epoch [21/51], Iter [240/391] Loss: 0.2088
Epoch [21/51], Iter [250/391] Loss: 0.1798
Epoch [21/51], Iter [260/391] Loss: 0.2719
Epoch [21/51], Iter [270/391] Loss: 0.2105
Epoch [21/51], Iter [280/391] Loss: 0.2609
Epoch [21/51], Iter [290/391] Loss: 0.2754
Epoch [21/51], Iter [300/391] Loss: 0.2530
Epoch [21/51], Iter [310/391] Loss: 0.2543
Epoch [21/51], Iter [320/391] Loss: 0.2102
Epoch [21/51], Iter [330/391] Loss: 0.1896
Epoch [21/51], Iter [340/391] Loss: 0.2776
Epoch [21/51], Iter [350/391] Loss: 0.2902
Epoch [21/51], Iter [360/391] Loss: 0.2117
Epoch [21/51], Iter [370/391] Loss: 0.2326
Epoch [21/51], Iter [380/391] Loss: 0.2511
Epoch [21/51], Iter [390/391] Loss: 0.2124
Epoch [22/51], Iter [10/391] Loss: 0.1576
Epoch [22/51], Iter [20/391] Loss: 0.1589
Epoch [22/51], Iter [30/391] Loss: 0.3167
Epoch [22/51], Iter [40/391] Loss: 0.2160
Epoch [22/51], Iter [50/391] Loss: 0.2097
Epoch [22/51], Iter [60/391] Loss: 0.3254
Epoch [22/51], Iter [70/391] Loss: 0.2087
Epoch [22/51], Iter [80/391] Loss: 0.1866
Epoch [22/51], Iter [90/391] Loss: 0.1752
Epoch [22/51], Iter [100/391] Loss: 0.2244
Epoch [22/51], Iter [110/391] Loss: 0.1704
Epoch [22/51], Iter [120/391] Loss: 0.2593
Epoch [22/51], Iter [130/391] Loss: 0.2261
Epoch [22/51], Iter [140/391] Loss: 0.2828
Epoch [22/51], Iter [150/391] Loss: 0.1710
Epoch [22/51], Iter [160/391] Loss: 0.1896
Epoch [22/51], Iter [170/391] Loss: 0.2370
Epoch [22/51], Iter [180/391] Loss: 0.2110
Epoch [22/51], Iter [190/391] Loss: 0.2508
Epoch [22/51], Iter [200/391] Loss: 0.2055
Epoch [22/51], Iter [210/391] Loss: 0.2440
Epoch [22/51], Iter [220/391] Loss: 0.3124
Epoch [22/51], Iter [230/391] Loss: 0.1813
Epoch [22/51], Iter [240/391] Loss: 0.1970
Epoch [22/51], Iter [250/391] Loss: 0.2173
Epoch [22/51], Iter [260/391] Loss: 0.2775
Epoch [22/51], Iter [270/391] Loss: 0.2825
Epoch [22/51], Iter [280/391] Loss: 0.2739
Epoch [22/51], Iter [290/391] Loss: 0.2440
Epoch [22/51], Iter [300/391] Loss: 0.1993
Epoch [22/51], Iter [310/391] Loss: 0.3035
Epoch [22/51], Iter [320/391] Loss: 0.1339
Epoch [22/51], Iter [330/391] Loss: 0.2275
Epoch [22/51], Iter [340/391] Loss: 0.3402
Epoch [22/51], Iter [350/391] Loss: 0.2527
Epoch [22/51], Iter [360/391] Loss: 0.2304
Epoch [22/51], Iter [370/391] Loss: 0.2847
Epoch [22/51], Iter [380/391] Loss: 0.2586
Epoch [22/51], Iter [390/391] Loss: 0.1942
Epoch [23/51], Iter [10/391] Loss: 0.2225
Epoch [23/51], Iter [20/391] Loss: 0.2006
Epoch [23/51], Iter [30/391] Loss: 0.1564
Epoch [23/51], Iter [40/391] Loss: 0.1629
Epoch [23/51], Iter [50/391] Loss: 0.2056
Epoch [23/51], Iter [60/391] Loss: 0.1697
Epoch [23/51], Iter [70/391] Loss: 0.2368
Epoch [23/51], Iter [80/391] Loss: 0.1586
Epoch [23/51], Iter [90/391] Loss: 0.2094
Epoch [23/51], Iter [100/391] Loss: 0.3860
Epoch [23/51], Iter [110/391] Loss: 0.1634
Epoch [23/51], Iter [120/391] Loss: 0.2190
Epoch [23/51], Iter [130/391] Loss: 0.3373
Epoch [23/51], Iter [140/391] Loss: 0.1531
Epoch [23/51], Iter [150/391] Loss: 0.2296
Epoch [23/51], Iter [160/391] Loss: 0.2641
Epoch [23/51], Iter [170/391] Loss: 0.2845
Epoch [23/51], Iter [180/391] Loss: 0.1868
Epoch [23/51], Iter [190/391] Loss: 0.3164
Epoch [23/51], Iter [200/391] Loss: 0.2272
Epoch [23/51], Iter [210/391] Loss: 0.2051
Epoch [23/51], Iter [220/391] Loss: 0.1986
Epoch [23/51], Iter [230/391] Loss: 0.1806
Epoch [23/51], Iter [240/391] Loss: 0.1354
Epoch [23/51], Iter [250/391] Loss: 0.1528
Epoch [23/51], Iter [260/391] Loss: 0.3033
Epoch [23/51], Iter [270/391] Loss: 0.2021
Epoch [23/51], Iter [280/391] Loss: 0.1891
Epoch [23/51], Iter [290/391] Loss: 0.2471
Epoch [23/51], Iter [300/391] Loss: 0.2267
Epoch [23/51], Iter [310/391] Loss: 0.1739
Epoch [23/51], Iter [320/391] Loss: 0.1785
Epoch [23/51], Iter [330/391] Loss: 0.2209
Epoch [23/51], Iter [340/391] Loss: 0.2120
Epoch [23/51], Iter [350/391] Loss: 0.1800
Epoch [23/51], Iter [360/391] Loss: 0.2761
Epoch [23/51], Iter [370/391] Loss: 0.2661
Epoch [23/51], Iter [380/391] Loss: 0.2037
Epoch [23/51], Iter [390/391] Loss: 0.2558
Epoch [24/51], Iter [10/391] Loss: 0.1601
Epoch [24/51], Iter [20/391] Loss: 0.2737
Epoch [24/51], Iter [30/391] Loss: 0.3102
Epoch [24/51], Iter [40/391] Loss: 0.1906
Epoch [24/51], Iter [50/391] Loss: 0.2094
Epoch [24/51], Iter [60/391] Loss: 0.1180
Epoch [24/51], Iter [70/391] Loss: 0.2066
Epoch [24/51], Iter [80/391] Loss: 0.2320
Epoch [24/51], Iter [90/391] Loss: 0.1838
Epoch [24/51], Iter [100/391] Loss: 0.1789
Epoch [24/51], Iter [110/391] Loss: 0.1722
Epoch [24/51], Iter [120/391] Loss: 0.2068
Epoch [24/51], Iter [130/391] Loss: 0.1868
Epoch [24/51], Iter [140/391] Loss: 0.1722
Epoch [24/51], Iter [150/391] Loss: 0.2248
Epoch [24/51], Iter [160/391] Loss: 0.2907
Epoch [24/51], Iter [170/391] Loss: 0.2373
Epoch [24/51], Iter [180/391] Loss: 0.1490
Epoch [24/51], Iter [190/391] Loss: 0.2415
Epoch [24/51], Iter [200/391] Loss: 0.1456
Epoch [24/51], Iter [210/391] Loss: 0.2420
Epoch [24/51], Iter [220/391] Loss: 0.1825
Epoch [24/51], Iter [230/391] Loss: 0.2154
Epoch [24/51], Iter [240/391] Loss: 0.1529
Epoch [24/51], Iter [250/391] Loss: 0.2123
Epoch [24/51], Iter [260/391] Loss: 0.2230
Epoch [24/51], Iter [270/391] Loss: 0.2295
Epoch [24/51], Iter [280/391] Loss: 0.3090
Epoch [24/51], Iter [290/391] Loss: 0.2323
Epoch [24/51], Iter [300/391] Loss: 0.1148
Epoch [24/51], Iter [310/391] Loss: 0.1760
Epoch [24/51], Iter [320/391] Loss: 0.2239
Epoch [24/51], Iter [330/391] Loss: 0.1587
Epoch [24/51], Iter [340/391] Loss: 0.1902
Epoch [24/51], Iter [350/391] Loss: 0.1979
Epoch [24/51], Iter [360/391] Loss: 0.2588
Epoch [24/51], Iter [370/391] Loss: 0.2015
Epoch [24/51], Iter [380/391] Loss: 0.2381
Epoch [24/51], Iter [390/391] Loss: 0.1752
Epoch [25/51], Iter [10/391] Loss: 0.1775
Epoch [25/51], Iter [20/391] Loss: 0.1883
Epoch [25/51], Iter [30/391] Loss: 0.1313
Epoch [25/51], Iter [40/391] Loss: 0.1421
Epoch [25/51], Iter [50/391] Loss: 0.3248
Epoch [25/51], Iter [60/391] Loss: 0.1869
Epoch [25/51], Iter [70/391] Loss: 0.1419
Epoch [25/51], Iter [80/391] Loss: 0.1423
Epoch [25/51], Iter [90/391] Loss: 0.1530
Epoch [25/51], Iter [100/391] Loss: 0.1653
Epoch [25/51], Iter [110/391] Loss: 0.1376
Epoch [25/51], Iter [120/391] Loss: 0.1137
Epoch [25/51], Iter [130/391] Loss: 0.1570
Epoch [25/51], Iter [140/391] Loss: 0.2315
Epoch [25/51], Iter [150/391] Loss: 0.2983
Epoch [25/51], Iter [160/391] Loss: 0.2657
Epoch [25/51], Iter [170/391] Loss: 0.1346
Epoch [25/51], Iter [180/391] Loss: 0.1921
Epoch [25/51], Iter [190/391] Loss: 0.1621
Epoch [25/51], Iter [200/391] Loss: 0.1479
Epoch [25/51], Iter [210/391] Loss: 0.2298
Epoch [25/51], Iter [220/391] Loss: 0.1770
Epoch [25/51], Iter [230/391] Loss: 0.2658
Epoch [25/51], Iter [240/391] Loss: 0.1543
Epoch [25/51], Iter [250/391] Loss: 0.1121
Epoch [25/51], Iter [260/391] Loss: 0.2435
Epoch [25/51], Iter [270/391] Loss: 0.2874
Epoch [25/51], Iter [280/391] Loss: 0.1780
Epoch [25/51], Iter [290/391] Loss: 0.2636
Epoch [25/51], Iter [300/391] Loss: 0.2078
Epoch [25/51], Iter [310/391] Loss: 0.1548
Epoch [25/51], Iter [320/391] Loss: 0.2096
Epoch [25/51], Iter [330/391] Loss: 0.2676
Epoch [25/51], Iter [340/391] Loss: 0.2774
Epoch [25/51], Iter [350/391] Loss: 0.1617
Epoch [25/51], Iter [360/391] Loss: 0.1771
Epoch [25/51], Iter [370/391] Loss: 0.2269
Epoch [25/51], Iter [380/391] Loss: 0.1464
Epoch [25/51], Iter [390/391] Loss: 0.1546
Epoch [26/51], Iter [10/391] Loss: 0.1242
Epoch [26/51], Iter [20/391] Loss: 0.2002
Epoch [26/51], Iter [30/391] Loss: 0.1348
Epoch [26/51], Iter [40/391] Loss: 0.1276
Epoch [26/51], Iter [50/391] Loss: 0.1358
Epoch [26/51], Iter [60/391] Loss: 0.3175
Epoch [26/51], Iter [70/391] Loss: 0.1540
Epoch [26/51], Iter [80/391] Loss: 0.1549
Epoch [26/51], Iter [90/391] Loss: 0.1344
Epoch [26/51], Iter [100/391] Loss: 0.2982
Epoch [26/51], Iter [110/391] Loss: 0.2413
Epoch [26/51], Iter [120/391] Loss: 0.2246
Epoch [26/51], Iter [130/391] Loss: 0.1318
Epoch [26/51], Iter [140/391] Loss: 0.1885
Epoch [26/51], Iter [150/391] Loss: 0.1815
Epoch [26/51], Iter [160/391] Loss: 0.1714
Epoch [26/51], Iter [170/391] Loss: 0.1851
Epoch [26/51], Iter [180/391] Loss: 0.1797
Epoch [26/51], Iter [190/391] Loss: 0.1302
Epoch [26/51], Iter [200/391] Loss: 0.2122
Epoch [26/51], Iter [210/391] Loss: 0.1292
Epoch [26/51], Iter [220/391] Loss: 0.2397
Epoch [26/51], Iter [230/391] Loss: 0.1683
Epoch [26/51], Iter [240/391] Loss: 0.2078
Epoch [26/51], Iter [250/391] Loss: 0.1471
Epoch [26/51], Iter [260/391] Loss: 0.1573
Epoch [26/51], Iter [270/391] Loss: 0.2104
Epoch [26/51], Iter [280/391] Loss: 0.1521
Epoch [26/51], Iter [290/391] Loss: 0.1945
Epoch [26/51], Iter [300/391] Loss: 0.2227
Epoch [26/51], Iter [310/391] Loss: 0.1867
Epoch [26/51], Iter [320/391] Loss: 0.1961
Epoch [26/51], Iter [330/391] Loss: 0.1807
Epoch [26/51], Iter [340/391] Loss: 0.1845
Epoch [26/51], Iter [350/391] Loss: 0.2971
Epoch [26/51], Iter [360/391] Loss: 0.1705
Epoch [26/51], Iter [370/391] Loss: 0.2217
Epoch [26/51], Iter [380/391] Loss: 0.2304
Epoch [26/51], Iter [390/391] Loss: 0.2830
Epoch [27/51], Iter [10/391] Loss: 0.2239
Epoch [27/51], Iter [20/391] Loss: 0.2323
Epoch [27/51], Iter [30/391] Loss: 0.1689
Epoch [27/51], Iter [40/391] Loss: 0.1235
Epoch [27/51], Iter [50/391] Loss: 0.1522
Epoch [27/51], Iter [60/391] Loss: 0.1462
Epoch [27/51], Iter [70/391] Loss: 0.1506
Epoch [27/51], Iter [80/391] Loss: 0.2733
Epoch [27/51], Iter [90/391] Loss: 0.1205
Epoch [27/51], Iter [100/391] Loss: 0.1308
Epoch [27/51], Iter [110/391] Loss: 0.2126
Epoch [27/51], Iter [120/391] Loss: 0.1561
Epoch [27/51], Iter [130/391] Loss: 0.1680
Epoch [27/51], Iter [140/391] Loss: 0.1182
Epoch [27/51], Iter [150/391] Loss: 0.1823
Epoch [27/51], Iter [160/391] Loss: 0.1545
Epoch [27/51], Iter [170/391] Loss: 0.1632
Epoch [27/51], Iter [180/391] Loss: 0.2144
Epoch [27/51], Iter [190/391] Loss: 0.1682
Epoch [27/51], Iter [200/391] Loss: 0.1640
Epoch [27/51], Iter [210/391] Loss: 0.2436
Epoch [27/51], Iter [220/391] Loss: 0.1574
Epoch [27/51], Iter [230/391] Loss: 0.1846
Epoch [27/51], Iter [240/391] Loss: 0.1194
Epoch [27/51], Iter [250/391] Loss: 0.1494
Epoch [27/51], Iter [260/391] Loss: 0.1194
Epoch [27/51], Iter [270/391] Loss: 0.2307
Epoch [27/51], Iter [280/391] Loss: 0.1412
Epoch [27/51], Iter [290/391] Loss: 0.1412
Epoch [27/51], Iter [300/391] Loss: 0.1334
Epoch [27/51], Iter [310/391] Loss: 0.1567
Epoch [27/51], Iter [320/391] Loss: 0.2395
Epoch [27/51], Iter [330/391] Loss: 0.1897
Epoch [27/51], Iter [340/391] Loss: 0.1952
Epoch [27/51], Iter [350/391] Loss: 0.2048
Epoch [27/51], Iter [360/391] Loss: 0.1727
Epoch [27/51], Iter [370/391] Loss: 0.2419
Epoch [27/51], Iter [380/391] Loss: 0.2483
Epoch [27/51], Iter [390/391] Loss: 0.2372
Epoch [28/51], Iter [10/391] Loss: 0.2016
Epoch [28/51], Iter [20/391] Loss: 0.0947
Epoch [28/51], Iter [30/391] Loss: 0.1457
Epoch [28/51], Iter [40/391] Loss: 0.1599
Epoch [28/51], Iter [50/391] Loss: 0.1557
Epoch [28/51], Iter [60/391] Loss: 0.1656
Epoch [28/51], Iter [70/391] Loss: 0.1595
Epoch [28/51], Iter [80/391] Loss: 0.1057
Epoch [28/51], Iter [90/391] Loss: 0.2010
Epoch [28/51], Iter [100/391] Loss: 0.0825
Epoch [28/51], Iter [110/391] Loss: 0.1845
Epoch [28/51], Iter [120/391] Loss: 0.1211
Epoch [28/51], Iter [130/391] Loss: 0.1505
Epoch [28/51], Iter [140/391] Loss: 0.2306
Epoch [28/51], Iter [150/391] Loss: 0.1128
Epoch [28/51], Iter [160/391] Loss: 0.0826
Epoch [28/51], Iter [170/391] Loss: 0.1972
Epoch [28/51], Iter [180/391] Loss: 0.3549
Epoch [28/51], Iter [190/391] Loss: 0.1055
Epoch [28/51], Iter [200/391] Loss: 0.1512
Epoch [28/51], Iter [210/391] Loss: 0.1874
Epoch [28/51], Iter [220/391] Loss: 0.0978
Epoch [28/51], Iter [230/391] Loss: 0.0957
Epoch [28/51], Iter [240/391] Loss: 0.1555
Epoch [28/51], Iter [250/391] Loss: 0.1499
Epoch [28/51], Iter [260/391] Loss: 0.1698
Epoch [28/51], Iter [270/391] Loss: 0.1717
Epoch [28/51], Iter [280/391] Loss: 0.1341
Epoch [28/51], Iter [290/391] Loss: 0.1403
Epoch [28/51], Iter [300/391] Loss: 0.1761
Epoch [28/51], Iter [310/391] Loss: 0.2649
Epoch [28/51], Iter [320/391] Loss: 0.2703
Epoch [28/51], Iter [330/391] Loss: 0.1527
Epoch [28/51], Iter [340/391] Loss: 0.2300
Epoch [28/51], Iter [350/391] Loss: 0.2157
Epoch [28/51], Iter [360/391] Loss: 0.2482
Epoch [28/51], Iter [370/391] Loss: 0.1485
Epoch [28/51], Iter [380/391] Loss: 0.2044
Epoch [28/51], Iter [390/391] Loss: 0.1119
Epoch [29/51], Iter [10/391] Loss: 0.1961
Epoch [29/51], Iter [20/391] Loss: 0.1506
Epoch [29/51], Iter [30/391] Loss: 0.1726
Epoch [29/51], Iter [40/391] Loss: 0.1821
Epoch [29/51], Iter [50/391] Loss: 0.1705
Epoch [29/51], Iter [60/391] Loss: 0.1326
Epoch [29/51], Iter [70/391] Loss: 0.1333
Epoch [29/51], Iter [80/391] Loss: 0.1151
Epoch [29/51], Iter [90/391] Loss: 0.2014
Epoch [29/51], Iter [100/391] Loss: 0.1340
Epoch [29/51], Iter [110/391] Loss: 0.1204
Epoch [29/51], Iter [120/391] Loss: 0.2421
Epoch [29/51], Iter [130/391] Loss: 0.3064
Epoch [29/51], Iter [140/391] Loss: 0.1329
Epoch [29/51], Iter [150/391] Loss: 0.1294
Epoch [29/51], Iter [160/391] Loss: 0.1733
Epoch [29/51], Iter [170/391] Loss: 0.2263
Epoch [29/51], Iter [180/391] Loss: 0.2113
Epoch [29/51], Iter [190/391] Loss: 0.1163
Epoch [29/51], Iter [200/391] Loss: 0.1605
Epoch [29/51], Iter [210/391] Loss: 0.1144
Epoch [29/51], Iter [220/391] Loss: 0.1951
Epoch [29/51], Iter [230/391] Loss: 0.1686
Epoch [29/51], Iter [240/391] Loss: 0.1788
Epoch [29/51], Iter [250/391] Loss: 0.2004
Epoch [29/51], Iter [260/391] Loss: 0.2082
Epoch [29/51], Iter [270/391] Loss: 0.2567
Epoch [29/51], Iter [280/391] Loss: 0.1946
Epoch [29/51], Iter [290/391] Loss: 0.1558
Epoch [29/51], Iter [300/391] Loss: 0.1386
Epoch [29/51], Iter [310/391] Loss: 0.0898
Epoch [29/51], Iter [320/391] Loss: 0.1763
Epoch [29/51], Iter [330/391] Loss: 0.1401
Epoch [29/51], Iter [340/391] Loss: 0.1953
Epoch [29/51], Iter [350/391] Loss: 0.1245
Epoch [29/51], Iter [360/391] Loss: 0.1979
Epoch [29/51], Iter [370/391] Loss: 0.1223
Epoch [29/51], Iter [380/391] Loss: 0.0958
Epoch [29/51], Iter [390/391] Loss: 0.1310
Epoch [30/51], Iter [10/391] Loss: 0.1359
Epoch [30/51], Iter [20/391] Loss: 0.1035
Epoch [30/51], Iter [30/391] Loss: 0.0787
Epoch [30/51], Iter [40/391] Loss: 0.0859
Epoch [30/51], Iter [50/391] Loss: 0.1521
Epoch [30/51], Iter [60/391] Loss: 0.1699
Epoch [30/51], Iter [70/391] Loss: 0.1547
Epoch [30/51], Iter [80/391] Loss: 0.1257
Epoch [30/51], Iter [90/391] Loss: 0.1700
Epoch [30/51], Iter [100/391] Loss: 0.1689
Epoch [30/51], Iter [110/391] Loss: 0.1615
Epoch [30/51], Iter [120/391] Loss: 0.0776
Epoch [30/51], Iter [130/391] Loss: 0.1756
Epoch [30/51], Iter [140/391] Loss: 0.1167
Epoch [30/51], Iter [150/391] Loss: 0.1575
Epoch [30/51], Iter [160/391] Loss: 0.1161
Epoch [30/51], Iter [170/391] Loss: 0.1139
Epoch [30/51], Iter [180/391] Loss: 0.1849
Epoch [30/51], Iter [190/391] Loss: 0.2283
Epoch [30/51], Iter [200/391] Loss: 0.2036
Epoch [30/51], Iter [210/391] Loss: 0.1541
Epoch [30/51], Iter [220/391] Loss: 0.1978
Epoch [30/51], Iter [230/391] Loss: 0.1760
Epoch [30/51], Iter [240/391] Loss: 0.1581
Epoch [30/51], Iter [250/391] Loss: 0.1839
Epoch [30/51], Iter [260/391] Loss: 0.1493
Epoch [30/51], Iter [270/391] Loss: 0.1740
Epoch [30/51], Iter [280/391] Loss: 0.2307
Epoch [30/51], Iter [290/391] Loss: 0.1929
Epoch [30/51], Iter [300/391] Loss: 0.1820
Epoch [30/51], Iter [310/391] Loss: 0.1900
Epoch [30/51], Iter [320/391] Loss: 0.1207
Epoch [30/51], Iter [330/391] Loss: 0.1436
Epoch [30/51], Iter [340/391] Loss: 0.1504
Epoch [30/51], Iter [350/391] Loss: 0.2274
Epoch [30/51], Iter [360/391] Loss: 0.2147
Epoch [30/51], Iter [370/391] Loss: 0.1882
Epoch [30/51], Iter [380/391] Loss: 0.1941
Epoch [30/51], Iter [390/391] Loss: 0.1888
[Saving Checkpoint]
Epoch [31/51], Iter [10/391] Loss: 0.1019
Epoch [31/51], Iter [20/391] Loss: 0.0967
Epoch [31/51], Iter [30/391] Loss: 0.0966
Epoch [31/51], Iter [40/391] Loss: 0.1501
Epoch [31/51], Iter [50/391] Loss: 0.1462
Epoch [31/51], Iter [60/391] Loss: 0.1327
Epoch [31/51], Iter [70/391] Loss: 0.1232
Epoch [31/51], Iter [80/391] Loss: 0.1178
Epoch [31/51], Iter [90/391] Loss: 0.1122
Epoch [31/51], Iter [100/391] Loss: 0.1448
Epoch [31/51], Iter [110/391] Loss: 0.1517
Epoch [31/51], Iter [120/391] Loss: 0.2564
Epoch [31/51], Iter [130/391] Loss: 0.1451
Epoch [31/51], Iter [140/391] Loss: 0.0573
Epoch [31/51], Iter [150/391] Loss: 0.1962
Epoch [31/51], Iter [160/391] Loss: 0.1273
Epoch [31/51], Iter [170/391] Loss: 0.1776
Epoch [31/51], Iter [180/391] Loss: 0.1260
Epoch [31/51], Iter [190/391] Loss: 0.1311
Epoch [31/51], Iter [200/391] Loss: 0.1331
Epoch [31/51], Iter [210/391] Loss: 0.1299
Epoch [31/51], Iter [220/391] Loss: 0.0995
Epoch [31/51], Iter [230/391] Loss: 0.1627
Epoch [31/51], Iter [240/391] Loss: 0.1858
Epoch [31/51], Iter [250/391] Loss: 0.1538
Epoch [31/51], Iter [260/391] Loss: 0.1138
Epoch [31/51], Iter [270/391] Loss: 0.1375
Epoch [31/51], Iter [280/391] Loss: 0.1153
Epoch [31/51], Iter [290/391] Loss: 0.1672
Epoch [31/51], Iter [300/391] Loss: 0.2461
Epoch [31/51], Iter [310/391] Loss: 0.1098
Epoch [31/51], Iter [320/391] Loss: 0.1192
Epoch [31/51], Iter [330/391] Loss: 0.1478
Epoch [31/51], Iter [340/391] Loss: 0.0915
Epoch [31/51], Iter [350/391] Loss: 0.1116
Epoch [31/51], Iter [360/391] Loss: 0.1261
Epoch [31/51], Iter [370/391] Loss: 0.1852
Epoch [31/51], Iter [380/391] Loss: 0.2047
Epoch [31/51], Iter [390/391] Loss: 0.1474
Epoch [32/51], Iter [10/391] Loss: 0.1212
Epoch [32/51], Iter [20/391] Loss: 0.0899
Epoch [32/51], Iter [30/391] Loss: 0.0922
Epoch [32/51], Iter [40/391] Loss: 0.0825
Epoch [32/51], Iter [50/391] Loss: 0.1282
Epoch [32/51], Iter [60/391] Loss: 0.1019
Epoch [32/51], Iter [70/391] Loss: 0.0720
Epoch [32/51], Iter [80/391] Loss: 0.1000
Epoch [32/51], Iter [90/391] Loss: 0.1148
Epoch [32/51], Iter [100/391] Loss: 0.1835
Epoch [32/51], Iter [110/391] Loss: 0.0987
Epoch [32/51], Iter [120/391] Loss: 0.1309
Epoch [32/51], Iter [130/391] Loss: 0.1428
Epoch [32/51], Iter [140/391] Loss: 0.1034
Epoch [32/51], Iter [150/391] Loss: 0.1386
Epoch [32/51], Iter [160/391] Loss: 0.1548
Epoch [32/51], Iter [170/391] Loss: 0.1089
Epoch [32/51], Iter [180/391] Loss: 0.0959
Epoch [32/51], Iter [190/391] Loss: 0.1321
Epoch [32/51], Iter [200/391] Loss: 0.0733
Epoch [32/51], Iter [210/391] Loss: 0.0852
Epoch [32/51], Iter [220/391] Loss: 0.1174
Epoch [32/51], Iter [230/391] Loss: 0.1616
Epoch [32/51], Iter [240/391] Loss: 0.1167
Epoch [32/51], Iter [250/391] Loss: 0.1921
Epoch [32/51], Iter [260/391] Loss: 0.1374
Epoch [32/51], Iter [270/391] Loss: 0.1265
Epoch [32/51], Iter [280/391] Loss: 0.1258
Epoch [32/51], Iter [290/391] Loss: 0.1690
Epoch [32/51], Iter [300/391] Loss: 0.2148
Epoch [32/51], Iter [310/391] Loss: 0.1631
Epoch [32/51], Iter [320/391] Loss: 0.2101
Epoch [32/51], Iter [330/391] Loss: 0.1327
Epoch [32/51], Iter [340/391] Loss: 0.1657
Epoch [32/51], Iter [350/391] Loss: 0.1331
Epoch [32/51], Iter [360/391] Loss: 0.0635
Epoch [32/51], Iter [370/391] Loss: 0.1720
Epoch [32/51], Iter [380/391] Loss: 0.0775
Epoch [32/51], Iter [390/391] Loss: 0.1103
Epoch [33/51], Iter [10/391] Loss: 0.1452
Epoch [33/51], Iter [20/391] Loss: 0.1295
Epoch [33/51], Iter [30/391] Loss: 0.1295
Epoch [33/51], Iter [40/391] Loss: 0.0776
Epoch [33/51], Iter [50/391] Loss: 0.1551
Epoch [33/51], Iter [60/391] Loss: 0.2136
Epoch [33/51], Iter [70/391] Loss: 0.1084
Epoch [33/51], Iter [80/391] Loss: 0.1510
Epoch [33/51], Iter [90/391] Loss: 0.1040
Epoch [33/51], Iter [100/391] Loss: 0.1641
Epoch [33/51], Iter [110/391] Loss: 0.1286
Epoch [33/51], Iter [120/391] Loss: 0.1447
Epoch [33/51], Iter [130/391] Loss: 0.2033
Epoch [33/51], Iter [140/391] Loss: 0.1207
Epoch [33/51], Iter [150/391] Loss: 0.1140
Epoch [33/51], Iter [160/391] Loss: 0.2032
Epoch [33/51], Iter [170/391] Loss: 0.1402
Epoch [33/51], Iter [180/391] Loss: 0.1141
Epoch [33/51], Iter [190/391] Loss: 0.1099
Epoch [33/51], Iter [200/391] Loss: 0.1620
Epoch [33/51], Iter [210/391] Loss: 0.1623
Epoch [33/51], Iter [220/391] Loss: 0.1259
Epoch [33/51], Iter [230/391] Loss: 0.1296
Epoch [33/51], Iter [240/391] Loss: 0.1138
Epoch [33/51], Iter [250/391] Loss: 0.1194
Epoch [33/51], Iter [260/391] Loss: 0.1177
Epoch [33/51], Iter [270/391] Loss: 0.1982
Epoch [33/51], Iter [280/391] Loss: 0.1390
Epoch [33/51], Iter [290/391] Loss: 0.1610
Epoch [33/51], Iter [300/391] Loss: 0.1828
Epoch [33/51], Iter [310/391] Loss: 0.1453
Epoch [33/51], Iter [320/391] Loss: 0.1749
Epoch [33/51], Iter [330/391] Loss: 0.1176
Epoch [33/51], Iter [340/391] Loss: 0.0997
Epoch [33/51], Iter [350/391] Loss: 0.1398
Epoch [33/51], Iter [360/391] Loss: 0.1349
Epoch [33/51], Iter [370/391] Loss: 0.1384
Epoch [33/51], Iter [380/391] Loss: 0.1782
Epoch [33/51], Iter [390/391] Loss: 0.1766
Epoch [34/51], Iter [10/391] Loss: 0.1060
Epoch [34/51], Iter [20/391] Loss: 0.0568
Epoch [34/51], Iter [30/391] Loss: 0.0794
Epoch [34/51], Iter [40/391] Loss: 0.1937
Epoch [34/51], Iter [50/391] Loss: 0.0689
Epoch [34/51], Iter [60/391] Loss: 0.1019
Epoch [34/51], Iter [70/391] Loss: 0.1054
Epoch [34/51], Iter [80/391] Loss: 0.1123
Epoch [34/51], Iter [90/391] Loss: 0.0951
Epoch [34/51], Iter [100/391] Loss: 0.0912
Epoch [34/51], Iter [110/391] Loss: 0.1700
Epoch [34/51], Iter [120/391] Loss: 0.1097
Epoch [34/51], Iter [130/391] Loss: 0.1320
Epoch [34/51], Iter [140/391] Loss: 0.1244
Epoch [34/51], Iter [150/391] Loss: 0.1355
Epoch [34/51], Iter [160/391] Loss: 0.1237
Epoch [34/51], Iter [170/391] Loss: 0.1456
Epoch [34/51], Iter [180/391] Loss: 0.1577
Epoch [34/51], Iter [190/391] Loss: 0.1451
Epoch [34/51], Iter [200/391] Loss: 0.1671
Epoch [34/51], Iter [210/391] Loss: 0.0828
Epoch [34/51], Iter [220/391] Loss: 0.1269
Epoch [34/51], Iter [230/391] Loss: 0.1171
Epoch [34/51], Iter [240/391] Loss: 0.1283
Epoch [34/51], Iter [250/391] Loss: 0.0967
Epoch [34/51], Iter [260/391] Loss: 0.2144
Epoch [34/51], Iter [270/391] Loss: 0.0866
Epoch [34/51], Iter [280/391] Loss: 0.0486
Epoch [34/51], Iter [290/391] Loss: 0.1698
Epoch [34/51], Iter [300/391] Loss: 0.1194
Epoch [34/51], Iter [310/391] Loss: 0.0801
Epoch [34/51], Iter [320/391] Loss: 0.1644
Epoch [34/51], Iter [330/391] Loss: 0.1116
Epoch [34/51], Iter [340/391] Loss: 0.1178
Epoch [34/51], Iter [350/391] Loss: 0.1289
Epoch [34/51], Iter [360/391] Loss: 0.1495
Epoch [34/51], Iter [370/391] Loss: 0.0872
Epoch [34/51], Iter [380/391] Loss: 0.1242
Epoch [34/51], Iter [390/391] Loss: 0.1607
Epoch [35/51], Iter [10/391] Loss: 0.0913
Epoch [35/51], Iter [20/391] Loss: 0.1495
Epoch [35/51], Iter [30/391] Loss: 0.0756
Epoch [35/51], Iter [40/391] Loss: 0.0798
Epoch [35/51], Iter [50/391] Loss: 0.1354
Epoch [35/51], Iter [60/391] Loss: 0.1150
Epoch [35/51], Iter [70/391] Loss: 0.0584
Epoch [35/51], Iter [80/391] Loss: 0.1228
Epoch [35/51], Iter [90/391] Loss: 0.0497
Epoch [35/51], Iter [100/391] Loss: 0.0922
Epoch [35/51], Iter [110/391] Loss: 0.1618
Epoch [35/51], Iter [120/391] Loss: 0.1268
Epoch [35/51], Iter [130/391] Loss: 0.0852
Epoch [35/51], Iter [140/391] Loss: 0.1827
Epoch [35/51], Iter [150/391] Loss: 0.0715
Epoch [35/51], Iter [160/391] Loss: 0.0919
Epoch [35/51], Iter [170/391] Loss: 0.0947
Epoch [35/51], Iter [180/391] Loss: 0.2076
Epoch [35/51], Iter [190/391] Loss: 0.1071
Epoch [35/51], Iter [200/391] Loss: 0.1290
Epoch [35/51], Iter [210/391] Loss: 0.1302
Epoch [35/51], Iter [220/391] Loss: 0.1512
Epoch [35/51], Iter [230/391] Loss: 0.1070
Epoch [35/51], Iter [240/391] Loss: 0.0676
Epoch [35/51], Iter [250/391] Loss: 0.1390
Epoch [35/51], Iter [260/391] Loss: 0.1033
Epoch [35/51], Iter [270/391] Loss: 0.1460
Epoch [35/51], Iter [280/391] Loss: 0.1716
Epoch [35/51], Iter [290/391] Loss: 0.0804
Epoch [35/51], Iter [300/391] Loss: 0.0913
Epoch [35/51], Iter [310/391] Loss: 0.1078
Epoch [35/51], Iter [320/391] Loss: 0.1577
Epoch [35/51], Iter [330/391] Loss: 0.2225
Epoch [35/51], Iter [340/391] Loss: 0.1493
Epoch [35/51], Iter [350/391] Loss: 0.2129
Epoch [35/51], Iter [360/391] Loss: 0.1245
Epoch [35/51], Iter [370/391] Loss: 0.1604
Epoch [35/51], Iter [380/391] Loss: 0.1456
Epoch [35/51], Iter [390/391] Loss: 0.1279
Epoch [36/51], Iter [10/391] Loss: 0.1232
Epoch [36/51], Iter [20/391] Loss: 0.0984
Epoch [36/51], Iter [30/391] Loss: 0.0898
Epoch [36/51], Iter [40/391] Loss: 0.1080
Epoch [36/51], Iter [50/391] Loss: 0.0819
Epoch [36/51], Iter [60/391] Loss: 0.0915
Epoch [36/51], Iter [70/391] Loss: 0.0658
Epoch [36/51], Iter [80/391] Loss: 0.1923
Epoch [36/51], Iter [90/391] Loss: 0.1086
Epoch [36/51], Iter [100/391] Loss: 0.0816
Epoch [36/51], Iter [110/391] Loss: 0.1500
Epoch [36/51], Iter [120/391] Loss: 0.0799
Epoch [36/51], Iter [130/391] Loss: 0.2279
Epoch [36/51], Iter [140/391] Loss: 0.1015
Epoch [36/51], Iter [150/391] Loss: 0.1244
Epoch [36/51], Iter [160/391] Loss: 0.1259
Epoch [36/51], Iter [170/391] Loss: 0.0710
Epoch [36/51], Iter [180/391] Loss: 0.0777
Epoch [36/51], Iter [190/391] Loss: 0.1594
Epoch [36/51], Iter [200/391] Loss: 0.1241
Epoch [36/51], Iter [210/391] Loss: 0.0897
Epoch [36/51], Iter [220/391] Loss: 0.1102
Epoch [36/51], Iter [230/391] Loss: 0.0512
Epoch [36/51], Iter [240/391] Loss: 0.0855
Epoch [36/51], Iter [250/391] Loss: 0.1494
Epoch [36/51], Iter [260/391] Loss: 0.1364
Epoch [36/51], Iter [270/391] Loss: 0.1121
Epoch [36/51], Iter [280/391] Loss: 0.1362
Epoch [36/51], Iter [290/391] Loss: 0.1434
Epoch [36/51], Iter [300/391] Loss: 0.1229
Epoch [36/51], Iter [310/391] Loss: 0.1100
Epoch [36/51], Iter [320/391] Loss: 0.0675
Epoch [36/51], Iter [330/391] Loss: 0.1484
Epoch [36/51], Iter [340/391] Loss: 0.1658
Epoch [36/51], Iter [350/391] Loss: 0.0962
Epoch [36/51], Iter [360/391] Loss: 0.0883
Epoch [36/51], Iter [370/391] Loss: 0.1428
Epoch [36/51], Iter [380/391] Loss: 0.1647
Epoch [36/51], Iter [390/391] Loss: 0.1392
Epoch [37/51], Iter [10/391] Loss: 0.0645
Epoch [37/51], Iter [20/391] Loss: 0.0848
Epoch [37/51], Iter [30/391] Loss: 0.0429
Epoch [37/51], Iter [40/391] Loss: 0.1081
Epoch [37/51], Iter [50/391] Loss: 0.1323
Epoch [37/51], Iter [60/391] Loss: 0.0611
Epoch [37/51], Iter [70/391] Loss: 0.1273
Epoch [37/51], Iter [80/391] Loss: 0.0792
Epoch [37/51], Iter [90/391] Loss: 0.0891
Epoch [37/51], Iter [100/391] Loss: 0.0800
Epoch [37/51], Iter [110/391] Loss: 0.0863
Epoch [37/51], Iter [120/391] Loss: 0.1125
Epoch [37/51], Iter [130/391] Loss: 0.1304
Epoch [37/51], Iter [140/391] Loss: 0.0805
Epoch [37/51], Iter [150/391] Loss: 0.0816
Epoch [37/51], Iter [160/391] Loss: 0.1223
Epoch [37/51], Iter [170/391] Loss: 0.1188
Epoch [37/51], Iter [180/391] Loss: 0.0778
Epoch [37/51], Iter [190/391] Loss: 0.0975
Epoch [37/51], Iter [200/391] Loss: 0.0939
Epoch [37/51], Iter [210/391] Loss: 0.1365
Epoch [37/51], Iter [220/391] Loss: 0.2405
Epoch [37/51], Iter [230/391] Loss: 0.1528
Epoch [37/51], Iter [240/391] Loss: 0.1783
Epoch [37/51], Iter [250/391] Loss: 0.1225
Epoch [37/51], Iter [260/391] Loss: 0.1291
Epoch [37/51], Iter [270/391] Loss: 0.1840
Epoch [37/51], Iter [280/391] Loss: 0.0949
Epoch [37/51], Iter [290/391] Loss: 0.1112
Epoch [37/51], Iter [300/391] Loss: 0.1673
Epoch [37/51], Iter [310/391] Loss: 0.0914
Epoch [37/51], Iter [320/391] Loss: 0.0693
Epoch [37/51], Iter [330/391] Loss: 0.0665
Epoch [37/51], Iter [340/391] Loss: 0.1525
Epoch [37/51], Iter [350/391] Loss: 0.1158
Epoch [37/51], Iter [360/391] Loss: 0.0655
Epoch [37/51], Iter [370/391] Loss: 0.1639
Epoch [37/51], Iter [380/391] Loss: 0.1205
Epoch [37/51], Iter [390/391] Loss: 0.1826
Epoch [38/51], Iter [10/391] Loss: 0.2065
Epoch [38/51], Iter [20/391] Loss: 0.0826
Epoch [38/51], Iter [30/391] Loss: 0.0505
Epoch [38/51], Iter [40/391] Loss: 0.1134
Epoch [38/51], Iter [50/391] Loss: 0.0726
Epoch [38/51], Iter [60/391] Loss: 0.0959
Epoch [38/51], Iter [70/391] Loss: 0.1623
Epoch [38/51], Iter [80/391] Loss: 0.0940
Epoch [38/51], Iter [90/391] Loss: 0.0683
Epoch [38/51], Iter [100/391] Loss: 0.1488
Epoch [38/51], Iter [110/391] Loss: 0.0457
Epoch [38/51], Iter [120/391] Loss: 0.0871
Epoch [38/51], Iter [130/391] Loss: 0.0827
Epoch [38/51], Iter [140/391] Loss: 0.1592
Epoch [38/51], Iter [150/391] Loss: 0.1013
Epoch [38/51], Iter [160/391] Loss: 0.1605
Epoch [38/51], Iter [170/391] Loss: 0.0759
Epoch [38/51], Iter [180/391] Loss: 0.0762
Epoch [38/51], Iter [190/391] Loss: 0.1509
Epoch [38/51], Iter [200/391] Loss: 0.0653
Epoch [38/51], Iter [210/391] Loss: 0.0510
Epoch [38/51], Iter [220/391] Loss: 0.1338
Epoch [38/51], Iter [230/391] Loss: 0.0911
Epoch [38/51], Iter [240/391] Loss: 0.0526
Epoch [38/51], Iter [250/391] Loss: 0.0953
Epoch [38/51], Iter [260/391] Loss: 0.0939
Epoch [38/51], Iter [270/391] Loss: 0.0864
Epoch [38/51], Iter [280/391] Loss: 0.0851
Epoch [38/51], Iter [290/391] Loss: 0.1850
Epoch [38/51], Iter [300/391] Loss: 0.1144
Epoch [38/51], Iter [310/391] Loss: 0.0877
Epoch [38/51], Iter [320/391] Loss: 0.0751
Epoch [38/51], Iter [330/391] Loss: 0.1158
Epoch [38/51], Iter [340/391] Loss: 0.1258
Epoch [38/51], Iter [350/391] Loss: 0.1751
Epoch [38/51], Iter [360/391] Loss: 0.1343
Epoch [38/51], Iter [370/391] Loss: 0.0821
Epoch [38/51], Iter [380/391] Loss: 0.1022
Epoch [38/51], Iter [390/391] Loss: 0.1105
Epoch [39/51], Iter [10/391] Loss: 0.0820
Epoch [39/51], Iter [20/391] Loss: 0.0614
Epoch [39/51], Iter [30/391] Loss: 0.0654
Epoch [39/51], Iter [40/391] Loss: 0.0774
Epoch [39/51], Iter [50/391] Loss: 0.0930
Epoch [39/51], Iter [60/391] Loss: 0.0940
Epoch [39/51], Iter [70/391] Loss: 0.0622
Epoch [39/51], Iter [80/391] Loss: 0.0639
Epoch [39/51], Iter [90/391] Loss: 0.0841
Epoch [39/51], Iter [100/391] Loss: 0.0759
Epoch [39/51], Iter [110/391] Loss: 0.1290
Epoch [39/51], Iter [120/391] Loss: 0.1232
Epoch [39/51], Iter [130/391] Loss: 0.1230
Epoch [39/51], Iter [140/391] Loss: 0.1683
Epoch [39/51], Iter [150/391] Loss: 0.0986
Epoch [39/51], Iter [160/391] Loss: 0.1050
Epoch [39/51], Iter [170/391] Loss: 0.1949
Epoch [39/51], Iter [180/391] Loss: 0.0838
Epoch [39/51], Iter [190/391] Loss: 0.1138
Epoch [39/51], Iter [200/391] Loss: 0.0811
Epoch [39/51], Iter [210/391] Loss: 0.0750
Epoch [39/51], Iter [220/391] Loss: 0.0620
Epoch [39/51], Iter [230/391] Loss: 0.0785
Epoch [39/51], Iter [240/391] Loss: 0.1250
Epoch [39/51], Iter [250/391] Loss: 0.0927
Epoch [39/51], Iter [260/391] Loss: 0.0834
Epoch [39/51], Iter [270/391] Loss: 0.1294
Epoch [39/51], Iter [280/391] Loss: 0.1072
Epoch [39/51], Iter [290/391] Loss: 0.1110
Epoch [39/51], Iter [300/391] Loss: 0.0887
Epoch [39/51], Iter [310/391] Loss: 0.1065
Epoch [39/51], Iter [320/391] Loss: 0.0674
Epoch [39/51], Iter [330/391] Loss: 0.1004
Epoch [39/51], Iter [340/391] Loss: 0.0951
Epoch [39/51], Iter [350/391] Loss: 0.1620
Epoch [39/51], Iter [360/391] Loss: 0.1122
Epoch [39/51], Iter [370/391] Loss: 0.0862
Epoch [39/51], Iter [380/391] Loss: 0.1260
Epoch [39/51], Iter [390/391] Loss: 0.1692
Epoch [40/51], Iter [10/391] Loss: 0.0760
Epoch [40/51], Iter [20/391] Loss: 0.0783
Epoch [40/51], Iter [30/391] Loss: 0.1234
Epoch [40/51], Iter [40/391] Loss: 0.1185
Epoch [40/51], Iter [50/391] Loss: 0.0913
Epoch [40/51], Iter [60/391] Loss: 0.0908
Epoch [40/51], Iter [70/391] Loss: 0.1110
Epoch [40/51], Iter [80/391] Loss: 0.1024
Epoch [40/51], Iter [90/391] Loss: 0.1182
Epoch [40/51], Iter [100/391] Loss: 0.0447
Epoch [40/51], Iter [110/391] Loss: 0.0718
Epoch [40/51], Iter [120/391] Loss: 0.1036
Epoch [40/51], Iter [130/391] Loss: 0.0946
Epoch [40/51], Iter [140/391] Loss: 0.0560
Epoch [40/51], Iter [150/391] Loss: 0.1636
Epoch [40/51], Iter [160/391] Loss: 0.0900
Epoch [40/51], Iter [170/391] Loss: 0.1431
Epoch [40/51], Iter [180/391] Loss: 0.1003
Epoch [40/51], Iter [190/391] Loss: 0.1131
Epoch [40/51], Iter [200/391] Loss: 0.1043
Epoch [40/51], Iter [210/391] Loss: 0.0834
Epoch [40/51], Iter [220/391] Loss: 0.1855
Epoch [40/51], Iter [230/391] Loss: 0.1044
Epoch [40/51], Iter [240/391] Loss: 0.0953
Epoch [40/51], Iter [250/391] Loss: 0.1267
Epoch [40/51], Iter [260/391] Loss: 0.0966
Epoch [40/51], Iter [270/391] Loss: 0.0858
Epoch [40/51], Iter [280/391] Loss: 0.0621
Epoch [40/51], Iter [290/391] Loss: 0.0803
Epoch [40/51], Iter [300/391] Loss: 0.1052
Epoch [40/51], Iter [310/391] Loss: 0.0860
Epoch [40/51], Iter [320/391] Loss: 0.1120
Epoch [40/51], Iter [330/391] Loss: 0.1128
Epoch [40/51], Iter [340/391] Loss: 0.0746
Epoch [40/51], Iter [350/391] Loss: 0.0920
Epoch [40/51], Iter [360/391] Loss: 0.1423
Epoch [40/51], Iter [370/391] Loss: 0.1035
Epoch [40/51], Iter [380/391] Loss: 0.1689
Epoch [40/51], Iter [390/391] Loss: 0.0820
[Saving Checkpoint]
Epoch [41/51], Iter [10/391] Loss: 0.0969
Epoch [41/51], Iter [20/391] Loss: 0.0825
Epoch [41/51], Iter [30/391] Loss: 0.1080
Epoch [41/51], Iter [40/391] Loss: 0.0782
Epoch [41/51], Iter [50/391] Loss: 0.0719
Epoch [41/51], Iter [60/391] Loss: 0.0816
Epoch [41/51], Iter [70/391] Loss: 0.0682
Epoch [41/51], Iter [80/391] Loss: 0.0927
Epoch [41/51], Iter [90/391] Loss: 0.0899
Epoch [41/51], Iter [100/391] Loss: 0.1174
Epoch [41/51], Iter [110/391] Loss: 0.1343
Epoch [41/51], Iter [120/391] Loss: 0.0864
Epoch [41/51], Iter [130/391] Loss: 0.1131
Epoch [41/51], Iter [140/391] Loss: 0.0994
Epoch [41/51], Iter [150/391] Loss: 0.0474
Epoch [41/51], Iter [160/391] Loss: 0.0849
Epoch [41/51], Iter [170/391] Loss: 0.1581
Epoch [41/51], Iter [180/391] Loss: 0.0904
Epoch [41/51], Iter [190/391] Loss: 0.0600
Epoch [41/51], Iter [200/391] Loss: 0.0648
Epoch [41/51], Iter [210/391] Loss: 0.0709
Epoch [41/51], Iter [220/391] Loss: 0.0853
Epoch [41/51], Iter [230/391] Loss: 0.0428
Epoch [41/51], Iter [240/391] Loss: 0.1103
Epoch [41/51], Iter [250/391] Loss: 0.0721
Epoch [41/51], Iter [260/391] Loss: 0.0549
Epoch [41/51], Iter [270/391] Loss: 0.1891
Epoch [41/51], Iter [280/391] Loss: 0.0914
Epoch [41/51], Iter [290/391] Loss: 0.0770
Epoch [41/51], Iter [300/391] Loss: 0.0809
Epoch [41/51], Iter [310/391] Loss: 0.0761
Epoch [41/51], Iter [320/391] Loss: 0.1145
Epoch [41/51], Iter [330/391] Loss: 0.1701
Epoch [41/51], Iter [340/391] Loss: 0.1278
Epoch [41/51], Iter [350/391] Loss: 0.0794
Epoch [41/51], Iter [360/391] Loss: 0.0861
Epoch [41/51], Iter [370/391] Loss: 0.1264
Epoch [41/51], Iter [380/391] Loss: 0.1797
Epoch [41/51], Iter [390/391] Loss: 0.0768
Epoch [42/51], Iter [10/391] Loss: 0.0986
Epoch [42/51], Iter [20/391] Loss: 0.1005
Epoch [42/51], Iter [30/391] Loss: 0.1181
Epoch [42/51], Iter [40/391] Loss: 0.0476
Epoch [42/51], Iter [50/391] Loss: 0.0992
Epoch [42/51], Iter [60/391] Loss: 0.0615
Epoch [42/51], Iter [70/391] Loss: 0.1285
Epoch [42/51], Iter [80/391] Loss: 0.0846
Epoch [42/51], Iter [90/391] Loss: 0.1006
Epoch [42/51], Iter [100/391] Loss: 0.1089
Epoch [42/51], Iter [110/391] Loss: 0.0520
Epoch [42/51], Iter [120/391] Loss: 0.1082
Epoch [42/51], Iter [130/391] Loss: 0.0852
Epoch [42/51], Iter [140/391] Loss: 0.1041
Epoch [42/51], Iter [150/391] Loss: 0.0946
Epoch [42/51], Iter [160/391] Loss: 0.0921
Epoch [42/51], Iter [170/391] Loss: 0.1097
Epoch [42/51], Iter [180/391] Loss: 0.1285
Epoch [42/51], Iter [190/391] Loss: 0.1332
Epoch [42/51], Iter [200/391] Loss: 0.1418
Epoch [42/51], Iter [210/391] Loss: 0.0831
Epoch [42/51], Iter [220/391] Loss: 0.0725
Epoch [42/51], Iter [230/391] Loss: 0.0757
Epoch [42/51], Iter [240/391] Loss: 0.0465
Epoch [42/51], Iter [250/391] Loss: 0.1158
Epoch [42/51], Iter [260/391] Loss: 0.1116
Epoch [42/51], Iter [270/391] Loss: 0.0865
Epoch [42/51], Iter [280/391] Loss: 0.1227
Epoch [42/51], Iter [290/391] Loss: 0.0997
Epoch [42/51], Iter [300/391] Loss: 0.0968
Epoch [42/51], Iter [310/391] Loss: 0.1560
Epoch [42/51], Iter [320/391] Loss: 0.1045
Epoch [42/51], Iter [330/391] Loss: 0.1548
Epoch [42/51], Iter [340/391] Loss: 0.1771
Epoch [42/51], Iter [350/391] Loss: 0.1170
Epoch [42/51], Iter [360/391] Loss: 0.0930
Epoch [42/51], Iter [370/391] Loss: 0.1071
Epoch [42/51], Iter [380/391] Loss: 0.1427
Epoch [42/51], Iter [390/391] Loss: 0.0744
Epoch [43/51], Iter [10/391] Loss: 0.0768
Epoch [43/51], Iter [20/391] Loss: 0.0880
Epoch [43/51], Iter [30/391] Loss: 0.1036
Epoch [43/51], Iter [40/391] Loss: 0.0392
Epoch [43/51], Iter [50/391] Loss: 0.0617
Epoch [43/51], Iter [60/391] Loss: 0.0635
Epoch [43/51], Iter [70/391] Loss: 0.0995
Epoch [43/51], Iter [80/391] Loss: 0.0649
Epoch [43/51], Iter [90/391] Loss: 0.1067
Epoch [43/51], Iter [100/391] Loss: 0.1146
Epoch [43/51], Iter [110/391] Loss: 0.1294
Epoch [43/51], Iter [120/391] Loss: 0.0507
Epoch [43/51], Iter [130/391] Loss: 0.0779
Epoch [43/51], Iter [140/391] Loss: 0.1000
Epoch [43/51], Iter [150/391] Loss: 0.1337
Epoch [43/51], Iter [160/391] Loss: 0.0863
Epoch [43/51], Iter [170/391] Loss: 0.1156
Epoch [43/51], Iter [180/391] Loss: 0.0864
Epoch [43/51], Iter [190/391] Loss: 0.1170
Epoch [43/51], Iter [200/391] Loss: 0.0511
Epoch [43/51], Iter [210/391] Loss: 0.1049
Epoch [43/51], Iter [220/391] Loss: 0.0514
Epoch [43/51], Iter [230/391] Loss: 0.0996
Epoch [43/51], Iter [240/391] Loss: 0.0820
Epoch [43/51], Iter [250/391] Loss: 0.1207
Epoch [43/51], Iter [260/391] Loss: 0.0419
Epoch [43/51], Iter [270/391] Loss: 0.0667
Epoch [43/51], Iter [280/391] Loss: 0.1247
Epoch [43/51], Iter [290/391] Loss: 0.0927
Epoch [43/51], Iter [300/391] Loss: 0.0829
Epoch [43/51], Iter [310/391] Loss: 0.0896
Epoch [43/51], Iter [320/391] Loss: 0.1024
Epoch [43/51], Iter [330/391] Loss: 0.1113
Epoch [43/51], Iter [340/391] Loss: 0.0896
Epoch [43/51], Iter [350/391] Loss: 0.0861
Epoch [43/51], Iter [360/391] Loss: 0.1346
Epoch [43/51], Iter [370/391] Loss: 0.0873
Epoch [43/51], Iter [380/391] Loss: 0.1346
Epoch [43/51], Iter [390/391] Loss: 0.1456
Epoch [44/51], Iter [10/391] Loss: 0.1296
Epoch [44/51], Iter [20/391] Loss: 0.0846
Epoch [44/51], Iter [30/391] Loss: 0.1108
Epoch [44/51], Iter [40/391] Loss: 0.1107
Epoch [44/51], Iter [50/391] Loss: 0.1121
Epoch [44/51], Iter [60/391] Loss: 0.0470
Epoch [44/51], Iter [70/391] Loss: 0.0934
Epoch [44/51], Iter [80/391] Loss: 0.0348
Epoch [44/51], Iter [90/391] Loss: 0.0618
Epoch [44/51], Iter [100/391] Loss: 0.1146
Epoch [44/51], Iter [110/391] Loss: 0.1298
Epoch [44/51], Iter [120/391] Loss: 0.1084
Epoch [44/51], Iter [130/391] Loss: 0.0926
Epoch [44/51], Iter [140/391] Loss: 0.0652
Epoch [44/51], Iter [150/391] Loss: 0.1094
Epoch [44/51], Iter [160/391] Loss: 0.0553
Epoch [44/51], Iter [170/391] Loss: 0.0779
Epoch [44/51], Iter [180/391] Loss: 0.0937
Epoch [44/51], Iter [190/391] Loss: 0.0811
Epoch [44/51], Iter [200/391] Loss: 0.0406
Epoch [44/51], Iter [210/391] Loss: 0.0895
Epoch [44/51], Iter [220/391] Loss: 0.1356
Epoch [44/51], Iter [230/391] Loss: 0.0869
Epoch [44/51], Iter [240/391] Loss: 0.0744
Epoch [44/51], Iter [250/391] Loss: 0.1880
Epoch [44/51], Iter [260/391] Loss: 0.0878
Epoch [44/51], Iter [270/391] Loss: 0.1081
Epoch [44/51], Iter [280/391] Loss: 0.0932
Epoch [44/51], Iter [290/391] Loss: 0.0738
Epoch [44/51], Iter [300/391] Loss: 0.1118
Epoch [44/51], Iter [310/391] Loss: 0.0653
Epoch [44/51], Iter [320/391] Loss: 0.1322
Epoch [44/51], Iter [330/391] Loss: 0.0814
Epoch [44/51], Iter [340/391] Loss: 0.0790
Epoch [44/51], Iter [350/391] Loss: 0.0777
Epoch [44/51], Iter [360/391] Loss: 0.0857
Epoch [44/51], Iter [370/391] Loss: 0.0868
Epoch [44/51], Iter [380/391] Loss: 0.0843
Epoch [44/51], Iter [390/391] Loss: 0.0859
Epoch [45/51], Iter [10/391] Loss: 0.0997
Epoch [45/51], Iter [20/391] Loss: 0.1385
Epoch [45/51], Iter [30/391] Loss: 0.0562
Epoch [45/51], Iter [40/391] Loss: 0.0868
Epoch [45/51], Iter [50/391] Loss: 0.0583
Epoch [45/51], Iter [60/391] Loss: 0.0628
Epoch [45/51], Iter [70/391] Loss: 0.0796
Epoch [45/51], Iter [80/391] Loss: 0.0262
Epoch [45/51], Iter [90/391] Loss: 0.0249
Epoch [45/51], Iter [100/391] Loss: 0.0678
Epoch [45/51], Iter [110/391] Loss: 0.0826
Epoch [45/51], Iter [120/391] Loss: 0.0705
Epoch [45/51], Iter [130/391] Loss: 0.1174
Epoch [45/51], Iter [140/391] Loss: 0.0901
Epoch [45/51], Iter [150/391] Loss: 0.1066
Epoch [45/51], Iter [160/391] Loss: 0.0673
Epoch [45/51], Iter [170/391] Loss: 0.0861
Epoch [45/51], Iter [180/391] Loss: 0.0655
Epoch [45/51], Iter [190/391] Loss: 0.0445
Epoch [45/51], Iter [200/391] Loss: 0.0858
Epoch [45/51], Iter [210/391] Loss: 0.0852
Epoch [45/51], Iter [220/391] Loss: 0.0943
Epoch [45/51], Iter [230/391] Loss: 0.0793
Epoch [45/51], Iter [240/391] Loss: 0.0967
Epoch [45/51], Iter [250/391] Loss: 0.1143
Epoch [45/51], Iter [260/391] Loss: 0.0810
Epoch [45/51], Iter [270/391] Loss: 0.1087
Epoch [45/51], Iter [280/391] Loss: 0.0828
Epoch [45/51], Iter [290/391] Loss: 0.0997
Epoch [45/51], Iter [300/391] Loss: 0.0686
Epoch [45/51], Iter [310/391] Loss: 0.0827
Epoch [45/51], Iter [320/391] Loss: 0.1624
Epoch [45/51], Iter [330/391] Loss: 0.1034
Epoch [45/51], Iter [340/391] Loss: 0.0914
Epoch [45/51], Iter [350/391] Loss: 0.0609
Epoch [45/51], Iter [360/391] Loss: 0.1156
Epoch [45/51], Iter [370/391] Loss: 0.0795
Epoch [45/51], Iter [380/391] Loss: 0.0562
Epoch [45/51], Iter [390/391] Loss: 0.1483
Epoch [46/51], Iter [10/391] Loss: 0.1244
Epoch [46/51], Iter [20/391] Loss: 0.0594
Epoch [46/51], Iter [30/391] Loss: 0.0737
Epoch [46/51], Iter [40/391] Loss: 0.0498
Epoch [46/51], Iter [50/391] Loss: 0.0470
Epoch [46/51], Iter [60/391] Loss: 0.0914
Epoch [46/51], Iter [70/391] Loss: 0.1011
Epoch [46/51], Iter [80/391] Loss: 0.0548
Epoch [46/51], Iter [90/391] Loss: 0.0759
Epoch [46/51], Iter [100/391] Loss: 0.0848
Epoch [46/51], Iter [110/391] Loss: 0.0779
Epoch [46/51], Iter [120/391] Loss: 0.0789
Epoch [46/51], Iter [130/391] Loss: 0.1010
Epoch [46/51], Iter [140/391] Loss: 0.1218
Epoch [46/51], Iter [150/391] Loss: 0.0569
Epoch [46/51], Iter [160/391] Loss: 0.0797
Epoch [46/51], Iter [170/391] Loss: 0.0446
Epoch [46/51], Iter [180/391] Loss: 0.0489
Epoch [46/51], Iter [190/391] Loss: 0.0564
Epoch [46/51], Iter [200/391] Loss: 0.0663
Epoch [46/51], Iter [210/391] Loss: 0.0509
Epoch [46/51], Iter [220/391] Loss: 0.1387
Epoch [46/51], Iter [230/391] Loss: 0.0473
Epoch [46/51], Iter [240/391] Loss: 0.0510
Epoch [46/51], Iter [250/391] Loss: 0.0596
Epoch [46/51], Iter [260/391] Loss: 0.0615
Epoch [46/51], Iter [270/391] Loss: 0.0849
Epoch [46/51], Iter [280/391] Loss: 0.1181
Epoch [46/51], Iter [290/391] Loss: 0.0665
Epoch [46/51], Iter [300/391] Loss: 0.1003
Epoch [46/51], Iter [310/391] Loss: 0.0996
Epoch [46/51], Iter [320/391] Loss: 0.0583
Epoch [46/51], Iter [330/391] Loss: 0.0785
Epoch [46/51], Iter [340/391] Loss: 0.0760
Epoch [46/51], Iter [350/391] Loss: 0.1502
Epoch [46/51], Iter [360/391] Loss: 0.0713
Epoch [46/51], Iter [370/391] Loss: 0.1021
Epoch [46/51], Iter [380/391] Loss: 0.0947
Epoch [46/51], Iter [390/391] Loss: 0.0242
Epoch [47/51], Iter [10/391] Loss: 0.1128
Epoch [47/51], Iter [20/391] Loss: 0.0637
Epoch [47/51], Iter [30/391] Loss: 0.0668
Epoch [47/51], Iter [40/391] Loss: 0.0307
Epoch [47/51], Iter [50/391] Loss: 0.0707
Epoch [47/51], Iter [60/391] Loss: 0.0616
Epoch [47/51], Iter [70/391] Loss: 0.0680
Epoch [47/51], Iter [80/391] Loss: 0.0695
Epoch [47/51], Iter [90/391] Loss: 0.0609
Epoch [47/51], Iter [100/391] Loss: 0.0986
Epoch [47/51], Iter [110/391] Loss: 0.1098
Epoch [47/51], Iter [120/391] Loss: 0.0984
Epoch [47/51], Iter [130/391] Loss: 0.0828
Epoch [47/51], Iter [140/391] Loss: 0.0533
Epoch [47/51], Iter [150/391] Loss: 0.0771
Epoch [47/51], Iter [160/391] Loss: 0.0809
Epoch [47/51], Iter [170/391] Loss: 0.0257
Epoch [47/51], Iter [180/391] Loss: 0.0495
Epoch [47/51], Iter [190/391] Loss: 0.1240
Epoch [47/51], Iter [200/391] Loss: 0.0955
Epoch [47/51], Iter [210/391] Loss: 0.0363
Epoch [47/51], Iter [220/391] Loss: 0.0706
Epoch [47/51], Iter [230/391] Loss: 0.0605
Epoch [47/51], Iter [240/391] Loss: 0.0777
Epoch [47/51], Iter [250/391] Loss: 0.0860
Epoch [47/51], Iter [260/391] Loss: 0.0709
Epoch [47/51], Iter [270/391] Loss: 0.0655
Epoch [47/51], Iter [280/391] Loss: 0.0418
Epoch [47/51], Iter [290/391] Loss: 0.0462
Epoch [47/51], Iter [300/391] Loss: 0.0603
Epoch [47/51], Iter [310/391] Loss: 0.0710
Epoch [47/51], Iter [320/391] Loss: 0.1026
Epoch [47/51], Iter [330/391] Loss: 0.0992
Epoch [47/51], Iter [340/391] Loss: 0.0810
Epoch [47/51], Iter [350/391] Loss: 0.1301
Epoch [47/51], Iter [360/391] Loss: 0.0919
Epoch [47/51], Iter [370/391] Loss: 0.0669
Epoch [47/51], Iter [380/391] Loss: 0.0900
Epoch [47/51], Iter [390/391] Loss: 0.0901
Epoch [48/51], Iter [10/391] Loss: 0.0686
Epoch [48/51], Iter [20/391] Loss: 0.1178
Epoch [48/51], Iter [30/391] Loss: 0.0988
Epoch [48/51], Iter [40/391] Loss: 0.0796
Epoch [48/51], Iter [50/391] Loss: 0.0628
Epoch [48/51], Iter [60/391] Loss: 0.0608
Epoch [48/51], Iter [70/391] Loss: 0.0712
Epoch [48/51], Iter [80/391] Loss: 0.1118
Epoch [48/51], Iter [90/391] Loss: 0.0736
Epoch [48/51], Iter [100/391] Loss: 0.0683
Epoch [48/51], Iter [110/391] Loss: 0.0770
Epoch [48/51], Iter [120/391] Loss: 0.0729
Epoch [48/51], Iter [130/391] Loss: 0.0749
Epoch [48/51], Iter [140/391] Loss: 0.0566
Epoch [48/51], Iter [150/391] Loss: 0.0903
Epoch [48/51], Iter [160/391] Loss: 0.1059
Epoch [48/51], Iter [170/391] Loss: 0.0467
Epoch [48/51], Iter [180/391] Loss: 0.0919
Epoch [48/51], Iter [190/391] Loss: 0.0967
Epoch [48/51], Iter [200/391] Loss: 0.1006
Epoch [48/51], Iter [210/391] Loss: 0.0584
Epoch [48/51], Iter [220/391] Loss: 0.0737
Epoch [48/51], Iter [230/391] Loss: 0.0386
Epoch [48/51], Iter [240/391] Loss: 0.0892
Epoch [48/51], Iter [250/391] Loss: 0.0536
Epoch [48/51], Iter [260/391] Loss: 0.0525
Epoch [48/51], Iter [270/391] Loss: 0.0438
Epoch [48/51], Iter [280/391] Loss: 0.0487
Epoch [48/51], Iter [290/391] Loss: 0.0660
Epoch [48/51], Iter [300/391] Loss: 0.0936
Epoch [48/51], Iter [310/391] Loss: 0.0886
Epoch [48/51], Iter [320/391] Loss: 0.1018
Epoch [48/51], Iter [330/391] Loss: 0.1539
Epoch [48/51], Iter [340/391] Loss: 0.1371
Epoch [48/51], Iter [350/391] Loss: 0.1155
Epoch [48/51], Iter [360/391] Loss: 0.0763
Epoch [48/51], Iter [370/391] Loss: 0.0541
Epoch [48/51], Iter [380/391] Loss: 0.0892
Epoch [48/51], Iter [390/391] Loss: 0.1113
Epoch [49/51], Iter [10/391] Loss: 0.0529
Epoch [49/51], Iter [20/391] Loss: 0.0546
Epoch [49/51], Iter [30/391] Loss: 0.1051
Epoch [49/51], Iter [40/391] Loss: 0.0524
Epoch [49/51], Iter [50/391] Loss: 0.0677
Epoch [49/51], Iter [60/391] Loss: 0.0849
Epoch [49/51], Iter [70/391] Loss: 0.0497
Epoch [49/51], Iter [80/391] Loss: 0.0984
Epoch [49/51], Iter [90/391] Loss: 0.0619
Epoch [49/51], Iter [100/391] Loss: 0.0711
Epoch [49/51], Iter [110/391] Loss: 0.0783
Epoch [49/51], Iter [120/391] Loss: 0.0529
Epoch [49/51], Iter [130/391] Loss: 0.0494
Epoch [49/51], Iter [140/391] Loss: 0.0656
Epoch [49/51], Iter [150/391] Loss: 0.1139
Epoch [49/51], Iter [160/391] Loss: 0.0657
Epoch [49/51], Iter [170/391] Loss: 0.0513
Epoch [49/51], Iter [180/391] Loss: 0.0621
Epoch [49/51], Iter [190/391] Loss: 0.0824
Epoch [49/51], Iter [200/391] Loss: 0.0811
Epoch [49/51], Iter [210/391] Loss: 0.0443
Epoch [49/51], Iter [220/391] Loss: 0.0860
Epoch [49/51], Iter [230/391] Loss: 0.0303
Epoch [49/51], Iter [240/391] Loss: 0.1173
Epoch [49/51], Iter [250/391] Loss: 0.1354
Epoch [49/51], Iter [260/391] Loss: 0.0999
Epoch [49/51], Iter [270/391] Loss: 0.0854
Epoch [49/51], Iter [280/391] Loss: 0.0858
Epoch [49/51], Iter [290/391] Loss: 0.0788
Epoch [49/51], Iter [300/391] Loss: 0.0780
Epoch [49/51], Iter [310/391] Loss: 0.0680
Epoch [49/51], Iter [320/391] Loss: 0.0686
Epoch [49/51], Iter [330/391] Loss: 0.0601
Epoch [49/51], Iter [340/391] Loss: 0.0646
Epoch [49/51], Iter [350/391] Loss: 0.0653
Epoch [49/51], Iter [360/391] Loss: 0.1118
Epoch [49/51], Iter [370/391] Loss: 0.0953
Epoch [49/51], Iter [380/391] Loss: 0.1203
Epoch [49/51], Iter [390/391] Loss: 0.1383
Epoch [50/51], Iter [10/391] Loss: 0.0528
Epoch [50/51], Iter [20/391] Loss: 0.0736
Epoch [50/51], Iter [30/391] Loss: 0.1021
Epoch [50/51], Iter [40/391] Loss: 0.0761
Epoch [50/51], Iter [50/391] Loss: 0.0462
Epoch [50/51], Iter [60/391] Loss: 0.0659
Epoch [50/51], Iter [70/391] Loss: 0.0346
Epoch [50/51], Iter [80/391] Loss: 0.0539
Epoch [50/51], Iter [90/391] Loss: 0.0441
Epoch [50/51], Iter [100/391] Loss: 0.0802
Epoch [50/51], Iter [110/391] Loss: 0.0746
Epoch [50/51], Iter [120/391] Loss: 0.0696
Epoch [50/51], Iter [130/391] Loss: 0.1012
Epoch [50/51], Iter [140/391] Loss: 0.1286
Epoch [50/51], Iter [150/391] Loss: 0.1165
Epoch [50/51], Iter [160/391] Loss: 0.1265
Epoch [50/51], Iter [170/391] Loss: 0.0583
Epoch [50/51], Iter [180/391] Loss: 0.0702
Epoch [50/51], Iter [190/391] Loss: 0.0446
Epoch [50/51], Iter [200/391] Loss: 0.0902
Epoch [50/51], Iter [210/391] Loss: 0.1049
Epoch [50/51], Iter [220/391] Loss: 0.0396
Epoch [50/51], Iter [230/391] Loss: 0.0605
Epoch [50/51], Iter [240/391] Loss: 0.1211
Epoch [50/51], Iter [250/391] Loss: 0.0460
Epoch [50/51], Iter [260/391] Loss: 0.0752
Epoch [50/51], Iter [270/391] Loss: 0.0511
Epoch [50/51], Iter [280/391] Loss: 0.0796
Epoch [50/51], Iter [290/391] Loss: 0.0432
Epoch [50/51], Iter [300/391] Loss: 0.0655
Epoch [50/51], Iter [310/391] Loss: 0.0590
Epoch [50/51], Iter [320/391] Loss: 0.0852
Epoch [50/51], Iter [330/391] Loss: 0.0754
Epoch [50/51], Iter [340/391] Loss: 0.0795
Epoch [50/51], Iter [350/391] Loss: 0.0768
Epoch [50/51], Iter [360/391] Loss: 0.1009
Epoch [50/51], Iter [370/391] Loss: 0.0673
Epoch [50/51], Iter [380/391] Loss: 0.0844
Epoch [50/51], Iter [390/391] Loss: 0.0735
[Saving Checkpoint]
Epoch [51/51], Iter [10/391] Loss: 0.0610
Epoch [51/51], Iter [20/391] Loss: 0.1347
Epoch [51/51], Iter [30/391] Loss: 0.0737
Epoch [51/51], Iter [40/391] Loss: 0.0777
Epoch [51/51], Iter [50/391] Loss: 0.0463
Epoch [51/51], Iter [60/391] Loss: 0.0892
Epoch [51/51], Iter [70/391] Loss: 0.0423
Epoch [51/51], Iter [80/391] Loss: 0.0813
Epoch [51/51], Iter [90/391] Loss: 0.0535
Epoch [51/51], Iter [100/391] Loss: 0.0314
Epoch [51/51], Iter [110/391] Loss: 0.0791
Epoch [51/51], Iter [120/391] Loss: 0.0543
Epoch [51/51], Iter [130/391] Loss: 0.0747
Epoch [51/51], Iter [140/391] Loss: 0.0846
Epoch [51/51], Iter [150/391] Loss: 0.0709
Epoch [51/51], Iter [160/391] Loss: 0.0984
Epoch [51/51], Iter [170/391] Loss: 0.0760
Epoch [51/51], Iter [180/391] Loss: 0.0691
Epoch [51/51], Iter [190/391] Loss: 0.0411
Epoch [51/51], Iter [200/391] Loss: 0.1041
Epoch [51/51], Iter [210/391] Loss: 0.0641
Epoch [51/51], Iter [220/391] Loss: 0.0767
Epoch [51/51], Iter [230/391] Loss: 0.0724
Epoch [51/51], Iter [240/391] Loss: 0.0546
Epoch [51/51], Iter [250/391] Loss: 0.0690
Epoch [51/51], Iter [260/391] Loss: 0.0556
Epoch [51/51], Iter [270/391] Loss: 0.0958
Epoch [51/51], Iter [280/391] Loss: 0.0878
Epoch [51/51], Iter [290/391] Loss: 0.1402
Epoch [51/51], Iter [300/391] Loss: 0.0817
Epoch [51/51], Iter [310/391] Loss: 0.0787
Epoch [51/51], Iter [320/391] Loss: 0.1262
Epoch [51/51], Iter [330/391] Loss: 0.0572
Epoch [51/51], Iter [340/391] Loss: 0.1191
Epoch [51/51], Iter [350/391] Loss: 0.0856
Epoch [51/51], Iter [360/391] Loss: 0.0689
Epoch [51/51], Iter [370/391] Loss: 0.1304
Epoch [51/51], Iter [380/391] Loss: 0.1116
Epoch [51/51], Iter [390/391] Loss: 0.0913
# | a=0 | T=1 | epochs = 51 |
test_harness( testloader, resnet_child )
Accuracy of the model on the test images: 88 %
(tensor(8859, device='cuda:0'), 10000)
resnet_child_a0_t1_e51 = copy.deepcopy(resnet_child) #let's save for future reference
# | a=1 | T=5 | epochs = 51 |
resnet_child, optimizer_child = get_new_child_model()
epoch = 0
kd_loss_a1_t5 = partial( knowledge_distillation_loss, alpha=1, T=5 )
training_harness( trainloader, optimizer_child, kd_loss_a1_t5, resnet_parent, resnet_child, model_name='DeepResNet_a1_t5_e51' )
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:12: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
if sys.path[0] == '':
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:26: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:15: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
from ipykernel import kernelapp as app
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:20: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:21: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
ENABLING GPU ACCELERATION || GeForce GTX 1080 Ti
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:46: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
Epoch [1/51], Iter [10/391] Loss: 1.5539
Epoch [1/51], Iter [20/391] Loss: 1.2895
Epoch [1/51], Iter [30/391] Loss: 1.2595
Epoch [1/51], Iter [40/391] Loss: 1.0947
Epoch [1/51], Iter [50/391] Loss: 1.2370
Epoch [1/51], Iter [60/391] Loss: 1.1683
Epoch [1/51], Iter [70/391] Loss: 1.1829
Epoch [1/51], Iter [80/391] Loss: 1.1049
Epoch [1/51], Iter [90/391] Loss: 1.0549
Epoch [1/51], Iter [100/391] Loss: 1.1035
Epoch [1/51], Iter [110/391] Loss: 0.9451
Epoch [1/51], Iter [120/391] Loss: 1.0377
Epoch [1/51], Iter [130/391] Loss: 1.0612
Epoch [1/51], Iter [140/391] Loss: 0.8896
Epoch [1/51], Iter [150/391] Loss: 1.0050
Epoch [1/51], Iter [160/391] Loss: 0.9043
Epoch [1/51], Iter [170/391] Loss: 0.9472
Epoch [1/51], Iter [180/391] Loss: 0.8607
Epoch [1/51], Iter [190/391] Loss: 0.8388
Epoch [1/51], Iter [200/391] Loss: 0.8349
Epoch [1/51], Iter [210/391] Loss: 0.8813
Epoch [1/51], Iter [220/391] Loss: 0.7765
Epoch [1/51], Iter [230/391] Loss: 0.8249
Epoch [1/51], Iter [240/391] Loss: 0.7388
Epoch [1/51], Iter [250/391] Loss: 0.7343
Epoch [1/51], Iter [260/391] Loss: 0.7891
Epoch [1/51], Iter [270/391] Loss: 0.7792
Epoch [1/51], Iter [280/391] Loss: 0.7161
Epoch [1/51], Iter [290/391] Loss: 0.7517
Epoch [1/51], Iter [300/391] Loss: 0.7769
Epoch [1/51], Iter [310/391] Loss: 0.7508
Epoch [1/51], Iter [320/391] Loss: 0.8041
Epoch [1/51], Iter [330/391] Loss: 0.6999
Epoch [1/51], Iter [340/391] Loss: 0.7070
Epoch [1/51], Iter [350/391] Loss: 0.7430
Epoch [1/51], Iter [360/391] Loss: 0.6771
Epoch [1/51], Iter [370/391] Loss: 0.7332
Epoch [1/51], Iter [380/391] Loss: 0.7566
Epoch [1/51], Iter [390/391] Loss: 0.6466
Epoch [2/51], Iter [10/391] Loss: 0.5477
Epoch [2/51], Iter [20/391] Loss: 0.6320
Epoch [2/51], Iter [30/391] Loss: 0.7609
Epoch [2/51], Iter [40/391] Loss: 0.6844
Epoch [2/51], Iter [50/391] Loss: 0.6719
Epoch [2/51], Iter [60/391] Loss: 0.6773
Epoch [2/51], Iter [70/391] Loss: 0.6654
Epoch [2/51], Iter [80/391] Loss: 0.6369
Epoch [2/51], Iter [90/391] Loss: 0.6346
Epoch [2/51], Iter [100/391] Loss: 0.6081
Epoch [2/51], Iter [110/391] Loss: 0.6289
Epoch [2/51], Iter [120/391] Loss: 0.6895
Epoch [2/51], Iter [130/391] Loss: 0.5974
Epoch [2/51], Iter [140/391] Loss: 0.5882
Epoch [2/51], Iter [150/391] Loss: 0.6184
Epoch [2/51], Iter [160/391] Loss: 0.5577
Epoch [2/51], Iter [170/391] Loss: 0.5910
Epoch [2/51], Iter [180/391] Loss: 0.5471
Epoch [2/51], Iter [190/391] Loss: 0.5743
Epoch [2/51], Iter [200/391] Loss: 0.6097
Epoch [2/51], Iter [210/391] Loss: 0.6105
Epoch [2/51], Iter [220/391] Loss: 0.5643
Epoch [2/51], Iter [230/391] Loss: 0.5759
Epoch [2/51], Iter [240/391] Loss: 0.5589
Epoch [2/51], Iter [250/391] Loss: 0.5494
Epoch [2/51], Iter [260/391] Loss: 0.6296
Epoch [2/51], Iter [270/391] Loss: 0.5301
Epoch [2/51], Iter [280/391] Loss: 0.4870
Epoch [2/51], Iter [290/391] Loss: 0.5778
Epoch [2/51], Iter [300/391] Loss: 0.5609
Epoch [2/51], Iter [310/391] Loss: 0.5560
Epoch [2/51], Iter [320/391] Loss: 0.5554
Epoch [2/51], Iter [330/391] Loss: 0.4975
Epoch [2/51], Iter [340/391] Loss: 0.5244
Epoch [2/51], Iter [350/391] Loss: 0.5056
Epoch [2/51], Iter [360/391] Loss: 0.6225
Epoch [2/51], Iter [370/391] Loss: 0.5147
Epoch [2/51], Iter [380/391] Loss: 0.5090
Epoch [2/51], Iter [390/391] Loss: 0.4871
Epoch [3/51], Iter [10/391] Loss: 0.5398
Epoch [3/51], Iter [20/391] Loss: 0.4859
Epoch [3/51], Iter [30/391] Loss: 0.4874
Epoch [3/51], Iter [40/391] Loss: 0.4871
Epoch [3/51], Iter [50/391] Loss: 0.5128
Epoch [3/51], Iter [60/391] Loss: 0.5252
Epoch [3/51], Iter [70/391] Loss: 0.4880
Epoch [3/51], Iter [80/391] Loss: 0.5025
Epoch [3/51], Iter [90/391] Loss: 0.4940
Epoch [3/51], Iter [100/391] Loss: 0.5106
Epoch [3/51], Iter [110/391] Loss: 0.4498
Epoch [3/51], Iter [120/391] Loss: 0.4993
Epoch [3/51], Iter [130/391] Loss: 0.4998
Epoch [3/51], Iter [140/391] Loss: 0.5198
Epoch [3/51], Iter [150/391] Loss: 0.4474
Epoch [3/51], Iter [160/391] Loss: 0.5326
Epoch [3/51], Iter [170/391] Loss: 0.4571
Epoch [3/51], Iter [180/391] Loss: 0.5662
Epoch [3/51], Iter [190/391] Loss: 0.4566
Epoch [3/51], Iter [200/391] Loss: 0.4662
Epoch [3/51], Iter [210/391] Loss: 0.5020
Epoch [3/51], Iter [220/391] Loss: 0.5041
Epoch [3/51], Iter [230/391] Loss: 0.4968
Epoch [3/51], Iter [240/391] Loss: 0.5077
Epoch [3/51], Iter [250/391] Loss: 0.4754
Epoch [3/51], Iter [260/391] Loss: 0.3731
Epoch [3/51], Iter [270/391] Loss: 0.3949
Epoch [3/51], Iter [280/391] Loss: 0.4533
Epoch [3/51], Iter [290/391] Loss: 0.4443
Epoch [3/51], Iter [300/391] Loss: 0.4243
Epoch [3/51], Iter [310/391] Loss: 0.5196
Epoch [3/51], Iter [320/391] Loss: 0.4857
Epoch [3/51], Iter [330/391] Loss: 0.4660
Epoch [3/51], Iter [340/391] Loss: 0.4274
Epoch [3/51], Iter [350/391] Loss: 0.4614
Epoch [3/51], Iter [360/391] Loss: 0.4179
Epoch [3/51], Iter [370/391] Loss: 0.4182
Epoch [3/51], Iter [380/391] Loss: 0.4631
Epoch [3/51], Iter [390/391] Loss: 0.4055
Epoch [4/51], Iter [10/391] Loss: 0.3708
Epoch [4/51], Iter [20/391] Loss: 0.3466
Epoch [4/51], Iter [30/391] Loss: 0.4450
Epoch [4/51], Iter [40/391] Loss: 0.4001
Epoch [4/51], Iter [50/391] Loss: 0.4274
Epoch [4/51], Iter [60/391] Loss: 0.4213
Epoch [4/51], Iter [70/391] Loss: 0.4095
Epoch [4/51], Iter [80/391] Loss: 0.4161
Epoch [4/51], Iter [90/391] Loss: 0.4101
Epoch [4/51], Iter [100/391] Loss: 0.3877
Epoch [4/51], Iter [110/391] Loss: 0.4125
Epoch [4/51], Iter [120/391] Loss: 0.3378
Epoch [4/51], Iter [130/391] Loss: 0.3796
Epoch [4/51], Iter [140/391] Loss: 0.4044
Epoch [4/51], Iter [150/391] Loss: 0.3239
Epoch [4/51], Iter [160/391] Loss: 0.4205
Epoch [4/51], Iter [170/391] Loss: 0.3268
Epoch [4/51], Iter [180/391] Loss: 0.3613
Epoch [4/51], Iter [190/391] Loss: 0.3703
Epoch [4/51], Iter [200/391] Loss: 0.4276
Epoch [4/51], Iter [210/391] Loss: 0.3740
Epoch [4/51], Iter [220/391] Loss: 0.4131
Epoch [4/51], Iter [230/391] Loss: 0.4618
Epoch [4/51], Iter [240/391] Loss: 0.3694
Epoch [4/51], Iter [250/391] Loss: 0.3793
Epoch [4/51], Iter [260/391] Loss: 0.3620
Epoch [4/51], Iter [270/391] Loss: 0.4025
Epoch [4/51], Iter [280/391] Loss: 0.4137
Epoch [4/51], Iter [290/391] Loss: 0.3867
Epoch [4/51], Iter [300/391] Loss: 0.3508
Epoch [4/51], Iter [310/391] Loss: 0.3830
Epoch [4/51], Iter [320/391] Loss: 0.3385
Epoch [4/51], Iter [330/391] Loss: 0.3867
Epoch [4/51], Iter [340/391] Loss: 0.3803
Epoch [4/51], Iter [350/391] Loss: 0.4624
Epoch [4/51], Iter [360/391] Loss: 0.3774
Epoch [4/51], Iter [370/391] Loss: 0.3798
Epoch [4/51], Iter [380/391] Loss: 0.3187
Epoch [4/51], Iter [390/391] Loss: 0.3553
Epoch [5/51], Iter [10/391] Loss: 0.3613
Epoch [5/51], Iter [20/391] Loss: 0.3256
Epoch [5/51], Iter [30/391] Loss: 0.3671
Epoch [5/51], Iter [40/391] Loss: 0.3681
Epoch [5/51], Iter [50/391] Loss: 0.3654
Epoch [5/51], Iter [60/391] Loss: 0.3721
Epoch [5/51], Iter [70/391] Loss: 0.3544
Epoch [5/51], Iter [80/391] Loss: 0.3462
Epoch [5/51], Iter [90/391] Loss: 0.3540
Epoch [5/51], Iter [100/391] Loss: 0.3234
Epoch [5/51], Iter [110/391] Loss: 0.3411
Epoch [5/51], Iter [120/391] Loss: 0.3144
Epoch [5/51], Iter [130/391] Loss: 0.3437
Epoch [5/51], Iter [140/391] Loss: 0.3908
Epoch [5/51], Iter [150/391] Loss: 0.3197
Epoch [5/51], Iter [160/391] Loss: 0.3604
Epoch [5/51], Iter [170/391] Loss: 0.3569
Epoch [5/51], Iter [180/391] Loss: 0.3273
Epoch [5/51], Iter [190/391] Loss: 0.3352
Epoch [5/51], Iter [200/391] Loss: 0.3343
Epoch [5/51], Iter [210/391] Loss: 0.3035
Epoch [5/51], Iter [220/391] Loss: 0.3601
Epoch [5/51], Iter [230/391] Loss: 0.3321
Epoch [5/51], Iter [240/391] Loss: 0.3311
Epoch [5/51], Iter [250/391] Loss: 0.3428
Epoch [5/51], Iter [260/391] Loss: 0.3194
Epoch [5/51], Iter [270/391] Loss: 0.3175
Epoch [5/51], Iter [280/391] Loss: 0.3463
Epoch [5/51], Iter [290/391] Loss: 0.3144
Epoch [5/51], Iter [300/391] Loss: 0.3377
Epoch [5/51], Iter [310/391] Loss: 0.2621
Epoch [5/51], Iter [320/391] Loss: 0.3378
Epoch [5/51], Iter [330/391] Loss: 0.3431
Epoch [5/51], Iter [340/391] Loss: 0.3213
Epoch [5/51], Iter [350/391] Loss: 0.3430
Epoch [5/51], Iter [360/391] Loss: 0.3135
Epoch [5/51], Iter [370/391] Loss: 0.3301
Epoch [5/51], Iter [380/391] Loss: 0.3440
Epoch [5/51], Iter [390/391] Loss: 0.3146
Epoch [6/51], Iter [10/391] Loss: 0.3164
Epoch [6/51], Iter [20/391] Loss: 0.3078
Epoch [6/51], Iter [30/391] Loss: 0.3426
Epoch [6/51], Iter [40/391] Loss: 0.2865
Epoch [6/51], Iter [50/391] Loss: 0.2862
Epoch [6/51], Iter [60/391] Loss: 0.3171
Epoch [6/51], Iter [70/391] Loss: 0.3213
Epoch [6/51], Iter [80/391] Loss: 0.3148
Epoch [6/51], Iter [90/391] Loss: 0.3020
Epoch [6/51], Iter [100/391] Loss: 0.2684
Epoch [6/51], Iter [110/391] Loss: 0.3335
Epoch [6/51], Iter [120/391] Loss: 0.3011
Epoch [6/51], Iter [130/391] Loss: 0.3695
Epoch [6/51], Iter [140/391] Loss: 0.2610
Epoch [6/51], Iter [150/391] Loss: 0.3329
Epoch [6/51], Iter [160/391] Loss: 0.3176
Epoch [6/51], Iter [170/391] Loss: 0.2896
Epoch [6/51], Iter [180/391] Loss: 0.2705
Epoch [6/51], Iter [190/391] Loss: 0.2790
Epoch [6/51], Iter [200/391] Loss: 0.2570
Epoch [6/51], Iter [210/391] Loss: 0.2581
Epoch [6/51], Iter [220/391] Loss: 0.3173
Epoch [6/51], Iter [230/391] Loss: 0.2838
Epoch [6/51], Iter [240/391] Loss: 0.2576
Epoch [6/51], Iter [250/391] Loss: 0.3166
Epoch [6/51], Iter [260/391] Loss: 0.3046
Epoch [6/51], Iter [270/391] Loss: 0.3109
Epoch [6/51], Iter [280/391] Loss: 0.3419
Epoch [6/51], Iter [290/391] Loss: 0.3222
Epoch [6/51], Iter [300/391] Loss: 0.2936
Epoch [6/51], Iter [310/391] Loss: 0.3068
Epoch [6/51], Iter [320/391] Loss: 0.2996
Epoch [6/51], Iter [330/391] Loss: 0.2576
Epoch [6/51], Iter [340/391] Loss: 0.3150
Epoch [6/51], Iter [350/391] Loss: 0.3180
Epoch [6/51], Iter [360/391] Loss: 0.2530
Epoch [6/51], Iter [370/391] Loss: 0.3215
Epoch [6/51], Iter [380/391] Loss: 0.3302
Epoch [6/51], Iter [390/391] Loss: 0.2842
Epoch [7/51], Iter [10/391] Loss: 0.2807
Epoch [7/51], Iter [20/391] Loss: 0.2690
Epoch [7/51], Iter [30/391] Loss: 0.2702
Epoch [7/51], Iter [40/391] Loss: 0.2266
Epoch [7/51], Iter [50/391] Loss: 0.2488
Epoch [7/51], Iter [60/391] Loss: 0.2548
Epoch [7/51], Iter [70/391] Loss: 0.2542
Epoch [7/51], Iter [80/391] Loss: 0.2404
Epoch [7/51], Iter [90/391] Loss: 0.2921
Epoch [7/51], Iter [100/391] Loss: 0.2663
Epoch [7/51], Iter [110/391] Loss: 0.2581
Epoch [7/51], Iter [120/391] Loss: 0.2854
Epoch [7/51], Iter [130/391] Loss: 0.2973
Epoch [7/51], Iter [140/391] Loss: 0.3196
Epoch [7/51], Iter [150/391] Loss: 0.2482
Epoch [7/51], Iter [160/391] Loss: 0.3221
Epoch [7/51], Iter [170/391] Loss: 0.2796
Epoch [7/51], Iter [180/391] Loss: 0.2669
Epoch [7/51], Iter [190/391] Loss: 0.2354
Epoch [7/51], Iter [200/391] Loss: 0.3217
Epoch [7/51], Iter [210/391] Loss: 0.2468
Epoch [7/51], Iter [220/391] Loss: 0.2711
Epoch [7/51], Iter [230/391] Loss: 0.2917
Epoch [7/51], Iter [240/391] Loss: 0.2584
Epoch [7/51], Iter [250/391] Loss: 0.2419
Epoch [7/51], Iter [260/391] Loss: 0.2726
Epoch [7/51], Iter [270/391] Loss: 0.2599
Epoch [7/51], Iter [280/391] Loss: 0.2567
Epoch [7/51], Iter [290/391] Loss: 0.2456
Epoch [7/51], Iter [300/391] Loss: 0.2549
Epoch [7/51], Iter [310/391] Loss: 0.2564
Epoch [7/51], Iter [320/391] Loss: 0.2338
Epoch [7/51], Iter [330/391] Loss: 0.2652
Epoch [7/51], Iter [340/391] Loss: 0.2358
Epoch [7/51], Iter [350/391] Loss: 0.2524
Epoch [7/51], Iter [360/391] Loss: 0.2899
Epoch [7/51], Iter [370/391] Loss: 0.2634
Epoch [7/51], Iter [380/391] Loss: 0.2749
Epoch [7/51], Iter [390/391] Loss: 0.2213
Epoch [8/51], Iter [10/391] Loss: 0.2381
Epoch [8/51], Iter [20/391] Loss: 0.2790
Epoch [8/51], Iter [30/391] Loss: 0.2551
Epoch [8/51], Iter [40/391] Loss: 0.2630
Epoch [8/51], Iter [50/391] Loss: 0.2926
Epoch [8/51], Iter [60/391] Loss: 0.2570
Epoch [8/51], Iter [70/391] Loss: 0.2424
Epoch [8/51], Iter [80/391] Loss: 0.2193
Epoch [8/51], Iter [90/391] Loss: 0.2161
Epoch [8/51], Iter [100/391] Loss: 0.2951
Epoch [8/51], Iter [110/391] Loss: 0.2443
Epoch [8/51], Iter [120/391] Loss: 0.2334
Epoch [8/51], Iter [130/391] Loss: 0.2195
Epoch [8/51], Iter [140/391] Loss: 0.2216
Epoch [8/51], Iter [150/391] Loss: 0.2660
Epoch [8/51], Iter [160/391] Loss: 0.2442
Epoch [8/51], Iter [170/391] Loss: 0.2706
Epoch [8/51], Iter [180/391] Loss: 0.2716
Epoch [8/51], Iter [190/391] Loss: 0.2558
Epoch [8/51], Iter [200/391] Loss: 0.2078
Epoch [8/51], Iter [210/391] Loss: 0.2454
Epoch [8/51], Iter [220/391] Loss: 0.2042
Epoch [8/51], Iter [230/391] Loss: 0.2190
Epoch [8/51], Iter [240/391] Loss: 0.2383
Epoch [8/51], Iter [250/391] Loss: 0.2287
Epoch [8/51], Iter [260/391] Loss: 0.2788
Epoch [8/51], Iter [270/391] Loss: 0.2145
Epoch [8/51], Iter [280/391] Loss: 0.2151
Epoch [8/51], Iter [290/391] Loss: 0.2418
Epoch [8/51], Iter [300/391] Loss: 0.1974
Epoch [8/51], Iter [310/391] Loss: 0.2432
Epoch [8/51], Iter [320/391] Loss: 0.2389
Epoch [8/51], Iter [330/391] Loss: 0.2106
Epoch [8/51], Iter [340/391] Loss: 0.2376
Epoch [8/51], Iter [350/391] Loss: 0.2076
Epoch [8/51], Iter [360/391] Loss: 0.2581
Epoch [8/51], Iter [370/391] Loss: 0.2551
Epoch [8/51], Iter [380/391] Loss: 0.2233
Epoch [8/51], Iter [390/391] Loss: 0.2717
Epoch [9/51], Iter [10/391] Loss: 0.1825
Epoch [9/51], Iter [20/391] Loss: 0.2331
Epoch [9/51], Iter [30/391] Loss: 0.2213
Epoch [9/51], Iter [40/391] Loss: 0.2619
Epoch [9/51], Iter [50/391] Loss: 0.2441
Epoch [9/51], Iter [60/391] Loss: 0.2550
Epoch [9/51], Iter [70/391] Loss: 0.2073
Epoch [9/51], Iter [80/391] Loss: 0.2290
Epoch [9/51], Iter [90/391] Loss: 0.2589
Epoch [9/51], Iter [100/391] Loss: 0.2355
Epoch [9/51], Iter [110/391] Loss: 0.2454
Epoch [9/51], Iter [120/391] Loss: 0.2422
Epoch [9/51], Iter [130/391] Loss: 0.2398
Epoch [9/51], Iter [140/391] Loss: 0.2183
Epoch [9/51], Iter [150/391] Loss: 0.2026
Epoch [9/51], Iter [160/391] Loss: 0.2270
Epoch [9/51], Iter [170/391] Loss: 0.2146
Epoch [9/51], Iter [180/391] Loss: 0.2065
Epoch [9/51], Iter [190/391] Loss: 0.2709
Epoch [9/51], Iter [200/391] Loss: 0.2461
Epoch [9/51], Iter [210/391] Loss: 0.2062
Epoch [9/51], Iter [220/391] Loss: 0.2179
Epoch [9/51], Iter [230/391] Loss: 0.2305
Epoch [9/51], Iter [240/391] Loss: 0.2246
Epoch [9/51], Iter [250/391] Loss: 0.2168
Epoch [9/51], Iter [260/391] Loss: 0.2074
Epoch [9/51], Iter [270/391] Loss: 0.2182
Epoch [9/51], Iter [280/391] Loss: 0.2062
Epoch [9/51], Iter [290/391] Loss: 0.2431
Epoch [9/51], Iter [300/391] Loss: 0.2348
Epoch [9/51], Iter [310/391] Loss: 0.2147
Epoch [9/51], Iter [320/391] Loss: 0.2100
Epoch [9/51], Iter [330/391] Loss: 0.2419
Epoch [9/51], Iter [340/391] Loss: 0.2532
Epoch [9/51], Iter [350/391] Loss: 0.2092
Epoch [9/51], Iter [360/391] Loss: 0.2227
Epoch [9/51], Iter [370/391] Loss: 0.1928
Epoch [9/51], Iter [380/391] Loss: 0.2266
Epoch [9/51], Iter [390/391] Loss: 0.2091
Epoch [10/51], Iter [10/391] Loss: 0.1909
Epoch [10/51], Iter [20/391] Loss: 0.1866
Epoch [10/51], Iter [30/391] Loss: 0.2178
Epoch [10/51], Iter [40/391] Loss: 0.2077
Epoch [10/51], Iter [50/391] Loss: 0.2163
Epoch [10/51], Iter [60/391] Loss: 0.1955
Epoch [10/51], Iter [70/391] Loss: 0.1961
Epoch [10/51], Iter [80/391] Loss: 0.2292
Epoch [10/51], Iter [90/391] Loss: 0.2065
Epoch [10/51], Iter [100/391] Loss: 0.2615
Epoch [10/51], Iter [110/391] Loss: 0.2110
Epoch [10/51], Iter [120/391] Loss: 0.1844
Epoch [10/51], Iter [130/391] Loss: 0.2192
Epoch [10/51], Iter [140/391] Loss: 0.2013
Epoch [10/51], Iter [150/391] Loss: 0.2242
Epoch [10/51], Iter [160/391] Loss: 0.2355
Epoch [10/51], Iter [170/391] Loss: 0.2137
Epoch [10/51], Iter [180/391] Loss: 0.2050
Epoch [10/51], Iter [190/391] Loss: 0.1925
Epoch [10/51], Iter [200/391] Loss: 0.2030
Epoch [10/51], Iter [210/391] Loss: 0.2288
Epoch [10/51], Iter [220/391] Loss: 0.2063
Epoch [10/51], Iter [230/391] Loss: 0.2363
Epoch [10/51], Iter [240/391] Loss: 0.2074
Epoch [10/51], Iter [250/391] Loss: 0.1998
Epoch [10/51], Iter [260/391] Loss: 0.2006
Epoch [10/51], Iter [270/391] Loss: 0.2055
Epoch [10/51], Iter [280/391] Loss: 0.2534
Epoch [10/51], Iter [290/391] Loss: 0.1952
Epoch [10/51], Iter [300/391] Loss: 0.1680
Epoch [10/51], Iter [310/391] Loss: 0.1863
Epoch [10/51], Iter [320/391] Loss: 0.2120
Epoch [10/51], Iter [330/391] Loss: 0.2006
Epoch [10/51], Iter [340/391] Loss: 0.1980
Epoch [10/51], Iter [350/391] Loss: 0.2395
Epoch [10/51], Iter [360/391] Loss: 0.1888
Epoch [10/51], Iter [370/391] Loss: 0.2050
Epoch [10/51], Iter [380/391] Loss: 0.2294
Epoch [10/51], Iter [390/391] Loss: 0.1762
[Saving Checkpoint]
Epoch [11/51], Iter [10/391] Loss: 0.2014
Epoch [11/51], Iter [20/391] Loss: 0.2127
Epoch [11/51], Iter [30/391] Loss: 0.2101
Epoch [11/51], Iter [40/391] Loss: 0.2286
Epoch [11/51], Iter [50/391] Loss: 0.2047
Epoch [11/51], Iter [60/391] Loss: 0.1956
Epoch [11/51], Iter [70/391] Loss: 0.2313
Epoch [11/51], Iter [80/391] Loss: 0.2087
Epoch [11/51], Iter [90/391] Loss: 0.2058
Epoch [11/51], Iter [100/391] Loss: 0.2050
Epoch [11/51], Iter [110/391] Loss: 0.2416
Epoch [11/51], Iter [120/391] Loss: 0.2077
Epoch [11/51], Iter [130/391] Loss: 0.1983
Epoch [11/51], Iter [140/391] Loss: 0.1847
Epoch [11/51], Iter [150/391] Loss: 0.1942
Epoch [11/51], Iter [160/391] Loss: 0.2070
Epoch [11/51], Iter [170/391] Loss: 0.1982
Epoch [11/51], Iter [180/391] Loss: 0.1797
Epoch [11/51], Iter [190/391] Loss: 0.2037
Epoch [11/51], Iter [200/391] Loss: 0.2056
Epoch [11/51], Iter [210/391] Loss: 0.2210
Epoch [11/51], Iter [220/391] Loss: 0.1900
Epoch [11/51], Iter [230/391] Loss: 0.2051
Epoch [11/51], Iter [240/391] Loss: 0.1660
Epoch [11/51], Iter [250/391] Loss: 0.2081
Epoch [11/51], Iter [260/391] Loss: 0.1513
Epoch [11/51], Iter [270/391] Loss: 0.1845
Epoch [11/51], Iter [280/391] Loss: 0.1551
Epoch [11/51], Iter [290/391] Loss: 0.1771
Epoch [11/51], Iter [300/391] Loss: 0.1919
Epoch [11/51], Iter [310/391] Loss: 0.1977
Epoch [11/51], Iter [320/391] Loss: 0.1752
Epoch [11/51], Iter [330/391] Loss: 0.2047
Epoch [11/51], Iter [340/391] Loss: 0.1614
Epoch [11/51], Iter [350/391] Loss: 0.1882
Epoch [11/51], Iter [360/391] Loss: 0.1834
Epoch [11/51], Iter [370/391] Loss: 0.1899
Epoch [11/51], Iter [380/391] Loss: 0.2046
Epoch [11/51], Iter [390/391] Loss: 0.2113
Epoch [12/51], Iter [10/391] Loss: 0.2101
Epoch [12/51], Iter [20/391] Loss: 0.1810
Epoch [12/51], Iter [30/391] Loss: 0.1948
Epoch [12/51], Iter [40/391] Loss: 0.1937
Epoch [12/51], Iter [50/391] Loss: 0.1860
Epoch [12/51], Iter [60/391] Loss: 0.1745
Epoch [12/51], Iter [70/391] Loss: 0.1851
Epoch [12/51], Iter [80/391] Loss: 0.1938
Epoch [12/51], Iter [90/391] Loss: 0.1869
Epoch [12/51], Iter [100/391] Loss: 0.1615
Epoch [12/51], Iter [110/391] Loss: 0.1658
Epoch [12/51], Iter [120/391] Loss: 0.1570
Epoch [12/51], Iter [130/391] Loss: 0.2144
Epoch [12/51], Iter [140/391] Loss: 0.2355
Epoch [12/51], Iter [150/391] Loss: 0.1559
Epoch [12/51], Iter [160/391] Loss: 0.1662
Epoch [12/51], Iter [170/391] Loss: 0.1885
Epoch [12/51], Iter [180/391] Loss: 0.1945
Epoch [12/51], Iter [190/391] Loss: 0.1654
Epoch [12/51], Iter [200/391] Loss: 0.1913
Epoch [12/51], Iter [210/391] Loss: 0.1761
Epoch [12/51], Iter [220/391] Loss: 0.2174
Epoch [12/51], Iter [230/391] Loss: 0.2185
Epoch [12/51], Iter [240/391] Loss: 0.1698
Epoch [12/51], Iter [250/391] Loss: 0.1491
Epoch [12/51], Iter [260/391] Loss: 0.1848
Epoch [12/51], Iter [270/391] Loss: 0.1840
Epoch [12/51], Iter [280/391] Loss: 0.1767
Epoch [12/51], Iter [290/391] Loss: 0.1943
Epoch [12/51], Iter [300/391] Loss: 0.1767
Epoch [12/51], Iter [310/391] Loss: 0.1677
Epoch [12/51], Iter [320/391] Loss: 0.1748
Epoch [12/51], Iter [330/391] Loss: 0.1630
Epoch [12/51], Iter [340/391] Loss: 0.1885
Epoch [12/51], Iter [350/391] Loss: 0.1897
Epoch [12/51], Iter [360/391] Loss: 0.1800
Epoch [12/51], Iter [370/391] Loss: 0.1775
Epoch [12/51], Iter [380/391] Loss: 0.2188
Epoch [12/51], Iter [390/391] Loss: 0.1702
Epoch [13/51], Iter [10/391] Loss: 0.1898
Epoch [13/51], Iter [20/391] Loss: 0.1765
Epoch [13/51], Iter [30/391] Loss: 0.1710
Epoch [13/51], Iter [40/391] Loss: 0.1488
Epoch [13/51], Iter [50/391] Loss: 0.1945
Epoch [13/51], Iter [60/391] Loss: 0.1851
Epoch [13/51], Iter [70/391] Loss: 0.1673
Epoch [13/51], Iter [80/391] Loss: 0.2072
Epoch [13/51], Iter [90/391] Loss: 0.1680
Epoch [13/51], Iter [100/391] Loss: 0.1805
Epoch [13/51], Iter [110/391] Loss: 0.1455
Epoch [13/51], Iter [120/391] Loss: 0.1791
Epoch [13/51], Iter [130/391] Loss: 0.1530
Epoch [13/51], Iter [140/391] Loss: 0.1706
Epoch [13/51], Iter [150/391] Loss: 0.1993
Epoch [13/51], Iter [160/391] Loss: 0.1800
Epoch [13/51], Iter [170/391] Loss: 0.1639
Epoch [13/51], Iter [180/391] Loss: 0.1973
Epoch [13/51], Iter [190/391] Loss: 0.1818
Epoch [13/51], Iter [200/391] Loss: 0.1673
Epoch [13/51], Iter [210/391] Loss: 0.1905
Epoch [13/51], Iter [220/391] Loss: 0.1615
Epoch [13/51], Iter [230/391] Loss: 0.1669
Epoch [13/51], Iter [240/391] Loss: 0.1959
Epoch [13/51], Iter [250/391] Loss: 0.1607
Epoch [13/51], Iter [260/391] Loss: 0.1701
Epoch [13/51], Iter [270/391] Loss: 0.2037
Epoch [13/51], Iter [280/391] Loss: 0.1718
Epoch [13/51], Iter [290/391] Loss: 0.1795
Epoch [13/51], Iter [300/391] Loss: 0.1943
Epoch [13/51], Iter [310/391] Loss: 0.1616
Epoch [13/51], Iter [320/391] Loss: 0.1603
Epoch [13/51], Iter [330/391] Loss: 0.2105
Epoch [13/51], Iter [340/391] Loss: 0.1609
Epoch [13/51], Iter [350/391] Loss: 0.1456
Epoch [13/51], Iter [360/391] Loss: 0.1753
Epoch [13/51], Iter [370/391] Loss: 0.1627
Epoch [13/51], Iter [380/391] Loss: 0.1611
Epoch [13/51], Iter [390/391] Loss: 0.1646
Epoch [14/51], Iter [10/391] Loss: 0.1592
Epoch [14/51], Iter [20/391] Loss: 0.1473
Epoch [14/51], Iter [30/391] Loss: 0.1643
Epoch [14/51], Iter [40/391] Loss: 0.1495
Epoch [14/51], Iter [50/391] Loss: 0.1280
Epoch [14/51], Iter [60/391] Loss: 0.1824
Epoch [14/51], Iter [70/391] Loss: 0.1872
Epoch [14/51], Iter [80/391] Loss: 0.1722
Epoch [14/51], Iter [90/391] Loss: 0.1379
Epoch [14/51], Iter [100/391] Loss: 0.1458
Epoch [14/51], Iter [110/391] Loss: 0.1483
Epoch [14/51], Iter [120/391] Loss: 0.1416
Epoch [14/51], Iter [130/391] Loss: 0.1664
Epoch [14/51], Iter [140/391] Loss: 0.1575
Epoch [14/51], Iter [150/391] Loss: 0.1609
Epoch [14/51], Iter [160/391] Loss: 0.1569
Epoch [14/51], Iter [170/391] Loss: 0.1253
Epoch [14/51], Iter [180/391] Loss: 0.1642
Epoch [14/51], Iter [190/391] Loss: 0.1676
Epoch [14/51], Iter [200/391] Loss: 0.1580
Epoch [14/51], Iter [210/391] Loss: 0.1538
Epoch [14/51], Iter [220/391] Loss: 0.1692
Epoch [14/51], Iter [230/391] Loss: 0.1799
Epoch [14/51], Iter [240/391] Loss: 0.1730
Epoch [14/51], Iter [250/391] Loss: 0.1647
Epoch [14/51], Iter [260/391] Loss: 0.1568
Epoch [14/51], Iter [270/391] Loss: 0.1474
Epoch [14/51], Iter [280/391] Loss: 0.1733
Epoch [14/51], Iter [290/391] Loss: 0.1996
Epoch [14/51], Iter [300/391] Loss: 0.1563
Epoch [14/51], Iter [310/391] Loss: 0.1388
Epoch [14/51], Iter [320/391] Loss: 0.1774
Epoch [14/51], Iter [330/391] Loss: 0.1736
Epoch [14/51], Iter [340/391] Loss: 0.1753
Epoch [14/51], Iter [350/391] Loss: 0.1946
Epoch [14/51], Iter [360/391] Loss: 0.2052
Epoch [14/51], Iter [370/391] Loss: 0.1440
Epoch [14/51], Iter [380/391] Loss: 0.1639
Epoch [14/51], Iter [390/391] Loss: 0.1832
Epoch [15/51], Iter [10/391] Loss: 0.1646
Epoch [15/51], Iter [20/391] Loss: 0.1392
Epoch [15/51], Iter [30/391] Loss: 0.1398
Epoch [15/51], Iter [40/391] Loss: 0.1838
Epoch [15/51], Iter [50/391] Loss: 0.1900
Epoch [15/51], Iter [60/391] Loss: 0.1694
Epoch [15/51], Iter [70/391] Loss: 0.1702
Epoch [15/51], Iter [80/391] Loss: 0.1827
Epoch [15/51], Iter [90/391] Loss: 0.1388
Epoch [15/51], Iter [100/391] Loss: 0.1673
Epoch [15/51], Iter [110/391] Loss: 0.1630
Epoch [15/51], Iter [120/391] Loss: 0.1583
Epoch [15/51], Iter [130/391] Loss: 0.1645
Epoch [15/51], Iter [140/391] Loss: 0.1603
Epoch [15/51], Iter [150/391] Loss: 0.1584
Epoch [15/51], Iter [160/391] Loss: 0.1313
Epoch [15/51], Iter [170/391] Loss: 0.1866
Epoch [15/51], Iter [180/391] Loss: 0.1787
Epoch [15/51], Iter [190/391] Loss: 0.1304
Epoch [15/51], Iter [200/391] Loss: 0.1627
Epoch [15/51], Iter [210/391] Loss: 0.1509
Epoch [15/51], Iter [220/391] Loss: 0.1376
Epoch [15/51], Iter [230/391] Loss: 0.1546
Epoch [15/51], Iter [240/391] Loss: 0.1697
Epoch [15/51], Iter [250/391] Loss: 0.1621
Epoch [15/51], Iter [260/391] Loss: 0.1442
Epoch [15/51], Iter [270/391] Loss: 0.1736
Epoch [15/51], Iter [280/391] Loss: 0.1663
Epoch [15/51], Iter [290/391] Loss: 0.1348
Epoch [15/51], Iter [300/391] Loss: 0.1682
Epoch [15/51], Iter [310/391] Loss: 0.1336
Epoch [15/51], Iter [320/391] Loss: 0.1273
Epoch [15/51], Iter [330/391] Loss: 0.1469
Epoch [15/51], Iter [340/391] Loss: 0.1456
Epoch [15/51], Iter [350/391] Loss: 0.1779
Epoch [15/51], Iter [360/391] Loss: 0.1797
Epoch [15/51], Iter [370/391] Loss: 0.1706
Epoch [15/51], Iter [380/391] Loss: 0.1649
Epoch [15/51], Iter [390/391] Loss: 0.1479
Epoch [16/51], Iter [10/391] Loss: 0.1475
Epoch [16/51], Iter [20/391] Loss: 0.1596
Epoch [16/51], Iter [30/391] Loss: 0.1640
Epoch [16/51], Iter [40/391] Loss: 0.1270
Epoch [16/51], Iter [50/391] Loss: 0.1450
Epoch [16/51], Iter [60/391] Loss: 0.1349
Epoch [16/51], Iter [70/391] Loss: 0.1497
Epoch [16/51], Iter [80/391] Loss: 0.1490
Epoch [16/51], Iter [90/391] Loss: 0.1634
Epoch [16/51], Iter [100/391] Loss: 0.1432
Epoch [16/51], Iter [110/391] Loss: 0.1466
Epoch [16/51], Iter [120/391] Loss: 0.1351
Epoch [16/51], Iter [130/391] Loss: 0.1466
Epoch [16/51], Iter [140/391] Loss: 0.1576
Epoch [16/51], Iter [150/391] Loss: 0.1405
Epoch [16/51], Iter [160/391] Loss: 0.1694
Epoch [16/51], Iter [170/391] Loss: 0.1584
Epoch [16/51], Iter [180/391] Loss: 0.1429
Epoch [16/51], Iter [190/391] Loss: 0.1555
Epoch [16/51], Iter [200/391] Loss: 0.1418
Epoch [16/51], Iter [210/391] Loss: 0.1343
Epoch [16/51], Iter [220/391] Loss: 0.1622
Epoch [16/51], Iter [230/391] Loss: 0.1265
Epoch [16/51], Iter [240/391] Loss: 0.1503
Epoch [16/51], Iter [250/391] Loss: 0.1623
Epoch [16/51], Iter [260/391] Loss: 0.1439
Epoch [16/51], Iter [270/391] Loss: 0.1519
Epoch [16/51], Iter [280/391] Loss: 0.1484
Epoch [16/51], Iter [290/391] Loss: 0.1474
Epoch [16/51], Iter [300/391] Loss: 0.1027
Epoch [16/51], Iter [310/391] Loss: 0.1479
Epoch [16/51], Iter [320/391] Loss: 0.1794
Epoch [16/51], Iter [330/391] Loss: 0.1288
Epoch [16/51], Iter [340/391] Loss: 0.1358
Epoch [16/51], Iter [350/391] Loss: 0.1455
Epoch [16/51], Iter [360/391] Loss: 0.1476
Epoch [16/51], Iter [370/391] Loss: 0.1596
Epoch [16/51], Iter [380/391] Loss: 0.1100
Epoch [16/51], Iter [390/391] Loss: 0.1299
Epoch [17/51], Iter [10/391] Loss: 0.1271
Epoch [17/51], Iter [20/391] Loss: 0.1374
Epoch [17/51], Iter [30/391] Loss: 0.1704
Epoch [17/51], Iter [40/391] Loss: 0.1402
Epoch [17/51], Iter [50/391] Loss: 0.1454
Epoch [17/51], Iter [60/391] Loss: 0.1499
Epoch [17/51], Iter [70/391] Loss: 0.1652
Epoch [17/51], Iter [80/391] Loss: 0.1323
Epoch [17/51], Iter [90/391] Loss: 0.1199
Epoch [17/51], Iter [100/391] Loss: 0.1294
Epoch [17/51], Iter [110/391] Loss: 0.1338
Epoch [17/51], Iter [120/391] Loss: 0.1587
Epoch [17/51], Iter [130/391] Loss: 0.1511
Epoch [17/51], Iter [140/391] Loss: 0.1463
Epoch [17/51], Iter [150/391] Loss: 0.1304
Epoch [17/51], Iter [160/391] Loss: 0.1386
Epoch [17/51], Iter [170/391] Loss: 0.1288
Epoch [17/51], Iter [180/391] Loss: 0.1370
Epoch [17/51], Iter [190/391] Loss: 0.1542
Epoch [17/51], Iter [200/391] Loss: 0.1578
Epoch [17/51], Iter [210/391] Loss: 0.1254
Epoch [17/51], Iter [220/391] Loss: 0.1372
Epoch [17/51], Iter [230/391] Loss: 0.1363
Epoch [17/51], Iter [240/391] Loss: 0.1469
Epoch [17/51], Iter [250/391] Loss: 0.1590
Epoch [17/51], Iter [260/391] Loss: 0.1493
Epoch [17/51], Iter [270/391] Loss: 0.1333
Epoch [17/51], Iter [280/391] Loss: 0.1145
Epoch [17/51], Iter [290/391] Loss: 0.1523
Epoch [17/51], Iter [300/391] Loss: 0.1486
Epoch [17/51], Iter [310/391] Loss: 0.1509
Epoch [17/51], Iter [320/391] Loss: 0.1510
Epoch [17/51], Iter [330/391] Loss: 0.1587
Epoch [17/51], Iter [340/391] Loss: 0.1362
Epoch [17/51], Iter [350/391] Loss: 0.1457
Epoch [17/51], Iter [360/391] Loss: 0.1548
Epoch [17/51], Iter [370/391] Loss: 0.1436
Epoch [17/51], Iter [380/391] Loss: 0.1433
Epoch [17/51], Iter [390/391] Loss: 0.1333
Epoch [18/51], Iter [10/391] Loss: 0.1467
Epoch [18/51], Iter [20/391] Loss: 0.1494
Epoch [18/51], Iter [30/391] Loss: 0.1527
Epoch [18/51], Iter [40/391] Loss: 0.1310
Epoch [18/51], Iter [50/391] Loss: 0.1464
Epoch [18/51], Iter [60/391] Loss: 0.1471
Epoch [18/51], Iter [70/391] Loss: 0.1339
Epoch [18/51], Iter [80/391] Loss: 0.1512
Epoch [18/51], Iter [90/391] Loss: 0.1268
Epoch [18/51], Iter [100/391] Loss: 0.1168
Epoch [18/51], Iter [110/391] Loss: 0.1371
Epoch [18/51], Iter [120/391] Loss: 0.1312
Epoch [18/51], Iter [130/391] Loss: 0.1530
Epoch [18/51], Iter [140/391] Loss: 0.1247
Epoch [18/51], Iter [150/391] Loss: 0.1492
Epoch [18/51], Iter [160/391] Loss: 0.1285
Epoch [18/51], Iter [170/391] Loss: 0.1231
Epoch [18/51], Iter [180/391] Loss: 0.1562
Epoch [18/51], Iter [190/391] Loss: 0.1451
Epoch [18/51], Iter [200/391] Loss: 0.1419
Epoch [18/51], Iter [210/391] Loss: 0.1464
Epoch [18/51], Iter [220/391] Loss: 0.1234
Epoch [18/51], Iter [230/391] Loss: 0.1287
Epoch [18/51], Iter [240/391] Loss: 0.1546
Epoch [18/51], Iter [250/391] Loss: 0.1283
Epoch [18/51], Iter [260/391] Loss: 0.1398
Epoch [18/51], Iter [270/391] Loss: 0.1261
Epoch [18/51], Iter [280/391] Loss: 0.1270
Epoch [18/51], Iter [290/391] Loss: 0.1218
Epoch [18/51], Iter [300/391] Loss: 0.1162
Epoch [18/51], Iter [310/391] Loss: 0.1366
Epoch [18/51], Iter [320/391] Loss: 0.1468
Epoch [18/51], Iter [330/391] Loss: 0.1082
Epoch [18/51], Iter [340/391] Loss: 0.1304
Epoch [18/51], Iter [350/391] Loss: 0.1476
Epoch [18/51], Iter [360/391] Loss: 0.1760
Epoch [18/51], Iter [370/391] Loss: 0.1329
Epoch [18/51], Iter [380/391] Loss: 0.1436
Epoch [18/51], Iter [390/391] Loss: 0.1528
Epoch [19/51], Iter [10/391] Loss: 0.1311
Epoch [19/51], Iter [20/391] Loss: 0.1371
Epoch [19/51], Iter [30/391] Loss: 0.1487
Epoch [19/51], Iter [40/391] Loss: 0.1342
Epoch [19/51], Iter [50/391] Loss: 0.1263
Epoch [19/51], Iter [60/391] Loss: 0.1115
Epoch [19/51], Iter [70/391] Loss: 0.1492
Epoch [19/51], Iter [80/391] Loss: 0.1116
Epoch [19/51], Iter [90/391] Loss: 0.1087
Epoch [19/51], Iter [100/391] Loss: 0.1306
Epoch [19/51], Iter [110/391] Loss: 0.1153
Epoch [19/51], Iter [120/391] Loss: 0.1347
Epoch [19/51], Iter [130/391] Loss: 0.1530
Epoch [19/51], Iter [140/391] Loss: 0.1170
Epoch [19/51], Iter [150/391] Loss: 0.1154
Epoch [19/51], Iter [160/391] Loss: 0.1116
Epoch [19/51], Iter [170/391] Loss: 0.1216
Epoch [19/51], Iter [180/391] Loss: 0.1291
Epoch [19/51], Iter [190/391] Loss: 0.1466
Epoch [19/51], Iter [200/391] Loss: 0.1452
Epoch [19/51], Iter [210/391] Loss: 0.1333
Epoch [19/51], Iter [220/391] Loss: 0.1352
Epoch [19/51], Iter [230/391] Loss: 0.1186
Epoch [19/51], Iter [240/391] Loss: 0.1369
Epoch [19/51], Iter [250/391] Loss: 0.1403
Epoch [19/51], Iter [260/391] Loss: 0.1345
Epoch [19/51], Iter [270/391] Loss: 0.1178
Epoch [19/51], Iter [280/391] Loss: 0.1157
Epoch [19/51], Iter [290/391] Loss: 0.1367
Epoch [19/51], Iter [300/391] Loss: 0.1358
Epoch [19/51], Iter [310/391] Loss: 0.1131
Epoch [19/51], Iter [320/391] Loss: 0.1181
Epoch [19/51], Iter [330/391] Loss: 0.1331
Epoch [19/51], Iter [340/391] Loss: 0.1482
Epoch [19/51], Iter [350/391] Loss: 0.1333
Epoch [19/51], Iter [360/391] Loss: 0.1393
Epoch [19/51], Iter [370/391] Loss: 0.1502
Epoch [19/51], Iter [380/391] Loss: 0.1251
Epoch [19/51], Iter [390/391] Loss: 0.1168
Epoch [20/51], Iter [10/391] Loss: 0.1284
Epoch [20/51], Iter [20/391] Loss: 0.1261
Epoch [20/51], Iter [30/391] Loss: 0.1256
Epoch [20/51], Iter [40/391] Loss: 0.1329
Epoch [20/51], Iter [50/391] Loss: 0.1225
Epoch [20/51], Iter [60/391] Loss: 0.1242
Epoch [20/51], Iter [70/391] Loss: 0.1209
Epoch [20/51], Iter [80/391] Loss: 0.1374
Epoch [20/51], Iter [90/391] Loss: 0.1170
Epoch [20/51], Iter [100/391] Loss: 0.1411
Epoch [20/51], Iter [110/391] Loss: 0.1233
Epoch [20/51], Iter [120/391] Loss: 0.1447
Epoch [20/51], Iter [130/391] Loss: 0.1454
Epoch [20/51], Iter [140/391] Loss: 0.1311
Epoch [20/51], Iter [150/391] Loss: 0.1204
Epoch [20/51], Iter [160/391] Loss: 0.1565
Epoch [20/51], Iter [170/391] Loss: 0.1257
Epoch [20/51], Iter [180/391] Loss: 0.1244
Epoch [20/51], Iter [190/391] Loss: 0.1241
Epoch [20/51], Iter [200/391] Loss: 0.1272
Epoch [20/51], Iter [210/391] Loss: 0.1462
Epoch [20/51], Iter [220/391] Loss: 0.1211
Epoch [20/51], Iter [230/391] Loss: 0.1065
Epoch [20/51], Iter [240/391] Loss: 0.1267
Epoch [20/51], Iter [250/391] Loss: 0.1399
Epoch [20/51], Iter [260/391] Loss: 0.1247
Epoch [20/51], Iter [270/391] Loss: 0.1363
Epoch [20/51], Iter [280/391] Loss: 0.1467
Epoch [20/51], Iter [290/391] Loss: 0.1511
Epoch [20/51], Iter [300/391] Loss: 0.1310
Epoch [20/51], Iter [310/391] Loss: 0.1191
Epoch [20/51], Iter [320/391] Loss: 0.1417
Epoch [20/51], Iter [330/391] Loss: 0.1477
Epoch [20/51], Iter [340/391] Loss: 0.1191
Epoch [20/51], Iter [350/391] Loss: 0.1489
Epoch [20/51], Iter [360/391] Loss: 0.1097
Epoch [20/51], Iter [370/391] Loss: 0.1360
Epoch [20/51], Iter [380/391] Loss: 0.1299
Epoch [20/51], Iter [390/391] Loss: 0.1255
[Saving Checkpoint]
Epoch [21/51], Iter [10/391] Loss: 0.1230
Epoch [21/51], Iter [20/391] Loss: 0.1174
Epoch [21/51], Iter [30/391] Loss: 0.1273
Epoch [21/51], Iter [40/391] Loss: 0.1150
Epoch [21/51], Iter [50/391] Loss: 0.1431
Epoch [21/51], Iter [60/391] Loss: 0.1655
Epoch [21/51], Iter [70/391] Loss: 0.1142
Epoch [21/51], Iter [80/391] Loss: 0.1256
Epoch [21/51], Iter [90/391] Loss: 0.1066
Epoch [21/51], Iter [100/391] Loss: 0.1031
Epoch [21/51], Iter [110/391] Loss: 0.1069
Epoch [21/51], Iter [120/391] Loss: 0.1227
Epoch [21/51], Iter [130/391] Loss: 0.1385
Epoch [21/51], Iter [140/391] Loss: 0.1310
Epoch [21/51], Iter [150/391] Loss: 0.1122
Epoch [21/51], Iter [160/391] Loss: 0.1079
Epoch [21/51], Iter [170/391] Loss: 0.1269
Epoch [21/51], Iter [180/391] Loss: 0.1050
Epoch [21/51], Iter [190/391] Loss: 0.0998
Epoch [21/51], Iter [200/391] Loss: 0.1198
Epoch [21/51], Iter [210/391] Loss: 0.1119
Epoch [21/51], Iter [220/391] Loss: 0.1312
Epoch [21/51], Iter [230/391] Loss: 0.1151
Epoch [21/51], Iter [240/391] Loss: 0.1071
Epoch [21/51], Iter [250/391] Loss: 0.1142
Epoch [21/51], Iter [260/391] Loss: 0.1190
Epoch [21/51], Iter [270/391] Loss: 0.0983
Epoch [21/51], Iter [280/391] Loss: 0.1236
Epoch [21/51], Iter [290/391] Loss: 0.1392
Epoch [21/51], Iter [300/391] Loss: 0.1352
Epoch [21/51], Iter [310/391] Loss: 0.1287
Epoch [21/51], Iter [320/391] Loss: 0.1315
Epoch [21/51], Iter [330/391] Loss: 0.1318
Epoch [21/51], Iter [340/391] Loss: 0.1241
Epoch [21/51], Iter [350/391] Loss: 0.1358
Epoch [21/51], Iter [360/391] Loss: 0.1211
Epoch [21/51], Iter [370/391] Loss: 0.1367
Epoch [21/51], Iter [380/391] Loss: 0.1037
Epoch [21/51], Iter [390/391] Loss: 0.1001
Epoch [22/51], Iter [10/391] Loss: 0.1203
Epoch [22/51], Iter [20/391] Loss: 0.1304
Epoch [22/51], Iter [30/391] Loss: 0.1239
Epoch [22/51], Iter [40/391] Loss: 0.1101
Epoch [22/51], Iter [50/391] Loss: 0.1230
Epoch [22/51], Iter [60/391] Loss: 0.1163
Epoch [22/51], Iter [70/391] Loss: 0.1328
Epoch [22/51], Iter [80/391] Loss: 0.1422
Epoch [22/51], Iter [90/391] Loss: 0.1168
Epoch [22/51], Iter [100/391] Loss: 0.1052
Epoch [22/51], Iter [110/391] Loss: 0.1333
Epoch [22/51], Iter [120/391] Loss: 0.0928
Epoch [22/51], Iter [130/391] Loss: 0.1165
Epoch [22/51], Iter [140/391] Loss: 0.1312
Epoch [22/51], Iter [150/391] Loss: 0.1274
Epoch [22/51], Iter [160/391] Loss: 0.0939
Epoch [22/51], Iter [170/391] Loss: 0.1037
Epoch [22/51], Iter [180/391] Loss: 0.0983
Epoch [22/51], Iter [190/391] Loss: 0.1454
Epoch [22/51], Iter [200/391] Loss: 0.1198
Epoch [22/51], Iter [210/391] Loss: 0.1287
Epoch [22/51], Iter [220/391] Loss: 0.1146
Epoch [22/51], Iter [230/391] Loss: 0.1060
Epoch [22/51], Iter [240/391] Loss: 0.1044
Epoch [22/51], Iter [250/391] Loss: 0.1210
Epoch [22/51], Iter [260/391] Loss: 0.1183
Epoch [22/51], Iter [270/391] Loss: 0.1363
Epoch [22/51], Iter [280/391] Loss: 0.1026
Epoch [22/51], Iter [290/391] Loss: 0.1172
Epoch [22/51], Iter [300/391] Loss: 0.1029
Epoch [22/51], Iter [310/391] Loss: 0.1272
Epoch [22/51], Iter [320/391] Loss: 0.1167
Epoch [22/51], Iter [330/391] Loss: 0.1298
Epoch [22/51], Iter [340/391] Loss: 0.1152
Epoch [22/51], Iter [350/391] Loss: 0.1111
Epoch [22/51], Iter [360/391] Loss: 0.1232
Epoch [22/51], Iter [370/391] Loss: 0.1232
Epoch [22/51], Iter [380/391] Loss: 0.1184
Epoch [22/51], Iter [390/391] Loss: 0.1144
Epoch [23/51], Iter [10/391] Loss: 0.1188
Epoch [23/51], Iter [20/391] Loss: 0.1284
Epoch [23/51], Iter [30/391] Loss: 0.0988
Epoch [23/51], Iter [40/391] Loss: 0.1224
Epoch [23/51], Iter [50/391] Loss: 0.1208
Epoch [23/51], Iter [60/391] Loss: 0.1061
Epoch [23/51], Iter [70/391] Loss: 0.1358
Epoch [23/51], Iter [80/391] Loss: 0.1116
Epoch [23/51], Iter [90/391] Loss: 0.1237
Epoch [23/51], Iter [100/391] Loss: 0.1007
Epoch [23/51], Iter [110/391] Loss: 0.1266
Epoch [23/51], Iter [120/391] Loss: 0.0992
Epoch [23/51], Iter [130/391] Loss: 0.1005
Epoch [23/51], Iter [140/391] Loss: 0.0925
Epoch [23/51], Iter [150/391] Loss: 0.1059
Epoch [23/51], Iter [160/391] Loss: 0.1100
Epoch [23/51], Iter [170/391] Loss: 0.1276
Epoch [23/51], Iter [180/391] Loss: 0.1193
Epoch [23/51], Iter [190/391] Loss: 0.1089
Epoch [23/51], Iter [200/391] Loss: 0.0954
Epoch [23/51], Iter [210/391] Loss: 0.1323
Epoch [23/51], Iter [220/391] Loss: 0.1098
Epoch [23/51], Iter [230/391] Loss: 0.1037
Epoch [23/51], Iter [240/391] Loss: 0.1052
Epoch [23/51], Iter [250/391] Loss: 0.1114
Epoch [23/51], Iter [260/391] Loss: 0.1010
Epoch [23/51], Iter [270/391] Loss: 0.1349
Epoch [23/51], Iter [280/391] Loss: 0.1350
Epoch [23/51], Iter [290/391] Loss: 0.1008
Epoch [23/51], Iter [300/391] Loss: 0.1289
Epoch [23/51], Iter [310/391] Loss: 0.1253
Epoch [23/51], Iter [320/391] Loss: 0.1095
Epoch [23/51], Iter [330/391] Loss: 0.1170
Epoch [23/51], Iter [340/391] Loss: 0.1149
Epoch [23/51], Iter [350/391] Loss: 0.1165
Epoch [23/51], Iter [360/391] Loss: 0.1314
Epoch [23/51], Iter [370/391] Loss: 0.1256
Epoch [23/51], Iter [380/391] Loss: 0.1154
Epoch [23/51], Iter [390/391] Loss: 0.1101
Epoch [24/51], Iter [10/391] Loss: 0.1267
Epoch [24/51], Iter [20/391] Loss: 0.0969
Epoch [24/51], Iter [30/391] Loss: 0.1011
Epoch [24/51], Iter [40/391] Loss: 0.0918
Epoch [24/51], Iter [50/391] Loss: 0.1241
Epoch [24/51], Iter [60/391] Loss: 0.0883
Epoch [24/51], Iter [70/391] Loss: 0.1112
Epoch [24/51], Iter [80/391] Loss: 0.0887
Epoch [24/51], Iter [90/391] Loss: 0.0880
Epoch [24/51], Iter [100/391] Loss: 0.1084
Epoch [24/51], Iter [110/391] Loss: 0.1118
Epoch [24/51], Iter [120/391] Loss: 0.1107
Epoch [24/51], Iter [130/391] Loss: 0.1163
Epoch [24/51], Iter [140/391] Loss: 0.1184
Epoch [24/51], Iter [150/391] Loss: 0.1367
Epoch [24/51], Iter [160/391] Loss: 0.0957
Epoch [24/51], Iter [170/391] Loss: 0.1255
Epoch [24/51], Iter [180/391] Loss: 0.1004
Epoch [24/51], Iter [190/391] Loss: 0.1126
Epoch [24/51], Iter [200/391] Loss: 0.1313
Epoch [24/51], Iter [210/391] Loss: 0.1040
Epoch [24/51], Iter [220/391] Loss: 0.1105
Epoch [24/51], Iter [230/391] Loss: 0.0966
Epoch [24/51], Iter [240/391] Loss: 0.1219
Epoch [24/51], Iter [250/391] Loss: 0.1093
Epoch [24/51], Iter [260/391] Loss: 0.1273
Epoch [24/51], Iter [270/391] Loss: 0.1250
Epoch [24/51], Iter [280/391] Loss: 0.1091
Epoch [24/51], Iter [290/391] Loss: 0.1111
Epoch [24/51], Iter [300/391] Loss: 0.1063
Epoch [24/51], Iter [310/391] Loss: 0.1118
Epoch [24/51], Iter [320/391] Loss: 0.1076
Epoch [24/51], Iter [330/391] Loss: 0.1189
Epoch [24/51], Iter [340/391] Loss: 0.1168
Epoch [24/51], Iter [350/391] Loss: 0.1124
Epoch [24/51], Iter [360/391] Loss: 0.1389
Epoch [24/51], Iter [370/391] Loss: 0.1334
Epoch [24/51], Iter [380/391] Loss: 0.1203
Epoch [24/51], Iter [390/391] Loss: 0.1111
Epoch [25/51], Iter [10/391] Loss: 0.1005
Epoch [25/51], Iter [20/391] Loss: 0.1054
Epoch [25/51], Iter [30/391] Loss: 0.1234
Epoch [25/51], Iter [40/391] Loss: 0.1095
Epoch [25/51], Iter [50/391] Loss: 0.1034
Epoch [25/51], Iter [60/391] Loss: 0.1061
Epoch [25/51], Iter [70/391] Loss: 0.1019
Epoch [25/51], Iter [80/391] Loss: 0.1212
Epoch [25/51], Iter [90/391] Loss: 0.0937
Epoch [25/51], Iter [100/391] Loss: 0.1000
Epoch [25/51], Iter [110/391] Loss: 0.1006
Epoch [25/51], Iter [120/391] Loss: 0.1042
Epoch [25/51], Iter [130/391] Loss: 0.1069
Epoch [25/51], Iter [140/391] Loss: 0.0948
Epoch [25/51], Iter [150/391] Loss: 0.0938
Epoch [25/51], Iter [160/391] Loss: 0.0999
Epoch [25/51], Iter [170/391] Loss: 0.1033
Epoch [25/51], Iter [180/391] Loss: 0.1061
Epoch [25/51], Iter [190/391] Loss: 0.1243
Epoch [25/51], Iter [200/391] Loss: 0.0922
Epoch [25/51], Iter [210/391] Loss: 0.1318
Epoch [25/51], Iter [220/391] Loss: 0.1302
Epoch [25/51], Iter [230/391] Loss: 0.1237
Epoch [25/51], Iter [240/391] Loss: 0.1109
Epoch [25/51], Iter [250/391] Loss: 0.0882
Epoch [25/51], Iter [260/391] Loss: 0.1228
Epoch [25/51], Iter [270/391] Loss: 0.1002
Epoch [25/51], Iter [280/391] Loss: 0.1020
Epoch [25/51], Iter [290/391] Loss: 0.1001
Epoch [25/51], Iter [300/391] Loss: 0.1072
Epoch [25/51], Iter [310/391] Loss: 0.0975
Epoch [25/51], Iter [320/391] Loss: 0.1056
Epoch [25/51], Iter [330/391] Loss: 0.1183
Epoch [25/51], Iter [340/391] Loss: 0.1006
Epoch [25/51], Iter [350/391] Loss: 0.1183
Epoch [25/51], Iter [360/391] Loss: 0.1018
Epoch [25/51], Iter [370/391] Loss: 0.1209
Epoch [25/51], Iter [380/391] Loss: 0.0996
Epoch [25/51], Iter [390/391] Loss: 0.0978
Epoch [26/51], Iter [10/391] Loss: 0.1032
Epoch [26/51], Iter [20/391] Loss: 0.1072
Epoch [26/51], Iter [30/391] Loss: 0.1018
Epoch [26/51], Iter [40/391] Loss: 0.0889
Epoch [26/51], Iter [50/391] Loss: 0.0932
Epoch [26/51], Iter [60/391] Loss: 0.0893
Epoch [26/51], Iter [70/391] Loss: 0.0882
Epoch [26/51], Iter [80/391] Loss: 0.1000
Epoch [26/51], Iter [90/391] Loss: 0.0929
Epoch [26/51], Iter [100/391] Loss: 0.0975
Epoch [26/51], Iter [110/391] Loss: 0.0985
Epoch [26/51], Iter [120/391] Loss: 0.0812
Epoch [26/51], Iter [130/391] Loss: 0.1034
Epoch [26/51], Iter [140/391] Loss: 0.1171
Epoch [26/51], Iter [150/391] Loss: 0.1062
Epoch [26/51], Iter [160/391] Loss: 0.1124
Epoch [26/51], Iter [170/391] Loss: 0.1013
Epoch [26/51], Iter [180/391] Loss: 0.1168
Epoch [26/51], Iter [190/391] Loss: 0.1330
Epoch [26/51], Iter [200/391] Loss: 0.0928
Epoch [26/51], Iter [210/391] Loss: 0.1132
Epoch [26/51], Iter [220/391] Loss: 0.1011
Epoch [26/51], Iter [230/391] Loss: 0.0875
Epoch [26/51], Iter [240/391] Loss: 0.1009
Epoch [26/51], Iter [250/391] Loss: 0.0924
Epoch [26/51], Iter [260/391] Loss: 0.0926
Epoch [26/51], Iter [270/391] Loss: 0.0865
Epoch [26/51], Iter [280/391] Loss: 0.0940
Epoch [26/51], Iter [290/391] Loss: 0.1070
Epoch [26/51], Iter [300/391] Loss: 0.1321
Epoch [26/51], Iter [310/391] Loss: 0.1010
Epoch [26/51], Iter [320/391] Loss: 0.1009
Epoch [26/51], Iter [330/391] Loss: 0.1242
Epoch [26/51], Iter [340/391] Loss: 0.1100
Epoch [26/51], Iter [350/391] Loss: 0.0998
Epoch [26/51], Iter [360/391] Loss: 0.0979
Epoch [26/51], Iter [370/391] Loss: 0.1362
Epoch [26/51], Iter [380/391] Loss: 0.1046
Epoch [26/51], Iter [390/391] Loss: 0.1187
Epoch [27/51], Iter [10/391] Loss: 0.1122
Epoch [27/51], Iter [20/391] Loss: 0.0883
Epoch [27/51], Iter [30/391] Loss: 0.1259
Epoch [27/51], Iter [40/391] Loss: 0.0938
Epoch [27/51], Iter [50/391] Loss: 0.0973
Epoch [27/51], Iter [60/391] Loss: 0.0871
Epoch [27/51], Iter [70/391] Loss: 0.1125
Epoch [27/51], Iter [80/391] Loss: 0.0907
Epoch [27/51], Iter [90/391] Loss: 0.0945
Epoch [27/51], Iter [100/391] Loss: 0.0900
Epoch [27/51], Iter [110/391] Loss: 0.1089
Epoch [27/51], Iter [120/391] Loss: 0.0988
Epoch [27/51], Iter [130/391] Loss: 0.0783
Epoch [27/51], Iter [140/391] Loss: 0.0872
Epoch [27/51], Iter [150/391] Loss: 0.0960
Epoch [27/51], Iter [160/391] Loss: 0.0990
Epoch [27/51], Iter [170/391] Loss: 0.0897
Epoch [27/51], Iter [180/391] Loss: 0.0883
Epoch [27/51], Iter [190/391] Loss: 0.0920
Epoch [27/51], Iter [200/391] Loss: 0.0964
Epoch [27/51], Iter [210/391] Loss: 0.1153
Epoch [27/51], Iter [220/391] Loss: 0.0958
Epoch [27/51], Iter [230/391] Loss: 0.0931
Epoch [27/51], Iter [240/391] Loss: 0.0957
Epoch [27/51], Iter [250/391] Loss: 0.0993
Epoch [27/51], Iter [260/391] Loss: 0.1071
Epoch [27/51], Iter [270/391] Loss: 0.0890
Epoch [27/51], Iter [280/391] Loss: 0.0988
Epoch [27/51], Iter [290/391] Loss: 0.1069
Epoch [27/51], Iter [300/391] Loss: 0.1098
Epoch [27/51], Iter [310/391] Loss: 0.0959
Epoch [27/51], Iter [320/391] Loss: 0.0990
Epoch [27/51], Iter [330/391] Loss: 0.1210
Epoch [27/51], Iter [340/391] Loss: 0.1091
Epoch [27/51], Iter [350/391] Loss: 0.1009
Epoch [27/51], Iter [360/391] Loss: 0.1079
Epoch [27/51], Iter [370/391] Loss: 0.1188
Epoch [27/51], Iter [380/391] Loss: 0.0836
Epoch [27/51], Iter [390/391] Loss: 0.1011
Epoch [28/51], Iter [10/391] Loss: 0.1048
Epoch [28/51], Iter [20/391] Loss: 0.0976
Epoch [28/51], Iter [30/391] Loss: 0.0804
Epoch [28/51], Iter [40/391] Loss: 0.1242
Epoch [28/51], Iter [50/391] Loss: 0.0955
Epoch [28/51], Iter [60/391] Loss: 0.1050
Epoch [28/51], Iter [70/391] Loss: 0.0925
Epoch [28/51], Iter [80/391] Loss: 0.0929
Epoch [28/51], Iter [90/391] Loss: 0.1003
Epoch [28/51], Iter [100/391] Loss: 0.1036
Epoch [28/51], Iter [110/391] Loss: 0.1051
Epoch [28/51], Iter [120/391] Loss: 0.0971
Epoch [28/51], Iter [130/391] Loss: 0.1030
Epoch [28/51], Iter [140/391] Loss: 0.0855
Epoch [28/51], Iter [150/391] Loss: 0.0874
Epoch [28/51], Iter [160/391] Loss: 0.1032
Epoch [28/51], Iter [170/391] Loss: 0.1093
Epoch [28/51], Iter [180/391] Loss: 0.1116
Epoch [28/51], Iter [190/391] Loss: 0.1064
Epoch [28/51], Iter [200/391] Loss: 0.0950
Epoch [28/51], Iter [210/391] Loss: 0.0882
Epoch [28/51], Iter [220/391] Loss: 0.0897
Epoch [28/51], Iter [230/391] Loss: 0.1028
Epoch [28/51], Iter [240/391] Loss: 0.0915
Epoch [28/51], Iter [250/391] Loss: 0.1016
Epoch [28/51], Iter [260/391] Loss: 0.1071
Epoch [28/51], Iter [270/391] Loss: 0.0929
Epoch [28/51], Iter [280/391] Loss: 0.0867
Epoch [28/51], Iter [290/391] Loss: 0.1043
Epoch [28/51], Iter [300/391] Loss: 0.0870
Epoch [28/51], Iter [310/391] Loss: 0.1032
Epoch [28/51], Iter [320/391] Loss: 0.1171
Epoch [28/51], Iter [330/391] Loss: 0.0974
Epoch [28/51], Iter [340/391] Loss: 0.0850
Epoch [28/51], Iter [350/391] Loss: 0.0954
Epoch [28/51], Iter [360/391] Loss: 0.1062
Epoch [28/51], Iter [370/391] Loss: 0.0996
Epoch [28/51], Iter [380/391] Loss: 0.0974
Epoch [28/51], Iter [390/391] Loss: 0.0978
Epoch [29/51], Iter [10/391] Loss: 0.0990
Epoch [29/51], Iter [20/391] Loss: 0.0980
Epoch [29/51], Iter [30/391] Loss: 0.1048
Epoch [29/51], Iter [40/391] Loss: 0.1092
Epoch [29/51], Iter [50/391] Loss: 0.1098
Epoch [29/51], Iter [60/391] Loss: 0.0872
Epoch [29/51], Iter [70/391] Loss: 0.0983
Epoch [29/51], Iter [80/391] Loss: 0.0909
Epoch [29/51], Iter [90/391] Loss: 0.0930
Epoch [29/51], Iter [100/391] Loss: 0.0928
Epoch [29/51], Iter [110/391] Loss: 0.0956
Epoch [29/51], Iter [120/391] Loss: 0.0930
Epoch [29/51], Iter [130/391] Loss: 0.0847
Epoch [29/51], Iter [140/391] Loss: 0.1020
Epoch [29/51], Iter [150/391] Loss: 0.0965
Epoch [29/51], Iter [160/391] Loss: 0.0925
Epoch [29/51], Iter [170/391] Loss: 0.0955
Epoch [29/51], Iter [180/391] Loss: 0.0693
Epoch [29/51], Iter [190/391] Loss: 0.0941
Epoch [29/51], Iter [200/391] Loss: 0.0832
Epoch [29/51], Iter [210/391] Loss: 0.0959
Epoch [29/51], Iter [220/391] Loss: 0.0899
Epoch [29/51], Iter [230/391] Loss: 0.0928
Epoch [29/51], Iter [240/391] Loss: 0.0903
Epoch [29/51], Iter [250/391] Loss: 0.1110
Epoch [29/51], Iter [260/391] Loss: 0.0859
Epoch [29/51], Iter [270/391] Loss: 0.0967
Epoch [29/51], Iter [280/391] Loss: 0.0979
Epoch [29/51], Iter [290/391] Loss: 0.1029
Epoch [29/51], Iter [300/391] Loss: 0.0918
Epoch [29/51], Iter [310/391] Loss: 0.0844
Epoch [29/51], Iter [320/391] Loss: 0.1023
Epoch [29/51], Iter [330/391] Loss: 0.0904
Epoch [29/51], Iter [340/391] Loss: 0.0893
Epoch [29/51], Iter [350/391] Loss: 0.0877
Epoch [29/51], Iter [360/391] Loss: 0.0889
Epoch [29/51], Iter [370/391] Loss: 0.0975
Epoch [29/51], Iter [380/391] Loss: 0.0961
Epoch [29/51], Iter [390/391] Loss: 0.0883
Epoch [30/51], Iter [10/391] Loss: 0.0874
Epoch [30/51], Iter [20/391] Loss: 0.0965
Epoch [30/51], Iter [30/391] Loss: 0.0820
Epoch [30/51], Iter [40/391] Loss: 0.0952
Epoch [30/51], Iter [50/391] Loss: 0.1083
Epoch [30/51], Iter [60/391] Loss: 0.0855
Epoch [30/51], Iter [70/391] Loss: 0.0760
Epoch [30/51], Iter [80/391] Loss: 0.0835
Epoch [30/51], Iter [90/391] Loss: 0.0773
Epoch [30/51], Iter [100/391] Loss: 0.1095
Epoch [30/51], Iter [110/391] Loss: 0.0999
Epoch [30/51], Iter [120/391] Loss: 0.0897
Epoch [30/51], Iter [130/391] Loss: 0.0927
Epoch [30/51], Iter [140/391] Loss: 0.0863
Epoch [30/51], Iter [150/391] Loss: 0.0925
Epoch [30/51], Iter [160/391] Loss: 0.1138
Epoch [30/51], Iter [170/391] Loss: 0.0802
Epoch [30/51], Iter [180/391] Loss: 0.0865
Epoch [30/51], Iter [190/391] Loss: 0.0868
Epoch [30/51], Iter [200/391] Loss: 0.0812
Epoch [30/51], Iter [210/391] Loss: 0.0973
Epoch [30/51], Iter [220/391] Loss: 0.0832
Epoch [30/51], Iter [230/391] Loss: 0.0927
Epoch [30/51], Iter [240/391] Loss: 0.1004
Epoch [30/51], Iter [250/391] Loss: 0.0922
Epoch [30/51], Iter [260/391] Loss: 0.1099
Epoch [30/51], Iter [270/391] Loss: 0.0925
Epoch [30/51], Iter [280/391] Loss: 0.0894
Epoch [30/51], Iter [290/391] Loss: 0.0995
Epoch [30/51], Iter [300/391] Loss: 0.0979
Epoch [30/51], Iter [310/391] Loss: 0.0919
Epoch [30/51], Iter [320/391] Loss: 0.0770
Epoch [30/51], Iter [330/391] Loss: 0.0828
Epoch [30/51], Iter [340/391] Loss: 0.1080
Epoch [30/51], Iter [350/391] Loss: 0.0794
Epoch [30/51], Iter [360/391] Loss: 0.0808
Epoch [30/51], Iter [370/391] Loss: 0.1118
Epoch [30/51], Iter [380/391] Loss: 0.0737
Epoch [30/51], Iter [390/391] Loss: 0.0941
[Saving Checkpoint]
Epoch [31/51], Iter [10/391] Loss: 0.0783
Epoch [31/51], Iter [20/391] Loss: 0.0770
Epoch [31/51], Iter [30/391] Loss: 0.0782
Epoch [31/51], Iter [40/391] Loss: 0.0743
Epoch [31/51], Iter [50/391] Loss: 0.0911
Epoch [31/51], Iter [60/391] Loss: 0.0815
Epoch [31/51], Iter [70/391] Loss: 0.0905
Epoch [31/51], Iter [80/391] Loss: 0.0873
Epoch [31/51], Iter [90/391] Loss: 0.0856
Epoch [31/51], Iter [100/391] Loss: 0.0991
Epoch [31/51], Iter [110/391] Loss: 0.0899
Epoch [31/51], Iter [120/391] Loss: 0.0838
Epoch [31/51], Iter [130/391] Loss: 0.1010
Epoch [31/51], Iter [140/391] Loss: 0.0798
Epoch [31/51], Iter [150/391] Loss: 0.0721
Epoch [31/51], Iter [160/391] Loss: 0.0831
Epoch [31/51], Iter [170/391] Loss: 0.0865
Epoch [31/51], Iter [180/391] Loss: 0.1032
Epoch [31/51], Iter [190/391] Loss: 0.0790
Epoch [31/51], Iter [200/391] Loss: 0.0892
Epoch [31/51], Iter [210/391] Loss: 0.0838
Epoch [31/51], Iter [220/391] Loss: 0.0755
Epoch [31/51], Iter [230/391] Loss: 0.1056
Epoch [31/51], Iter [240/391] Loss: 0.1036
Epoch [31/51], Iter [250/391] Loss: 0.0982
Epoch [31/51], Iter [260/391] Loss: 0.0911
Epoch [31/51], Iter [270/391] Loss: 0.0905
Epoch [31/51], Iter [280/391] Loss: 0.1001
Epoch [31/51], Iter [290/391] Loss: 0.1133
Epoch [31/51], Iter [300/391] Loss: 0.0907
Epoch [31/51], Iter [310/391] Loss: 0.0863
Epoch [31/51], Iter [320/391] Loss: 0.0890
Epoch [31/51], Iter [330/391] Loss: 0.0762
Epoch [31/51], Iter [340/391] Loss: 0.0891
Epoch [31/51], Iter [350/391] Loss: 0.0956
Epoch [31/51], Iter [360/391] Loss: 0.0973
Epoch [31/51], Iter [370/391] Loss: 0.0884
Epoch [31/51], Iter [380/391] Loss: 0.1045
Epoch [31/51], Iter [390/391] Loss: 0.0894
Epoch [32/51], Iter [10/391] Loss: 0.0779
Epoch [32/51], Iter [20/391] Loss: 0.0973
Epoch [32/51], Iter [30/391] Loss: 0.0960
Epoch [32/51], Iter [40/391] Loss: 0.0867
Epoch [32/51], Iter [50/391] Loss: 0.0636
Epoch [32/51], Iter [60/391] Loss: 0.0833
Epoch [32/51], Iter [70/391] Loss: 0.0887
Epoch [32/51], Iter [80/391] Loss: 0.0794
Epoch [32/51], Iter [90/391] Loss: 0.0962
Epoch [32/51], Iter [100/391] Loss: 0.0811
Epoch [32/51], Iter [110/391] Loss: 0.0868
Epoch [32/51], Iter [120/391] Loss: 0.0914
Epoch [32/51], Iter [130/391] Loss: 0.0837
Epoch [32/51], Iter [140/391] Loss: 0.0915
Epoch [32/51], Iter [150/391] Loss: 0.0858
Epoch [32/51], Iter [160/391] Loss: 0.0951
Epoch [32/51], Iter [170/391] Loss: 0.0766
Epoch [32/51], Iter [180/391] Loss: 0.0939
Epoch [32/51], Iter [190/391] Loss: 0.0943
Epoch [32/51], Iter [200/391] Loss: 0.0743
Epoch [32/51], Iter [210/391] Loss: 0.0938
Epoch [32/51], Iter [220/391] Loss: 0.0867
Epoch [32/51], Iter [230/391] Loss: 0.0895
Epoch [32/51], Iter [240/391] Loss: 0.1016
Epoch [32/51], Iter [250/391] Loss: 0.0878
Epoch [32/51], Iter [260/391] Loss: 0.0903
Epoch [32/51], Iter [270/391] Loss: 0.0824
Epoch [32/51], Iter [280/391] Loss: 0.0857
Epoch [32/51], Iter [290/391] Loss: 0.1094
Epoch [32/51], Iter [300/391] Loss: 0.0837
Epoch [32/51], Iter [310/391] Loss: 0.0843
Epoch [32/51], Iter [320/391] Loss: 0.0863
Epoch [32/51], Iter [330/391] Loss: 0.1091
Epoch [32/51], Iter [340/391] Loss: 0.0938
Epoch [32/51], Iter [350/391] Loss: 0.1008
Epoch [32/51], Iter [360/391] Loss: 0.0911
Epoch [32/51], Iter [370/391] Loss: 0.0842
Epoch [32/51], Iter [380/391] Loss: 0.0898
Epoch [32/51], Iter [390/391] Loss: 0.0973
Epoch [33/51], Iter [10/391] Loss: 0.0835
Epoch [33/51], Iter [20/391] Loss: 0.0879
Epoch [33/51], Iter [30/391] Loss: 0.0865
Epoch [33/51], Iter [40/391] Loss: 0.0879
Epoch [33/51], Iter [50/391] Loss: 0.0902
Epoch [33/51], Iter [60/391] Loss: 0.0867
Epoch [33/51], Iter [70/391] Loss: 0.1131
Epoch [33/51], Iter [80/391] Loss: 0.0724
Epoch [33/51], Iter [90/391] Loss: 0.0784
Epoch [33/51], Iter [100/391] Loss: 0.0789
Epoch [33/51], Iter [110/391] Loss: 0.0870
Epoch [33/51], Iter [120/391] Loss: 0.0894
Epoch [33/51], Iter [130/391] Loss: 0.0853
Epoch [33/51], Iter [140/391] Loss: 0.0817
Epoch [33/51], Iter [150/391] Loss: 0.0820
Epoch [33/51], Iter [160/391] Loss: 0.0855
Epoch [33/51], Iter [170/391] Loss: 0.0711
Epoch [33/51], Iter [180/391] Loss: 0.0845
Epoch [33/51], Iter [190/391] Loss: 0.0758
Epoch [33/51], Iter [200/391] Loss: 0.0874
Epoch [33/51], Iter [210/391] Loss: 0.0818
Epoch [33/51], Iter [220/391] Loss: 0.0752
Epoch [33/51], Iter [230/391] Loss: 0.0790
Epoch [33/51], Iter [240/391] Loss: 0.0957
Epoch [33/51], Iter [250/391] Loss: 0.1007
Epoch [33/51], Iter [260/391] Loss: 0.0853
Epoch [33/51], Iter [270/391] Loss: 0.0776
Epoch [33/51], Iter [280/391] Loss: 0.0881
Epoch [33/51], Iter [290/391] Loss: 0.0808
Epoch [33/51], Iter [300/391] Loss: 0.0868
Epoch [33/51], Iter [310/391] Loss: 0.1064
Epoch [33/51], Iter [320/391] Loss: 0.0866
Epoch [33/51], Iter [330/391] Loss: 0.0820
Epoch [33/51], Iter [340/391] Loss: 0.0763
Epoch [33/51], Iter [350/391] Loss: 0.0669
Epoch [33/51], Iter [360/391] Loss: 0.0824
Epoch [33/51], Iter [370/391] Loss: 0.0919
Epoch [33/51], Iter [380/391] Loss: 0.0833
Epoch [33/51], Iter [390/391] Loss: 0.0814
Epoch [34/51], Iter [10/391] Loss: 0.0791
Epoch [34/51], Iter [20/391] Loss: 0.1015
Epoch [34/51], Iter [30/391] Loss: 0.0703
Epoch [34/51], Iter [40/391] Loss: 0.0668
Epoch [34/51], Iter [50/391] Loss: 0.0875
Epoch [34/51], Iter [60/391] Loss: 0.0826
Epoch [34/51], Iter [70/391] Loss: 0.0775
Epoch [34/51], Iter [80/391] Loss: 0.0860
Epoch [34/51], Iter [90/391] Loss: 0.0788
Epoch [34/51], Iter [100/391] Loss: 0.0833
Epoch [34/51], Iter [110/391] Loss: 0.0866
Epoch [34/51], Iter [120/391] Loss: 0.0844
Epoch [34/51], Iter [130/391] Loss: 0.0617
Epoch [34/51], Iter [140/391] Loss: 0.0766
Epoch [34/51], Iter [150/391] Loss: 0.0792
Epoch [34/51], Iter [160/391] Loss: 0.0789
Epoch [34/51], Iter [170/391] Loss: 0.0848
Epoch [34/51], Iter [180/391] Loss: 0.0877
Epoch [34/51], Iter [190/391] Loss: 0.0779
Epoch [34/51], Iter [200/391] Loss: 0.1079
Epoch [34/51], Iter [210/391] Loss: 0.0975
Epoch [34/51], Iter [220/391] Loss: 0.0785
Epoch [34/51], Iter [230/391] Loss: 0.0726
Epoch [34/51], Iter [240/391] Loss: 0.0928
Epoch [34/51], Iter [250/391] Loss: 0.0878
Epoch [34/51], Iter [260/391] Loss: 0.1021
Epoch [34/51], Iter [270/391] Loss: 0.0844
Epoch [34/51], Iter [280/391] Loss: 0.0791
Epoch [34/51], Iter [290/391] Loss: 0.1010
Epoch [34/51], Iter [300/391] Loss: 0.0927
Epoch [34/51], Iter [310/391] Loss: 0.1024
Epoch [34/51], Iter [320/391] Loss: 0.0976
Epoch [34/51], Iter [330/391] Loss: 0.0815
Epoch [34/51], Iter [340/391] Loss: 0.0885
Epoch [34/51], Iter [350/391] Loss: 0.0707
Epoch [34/51], Iter [360/391] Loss: 0.0772
Epoch [34/51], Iter [370/391] Loss: 0.0779
Epoch [34/51], Iter [380/391] Loss: 0.0825
Epoch [34/51], Iter [390/391] Loss: 0.0729
Epoch [35/51], Iter [10/391] Loss: 0.0895
Epoch [35/51], Iter [20/391] Loss: 0.0802
Epoch [35/51], Iter [30/391] Loss: 0.0732
Epoch [35/51], Iter [40/391] Loss: 0.0672
Epoch [35/51], Iter [50/391] Loss: 0.0893
Epoch [35/51], Iter [60/391] Loss: 0.0730
Epoch [35/51], Iter [70/391] Loss: 0.0830
Epoch [35/51], Iter [80/391] Loss: 0.0715
Epoch [35/51], Iter [90/391] Loss: 0.0836
Epoch [35/51], Iter [100/391] Loss: 0.0668
Epoch [35/51], Iter [110/391] Loss: 0.0804
Epoch [35/51], Iter [120/391] Loss: 0.0753
Epoch [35/51], Iter [130/391] Loss: 0.0754
Epoch [35/51], Iter [140/391] Loss: 0.0768
Epoch [35/51], Iter [150/391] Loss: 0.0759
Epoch [35/51], Iter [160/391] Loss: 0.0797
Epoch [35/51], Iter [170/391] Loss: 0.0753
Epoch [35/51], Iter [180/391] Loss: 0.0717
Epoch [35/51], Iter [190/391] Loss: 0.0873
Epoch [35/51], Iter [200/391] Loss: 0.0954
Epoch [35/51], Iter [210/391] Loss: 0.0806
Epoch [35/51], Iter [220/391] Loss: 0.0757
Epoch [35/51], Iter [230/391] Loss: 0.0812
Epoch [35/51], Iter [240/391] Loss: 0.0678
Epoch [35/51], Iter [250/391] Loss: 0.0894
Epoch [35/51], Iter [260/391] Loss: 0.0754
Epoch [35/51], Iter [270/391] Loss: 0.0807
Epoch [35/51], Iter [280/391] Loss: 0.0857
Epoch [35/51], Iter [290/391] Loss: 0.0767
Epoch [35/51], Iter [300/391] Loss: 0.0826
Epoch [35/51], Iter [310/391] Loss: 0.0874
Epoch [35/51], Iter [320/391] Loss: 0.0718
Epoch [35/51], Iter [330/391] Loss: 0.1066
Epoch [35/51], Iter [340/391] Loss: 0.0775
Epoch [35/51], Iter [350/391] Loss: 0.0801
Epoch [35/51], Iter [360/391] Loss: 0.0799
Epoch [35/51], Iter [370/391] Loss: 0.0835
Epoch [35/51], Iter [380/391] Loss: 0.0753
Epoch [35/51], Iter [390/391] Loss: 0.0848
Epoch [36/51], Iter [10/391] Loss: 0.0771
Epoch [36/51], Iter [20/391] Loss: 0.0723
Epoch [36/51], Iter [30/391] Loss: 0.0783
Epoch [36/51], Iter [40/391] Loss: 0.0817
Epoch [36/51], Iter [50/391] Loss: 0.0744
Epoch [36/51], Iter [60/391] Loss: 0.0742
Epoch [36/51], Iter [70/391] Loss: 0.0834
Epoch [36/51], Iter [80/391] Loss: 0.0740
Epoch [36/51], Iter [90/391] Loss: 0.0691
Epoch [36/51], Iter [100/391] Loss: 0.0915
Epoch [36/51], Iter [110/391] Loss: 0.0941
Epoch [36/51], Iter [120/391] Loss: 0.0781
Epoch [36/51], Iter [130/391] Loss: 0.0946
Epoch [36/51], Iter [140/391] Loss: 0.0941
Epoch [36/51], Iter [150/391] Loss: 0.0721
Epoch [36/51], Iter [160/391] Loss: 0.0813
Epoch [36/51], Iter [170/391] Loss: 0.0724
Epoch [36/51], Iter [180/391] Loss: 0.0810
Epoch [36/51], Iter [190/391] Loss: 0.0650
Epoch [36/51], Iter [200/391] Loss: 0.0753
Epoch [36/51], Iter [210/391] Loss: 0.0655
Epoch [36/51], Iter [220/391] Loss: 0.0804
Epoch [36/51], Iter [230/391] Loss: 0.0829
Epoch [36/51], Iter [240/391] Loss: 0.0729
Epoch [36/51], Iter [250/391] Loss: 0.0741
Epoch [36/51], Iter [260/391] Loss: 0.0873
Epoch [36/51], Iter [270/391] Loss: 0.0795
Epoch [36/51], Iter [280/391] Loss: 0.0873
Epoch [36/51], Iter [290/391] Loss: 0.0851
Epoch [36/51], Iter [300/391] Loss: 0.0812
Epoch [36/51], Iter [310/391] Loss: 0.0753
Epoch [36/51], Iter [320/391] Loss: 0.0811
Epoch [36/51], Iter [330/391] Loss: 0.0774
Epoch [36/51], Iter [340/391] Loss: 0.0863
Epoch [36/51], Iter [350/391] Loss: 0.0660
Epoch [36/51], Iter [360/391] Loss: 0.0694
Epoch [36/51], Iter [370/391] Loss: 0.1041
Epoch [36/51], Iter [380/391] Loss: 0.0988
Epoch [36/51], Iter [390/391] Loss: 0.0778
Epoch [37/51], Iter [10/391] Loss: 0.0871
Epoch [37/51], Iter [20/391] Loss: 0.0835
Epoch [37/51], Iter [30/391] Loss: 0.0760
Epoch [37/51], Iter [40/391] Loss: 0.0714
Epoch [37/51], Iter [50/391] Loss: 0.0688
Epoch [37/51], Iter [60/391] Loss: 0.0826
Epoch [37/51], Iter [70/391] Loss: 0.0795
Epoch [37/51], Iter [80/391] Loss: 0.0703
Epoch [37/51], Iter [90/391] Loss: 0.0747
Epoch [37/51], Iter [100/391] Loss: 0.0659
Epoch [37/51], Iter [110/391] Loss: 0.0847
Epoch [37/51], Iter [120/391] Loss: 0.0720
Epoch [37/51], Iter [130/391] Loss: 0.0805
Epoch [37/51], Iter [140/391] Loss: 0.0933
Epoch [37/51], Iter [150/391] Loss: 0.0714
Epoch [37/51], Iter [160/391] Loss: 0.0741
Epoch [37/51], Iter [170/391] Loss: 0.0766
Epoch [37/51], Iter [180/391] Loss: 0.0681
Epoch [37/51], Iter [190/391] Loss: 0.0732
Epoch [37/51], Iter [200/391] Loss: 0.0790
Epoch [37/51], Iter [210/391] Loss: 0.0761
Epoch [37/51], Iter [220/391] Loss: 0.0822
Epoch [37/51], Iter [230/391] Loss: 0.0864
Epoch [37/51], Iter [240/391] Loss: 0.0740
Epoch [37/51], Iter [250/391] Loss: 0.0812
Epoch [37/51], Iter [260/391] Loss: 0.0738
Epoch [37/51], Iter [270/391] Loss: 0.0785
Epoch [37/51], Iter [280/391] Loss: 0.0816
Epoch [37/51], Iter [290/391] Loss: 0.0814
Epoch [37/51], Iter [300/391] Loss: 0.0706
Epoch [37/51], Iter [310/391] Loss: 0.0795
Epoch [37/51], Iter [320/391] Loss: 0.0756
Epoch [37/51], Iter [330/391] Loss: 0.0842
Epoch [37/51], Iter [340/391] Loss: 0.0944
Epoch [37/51], Iter [350/391] Loss: 0.0788
Epoch [37/51], Iter [360/391] Loss: 0.0782
Epoch [37/51], Iter [370/391] Loss: 0.0843
Epoch [37/51], Iter [380/391] Loss: 0.0729
Epoch [37/51], Iter [390/391] Loss: 0.0691
Epoch [38/51], Iter [10/391] Loss: 0.0741
Epoch [38/51], Iter [20/391] Loss: 0.0759
Epoch [38/51], Iter [30/391] Loss: 0.0812
Epoch [38/51], Iter [40/391] Loss: 0.0705
Epoch [38/51], Iter [50/391] Loss: 0.0728
Epoch [38/51], Iter [60/391] Loss: 0.0733
Epoch [38/51], Iter [70/391] Loss: 0.0771
Epoch [38/51], Iter [80/391] Loss: 0.0762
Epoch [38/51], Iter [90/391] Loss: 0.0647
Epoch [38/51], Iter [100/391] Loss: 0.0663
Epoch [38/51], Iter [110/391] Loss: 0.0807
Epoch [38/51], Iter [120/391] Loss: 0.0895
Epoch [38/51], Iter [130/391] Loss: 0.0802
Epoch [38/51], Iter [140/391] Loss: 0.0857
Epoch [38/51], Iter [150/391] Loss: 0.0652
Epoch [38/51], Iter [160/391] Loss: 0.0708
Epoch [38/51], Iter [170/391] Loss: 0.0698
Epoch [38/51], Iter [180/391] Loss: 0.0834
Epoch [38/51], Iter [190/391] Loss: 0.0697
Epoch [38/51], Iter [200/391] Loss: 0.0594
Epoch [38/51], Iter [210/391] Loss: 0.0736
Epoch [38/51], Iter [220/391] Loss: 0.0732
Epoch [38/51], Iter [230/391] Loss: 0.0709
Epoch [38/51], Iter [240/391] Loss: 0.0797
Epoch [38/51], Iter [250/391] Loss: 0.0845
Epoch [38/51], Iter [260/391] Loss: 0.0819
Epoch [38/51], Iter [270/391] Loss: 0.0650
Epoch [38/51], Iter [280/391] Loss: 0.0761
Epoch [38/51], Iter [290/391] Loss: 0.0774
Epoch [38/51], Iter [300/391] Loss: 0.0769
Epoch [38/51], Iter [310/391] Loss: 0.0784
Epoch [38/51], Iter [320/391] Loss: 0.0847
Epoch [38/51], Iter [330/391] Loss: 0.0675
Epoch [38/51], Iter [340/391] Loss: 0.0731
Epoch [38/51], Iter [350/391] Loss: 0.0809
Epoch [38/51], Iter [360/391] Loss: 0.0975
Epoch [38/51], Iter [370/391] Loss: 0.0707
Epoch [38/51], Iter [380/391] Loss: 0.0706
Epoch [38/51], Iter [390/391] Loss: 0.0824
Epoch [39/51], Iter [10/391] Loss: 0.0706
Epoch [39/51], Iter [20/391] Loss: 0.0739
Epoch [39/51], Iter [30/391] Loss: 0.0664
Epoch [39/51], Iter [40/391] Loss: 0.0796
Epoch [39/51], Iter [50/391] Loss: 0.0757
Epoch [39/51], Iter [60/391] Loss: 0.0676
Epoch [39/51], Iter [70/391] Loss: 0.0831
Epoch [39/51], Iter [80/391] Loss: 0.0836
Epoch [39/51], Iter [90/391] Loss: 0.0765
Epoch [39/51], Iter [100/391] Loss: 0.0649
Epoch [39/51], Iter [110/391] Loss: 0.0687
Epoch [39/51], Iter [120/391] Loss: 0.0645
Epoch [39/51], Iter [130/391] Loss: 0.0785
Epoch [39/51], Iter [140/391] Loss: 0.0690
Epoch [39/51], Iter [150/391] Loss: 0.0915
Epoch [39/51], Iter [160/391] Loss: 0.0805
Epoch [39/51], Iter [170/391] Loss: 0.0710
Epoch [39/51], Iter [180/391] Loss: 0.0698
Epoch [39/51], Iter [190/391] Loss: 0.0705
Epoch [39/51], Iter [200/391] Loss: 0.0699
Epoch [39/51], Iter [210/391] Loss: 0.0750
Epoch [39/51], Iter [220/391] Loss: 0.0870
Epoch [39/51], Iter [230/391] Loss: 0.0674
Epoch [39/51], Iter [240/391] Loss: 0.0650
Epoch [39/51], Iter [250/391] Loss: 0.0771
Epoch [39/51], Iter [260/391] Loss: 0.0768
Epoch [39/51], Iter [270/391] Loss: 0.0895
Epoch [39/51], Iter [280/391] Loss: 0.0773
Epoch [39/51], Iter [290/391] Loss: 0.0880
Epoch [39/51], Iter [300/391] Loss: 0.0748
Epoch [39/51], Iter [310/391] Loss: 0.0829
Epoch [39/51], Iter [320/391] Loss: 0.0869
Epoch [39/51], Iter [330/391] Loss: 0.0655
Epoch [39/51], Iter [340/391] Loss: 0.0846
Epoch [39/51], Iter [350/391] Loss: 0.0783
Epoch [39/51], Iter [360/391] Loss: 0.0827
Epoch [39/51], Iter [370/391] Loss: 0.0862
Epoch [39/51], Iter [380/391] Loss: 0.0745
Epoch [39/51], Iter [390/391] Loss: 0.0831
Epoch [40/51], Iter [10/391] Loss: 0.0629
Epoch [40/51], Iter [20/391] Loss: 0.0750
Epoch [40/51], Iter [30/391] Loss: 0.0710
Epoch [40/51], Iter [40/391] Loss: 0.0525
Epoch [40/51], Iter [50/391] Loss: 0.0641
Epoch [40/51], Iter [60/391] Loss: 0.0822
Epoch [40/51], Iter [70/391] Loss: 0.0564
Epoch [40/51], Iter [80/391] Loss: 0.0869
Epoch [40/51], Iter [90/391] Loss: 0.0641
Epoch [40/51], Iter [100/391] Loss: 0.0724
Epoch [40/51], Iter [110/391] Loss: 0.0694
Epoch [40/51], Iter [120/391] Loss: 0.0641
Epoch [40/51], Iter [130/391] Loss: 0.0793
Epoch [40/51], Iter [140/391] Loss: 0.0735
Epoch [40/51], Iter [150/391] Loss: 0.0589
Epoch [40/51], Iter [160/391] Loss: 0.0832
Epoch [40/51], Iter [170/391] Loss: 0.0659
Epoch [40/51], Iter [180/391] Loss: 0.0754
Epoch [40/51], Iter [190/391] Loss: 0.0785
Epoch [40/51], Iter [200/391] Loss: 0.0783
Epoch [40/51], Iter [210/391] Loss: 0.0727
Epoch [40/51], Iter [220/391] Loss: 0.0791
Epoch [40/51], Iter [230/391] Loss: 0.0681
Epoch [40/51], Iter [240/391] Loss: 0.0624
Epoch [40/51], Iter [250/391] Loss: 0.0885
Epoch [40/51], Iter [260/391] Loss: 0.0865
Epoch [40/51], Iter [270/391] Loss: 0.0726
Epoch [40/51], Iter [280/391] Loss: 0.0801
Epoch [40/51], Iter [290/391] Loss: 0.0696
Epoch [40/51], Iter [300/391] Loss: 0.0728
Epoch [40/51], Iter [310/391] Loss: 0.0579
Epoch [40/51], Iter [320/391] Loss: 0.0838
Epoch [40/51], Iter [330/391] Loss: 0.0682
Epoch [40/51], Iter [340/391] Loss: 0.0792
Epoch [40/51], Iter [350/391] Loss: 0.0868
Epoch [40/51], Iter [360/391] Loss: 0.0812
Epoch [40/51], Iter [370/391] Loss: 0.0688
Epoch [40/51], Iter [380/391] Loss: 0.0755
Epoch [40/51], Iter [390/391] Loss: 0.0762
[Saving Checkpoint]
Epoch [41/51], Iter [10/391] Loss: 0.0602
Epoch [41/51], Iter [20/391] Loss: 0.0758
Epoch [41/51], Iter [30/391] Loss: 0.0669
Epoch [41/51], Iter [40/391] Loss: 0.0622
Epoch [41/51], Iter [50/391] Loss: 0.0661
Epoch [41/51], Iter [60/391] Loss: 0.0588
Epoch [41/51], Iter [70/391] Loss: 0.0657
Epoch [41/51], Iter [80/391] Loss: 0.0667
Epoch [41/51], Iter [90/391] Loss: 0.0649
Epoch [41/51], Iter [100/391] Loss: 0.0772
Epoch [41/51], Iter [110/391] Loss: 0.0721
Epoch [41/51], Iter [120/391] Loss: 0.0785
Epoch [41/51], Iter [130/391] Loss: 0.0683
Epoch [41/51], Iter [140/391] Loss: 0.0822
Epoch [41/51], Iter [150/391] Loss: 0.0814
Epoch [41/51], Iter [160/391] Loss: 0.0788
Epoch [41/51], Iter [170/391] Loss: 0.0764
Epoch [41/51], Iter [180/391] Loss: 0.0805
Epoch [41/51], Iter [190/391] Loss: 0.0779
Epoch [41/51], Iter [200/391] Loss: 0.0724
Epoch [41/51], Iter [210/391] Loss: 0.0623
Epoch [41/51], Iter [220/391] Loss: 0.0635
Epoch [41/51], Iter [230/391] Loss: 0.0751
Epoch [41/51], Iter [240/391] Loss: 0.0814
Epoch [41/51], Iter [250/391] Loss: 0.0840
Epoch [41/51], Iter [260/391] Loss: 0.0611
Epoch [41/51], Iter [270/391] Loss: 0.0795
Epoch [41/51], Iter [280/391] Loss: 0.0614
Epoch [41/51], Iter [290/391] Loss: 0.0826
Epoch [41/51], Iter [300/391] Loss: 0.0663
Epoch [41/51], Iter [310/391] Loss: 0.0770
Epoch [41/51], Iter [320/391] Loss: 0.0815
Epoch [41/51], Iter [330/391] Loss: 0.0657
Epoch [41/51], Iter [340/391] Loss: 0.0680
Epoch [41/51], Iter [350/391] Loss: 0.0685
Epoch [41/51], Iter [360/391] Loss: 0.0630
Epoch [41/51], Iter [370/391] Loss: 0.0786
Epoch [41/51], Iter [380/391] Loss: 0.0586
Epoch [41/51], Iter [390/391] Loss: 0.0813
Epoch [42/51], Iter [10/391] Loss: 0.0646
Epoch [42/51], Iter [20/391] Loss: 0.0698
Epoch [42/51], Iter [30/391] Loss: 0.0708
Epoch [42/51], Iter [40/391] Loss: 0.0673
Epoch [42/51], Iter [50/391] Loss: 0.0721
Epoch [42/51], Iter [60/391] Loss: 0.0537
Epoch [42/51], Iter [70/391] Loss: 0.0617
Epoch [42/51], Iter [80/391] Loss: 0.0787
Epoch [42/51], Iter [90/391] Loss: 0.0651
Epoch [42/51], Iter [100/391] Loss: 0.0779
Epoch [42/51], Iter [110/391] Loss: 0.0649
Epoch [42/51], Iter [120/391] Loss: 0.0590
Epoch [42/51], Iter [130/391] Loss: 0.0646
Epoch [42/51], Iter [140/391] Loss: 0.0710
Epoch [42/51], Iter [150/391] Loss: 0.0693
Epoch [42/51], Iter [160/391] Loss: 0.0798
Epoch [42/51], Iter [170/391] Loss: 0.0624
Epoch [42/51], Iter [180/391] Loss: 0.0675
Epoch [42/51], Iter [190/391] Loss: 0.0751
Epoch [42/51], Iter [200/391] Loss: 0.0621
Epoch [42/51], Iter [210/391] Loss: 0.0742
Epoch [42/51], Iter [220/391] Loss: 0.0685
Epoch [42/51], Iter [230/391] Loss: 0.0605
Epoch [42/51], Iter [240/391] Loss: 0.0766
Epoch [42/51], Iter [250/391] Loss: 0.0804
Epoch [42/51], Iter [260/391] Loss: 0.0734
Epoch [42/51], Iter [270/391] Loss: 0.0906
Epoch [42/51], Iter [280/391] Loss: 0.0673
Epoch [42/51], Iter [290/391] Loss: 0.0614
Epoch [42/51], Iter [300/391] Loss: 0.0792
Epoch [42/51], Iter [310/391] Loss: 0.0739
Epoch [42/51], Iter [320/391] Loss: 0.0849
Epoch [42/51], Iter [330/391] Loss: 0.0671
Epoch [42/51], Iter [340/391] Loss: 0.0752
Epoch [42/51], Iter [350/391] Loss: 0.0762
Epoch [42/51], Iter [360/391] Loss: 0.0785
Epoch [42/51], Iter [370/391] Loss: 0.0796
Epoch [42/51], Iter [380/391] Loss: 0.0711
Epoch [42/51], Iter [390/391] Loss: 0.0692
Epoch [43/51], Iter [10/391] Loss: 0.0558
Epoch [43/51], Iter [20/391] Loss: 0.0680
Epoch [43/51], Iter [30/391] Loss: 0.0607
Epoch [43/51], Iter [40/391] Loss: 0.0663
Epoch [43/51], Iter [50/391] Loss: 0.0635
Epoch [43/51], Iter [60/391] Loss: 0.0737
Epoch [43/51], Iter [70/391] Loss: 0.0633
Epoch [43/51], Iter [80/391] Loss: 0.0636
Epoch [43/51], Iter [90/391] Loss: 0.0708
Epoch [43/51], Iter [100/391] Loss: 0.0545
Epoch [43/51], Iter [110/391] Loss: 0.0683
Epoch [43/51], Iter [120/391] Loss: 0.0669
Epoch [43/51], Iter [130/391] Loss: 0.0740
Epoch [43/51], Iter [140/391] Loss: 0.0631
Epoch [43/51], Iter [150/391] Loss: 0.0765
Epoch [43/51], Iter [160/391] Loss: 0.0676
Epoch [43/51], Iter [170/391] Loss: 0.0674
Epoch [43/51], Iter [180/391] Loss: 0.0600
Epoch [43/51], Iter [190/391] Loss: 0.0701
Epoch [43/51], Iter [200/391] Loss: 0.0678
Epoch [43/51], Iter [210/391] Loss: 0.0828
Epoch [43/51], Iter [220/391] Loss: 0.0607
Epoch [43/51], Iter [230/391] Loss: 0.0571
Epoch [43/51], Iter [240/391] Loss: 0.0987
Epoch [43/51], Iter [250/391] Loss: 0.0706
Epoch [43/51], Iter [260/391] Loss: 0.0627
Epoch [43/51], Iter [270/391] Loss: 0.0819
Epoch [43/51], Iter [280/391] Loss: 0.0781
Epoch [43/51], Iter [290/391] Loss: 0.0537
Epoch [43/51], Iter [300/391] Loss: 0.0709
Epoch [43/51], Iter [310/391] Loss: 0.0722
Epoch [43/51], Iter [320/391] Loss: 0.0825
Epoch [43/51], Iter [330/391] Loss: 0.0660
Epoch [43/51], Iter [340/391] Loss: 0.0708
Epoch [43/51], Iter [350/391] Loss: 0.0629
Epoch [43/51], Iter [360/391] Loss: 0.0678
Epoch [43/51], Iter [370/391] Loss: 0.0690
Epoch [43/51], Iter [380/391] Loss: 0.0738
Epoch [43/51], Iter [390/391] Loss: 0.0665
Epoch [44/51], Iter [10/391] Loss: 0.0689
Epoch [44/51], Iter [20/391] Loss: 0.0632
Epoch [44/51], Iter [30/391] Loss: 0.0695
Epoch [44/51], Iter [40/391] Loss: 0.0566
Epoch [44/51], Iter [50/391] Loss: 0.0719
Epoch [44/51], Iter [60/391] Loss: 0.0651
Epoch [44/51], Iter [70/391] Loss: 0.0663
Epoch [44/51], Iter [80/391] Loss: 0.0638
Epoch [44/51], Iter [90/391] Loss: 0.0668
Epoch [44/51], Iter [100/391] Loss: 0.0579
Epoch [44/51], Iter [110/391] Loss: 0.0668
Epoch [44/51], Iter [120/391] Loss: 0.0558
Epoch [44/51], Iter [130/391] Loss: 0.0624
Epoch [44/51], Iter [140/391] Loss: 0.0700
Epoch [44/51], Iter [150/391] Loss: 0.0555
Epoch [44/51], Iter [160/391] Loss: 0.0574
Epoch [44/51], Iter [170/391] Loss: 0.0817
Epoch [44/51], Iter [180/391] Loss: 0.0630
Epoch [44/51], Iter [190/391] Loss: 0.0836
Epoch [44/51], Iter [200/391] Loss: 0.0720
Epoch [44/51], Iter [210/391] Loss: 0.0667
Epoch [44/51], Iter [220/391] Loss: 0.0746
Epoch [44/51], Iter [230/391] Loss: 0.0746
Epoch [44/51], Iter [240/391] Loss: 0.0761
Epoch [44/51], Iter [250/391] Loss: 0.0707
Epoch [44/51], Iter [260/391] Loss: 0.0665
Epoch [44/51], Iter [270/391] Loss: 0.0807
Epoch [44/51], Iter [280/391] Loss: 0.0634
Epoch [44/51], Iter [290/391] Loss: 0.0619
Epoch [44/51], Iter [300/391] Loss: 0.0646
Epoch [44/51], Iter [310/391] Loss: 0.0869
Epoch [44/51], Iter [320/391] Loss: 0.0823
Epoch [44/51], Iter [330/391] Loss: 0.0688
Epoch [44/51], Iter [340/391] Loss: 0.0795
Epoch [44/51], Iter [350/391] Loss: 0.0686
Epoch [44/51], Iter [360/391] Loss: 0.0728
Epoch [44/51], Iter [370/391] Loss: 0.0719
Epoch [44/51], Iter [380/391] Loss: 0.0671
Epoch [44/51], Iter [390/391] Loss: 0.0743
Epoch [45/51], Iter [10/391] Loss: 0.0699
Epoch [45/51], Iter [20/391] Loss: 0.0692
Epoch [45/51], Iter [30/391] Loss: 0.0584
Epoch [45/51], Iter [40/391] Loss: 0.0523
Epoch [45/51], Iter [50/391] Loss: 0.0727
Epoch [45/51], Iter [60/391] Loss: 0.0660
Epoch [45/51], Iter [70/391] Loss: 0.0756
Epoch [45/51], Iter [80/391] Loss: 0.0685
Epoch [45/51], Iter [90/391] Loss: 0.0617
Epoch [45/51], Iter [100/391] Loss: 0.0669
Epoch [45/51], Iter [110/391] Loss: 0.0670
Epoch [45/51], Iter [120/391] Loss: 0.0588
Epoch [45/51], Iter [130/391] Loss: 0.0625
Epoch [45/51], Iter [140/391] Loss: 0.0651
Epoch [45/51], Iter [150/391] Loss: 0.0608
Epoch [45/51], Iter [160/391] Loss: 0.0565
Epoch [45/51], Iter [170/391] Loss: 0.0658
Epoch [45/51], Iter [180/391] Loss: 0.0824
Epoch [45/51], Iter [190/391] Loss: 0.0584
Epoch [45/51], Iter [200/391] Loss: 0.0585
Epoch [45/51], Iter [210/391] Loss: 0.0698
Epoch [45/51], Iter [220/391] Loss: 0.0689
Epoch [45/51], Iter [230/391] Loss: 0.0619
Epoch [45/51], Iter [240/391] Loss: 0.0662
Epoch [45/51], Iter [250/391] Loss: 0.0712
Epoch [45/51], Iter [260/391] Loss: 0.0638
Epoch [45/51], Iter [270/391] Loss: 0.0696
Epoch [45/51], Iter [280/391] Loss: 0.0759
Epoch [45/51], Iter [290/391] Loss: 0.0673
Epoch [45/51], Iter [300/391] Loss: 0.0605
Epoch [45/51], Iter [310/391] Loss: 0.0743
Epoch [45/51], Iter [320/391] Loss: 0.0687
Epoch [45/51], Iter [330/391] Loss: 0.0661
Epoch [45/51], Iter [340/391] Loss: 0.0747
Epoch [45/51], Iter [350/391] Loss: 0.0821
Epoch [45/51], Iter [360/391] Loss: 0.0711
Epoch [45/51], Iter [370/391] Loss: 0.0717
Epoch [45/51], Iter [380/391] Loss: 0.0733
Epoch [45/51], Iter [390/391] Loss: 0.0884
Epoch [46/51], Iter [10/391] Loss: 0.0651
Epoch [46/51], Iter [20/391] Loss: 0.0721
Epoch [46/51], Iter [30/391] Loss: 0.0630
Epoch [46/51], Iter [40/391] Loss: 0.0772
Epoch [46/51], Iter [50/391] Loss: 0.0647
Epoch [46/51], Iter [60/391] Loss: 0.0623
Epoch [46/51], Iter [70/391] Loss: 0.0603
Epoch [46/51], Iter [80/391] Loss: 0.0705
Epoch [46/51], Iter [90/391] Loss: 0.0629
Epoch [46/51], Iter [100/391] Loss: 0.0608
Epoch [46/51], Iter [110/391] Loss: 0.0654
Epoch [46/51], Iter [120/391] Loss: 0.0733
Epoch [46/51], Iter [130/391] Loss: 0.0583
Epoch [46/51], Iter [140/391] Loss: 0.0734
Epoch [46/51], Iter [150/391] Loss: 0.0526
Epoch [46/51], Iter [160/391] Loss: 0.0676
Epoch [46/51], Iter [170/391] Loss: 0.0626
Epoch [46/51], Iter [180/391] Loss: 0.0692
Epoch [46/51], Iter [190/391] Loss: 0.0667
Epoch [46/51], Iter [200/391] Loss: 0.0616
Epoch [46/51], Iter [210/391] Loss: 0.0652
Epoch [46/51], Iter [220/391] Loss: 0.0749
Epoch [46/51], Iter [230/391] Loss: 0.0716
Epoch [46/51], Iter [240/391] Loss: 0.0681
Epoch [46/51], Iter [250/391] Loss: 0.0684
Epoch [46/51], Iter [260/391] Loss: 0.0644
Epoch [46/51], Iter [270/391] Loss: 0.0845
Epoch [46/51], Iter [280/391] Loss: 0.0672
Epoch [46/51], Iter [290/391] Loss: 0.0678
Epoch [46/51], Iter [300/391] Loss: 0.0637
Epoch [46/51], Iter [310/391] Loss: 0.0689
Epoch [46/51], Iter [320/391] Loss: 0.0704
Epoch [46/51], Iter [330/391] Loss: 0.0633
Epoch [46/51], Iter [340/391] Loss: 0.0567
Epoch [46/51], Iter [350/391] Loss: 0.0684
Epoch [46/51], Iter [360/391] Loss: 0.0591
Epoch [46/51], Iter [370/391] Loss: 0.0730
Epoch [46/51], Iter [380/391] Loss: 0.0774
Epoch [46/51], Iter [390/391] Loss: 0.0581
Epoch [47/51], Iter [10/391] Loss: 0.0585
Epoch [47/51], Iter [20/391] Loss: 0.0562
Epoch [47/51], Iter [30/391] Loss: 0.0689
Epoch [47/51], Iter [40/391] Loss: 0.0551
Epoch [47/51], Iter [50/391] Loss: 0.0595
Epoch [47/51], Iter [60/391] Loss: 0.0553
Epoch [47/51], Iter [70/391] Loss: 0.0600
Epoch [47/51], Iter [80/391] Loss: 0.0612
Epoch [47/51], Iter [90/391] Loss: 0.0552
Epoch [47/51], Iter [100/391] Loss: 0.0595
Epoch [47/51], Iter [110/391] Loss: 0.0596
Epoch [47/51], Iter [120/391] Loss: 0.0790
Epoch [47/51], Iter [130/391] Loss: 0.0760
Epoch [47/51], Iter [140/391] Loss: 0.0653
Epoch [47/51], Iter [150/391] Loss: 0.0639
Epoch [47/51], Iter [160/391] Loss: 0.0551
Epoch [47/51], Iter [170/391] Loss: 0.0652
Epoch [47/51], Iter [180/391] Loss: 0.0762
Epoch [47/51], Iter [190/391] Loss: 0.0643
Epoch [47/51], Iter [200/391] Loss: 0.0660
Epoch [47/51], Iter [210/391] Loss: 0.0652
Epoch [47/51], Iter [220/391] Loss: 0.0604
Epoch [47/51], Iter [230/391] Loss: 0.0546
Epoch [47/51], Iter [240/391] Loss: 0.0588
Epoch [47/51], Iter [250/391] Loss: 0.0715
Epoch [47/51], Iter [260/391] Loss: 0.0760
Epoch [47/51], Iter [270/391] Loss: 0.0681
Epoch [47/51], Iter [280/391] Loss: 0.0684
Epoch [47/51], Iter [290/391] Loss: 0.0619
Epoch [47/51], Iter [300/391] Loss: 0.0650
Epoch [47/51], Iter [310/391] Loss: 0.0691
Epoch [47/51], Iter [320/391] Loss: 0.0650
Epoch [47/51], Iter [330/391] Loss: 0.0580
Epoch [47/51], Iter [340/391] Loss: 0.0698
Epoch [47/51], Iter [350/391] Loss: 0.0695
Epoch [47/51], Iter [360/391] Loss: 0.0642
Epoch [47/51], Iter [370/391] Loss: 0.0726
Epoch [47/51], Iter [380/391] Loss: 0.0811
Epoch [47/51], Iter [390/391] Loss: 0.0559
Epoch [48/51], Iter [10/391] Loss: 0.0651
Epoch [48/51], Iter [20/391] Loss: 0.0538
Epoch [48/51], Iter [30/391] Loss: 0.0687
Epoch [48/51], Iter [40/391] Loss: 0.0690
Epoch [48/51], Iter [50/391] Loss: 0.0625
Epoch [48/51], Iter [60/391] Loss: 0.0562
Epoch [48/51], Iter [70/391] Loss: 0.0574
Epoch [48/51], Iter [80/391] Loss: 0.0578
Epoch [48/51], Iter [90/391] Loss: 0.0557
Epoch [48/51], Iter [100/391] Loss: 0.0516
Epoch [48/51], Iter [110/391] Loss: 0.0609
Epoch [48/51], Iter [120/391] Loss: 0.0649
Epoch [48/51], Iter [130/391] Loss: 0.0734
Epoch [48/51], Iter [140/391] Loss: 0.0606
Epoch [48/51], Iter [150/391] Loss: 0.0555
Epoch [48/51], Iter [160/391] Loss: 0.0541
Epoch [48/51], Iter [170/391] Loss: 0.0623
Epoch [48/51], Iter [180/391] Loss: 0.0636
Epoch [48/51], Iter [190/391] Loss: 0.0664
Epoch [48/51], Iter [200/391] Loss: 0.0644
Epoch [48/51], Iter [210/391] Loss: 0.0616
Epoch [48/51], Iter [220/391] Loss: 0.0581
Epoch [48/51], Iter [230/391] Loss: 0.0695
Epoch [48/51], Iter [240/391] Loss: 0.0585
Epoch [48/51], Iter [250/391] Loss: 0.0540
Epoch [48/51], Iter [260/391] Loss: 0.0735
Epoch [48/51], Iter [270/391] Loss: 0.0735
Epoch [48/51], Iter [280/391] Loss: 0.0728
Epoch [48/51], Iter [290/391] Loss: 0.0624
Epoch [48/51], Iter [300/391] Loss: 0.0719
Epoch [48/51], Iter [310/391] Loss: 0.0702
Epoch [48/51], Iter [320/391] Loss: 0.0714
Epoch [48/51], Iter [330/391] Loss: 0.0712
Epoch [48/51], Iter [340/391] Loss: 0.0721
Epoch [48/51], Iter [350/391] Loss: 0.0693
Epoch [48/51], Iter [360/391] Loss: 0.0824
Epoch [48/51], Iter [370/391] Loss: 0.0700
Epoch [48/51], Iter [380/391] Loss: 0.0680
Epoch [48/51], Iter [390/391] Loss: 0.0673
Epoch [49/51], Iter [10/391] Loss: 0.0708
Epoch [49/51], Iter [20/391] Loss: 0.0611
Epoch [49/51], Iter [30/391] Loss: 0.0561
Epoch [49/51], Iter [40/391] Loss: 0.0631
Epoch [49/51], Iter [50/391] Loss: 0.0575
Epoch [49/51], Iter [60/391] Loss: 0.0562
Epoch [49/51], Iter [70/391] Loss: 0.0460
Epoch [49/51], Iter [80/391] Loss: 0.0685
Epoch [49/51], Iter [90/391] Loss: 0.0676
Epoch [49/51], Iter [100/391] Loss: 0.0554
Epoch [49/51], Iter [110/391] Loss: 0.0609
Epoch [49/51], Iter [120/391] Loss: 0.0611
Epoch [49/51], Iter [130/391] Loss: 0.0571
Epoch [49/51], Iter [140/391] Loss: 0.0667
Epoch [49/51], Iter [150/391] Loss: 0.0682
Epoch [49/51], Iter [160/391] Loss: 0.0564
Epoch [49/51], Iter [170/391] Loss: 0.0504
Epoch [49/51], Iter [180/391] Loss: 0.0662
Epoch [49/51], Iter [190/391] Loss: 0.0756
Epoch [49/51], Iter [200/391] Loss: 0.0693
Epoch [49/51], Iter [210/391] Loss: 0.0664
Epoch [49/51], Iter [220/391] Loss: 0.0697
Epoch [49/51], Iter [230/391] Loss: 0.0628
Epoch [49/51], Iter [240/391] Loss: 0.0595
Epoch [49/51], Iter [250/391] Loss: 0.0707
Epoch [49/51], Iter [260/391] Loss: 0.0834
Epoch [49/51], Iter [270/391] Loss: 0.0773
Epoch [49/51], Iter [280/391] Loss: 0.0694
Epoch [49/51], Iter [290/391] Loss: 0.0675
Epoch [49/51], Iter [300/391] Loss: 0.0589
Epoch [49/51], Iter [310/391] Loss: 0.0617
Epoch [49/51], Iter [320/391] Loss: 0.0576
Epoch [49/51], Iter [330/391] Loss: 0.0595
Epoch [49/51], Iter [340/391] Loss: 0.0651
Epoch [49/51], Iter [350/391] Loss: 0.0718
Epoch [49/51], Iter [360/391] Loss: 0.0666
Epoch [49/51], Iter [370/391] Loss: 0.0539
Epoch [49/51], Iter [380/391] Loss: 0.0660
Epoch [49/51], Iter [390/391] Loss: 0.0674
Epoch [50/51], Iter [10/391] Loss: 0.0553
Epoch [50/51], Iter [20/391] Loss: 0.0580
Epoch [50/51], Iter [30/391] Loss: 0.0546
Epoch [50/51], Iter [40/391] Loss: 0.0601
Epoch [50/51], Iter [50/391] Loss: 0.0523
Epoch [50/51], Iter [60/391] Loss: 0.0646
Epoch [50/51], Iter [70/391] Loss: 0.0548
Epoch [50/51], Iter [80/391] Loss: 0.0574
Epoch [50/51], Iter [90/391] Loss: 0.0550
Epoch [50/51], Iter [100/391] Loss: 0.0572
Epoch [50/51], Iter [110/391] Loss: 0.0525
Epoch [50/51], Iter [120/391] Loss: 0.0542
Epoch [50/51], Iter [130/391] Loss: 0.0683
Epoch [50/51], Iter [140/391] Loss: 0.0588
Epoch [50/51], Iter [150/391] Loss: 0.0703
Epoch [50/51], Iter [160/391] Loss: 0.0576
Epoch [50/51], Iter [170/391] Loss: 0.0661
Epoch [50/51], Iter [180/391] Loss: 0.0714
Epoch [50/51], Iter [190/391] Loss: 0.0620
Epoch [50/51], Iter [200/391] Loss: 0.0705
Epoch [50/51], Iter [210/391] Loss: 0.0566
Epoch [50/51], Iter [220/391] Loss: 0.0637
Epoch [50/51], Iter [230/391] Loss: 0.0614
Epoch [50/51], Iter [240/391] Loss: 0.0744
Epoch [50/51], Iter [250/391] Loss: 0.0759
Epoch [50/51], Iter [260/391] Loss: 0.0641
Epoch [50/51], Iter [270/391] Loss: 0.0795
Epoch [50/51], Iter [280/391] Loss: 0.0607
Epoch [50/51], Iter [290/391] Loss: 0.0634
Epoch [50/51], Iter [300/391] Loss: 0.0554
Epoch [50/51], Iter [310/391] Loss: 0.0625
Epoch [50/51], Iter [320/391] Loss: 0.0645
Epoch [50/51], Iter [330/391] Loss: 0.0632
Epoch [50/51], Iter [340/391] Loss: 0.0556
Epoch [50/51], Iter [350/391] Loss: 0.0767
Epoch [50/51], Iter [360/391] Loss: 0.0605
Epoch [50/51], Iter [370/391] Loss: 0.0683
Epoch [50/51], Iter [380/391] Loss: 0.0609
Epoch [50/51], Iter [390/391] Loss: 0.0588
[Saving Checkpoint]
Epoch [51/51], Iter [10/391] Loss: 0.0686
Epoch [51/51], Iter [20/391] Loss: 0.0629
Epoch [51/51], Iter [30/391] Loss: 0.0527
Epoch [51/51], Iter [40/391] Loss: 0.0474
Epoch [51/51], Iter [50/391] Loss: 0.0602
Epoch [51/51], Iter [60/391] Loss: 0.0575
Epoch [51/51], Iter [70/391] Loss: 0.0682
Epoch [51/51], Iter [80/391] Loss: 0.0579
Epoch [51/51], Iter [90/391] Loss: 0.0647
Epoch [51/51], Iter [100/391] Loss: 0.0584
Epoch [51/51], Iter [110/391] Loss: 0.0648
Epoch [51/51], Iter [120/391] Loss: 0.0571
Epoch [51/51], Iter [130/391] Loss: 0.0557
Epoch [51/51], Iter [140/391] Loss: 0.0515
Epoch [51/51], Iter [150/391] Loss: 0.0692
Epoch [51/51], Iter [160/391] Loss: 0.0591
Epoch [51/51], Iter [170/391] Loss: 0.0471
Epoch [51/51], Iter [180/391] Loss: 0.0681
Epoch [51/51], Iter [190/391] Loss: 0.0517
Epoch [51/51], Iter [200/391] Loss: 0.0544
Epoch [51/51], Iter [210/391] Loss: 0.0597
Epoch [51/51], Iter [220/391] Loss: 0.0641
Epoch [51/51], Iter [230/391] Loss: 0.0614
Epoch [51/51], Iter [240/391] Loss: 0.0583
Epoch [51/51], Iter [250/391] Loss: 0.0590
Epoch [51/51], Iter [260/391] Loss: 0.0703
Epoch [51/51], Iter [270/391] Loss: 0.0576
Epoch [51/51], Iter [280/391] Loss: 0.0581
Epoch [51/51], Iter [290/391] Loss: 0.0540
Epoch [51/51], Iter [300/391] Loss: 0.0662
Epoch [51/51], Iter [310/391] Loss: 0.0630
Epoch [51/51], Iter [320/391] Loss: 0.0569
Epoch [51/51], Iter [330/391] Loss: 0.0761
Epoch [51/51], Iter [340/391] Loss: 0.0562
Epoch [51/51], Iter [350/391] Loss: 0.0746
Epoch [51/51], Iter [360/391] Loss: 0.0690
Epoch [51/51], Iter [370/391] Loss: 0.0687
Epoch [51/51], Iter [380/391] Loss: 0.0618
Epoch [51/51], Iter [390/391] Loss: 0.0595
# | a=1 | T=5 | epochs = 51 |
resnet_child_a1_t5_e51 = copy.deepcopy(resnet_child) #let's save for future reference
test_harness( testloader, resnet_child_a1_t5_e51 )
Accuracy of the model on the test images: 90 %
(tensor(9055, device='cuda:0'), 10000)
# | a=1 | T=10 | epochs = 51 |
resnet_child, optimizer_child = get_new_child_model()
epoch = 0
kd_loss_a1_t10 = partial( knowledge_distillation_loss, alpha=1, T=10 )
training_harness( trainloader, optimizer_child, kd_loss_a1_t10, resnet_parent, resnet_child, model_name='DeepResNet_a1_t10_e51' )
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:12: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
if sys.path[0] == '':
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:26: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:15: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
from ipykernel import kernelapp as app
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:20: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:21: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
ENABLING GPU ACCELERATION || GeForce GTX 1080 Ti
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:46: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
Epoch [1/51], Iter [10/391] Loss: 1.2352
Epoch [1/51], Iter [20/391] Loss: 1.1462
Epoch [1/51], Iter [30/391] Loss: 1.0883
Epoch [1/51], Iter [40/391] Loss: 1.0511
Epoch [1/51], Iter [50/391] Loss: 1.0525
Epoch [1/51], Iter [60/391] Loss: 0.9907
Epoch [1/51], Iter [70/391] Loss: 1.0051
Epoch [1/51], Iter [80/391] Loss: 0.9242
Epoch [1/51], Iter [90/391] Loss: 0.8790
Epoch [1/51], Iter [100/391] Loss: 0.9338
Epoch [1/51], Iter [110/391] Loss: 0.9081
Epoch [1/51], Iter [120/391] Loss: 0.8858
Epoch [1/51], Iter [130/391] Loss: 0.8146
Epoch [1/51], Iter [140/391] Loss: 0.7810
Epoch [1/51], Iter [150/391] Loss: 0.7846
Epoch [1/51], Iter [160/391] Loss: 0.7865
Epoch [1/51], Iter [170/391] Loss: 0.7582
Epoch [1/51], Iter [180/391] Loss: 0.7543
Epoch [1/51], Iter [190/391] Loss: 0.7260
Epoch [1/51], Iter [200/391] Loss: 0.7703
Epoch [1/51], Iter [210/391] Loss: 0.6977
Epoch [1/51], Iter [220/391] Loss: 0.7752
Epoch [1/51], Iter [230/391] Loss: 0.6916
Epoch [1/51], Iter [240/391] Loss: 0.6998
Epoch [1/51], Iter [250/391] Loss: 0.7489
Epoch [1/51], Iter [260/391] Loss: 0.6650
Epoch [1/51], Iter [270/391] Loss: 0.6815
Epoch [1/51], Iter [280/391] Loss: 0.7247
Epoch [1/51], Iter [290/391] Loss: 0.6767
Epoch [1/51], Iter [300/391] Loss: 0.6956
Epoch [1/51], Iter [310/391] Loss: 0.6168
Epoch [1/51], Iter [320/391] Loss: 0.6474
Epoch [1/51], Iter [330/391] Loss: 0.6698
Epoch [1/51], Iter [340/391] Loss: 0.6143
Epoch [1/51], Iter [350/391] Loss: 0.5981
Epoch [1/51], Iter [360/391] Loss: 0.6481
Epoch [1/51], Iter [370/391] Loss: 0.6421
Epoch [1/51], Iter [380/391] Loss: 0.6205
Epoch [1/51], Iter [390/391] Loss: 0.6047
Epoch [2/51], Iter [10/391] Loss: 0.6496
Epoch [2/51], Iter [20/391] Loss: 0.5868
Epoch [2/51], Iter [30/391] Loss: 0.6259
Epoch [2/51], Iter [40/391] Loss: 0.5760
Epoch [2/51], Iter [50/391] Loss: 0.6337
Epoch [2/51], Iter [60/391] Loss: 0.5380
Epoch [2/51], Iter [70/391] Loss: 0.5726
Epoch [2/51], Iter [80/391] Loss: 0.5597
Epoch [2/51], Iter [90/391] Loss: 0.5616
Epoch [2/51], Iter [100/391] Loss: 0.5152
Epoch [2/51], Iter [110/391] Loss: 0.5300
Epoch [2/51], Iter [120/391] Loss: 0.5083
Epoch [2/51], Iter [130/391] Loss: 0.5249
Epoch [2/51], Iter [140/391] Loss: 0.6317
Epoch [2/51], Iter [150/391] Loss: 0.5508
Epoch [2/51], Iter [160/391] Loss: 0.6073
Epoch [2/51], Iter [170/391] Loss: 0.5812
Epoch [2/51], Iter [180/391] Loss: 0.4504
Epoch [2/51], Iter [190/391] Loss: 0.5417
Epoch [2/51], Iter [200/391] Loss: 0.5175
Epoch [2/51], Iter [210/391] Loss: 0.5099
Epoch [2/51], Iter [220/391] Loss: 0.5250
Epoch [2/51], Iter [230/391] Loss: 0.5284
Epoch [2/51], Iter [240/391] Loss: 0.5018
Epoch [2/51], Iter [250/391] Loss: 0.5033
Epoch [2/51], Iter [260/391] Loss: 0.4816
Epoch [2/51], Iter [270/391] Loss: 0.4463
Epoch [2/51], Iter [280/391] Loss: 0.4472
Epoch [2/51], Iter [290/391] Loss: 0.4509
Epoch [2/51], Iter [300/391] Loss: 0.4576
Epoch [2/51], Iter [310/391] Loss: 0.4931
Epoch [2/51], Iter [320/391] Loss: 0.4350
Epoch [2/51], Iter [330/391] Loss: 0.4670
Epoch [2/51], Iter [340/391] Loss: 0.4870
Epoch [2/51], Iter [350/391] Loss: 0.4848
Epoch [2/51], Iter [360/391] Loss: 0.4582
Epoch [2/51], Iter [370/391] Loss: 0.4387
Epoch [2/51], Iter [380/391] Loss: 0.4720
Epoch [2/51], Iter [390/391] Loss: 0.4832
Epoch [3/51], Iter [10/391] Loss: 0.4736
Epoch [3/51], Iter [20/391] Loss: 0.4430
Epoch [3/51], Iter [30/391] Loss: 0.4626
Epoch [3/51], Iter [40/391] Loss: 0.4429
Epoch [3/51], Iter [50/391] Loss: 0.4518
Epoch [3/51], Iter [60/391] Loss: 0.4098
Epoch [3/51], Iter [70/391] Loss: 0.4435
Epoch [3/51], Iter [80/391] Loss: 0.4241
Epoch [3/51], Iter [90/391] Loss: 0.4235
Epoch [3/51], Iter [100/391] Loss: 0.4466
Epoch [3/51], Iter [110/391] Loss: 0.4213
Epoch [3/51], Iter [120/391] Loss: 0.4596
Epoch [3/51], Iter [130/391] Loss: 0.3873
Epoch [3/51], Iter [140/391] Loss: 0.3520
Epoch [3/51], Iter [150/391] Loss: 0.4216
Epoch [3/51], Iter [160/391] Loss: 0.3559
Epoch [3/51], Iter [170/391] Loss: 0.4207
Epoch [3/51], Iter [180/391] Loss: 0.3706
Epoch [3/51], Iter [190/391] Loss: 0.4661
Epoch [3/51], Iter [200/391] Loss: 0.4294
Epoch [3/51], Iter [210/391] Loss: 0.3566
Epoch [3/51], Iter [220/391] Loss: 0.4288
Epoch [3/51], Iter [230/391] Loss: 0.4264
Epoch [3/51], Iter [240/391] Loss: 0.3771
Epoch [3/51], Iter [250/391] Loss: 0.3775
Epoch [3/51], Iter [260/391] Loss: 0.4001
Epoch [3/51], Iter [270/391] Loss: 0.4091
Epoch [3/51], Iter [280/391] Loss: 0.3465
Epoch [3/51], Iter [290/391] Loss: 0.3839
Epoch [3/51], Iter [300/391] Loss: 0.4213
Epoch [3/51], Iter [310/391] Loss: 0.4016
Epoch [3/51], Iter [320/391] Loss: 0.4078
Epoch [3/51], Iter [330/391] Loss: 0.4044
Epoch [3/51], Iter [340/391] Loss: 0.3899
Epoch [3/51], Iter [350/391] Loss: 0.3316
Epoch [3/51], Iter [360/391] Loss: 0.3870
Epoch [3/51], Iter [370/391] Loss: 0.3342
Epoch [3/51], Iter [380/391] Loss: 0.4209
Epoch [3/51], Iter [390/391] Loss: 0.4058
Epoch [4/51], Iter [10/391] Loss: 0.3701
Epoch [4/51], Iter [20/391] Loss: 0.3514
Epoch [4/51], Iter [30/391] Loss: 0.3495
Epoch [4/51], Iter [40/391] Loss: 0.3662
Epoch [4/51], Iter [50/391] Loss: 0.3271
Epoch [4/51], Iter [60/391] Loss: 0.3656
Epoch [4/51], Iter [70/391] Loss: 0.3583
Epoch [4/51], Iter [80/391] Loss: 0.3589
Epoch [4/51], Iter [90/391] Loss: 0.3550
Epoch [4/51], Iter [100/391] Loss: 0.3477
Epoch [4/51], Iter [110/391] Loss: 0.3696
Epoch [4/51], Iter [120/391] Loss: 0.3639
Epoch [4/51], Iter [130/391] Loss: 0.3681
Epoch [4/51], Iter [140/391] Loss: 0.3235
Epoch [4/51], Iter [150/391] Loss: 0.3001
Epoch [4/51], Iter [160/391] Loss: 0.3080
Epoch [4/51], Iter [170/391] Loss: 0.3550
Epoch [4/51], Iter [180/391] Loss: 0.3640
Epoch [4/51], Iter [190/391] Loss: 0.3397
Epoch [4/51], Iter [200/391] Loss: 0.2849
Epoch [4/51], Iter [210/391] Loss: 0.3406
Epoch [4/51], Iter [220/391] Loss: 0.3697
Epoch [4/51], Iter [230/391] Loss: 0.3420
Epoch [4/51], Iter [240/391] Loss: 0.3192
Epoch [4/51], Iter [250/391] Loss: 0.3321
Epoch [4/51], Iter [260/391] Loss: 0.3331
Epoch [4/51], Iter [270/391] Loss: 0.3154
Epoch [4/51], Iter [280/391] Loss: 0.3310
Epoch [4/51], Iter [290/391] Loss: 0.3117
Epoch [4/51], Iter [300/391] Loss: 0.3432
Epoch [4/51], Iter [310/391] Loss: 0.3084
Epoch [4/51], Iter [320/391] Loss: 0.2941
Epoch [4/51], Iter [330/391] Loss: 0.3095
Epoch [4/51], Iter [340/391] Loss: 0.3104
Epoch [4/51], Iter [350/391] Loss: 0.2954
Epoch [4/51], Iter [360/391] Loss: 0.3372
Epoch [4/51], Iter [370/391] Loss: 0.3223
Epoch [4/51], Iter [380/391] Loss: 0.2969
Epoch [4/51], Iter [390/391] Loss: 0.3556
Epoch [5/51], Iter [10/391] Loss: 0.3114
Epoch [5/51], Iter [20/391] Loss: 0.3752
Epoch [5/51], Iter [30/391] Loss: 0.3184
Epoch [5/51], Iter [40/391] Loss: 0.3330
Epoch [5/51], Iter [50/391] Loss: 0.3326
Epoch [5/51], Iter [60/391] Loss: 0.2694
Epoch [5/51], Iter [70/391] Loss: 0.2642
Epoch [5/51], Iter [80/391] Loss: 0.2699
Epoch [5/51], Iter [90/391] Loss: 0.2789
Epoch [5/51], Iter [100/391] Loss: 0.3129
Epoch [5/51], Iter [110/391] Loss: 0.2801
Epoch [5/51], Iter [120/391] Loss: 0.3270
Epoch [5/51], Iter [130/391] Loss: 0.2770
Epoch [5/51], Iter [140/391] Loss: 0.2591
Epoch [5/51], Iter [150/391] Loss: 0.3086
Epoch [5/51], Iter [160/391] Loss: 0.2949
Epoch [5/51], Iter [170/391] Loss: 0.2698
Epoch [5/51], Iter [180/391] Loss: 0.2919
Epoch [5/51], Iter [190/391] Loss: 0.2660
Epoch [5/51], Iter [200/391] Loss: 0.2753
Epoch [5/51], Iter [210/391] Loss: 0.2955
Epoch [5/51], Iter [220/391] Loss: 0.2707
Epoch [5/51], Iter [230/391] Loss: 0.2569
Epoch [5/51], Iter [240/391] Loss: 0.3298
Epoch [5/51], Iter [250/391] Loss: 0.3037
Epoch [5/51], Iter [260/391] Loss: 0.2933
Epoch [5/51], Iter [270/391] Loss: 0.2730
Epoch [5/51], Iter [280/391] Loss: 0.2780
Epoch [5/51], Iter [290/391] Loss: 0.2669
Epoch [5/51], Iter [300/391] Loss: 0.2715
Epoch [5/51], Iter [310/391] Loss: 0.3153
Epoch [5/51], Iter [320/391] Loss: 0.2936
Epoch [5/51], Iter [330/391] Loss: 0.2929
Epoch [5/51], Iter [340/391] Loss: 0.2510
Epoch [5/51], Iter [350/391] Loss: 0.2519
Epoch [5/51], Iter [360/391] Loss: 0.3046
Epoch [5/51], Iter [370/391] Loss: 0.3453
Epoch [5/51], Iter [380/391] Loss: 0.2950
Epoch [5/51], Iter [390/391] Loss: 0.2628
Epoch [6/51], Iter [10/391] Loss: 0.2779
Epoch [6/51], Iter [20/391] Loss: 0.2645
Epoch [6/51], Iter [30/391] Loss: 0.2968
Epoch [6/51], Iter [40/391] Loss: 0.2508
Epoch [6/51], Iter [50/391] Loss: 0.2603
Epoch [6/51], Iter [60/391] Loss: 0.2478
Epoch [6/51], Iter [70/391] Loss: 0.2305
Epoch [6/51], Iter [80/391] Loss: 0.2521
Epoch [6/51], Iter [90/391] Loss: 0.2809
Epoch [6/51], Iter [100/391] Loss: 0.2863
Epoch [6/51], Iter [110/391] Loss: 0.2964
Epoch [6/51], Iter [120/391] Loss: 0.2393
Epoch [6/51], Iter [130/391] Loss: 0.2608
Epoch [6/51], Iter [140/391] Loss: 0.2913
Epoch [6/51], Iter [150/391] Loss: 0.2623
Epoch [6/51], Iter [160/391] Loss: 0.2451
Epoch [6/51], Iter [170/391] Loss: 0.2644
Epoch [6/51], Iter [180/391] Loss: 0.2369
Epoch [6/51], Iter [190/391] Loss: 0.3021
Epoch [6/51], Iter [200/391] Loss: 0.2543
Epoch [6/51], Iter [210/391] Loss: 0.2842
Epoch [6/51], Iter [220/391] Loss: 0.2494
Epoch [6/51], Iter [230/391] Loss: 0.2807
Epoch [6/51], Iter [240/391] Loss: 0.2556
Epoch [6/51], Iter [250/391] Loss: 0.2640
Epoch [6/51], Iter [260/391] Loss: 0.2775
Epoch [6/51], Iter [270/391] Loss: 0.2385
Epoch [6/51], Iter [280/391] Loss: 0.2698
Epoch [6/51], Iter [290/391] Loss: 0.2752
Epoch [6/51], Iter [300/391] Loss: 0.2349
Epoch [6/51], Iter [310/391] Loss: 0.2049
Epoch [6/51], Iter [320/391] Loss: 0.2223
Epoch [6/51], Iter [330/391] Loss: 0.2728
Epoch [6/51], Iter [340/391] Loss: 0.2318
Epoch [6/51], Iter [350/391] Loss: 0.2493
Epoch [6/51], Iter [360/391] Loss: 0.2451
Epoch [6/51], Iter [370/391] Loss: 0.2312
Epoch [6/51], Iter [380/391] Loss: 0.2091
Epoch [6/51], Iter [390/391] Loss: 0.2231
Epoch [7/51], Iter [10/391] Loss: 0.2325
Epoch [7/51], Iter [20/391] Loss: 0.2359
Epoch [7/51], Iter [30/391] Loss: 0.2048
Epoch [7/51], Iter [40/391] Loss: 0.2122
Epoch [7/51], Iter [50/391] Loss: 0.2458
Epoch [7/51], Iter [60/391] Loss: 0.2642
Epoch [7/51], Iter [70/391] Loss: 0.2169
Epoch [7/51], Iter [80/391] Loss: 0.2548
Epoch [7/51], Iter [90/391] Loss: 0.2232
Epoch [7/51], Iter [100/391] Loss: 0.2378
Epoch [7/51], Iter [110/391] Loss: 0.2483
Epoch [7/51], Iter [120/391] Loss: 0.2213
Epoch [7/51], Iter [130/391] Loss: 0.2064
Epoch [7/51], Iter [140/391] Loss: 0.2411
Epoch [7/51], Iter [150/391] Loss: 0.2092
Epoch [7/51], Iter [160/391] Loss: 0.2707
Epoch [7/51], Iter [170/391] Loss: 0.2345
Epoch [7/51], Iter [180/391] Loss: 0.2674
Epoch [7/51], Iter [190/391] Loss: 0.2270
Epoch [7/51], Iter [200/391] Loss: 0.2315
Epoch [7/51], Iter [210/391] Loss: 0.2376
Epoch [7/51], Iter [220/391] Loss: 0.2555
Epoch [7/51], Iter [230/391] Loss: 0.2463
Epoch [7/51], Iter [240/391] Loss: 0.2078
Epoch [7/51], Iter [250/391] Loss: 0.2607
Epoch [7/51], Iter [260/391] Loss: 0.2326
Epoch [7/51], Iter [270/391] Loss: 0.2112
Epoch [7/51], Iter [280/391] Loss: 0.2169
Epoch [7/51], Iter [290/391] Loss: 0.2044
Epoch [7/51], Iter [300/391] Loss: 0.2538
Epoch [7/51], Iter [310/391] Loss: 0.2331
Epoch [7/51], Iter [320/391] Loss: 0.2485
Epoch [7/51], Iter [330/391] Loss: 0.2177
Epoch [7/51], Iter [340/391] Loss: 0.2552
Epoch [7/51], Iter [350/391] Loss: 0.2541
Epoch [7/51], Iter [360/391] Loss: 0.2044
Epoch [7/51], Iter [370/391] Loss: 0.2418
Epoch [7/51], Iter [380/391] Loss: 0.1980
Epoch [7/51], Iter [390/391] Loss: 0.2212
Epoch [8/51], Iter [10/391] Loss: 0.2020
Epoch [8/51], Iter [20/391] Loss: 0.2075
Epoch [8/51], Iter [30/391] Loss: 0.2275
Epoch [8/51], Iter [40/391] Loss: 0.2224
Epoch [8/51], Iter [50/391] Loss: 0.2145
Epoch [8/51], Iter [60/391] Loss: 0.2215
Epoch [8/51], Iter [70/391] Loss: 0.2151
Epoch [8/51], Iter [80/391] Loss: 0.2284
Epoch [8/51], Iter [90/391] Loss: 0.2266
Epoch [8/51], Iter [100/391] Loss: 0.1890
Epoch [8/51], Iter [110/391] Loss: 0.2441
Epoch [8/51], Iter [120/391] Loss: 0.2242
Epoch [8/51], Iter [130/391] Loss: 0.2218
Epoch [8/51], Iter [140/391] Loss: 0.2159
Epoch [8/51], Iter [150/391] Loss: 0.2493
Epoch [8/51], Iter [160/391] Loss: 0.2274
Epoch [8/51], Iter [170/391] Loss: 0.2106
Epoch [8/51], Iter [180/391] Loss: 0.1921
Epoch [8/51], Iter [190/391] Loss: 0.1905
Epoch [8/51], Iter [200/391] Loss: 0.2220
Epoch [8/51], Iter [210/391] Loss: 0.2202
Epoch [8/51], Iter [220/391] Loss: 0.2169
Epoch [8/51], Iter [230/391] Loss: 0.2078
Epoch [8/51], Iter [240/391] Loss: 0.1943
Epoch [8/51], Iter [250/391] Loss: 0.2397
Epoch [8/51], Iter [260/391] Loss: 0.2205
Epoch [8/51], Iter [270/391] Loss: 0.2403
Epoch [8/51], Iter [280/391] Loss: 0.2151
Epoch [8/51], Iter [290/391] Loss: 0.2032
Epoch [8/51], Iter [300/391] Loss: 0.1853
Epoch [8/51], Iter [310/391] Loss: 0.2227
Epoch [8/51], Iter [320/391] Loss: 0.2021
Epoch [8/51], Iter [330/391] Loss: 0.2094
Epoch [8/51], Iter [340/391] Loss: 0.1991
Epoch [8/51], Iter [350/391] Loss: 0.2162
Epoch [8/51], Iter [360/391] Loss: 0.2038
Epoch [8/51], Iter [370/391] Loss: 0.1953
Epoch [8/51], Iter [380/391] Loss: 0.2135
Epoch [8/51], Iter [390/391] Loss: 0.1909
Epoch [9/51], Iter [10/391] Loss: 0.1882
Epoch [9/51], Iter [20/391] Loss: 0.1908
Epoch [9/51], Iter [30/391] Loss: 0.1996
Epoch [9/51], Iter [40/391] Loss: 0.2304
Epoch [9/51], Iter [50/391] Loss: 0.2290
Epoch [9/51], Iter [60/391] Loss: 0.2129
Epoch [9/51], Iter [70/391] Loss: 0.1996
Epoch [9/51], Iter [80/391] Loss: 0.2014
Epoch [9/51], Iter [90/391] Loss: 0.2021
Epoch [9/51], Iter [100/391] Loss: 0.2140
Epoch [9/51], Iter [110/391] Loss: 0.2073
Epoch [9/51], Iter [120/391] Loss: 0.1890
Epoch [9/51], Iter [130/391] Loss: 0.2152
Epoch [9/51], Iter [140/391] Loss: 0.2167
Epoch [9/51], Iter [150/391] Loss: 0.1746
Epoch [9/51], Iter [160/391] Loss: 0.1922
Epoch [9/51], Iter [170/391] Loss: 0.1911
Epoch [9/51], Iter [180/391] Loss: 0.1893
Epoch [9/51], Iter [190/391] Loss: 0.1925
Epoch [9/51], Iter [200/391] Loss: 0.2030
Epoch [9/51], Iter [210/391] Loss: 0.1968
Epoch [9/51], Iter [220/391] Loss: 0.1809
Epoch [9/51], Iter [230/391] Loss: 0.1837
Epoch [9/51], Iter [240/391] Loss: 0.1998
Epoch [9/51], Iter [250/391] Loss: 0.2580
Epoch [9/51], Iter [260/391] Loss: 0.1761
Epoch [9/51], Iter [270/391] Loss: 0.1815
Epoch [9/51], Iter [280/391] Loss: 0.1799
Epoch [9/51], Iter [290/391] Loss: 0.2092
Epoch [9/51], Iter [300/391] Loss: 0.2238
Epoch [9/51], Iter [310/391] Loss: 0.2250
Epoch [9/51], Iter [320/391] Loss: 0.2361
Epoch [9/51], Iter [330/391] Loss: 0.1749
Epoch [9/51], Iter [340/391] Loss: 0.1674
Epoch [9/51], Iter [350/391] Loss: 0.2058
Epoch [9/51], Iter [360/391] Loss: 0.1866
Epoch [9/51], Iter [370/391] Loss: 0.1792
Epoch [9/51], Iter [380/391] Loss: 0.2231
Epoch [9/51], Iter [390/391] Loss: 0.1955
Epoch [10/51], Iter [10/391] Loss: 0.1954
Epoch [10/51], Iter [20/391] Loss: 0.1955
Epoch [10/51], Iter [30/391] Loss: 0.1828
Epoch [10/51], Iter [40/391] Loss: 0.1889
Epoch [10/51], Iter [50/391] Loss: 0.1674
Epoch [10/51], Iter [60/391] Loss: 0.1957
Epoch [10/51], Iter [70/391] Loss: 0.2017
Epoch [10/51], Iter [80/391] Loss: 0.1774
Epoch [10/51], Iter [90/391] Loss: 0.2092
Epoch [10/51], Iter [100/391] Loss: 0.2020
Epoch [10/51], Iter [110/391] Loss: 0.1864
Epoch [10/51], Iter [120/391] Loss: 0.1704
Epoch [10/51], Iter [130/391] Loss: 0.1887
Epoch [10/51], Iter [140/391] Loss: 0.1977
Epoch [10/51], Iter [150/391] Loss: 0.1871
Epoch [10/51], Iter [160/391] Loss: 0.1821
Epoch [10/51], Iter [170/391] Loss: 0.1764
Epoch [10/51], Iter [180/391] Loss: 0.1780
Epoch [10/51], Iter [190/391] Loss: 0.2001
Epoch [10/51], Iter [200/391] Loss: 0.1869
Epoch [10/51], Iter [210/391] Loss: 0.1679
Epoch [10/51], Iter [220/391] Loss: 0.1950
Epoch [10/51], Iter [230/391] Loss: 0.2147
Epoch [10/51], Iter [240/391] Loss: 0.1992
Epoch [10/51], Iter [250/391] Loss: 0.1973
Epoch [10/51], Iter [260/391] Loss: 0.2001
Epoch [10/51], Iter [270/391] Loss: 0.1721
Epoch [10/51], Iter [280/391] Loss: 0.1992
Epoch [10/51], Iter [290/391] Loss: 0.1849
Epoch [10/51], Iter [300/391] Loss: 0.1564
Epoch [10/51], Iter [310/391] Loss: 0.1703
Epoch [10/51], Iter [320/391] Loss: 0.1566
Epoch [10/51], Iter [330/391] Loss: 0.1873
Epoch [10/51], Iter [340/391] Loss: 0.2015
Epoch [10/51], Iter [350/391] Loss: 0.2151
Epoch [10/51], Iter [360/391] Loss: 0.1891
Epoch [10/51], Iter [370/391] Loss: 0.1932
Epoch [10/51], Iter [380/391] Loss: 0.1784
Epoch [10/51], Iter [390/391] Loss: 0.1574
[Saving Checkpoint]
Epoch [11/51], Iter [10/391] Loss: 0.1683
Epoch [11/51], Iter [20/391] Loss: 0.1655
Epoch [11/51], Iter [30/391] Loss: 0.1757
Epoch [11/51], Iter [40/391] Loss: 0.1571
Epoch [11/51], Iter [50/391] Loss: 0.1611
Epoch [11/51], Iter [60/391] Loss: 0.1682
Epoch [11/51], Iter [70/391] Loss: 0.1593
Epoch [11/51], Iter [80/391] Loss: 0.1720
Epoch [11/51], Iter [90/391] Loss: 0.1699
Epoch [11/51], Iter [100/391] Loss: 0.1918
Epoch [11/51], Iter [110/391] Loss: 0.1742
Epoch [11/51], Iter [120/391] Loss: 0.1827
Epoch [11/51], Iter [130/391] Loss: 0.1737
Epoch [11/51], Iter [140/391] Loss: 0.1754
Epoch [11/51], Iter [150/391] Loss: 0.1771
Epoch [11/51], Iter [160/391] Loss: 0.1832
Epoch [11/51], Iter [170/391] Loss: 0.1805
Epoch [11/51], Iter [180/391] Loss: 0.1729
Epoch [11/51], Iter [190/391] Loss: 0.1758
Epoch [11/51], Iter [200/391] Loss: 0.1486
Epoch [11/51], Iter [210/391] Loss: 0.2009
Epoch [11/51], Iter [220/391] Loss: 0.1727
Epoch [11/51], Iter [230/391] Loss: 0.1886
Epoch [11/51], Iter [240/391] Loss: 0.1894
Epoch [11/51], Iter [250/391] Loss: 0.1558
Epoch [11/51], Iter [260/391] Loss: 0.1746
Epoch [11/51], Iter [270/391] Loss: 0.1532
Epoch [11/51], Iter [280/391] Loss: 0.1650
Epoch [11/51], Iter [290/391] Loss: 0.1630
Epoch [11/51], Iter [300/391] Loss: 0.1649
Epoch [11/51], Iter [310/391] Loss: 0.1785
Epoch [11/51], Iter [320/391] Loss: 0.1976
Epoch [11/51], Iter [330/391] Loss: 0.1662
Epoch [11/51], Iter [340/391] Loss: 0.1773
Epoch [11/51], Iter [350/391] Loss: 0.1919
Epoch [11/51], Iter [360/391] Loss: 0.1589
Epoch [11/51], Iter [370/391] Loss: 0.1601
Epoch [11/51], Iter [380/391] Loss: 0.1594
Epoch [11/51], Iter [390/391] Loss: 0.1808
Epoch [12/51], Iter [10/391] Loss: 0.1535
Epoch [12/51], Iter [20/391] Loss: 0.1651
Epoch [12/51], Iter [30/391] Loss: 0.1466
Epoch [12/51], Iter [40/391] Loss: 0.1653
Epoch [12/51], Iter [50/391] Loss: 0.1658
Epoch [12/51], Iter [60/391] Loss: 0.1477
Epoch [12/51], Iter [70/391] Loss: 0.1511
Epoch [12/51], Iter [80/391] Loss: 0.1674
Epoch [12/51], Iter [90/391] Loss: 0.1409
Epoch [12/51], Iter [100/391] Loss: 0.1430
Epoch [12/51], Iter [110/391] Loss: 0.1625
Epoch [12/51], Iter [120/391] Loss: 0.1607
Epoch [12/51], Iter [130/391] Loss: 0.1596
Epoch [12/51], Iter [140/391] Loss: 0.1336
Epoch [12/51], Iter [150/391] Loss: 0.1939
Epoch [12/51], Iter [160/391] Loss: 0.1507
Epoch [12/51], Iter [170/391] Loss: 0.1807
Epoch [12/51], Iter [180/391] Loss: 0.1870
Epoch [12/51], Iter [190/391] Loss: 0.1805
Epoch [12/51], Iter [200/391] Loss: 0.1659
Epoch [12/51], Iter [210/391] Loss: 0.1479
Epoch [12/51], Iter [220/391] Loss: 0.1512
Epoch [12/51], Iter [230/391] Loss: 0.1751
Epoch [12/51], Iter [240/391] Loss: 0.1731
Epoch [12/51], Iter [250/391] Loss: 0.1424
Epoch [12/51], Iter [260/391] Loss: 0.1513
Epoch [12/51], Iter [270/391] Loss: 0.1466
Epoch [12/51], Iter [280/391] Loss: 0.1675
Epoch [12/51], Iter [290/391] Loss: 0.1420
Epoch [12/51], Iter [300/391] Loss: 0.1540
Epoch [12/51], Iter [310/391] Loss: 0.1510
Epoch [12/51], Iter [320/391] Loss: 0.1404
Epoch [12/51], Iter [330/391] Loss: 0.1628
Epoch [12/51], Iter [340/391] Loss: 0.1526
Epoch [12/51], Iter [350/391] Loss: 0.1403
Epoch [12/51], Iter [360/391] Loss: 0.1694
Epoch [12/51], Iter [370/391] Loss: 0.1789
Epoch [12/51], Iter [380/391] Loss: 0.1578
Epoch [12/51], Iter [390/391] Loss: 0.1656
Epoch [13/51], Iter [10/391] Loss: 0.1377
Epoch [13/51], Iter [20/391] Loss: 0.1411
Epoch [13/51], Iter [30/391] Loss: 0.1535
Epoch [13/51], Iter [40/391] Loss: 0.1371
Epoch [13/51], Iter [50/391] Loss: 0.1342
Epoch [13/51], Iter [60/391] Loss: 0.1648
Epoch [13/51], Iter [70/391] Loss: 0.1556
Epoch [13/51], Iter [80/391] Loss: 0.1789
Epoch [13/51], Iter [90/391] Loss: 0.1730
Epoch [13/51], Iter [100/391] Loss: 0.1528
Epoch [13/51], Iter [110/391] Loss: 0.1491
Epoch [13/51], Iter [120/391] Loss: 0.1580
Epoch [13/51], Iter [130/391] Loss: 0.1820
Epoch [13/51], Iter [140/391] Loss: 0.1498
Epoch [13/51], Iter [150/391] Loss: 0.1432
Epoch [13/51], Iter [160/391] Loss: 0.1642
Epoch [13/51], Iter [170/391] Loss: 0.1577
Epoch [13/51], Iter [180/391] Loss: 0.1496
Epoch [13/51], Iter [190/391] Loss: 0.1357
Epoch [13/51], Iter [200/391] Loss: 0.1511
Epoch [13/51], Iter [210/391] Loss: 0.1670
Epoch [13/51], Iter [220/391] Loss: 0.1502
Epoch [13/51], Iter [230/391] Loss: 0.1415
Epoch [13/51], Iter [240/391] Loss: 0.1413
Epoch [13/51], Iter [250/391] Loss: 0.1633
Epoch [13/51], Iter [260/391] Loss: 0.1315
Epoch [13/51], Iter [270/391] Loss: 0.1665
Epoch [13/51], Iter [280/391] Loss: 0.1676
Epoch [13/51], Iter [290/391] Loss: 0.1563
Epoch [13/51], Iter [300/391] Loss: 0.1413
Epoch [13/51], Iter [310/391] Loss: 0.1258
Epoch [13/51], Iter [320/391] Loss: 0.1362
Epoch [13/51], Iter [330/391] Loss: 0.1509
Epoch [13/51], Iter [340/391] Loss: 0.1648
Epoch [13/51], Iter [350/391] Loss: 0.1632
Epoch [13/51], Iter [360/391] Loss: 0.1386
Epoch [13/51], Iter [370/391] Loss: 0.1381
Epoch [13/51], Iter [380/391] Loss: 0.1640
Epoch [13/51], Iter [390/391] Loss: 0.1643
Epoch [14/51], Iter [10/391] Loss: 0.1263
Epoch [14/51], Iter [20/391] Loss: 0.1606
Epoch [14/51], Iter [30/391] Loss: 0.1307
Epoch [14/51], Iter [40/391] Loss: 0.1538
Epoch [14/51], Iter [50/391] Loss: 0.1549
Epoch [14/51], Iter [60/391] Loss: 0.1334
Epoch [14/51], Iter [70/391] Loss: 0.1333
Epoch [14/51], Iter [80/391] Loss: 0.1553
Epoch [14/51], Iter [90/391] Loss: 0.1488
Epoch [14/51], Iter [100/391] Loss: 0.1468
Epoch [14/51], Iter [110/391] Loss: 0.1619
Epoch [14/51], Iter [120/391] Loss: 0.1268
Epoch [14/51], Iter [130/391] Loss: 0.1487
Epoch [14/51], Iter [140/391] Loss: 0.1165
Epoch [14/51], Iter [150/391] Loss: 0.1508
Epoch [14/51], Iter [160/391] Loss: 0.1548
Epoch [14/51], Iter [170/391] Loss: 0.1436
Epoch [14/51], Iter [180/391] Loss: 0.1685
Epoch [14/51], Iter [190/391] Loss: 0.1325
Epoch [14/51], Iter [200/391] Loss: 0.1528
Epoch [14/51], Iter [210/391] Loss: 0.1204
Epoch [14/51], Iter [220/391] Loss: 0.1389
Epoch [14/51], Iter [230/391] Loss: 0.1464
Epoch [14/51], Iter [240/391] Loss: 0.1576
Epoch [14/51], Iter [250/391] Loss: 0.1423
Epoch [14/51], Iter [260/391] Loss: 0.1364
Epoch [14/51], Iter [270/391] Loss: 0.1584
Epoch [14/51], Iter [280/391] Loss: 0.1523
Epoch [14/51], Iter [290/391] Loss: 0.1513
Epoch [14/51], Iter [300/391] Loss: 0.1587
Epoch [14/51], Iter [310/391] Loss: 0.1426
Epoch [14/51], Iter [320/391] Loss: 0.1230
Epoch [14/51], Iter [330/391] Loss: 0.1472
Epoch [14/51], Iter [340/391] Loss: 0.1384
Epoch [14/51], Iter [350/391] Loss: 0.1373
Epoch [14/51], Iter [360/391] Loss: 0.1224
Epoch [14/51], Iter [370/391] Loss: 0.1446
Epoch [14/51], Iter [380/391] Loss: 0.1124
Epoch [14/51], Iter [390/391] Loss: 0.1320
Epoch [15/51], Iter [10/391] Loss: 0.1540
Epoch [15/51], Iter [20/391] Loss: 0.1260
Epoch [15/51], Iter [30/391] Loss: 0.1369
Epoch [15/51], Iter [40/391] Loss: 0.1259
Epoch [15/51], Iter [50/391] Loss: 0.1413
Epoch [15/51], Iter [60/391] Loss: 0.1636
Epoch [15/51], Iter [70/391] Loss: 0.1350
Epoch [15/51], Iter [80/391] Loss: 0.1415
Epoch [15/51], Iter [90/391] Loss: 0.1229
Epoch [15/51], Iter [100/391] Loss: 0.1451
Epoch [15/51], Iter [110/391] Loss: 0.1558
Epoch [15/51], Iter [120/391] Loss: 0.1244
Epoch [15/51], Iter [130/391] Loss: 0.1383
Epoch [15/51], Iter [140/391] Loss: 0.1325
Epoch [15/51], Iter [150/391] Loss: 0.1255
Epoch [15/51], Iter [160/391] Loss: 0.1409
Epoch [15/51], Iter [170/391] Loss: 0.1413
Epoch [15/51], Iter [180/391] Loss: 0.1377
Epoch [15/51], Iter [190/391] Loss: 0.1256
Epoch [15/51], Iter [200/391] Loss: 0.1533
Epoch [15/51], Iter [210/391] Loss: 0.1568
Epoch [15/51], Iter [220/391] Loss: 0.1379
Epoch [15/51], Iter [230/391] Loss: 0.1619
Epoch [15/51], Iter [240/391] Loss: 0.1361
Epoch [15/51], Iter [250/391] Loss: 0.1628
Epoch [15/51], Iter [260/391] Loss: 0.1478
Epoch [15/51], Iter [270/391] Loss: 0.1331
Epoch [15/51], Iter [280/391] Loss: 0.1463
Epoch [15/51], Iter [290/391] Loss: 0.1276
Epoch [15/51], Iter [300/391] Loss: 0.1350
Epoch [15/51], Iter [310/391] Loss: 0.1418
Epoch [15/51], Iter [320/391] Loss: 0.1502
Epoch [15/51], Iter [330/391] Loss: 0.1272
Epoch [15/51], Iter [340/391] Loss: 0.1452
Epoch [15/51], Iter [350/391] Loss: 0.1387
Epoch [15/51], Iter [360/391] Loss: 0.1453
Epoch [15/51], Iter [370/391] Loss: 0.1439
Epoch [15/51], Iter [380/391] Loss: 0.1388
Epoch [15/51], Iter [390/391] Loss: 0.1368
Epoch [16/51], Iter [10/391] Loss: 0.1217
Epoch [16/51], Iter [20/391] Loss: 0.1413
Epoch [16/51], Iter [30/391] Loss: 0.1179
Epoch [16/51], Iter [40/391] Loss: 0.1464
Epoch [16/51], Iter [50/391] Loss: 0.1465
Epoch [16/51], Iter [60/391] Loss: 0.1450
Epoch [16/51], Iter [70/391] Loss: 0.1231
Epoch [16/51], Iter [80/391] Loss: 0.1385
Epoch [16/51], Iter [90/391] Loss: 0.1274
Epoch [16/51], Iter [100/391] Loss: 0.1474
Epoch [16/51], Iter [110/391] Loss: 0.1270
Epoch [16/51], Iter [120/391] Loss: 0.1129
Epoch [16/51], Iter [130/391] Loss: 0.1268
Epoch [16/51], Iter [140/391] Loss: 0.1388
Epoch [16/51], Iter [150/391] Loss: 0.1565
Epoch [16/51], Iter [160/391] Loss: 0.1514
Epoch [16/51], Iter [170/391] Loss: 0.1425
Epoch [16/51], Iter [180/391] Loss: 0.1245
Epoch [16/51], Iter [190/391] Loss: 0.1421
Epoch [16/51], Iter [200/391] Loss: 0.1359
Epoch [16/51], Iter [210/391] Loss: 0.1338
Epoch [16/51], Iter [220/391] Loss: 0.1295
Epoch [16/51], Iter [230/391] Loss: 0.1362
Epoch [16/51], Iter [240/391] Loss: 0.1251
Epoch [16/51], Iter [250/391] Loss: 0.1323
Epoch [16/51], Iter [260/391] Loss: 0.1333
Epoch [16/51], Iter [270/391] Loss: 0.1458
Epoch [16/51], Iter [280/391] Loss: 0.1053
Epoch [16/51], Iter [290/391] Loss: 0.1488
Epoch [16/51], Iter [300/391] Loss: 0.1361
Epoch [16/51], Iter [310/391] Loss: 0.1223
Epoch [16/51], Iter [320/391] Loss: 0.1405
Epoch [16/51], Iter [330/391] Loss: 0.1443
Epoch [16/51], Iter [340/391] Loss: 0.1350
Epoch [16/51], Iter [350/391] Loss: 0.1192
Epoch [16/51], Iter [360/391] Loss: 0.1184
Epoch [16/51], Iter [370/391] Loss: 0.1260
Epoch [16/51], Iter [380/391] Loss: 0.1432
Epoch [16/51], Iter [390/391] Loss: 0.1178
Epoch [17/51], Iter [10/391] Loss: 0.1265
Epoch [17/51], Iter [20/391] Loss: 0.1095
Epoch [17/51], Iter [30/391] Loss: 0.1371
Epoch [17/51], Iter [40/391] Loss: 0.1405
Epoch [17/51], Iter [50/391] Loss: 0.1184
Epoch [17/51], Iter [60/391] Loss: 0.1142
Epoch [17/51], Iter [70/391] Loss: 0.1262
Epoch [17/51], Iter [80/391] Loss: 0.1310
Epoch [17/51], Iter [90/391] Loss: 0.1234
Epoch [17/51], Iter [100/391] Loss: 0.1303
Epoch [17/51], Iter [110/391] Loss: 0.1340
Epoch [17/51], Iter [120/391] Loss: 0.1362
Epoch [17/51], Iter [130/391] Loss: 0.1067
Epoch [17/51], Iter [140/391] Loss: 0.1275
Epoch [17/51], Iter [150/391] Loss: 0.1287
Epoch [17/51], Iter [160/391] Loss: 0.1075
Epoch [17/51], Iter [170/391] Loss: 0.1364
Epoch [17/51], Iter [180/391] Loss: 0.1148
Epoch [17/51], Iter [190/391] Loss: 0.1427
Epoch [17/51], Iter [200/391] Loss: 0.1097
Epoch [17/51], Iter [210/391] Loss: 0.1294
Epoch [17/51], Iter [220/391] Loss: 0.1314
Epoch [17/51], Iter [230/391] Loss: 0.1115
Epoch [17/51], Iter [240/391] Loss: 0.1325
Epoch [17/51], Iter [250/391] Loss: 0.1304
Epoch [17/51], Iter [260/391] Loss: 0.1378
Epoch [17/51], Iter [270/391] Loss: 0.1190
Epoch [17/51], Iter [280/391] Loss: 0.1413
Epoch [17/51], Iter [290/391] Loss: 0.1167
Epoch [17/51], Iter [300/391] Loss: 0.1313
Epoch [17/51], Iter [310/391] Loss: 0.1187
Epoch [17/51], Iter [320/391] Loss: 0.1267
Epoch [17/51], Iter [330/391] Loss: 0.1339
Epoch [17/51], Iter [340/391] Loss: 0.1465
Epoch [17/51], Iter [350/391] Loss: 0.1229
Epoch [17/51], Iter [360/391] Loss: 0.1211
Epoch [17/51], Iter [370/391] Loss: 0.1361
Epoch [17/51], Iter [380/391] Loss: 0.1117
Epoch [17/51], Iter [390/391] Loss: 0.1085
Epoch [18/51], Iter [10/391] Loss: 0.1431
Epoch [18/51], Iter [20/391] Loss: 0.1315
Epoch [18/51], Iter [30/391] Loss: 0.1327
Epoch [18/51], Iter [40/391] Loss: 0.1261
Epoch [18/51], Iter [50/391] Loss: 0.1124
Epoch [18/51], Iter [60/391] Loss: 0.1377
Epoch [18/51], Iter [70/391] Loss: 0.1312
Epoch [18/51], Iter [80/391] Loss: 0.1416
Epoch [18/51], Iter [90/391] Loss: 0.1082
Epoch [18/51], Iter [100/391] Loss: 0.1240
Epoch [18/51], Iter [110/391] Loss: 0.1281
Epoch [18/51], Iter [120/391] Loss: 0.1283
Epoch [18/51], Iter [130/391] Loss: 0.1139
Epoch [18/51], Iter [140/391] Loss: 0.1237
Epoch [18/51], Iter [150/391] Loss: 0.1144
Epoch [18/51], Iter [160/391] Loss: 0.1212
Epoch [18/51], Iter [170/391] Loss: 0.1101
Epoch [18/51], Iter [180/391] Loss: 0.1188
Epoch [18/51], Iter [190/391] Loss: 0.1316
Epoch [18/51], Iter [200/391] Loss: 0.1255
Epoch [18/51], Iter [210/391] Loss: 0.1237
Epoch [18/51], Iter [220/391] Loss: 0.1250
Epoch [18/51], Iter [230/391] Loss: 0.1386
Epoch [18/51], Iter [240/391] Loss: 0.1224
Epoch [18/51], Iter [250/391] Loss: 0.1370
Epoch [18/51], Iter [260/391] Loss: 0.1283
Epoch [18/51], Iter [270/391] Loss: 0.1380
Epoch [18/51], Iter [280/391] Loss: 0.0986
Epoch [18/51], Iter [290/391] Loss: 0.1244
Epoch [18/51], Iter [300/391] Loss: 0.1087
Epoch [18/51], Iter [310/391] Loss: 0.1151
Epoch [18/51], Iter [320/391] Loss: 0.1229
Epoch [18/51], Iter [330/391] Loss: 0.1394
Epoch [18/51], Iter [340/391] Loss: 0.1118
Epoch [18/51], Iter [350/391] Loss: 0.1304
Epoch [18/51], Iter [360/391] Loss: 0.1248
Epoch [18/51], Iter [370/391] Loss: 0.0971
Epoch [18/51], Iter [380/391] Loss: 0.1083
Epoch [18/51], Iter [390/391] Loss: 0.1201
Epoch [19/51], Iter [10/391] Loss: 0.1028
Epoch [19/51], Iter [20/391] Loss: 0.1131
Epoch [19/51], Iter [30/391] Loss: 0.1190
Epoch [19/51], Iter [40/391] Loss: 0.1051
Epoch [19/51], Iter [50/391] Loss: 0.0987
Epoch [19/51], Iter [60/391] Loss: 0.1071
Epoch [19/51], Iter [70/391] Loss: 0.1309
Epoch [19/51], Iter [80/391] Loss: 0.1094
Epoch [19/51], Iter [90/391] Loss: 0.1348
Epoch [19/51], Iter [100/391] Loss: 0.1003
Epoch [19/51], Iter [110/391] Loss: 0.1225
Epoch [19/51], Iter [120/391] Loss: 0.0933
Epoch [19/51], Iter [130/391] Loss: 0.1267
Epoch [19/51], Iter [140/391] Loss: 0.1323
Epoch [19/51], Iter [150/391] Loss: 0.0959
Epoch [19/51], Iter [160/391] Loss: 0.1392
Epoch [19/51], Iter [170/391] Loss: 0.1005
Epoch [19/51], Iter [180/391] Loss: 0.1028
Epoch [19/51], Iter [190/391] Loss: 0.1037
Epoch [19/51], Iter [200/391] Loss: 0.1283
Epoch [19/51], Iter [210/391] Loss: 0.1181
Epoch [19/51], Iter [220/391] Loss: 0.1330
Epoch [19/51], Iter [230/391] Loss: 0.1098
Epoch [19/51], Iter [240/391] Loss: 0.1279
Epoch [19/51], Iter [250/391] Loss: 0.1128
Epoch [19/51], Iter [260/391] Loss: 0.1180
Epoch [19/51], Iter [270/391] Loss: 0.1085
Epoch [19/51], Iter [280/391] Loss: 0.1228
Epoch [19/51], Iter [290/391] Loss: 0.1088
Epoch [19/51], Iter [300/391] Loss: 0.1275
Epoch [19/51], Iter [310/391] Loss: 0.1178
Epoch [19/51], Iter [320/391] Loss: 0.1223
Epoch [19/51], Iter [330/391] Loss: 0.1279
Epoch [19/51], Iter [340/391] Loss: 0.1202
Epoch [19/51], Iter [350/391] Loss: 0.1081
Epoch [19/51], Iter [360/391] Loss: 0.1108
Epoch [19/51], Iter [370/391] Loss: 0.1174
Epoch [19/51], Iter [380/391] Loss: 0.1456
Epoch [19/51], Iter [390/391] Loss: 0.1149
Epoch [20/51], Iter [10/391] Loss: 0.1086
Epoch [20/51], Iter [20/391] Loss: 0.1136
Epoch [20/51], Iter [30/391] Loss: 0.1148
Epoch [20/51], Iter [40/391] Loss: 0.0999
Epoch [20/51], Iter [50/391] Loss: 0.1087
Epoch [20/51], Iter [60/391] Loss: 0.1118
Epoch [20/51], Iter [70/391] Loss: 0.1203
Epoch [20/51], Iter [80/391] Loss: 0.1247
Epoch [20/51], Iter [90/391] Loss: 0.1127
Epoch [20/51], Iter [100/391] Loss: 0.0972
Epoch [20/51], Iter [110/391] Loss: 0.1061
Epoch [20/51], Iter [120/391] Loss: 0.1116
Epoch [20/51], Iter [130/391] Loss: 0.1128
Epoch [20/51], Iter [140/391] Loss: 0.0917
Epoch [20/51], Iter [150/391] Loss: 0.1299
Epoch [20/51], Iter [160/391] Loss: 0.1161
Epoch [20/51], Iter [170/391] Loss: 0.1245
Epoch [20/51], Iter [180/391] Loss: 0.1288
Epoch [20/51], Iter [190/391] Loss: 0.1076
Epoch [20/51], Iter [200/391] Loss: 0.1299
Epoch [20/51], Iter [210/391] Loss: 0.1097
Epoch [20/51], Iter [220/391] Loss: 0.1110
Epoch [20/51], Iter [230/391] Loss: 0.1346
Epoch [20/51], Iter [240/391] Loss: 0.1181
Epoch [20/51], Iter [250/391] Loss: 0.1263
Epoch [20/51], Iter [260/391] Loss: 0.1126
Epoch [20/51], Iter [270/391] Loss: 0.1222
Epoch [20/51], Iter [280/391] Loss: 0.1250
Epoch [20/51], Iter [290/391] Loss: 0.1134
Epoch [20/51], Iter [300/391] Loss: 0.1243
Epoch [20/51], Iter [310/391] Loss: 0.1067
Epoch [20/51], Iter [320/391] Loss: 0.1046
Epoch [20/51], Iter [330/391] Loss: 0.0893
Epoch [20/51], Iter [340/391] Loss: 0.1160
Epoch [20/51], Iter [350/391] Loss: 0.1214
Epoch [20/51], Iter [360/391] Loss: 0.1139
Epoch [20/51], Iter [370/391] Loss: 0.1017
Epoch [20/51], Iter [380/391] Loss: 0.1332
Epoch [20/51], Iter [390/391] Loss: 0.1493
[Saving Checkpoint]
Epoch [21/51], Iter [10/391] Loss: 0.1036
Epoch [21/51], Iter [20/391] Loss: 0.0991
Epoch [21/51], Iter [30/391] Loss: 0.1371
Epoch [21/51], Iter [40/391] Loss: 0.1010
Epoch [21/51], Iter [50/391] Loss: 0.1102
Epoch [21/51], Iter [60/391] Loss: 0.0932
Epoch [21/51], Iter [70/391] Loss: 0.1161
Epoch [21/51], Iter [80/391] Loss: 0.1072
Epoch [21/51], Iter [90/391] Loss: 0.1079
Epoch [21/51], Iter [100/391] Loss: 0.1269
Epoch [21/51], Iter [110/391] Loss: 0.1206
Epoch [21/51], Iter [120/391] Loss: 0.1020
Epoch [21/51], Iter [130/391] Loss: 0.0984
Epoch [21/51], Iter [140/391] Loss: 0.1064
Epoch [21/51], Iter [150/391] Loss: 0.0967
Epoch [21/51], Iter [160/391] Loss: 0.1186
Epoch [21/51], Iter [170/391] Loss: 0.0933
Epoch [21/51], Iter [180/391] Loss: 0.1063
Epoch [21/51], Iter [190/391] Loss: 0.0912
Epoch [21/51], Iter [200/391] Loss: 0.1207
Epoch [21/51], Iter [210/391] Loss: 0.1135
Epoch [21/51], Iter [220/391] Loss: 0.1138
Epoch [21/51], Iter [230/391] Loss: 0.1277
Epoch [21/51], Iter [240/391] Loss: 0.0942
Epoch [21/51], Iter [250/391] Loss: 0.0896
Epoch [21/51], Iter [260/391] Loss: 0.1390
Epoch [21/51], Iter [270/391] Loss: 0.1283
Epoch [21/51], Iter [280/391] Loss: 0.0910
Epoch [21/51], Iter [290/391] Loss: 0.1170
Epoch [21/51], Iter [300/391] Loss: 0.1143
Epoch [21/51], Iter [310/391] Loss: 0.1047
Epoch [21/51], Iter [320/391] Loss: 0.1005
Epoch [21/51], Iter [330/391] Loss: 0.1247
Epoch [21/51], Iter [340/391] Loss: 0.1115
Epoch [21/51], Iter [350/391] Loss: 0.1118
Epoch [21/51], Iter [360/391] Loss: 0.0984
Epoch [21/51], Iter [370/391] Loss: 0.1252
Epoch [21/51], Iter [380/391] Loss: 0.1191
Epoch [21/51], Iter [390/391] Loss: 0.1046
Epoch [22/51], Iter [10/391] Loss: 0.1089
Epoch [22/51], Iter [20/391] Loss: 0.1345
Epoch [22/51], Iter [30/391] Loss: 0.0975
Epoch [22/51], Iter [40/391] Loss: 0.1009
Epoch [22/51], Iter [50/391] Loss: 0.1099
Epoch [22/51], Iter [60/391] Loss: 0.1037
Epoch [22/51], Iter [70/391] Loss: 0.1179
Epoch [22/51], Iter [80/391] Loss: 0.0979
Epoch [22/51], Iter [90/391] Loss: 0.0998
Epoch [22/51], Iter [100/391] Loss: 0.1163
Epoch [22/51], Iter [110/391] Loss: 0.1044
Epoch [22/51], Iter [120/391] Loss: 0.0972
Epoch [22/51], Iter [130/391] Loss: 0.0969
Epoch [22/51], Iter [140/391] Loss: 0.1070
Epoch [22/51], Iter [150/391] Loss: 0.0934
Epoch [22/51], Iter [160/391] Loss: 0.1140
Epoch [22/51], Iter [170/391] Loss: 0.1074
Epoch [22/51], Iter [180/391] Loss: 0.1096
Epoch [22/51], Iter [190/391] Loss: 0.1034
Epoch [22/51], Iter [200/391] Loss: 0.1062
Epoch [22/51], Iter [210/391] Loss: 0.1040
Epoch [22/51], Iter [220/391] Loss: 0.1036
Epoch [22/51], Iter [230/391] Loss: 0.1176
Epoch [22/51], Iter [240/391] Loss: 0.1208
Epoch [22/51], Iter [250/391] Loss: 0.0996
Epoch [22/51], Iter [260/391] Loss: 0.1089
Epoch [22/51], Iter [270/391] Loss: 0.1159
Epoch [22/51], Iter [280/391] Loss: 0.1036
Epoch [22/51], Iter [290/391] Loss: 0.1080
Epoch [22/51], Iter [300/391] Loss: 0.1118
Epoch [22/51], Iter [310/391] Loss: 0.0882
Epoch [22/51], Iter [320/391] Loss: 0.0993
Epoch [22/51], Iter [330/391] Loss: 0.1023
Epoch [22/51], Iter [340/391] Loss: 0.1061
Epoch [22/51], Iter [350/391] Loss: 0.0893
Epoch [22/51], Iter [360/391] Loss: 0.1164
Epoch [22/51], Iter [370/391] Loss: 0.0935
Epoch [22/51], Iter [380/391] Loss: 0.1243
Epoch [22/51], Iter [390/391] Loss: 0.0892
Epoch [23/51], Iter [10/391] Loss: 0.0983
Epoch [23/51], Iter [20/391] Loss: 0.1019
Epoch [23/51], Iter [30/391] Loss: 0.0856
Epoch [23/51], Iter [40/391] Loss: 0.1157
Epoch [23/51], Iter [50/391] Loss: 0.1005
Epoch [23/51], Iter [60/391] Loss: 0.0963
Epoch [23/51], Iter [70/391] Loss: 0.0934
Epoch [23/51], Iter [80/391] Loss: 0.0887
Epoch [23/51], Iter [90/391] Loss: 0.0943
Epoch [23/51], Iter [100/391] Loss: 0.1035
Epoch [23/51], Iter [110/391] Loss: 0.1122
Epoch [23/51], Iter [120/391] Loss: 0.0871
Epoch [23/51], Iter [130/391] Loss: 0.1012
Epoch [23/51], Iter [140/391] Loss: 0.0886
Epoch [23/51], Iter [150/391] Loss: 0.1193
Epoch [23/51], Iter [160/391] Loss: 0.1240
Epoch [23/51], Iter [170/391] Loss: 0.0982
Epoch [23/51], Iter [180/391] Loss: 0.1072
Epoch [23/51], Iter [190/391] Loss: 0.1076
Epoch [23/51], Iter [200/391] Loss: 0.1044
Epoch [23/51], Iter [210/391] Loss: 0.1027
Epoch [23/51], Iter [220/391] Loss: 0.1065
Epoch [23/51], Iter [230/391] Loss: 0.0959
Epoch [23/51], Iter [240/391] Loss: 0.1032
Epoch [23/51], Iter [250/391] Loss: 0.1023
Epoch [23/51], Iter [260/391] Loss: 0.1220
Epoch [23/51], Iter [270/391] Loss: 0.1046
Epoch [23/51], Iter [280/391] Loss: 0.0885
Epoch [23/51], Iter [290/391] Loss: 0.0969
Epoch [23/51], Iter [300/391] Loss: 0.0914
Epoch [23/51], Iter [310/391] Loss: 0.1096
Epoch [23/51], Iter [320/391] Loss: 0.0894
Epoch [23/51], Iter [330/391] Loss: 0.1188
Epoch [23/51], Iter [340/391] Loss: 0.1020
Epoch [23/51], Iter [350/391] Loss: 0.1062
Epoch [23/51], Iter [360/391] Loss: 0.1063
Epoch [23/51], Iter [370/391] Loss: 0.1057
Epoch [23/51], Iter [380/391] Loss: 0.1105
Epoch [23/51], Iter [390/391] Loss: 0.0895
Epoch [24/51], Iter [10/391] Loss: 0.1002
Epoch [24/51], Iter [20/391] Loss: 0.0946
Epoch [24/51], Iter [30/391] Loss: 0.0850
Epoch [24/51], Iter [40/391] Loss: 0.0901
Epoch [24/51], Iter [50/391] Loss: 0.0834
Epoch [24/51], Iter [60/391] Loss: 0.0784
Epoch [24/51], Iter [70/391] Loss: 0.0886
Epoch [24/51], Iter [80/391] Loss: 0.1129
Epoch [24/51], Iter [90/391] Loss: 0.1001
Epoch [24/51], Iter [100/391] Loss: 0.0934
Epoch [24/51], Iter [110/391] Loss: 0.0948
Epoch [24/51], Iter [120/391] Loss: 0.1035
Epoch [24/51], Iter [130/391] Loss: 0.1021
Epoch [24/51], Iter [140/391] Loss: 0.0904
Epoch [24/51], Iter [150/391] Loss: 0.0886
Epoch [24/51], Iter [160/391] Loss: 0.1096
Epoch [24/51], Iter [170/391] Loss: 0.0971
Epoch [24/51], Iter [180/391] Loss: 0.0985
Epoch [24/51], Iter [190/391] Loss: 0.1365
Epoch [24/51], Iter [200/391] Loss: 0.0918
Epoch [24/51], Iter [210/391] Loss: 0.1033
Epoch [24/51], Iter [220/391] Loss: 0.1145
Epoch [24/51], Iter [230/391] Loss: 0.1144
Epoch [24/51], Iter [240/391] Loss: 0.1018
Epoch [24/51], Iter [250/391] Loss: 0.0907
Epoch [24/51], Iter [260/391] Loss: 0.1060
Epoch [24/51], Iter [270/391] Loss: 0.0809
Epoch [24/51], Iter [280/391] Loss: 0.0942
Epoch [24/51], Iter [290/391] Loss: 0.0992
Epoch [24/51], Iter [300/391] Loss: 0.0866
Epoch [24/51], Iter [310/391] Loss: 0.0983
Epoch [24/51], Iter [320/391] Loss: 0.0968
Epoch [24/51], Iter [330/391] Loss: 0.0863
Epoch [24/51], Iter [340/391] Loss: 0.1061
Epoch [24/51], Iter [350/391] Loss: 0.0939
Epoch [24/51], Iter [360/391] Loss: 0.1075
Epoch [24/51], Iter [370/391] Loss: 0.0901
Epoch [24/51], Iter [380/391] Loss: 0.0751
Epoch [24/51], Iter [390/391] Loss: 0.0950
Epoch [25/51], Iter [10/391] Loss: 0.1030
Epoch [25/51], Iter [20/391] Loss: 0.0978
Epoch [25/51], Iter [30/391] Loss: 0.0825
Epoch [25/51], Iter [40/391] Loss: 0.0849
Epoch [25/51], Iter [50/391] Loss: 0.0861
Epoch [25/51], Iter [60/391] Loss: 0.0990
Epoch [25/51], Iter [70/391] Loss: 0.0863
Epoch [25/51], Iter [80/391] Loss: 0.1086
Epoch [25/51], Iter [90/391] Loss: 0.1054
Epoch [25/51], Iter [100/391] Loss: 0.0811
Epoch [25/51], Iter [110/391] Loss: 0.0868
Epoch [25/51], Iter [120/391] Loss: 0.0959
Epoch [25/51], Iter [130/391] Loss: 0.0978
Epoch [25/51], Iter [140/391] Loss: 0.0935
Epoch [25/51], Iter [150/391] Loss: 0.0923
Epoch [25/51], Iter [160/391] Loss: 0.0944
Epoch [25/51], Iter [170/391] Loss: 0.0905
Epoch [25/51], Iter [180/391] Loss: 0.0913
Epoch [25/51], Iter [190/391] Loss: 0.0920
Epoch [25/51], Iter [200/391] Loss: 0.0816
Epoch [25/51], Iter [210/391] Loss: 0.0853
Epoch [25/51], Iter [220/391] Loss: 0.1135
Epoch [25/51], Iter [230/391] Loss: 0.1077
Epoch [25/51], Iter [240/391] Loss: 0.1010
Epoch [25/51], Iter [250/391] Loss: 0.1098
Epoch [25/51], Iter [260/391] Loss: 0.0901
Epoch [25/51], Iter [270/391] Loss: 0.0999
Epoch [25/51], Iter [280/391] Loss: 0.0883
Epoch [25/51], Iter [290/391] Loss: 0.0896
Epoch [25/51], Iter [300/391] Loss: 0.0927
Epoch [25/51], Iter [310/391] Loss: 0.0898
Epoch [25/51], Iter [320/391] Loss: 0.1143
Epoch [25/51], Iter [330/391] Loss: 0.0889
Epoch [25/51], Iter [340/391] Loss: 0.0978
Epoch [25/51], Iter [350/391] Loss: 0.0986
Epoch [25/51], Iter [360/391] Loss: 0.0793
Epoch [25/51], Iter [370/391] Loss: 0.0887
Epoch [25/51], Iter [380/391] Loss: 0.0988
Epoch [25/51], Iter [390/391] Loss: 0.0916
Epoch [26/51], Iter [10/391] Loss: 0.0843
Epoch [26/51], Iter [20/391] Loss: 0.0906
Epoch [26/51], Iter [30/391] Loss: 0.0871
Epoch [26/51], Iter [40/391] Loss: 0.0976
Epoch [26/51], Iter [50/391] Loss: 0.0949
Epoch [26/51], Iter [60/391] Loss: 0.1015
Epoch [26/51], Iter [70/391] Loss: 0.0964
Epoch [26/51], Iter [80/391] Loss: 0.0783
Epoch [26/51], Iter [90/391] Loss: 0.0787
Epoch [26/51], Iter [100/391] Loss: 0.0943
Epoch [26/51], Iter [110/391] Loss: 0.0876
Epoch [26/51], Iter [120/391] Loss: 0.0789
Epoch [26/51], Iter [130/391] Loss: 0.1089
Epoch [26/51], Iter [140/391] Loss: 0.0923
Epoch [26/51], Iter [150/391] Loss: 0.0904
Epoch [26/51], Iter [160/391] Loss: 0.1027
Epoch [26/51], Iter [170/391] Loss: 0.0995
Epoch [26/51], Iter [180/391] Loss: 0.0941
Epoch [26/51], Iter [190/391] Loss: 0.1042
Epoch [26/51], Iter [200/391] Loss: 0.0832
Epoch [26/51], Iter [210/391] Loss: 0.1019
Epoch [26/51], Iter [220/391] Loss: 0.1076
Epoch [26/51], Iter [230/391] Loss: 0.1004
Epoch [26/51], Iter [240/391] Loss: 0.0892
Epoch [26/51], Iter [250/391] Loss: 0.0794
Epoch [26/51], Iter [260/391] Loss: 0.0994
Epoch [26/51], Iter [270/391] Loss: 0.1009
Epoch [26/51], Iter [280/391] Loss: 0.1128
Epoch [26/51], Iter [290/391] Loss: 0.0967
Epoch [26/51], Iter [300/391] Loss: 0.0894
Epoch [26/51], Iter [310/391] Loss: 0.0947
Epoch [26/51], Iter [320/391] Loss: 0.0974
Epoch [26/51], Iter [330/391] Loss: 0.0885
Epoch [26/51], Iter [340/391] Loss: 0.0723
Epoch [26/51], Iter [350/391] Loss: 0.0850
Epoch [26/51], Iter [360/391] Loss: 0.0899
Epoch [26/51], Iter [370/391] Loss: 0.1112
Epoch [26/51], Iter [380/391] Loss: 0.0831
Epoch [26/51], Iter [390/391] Loss: 0.0838
Epoch [27/51], Iter [10/391] Loss: 0.0843
Epoch [27/51], Iter [20/391] Loss: 0.0782
Epoch [27/51], Iter [30/391] Loss: 0.0892
Epoch [27/51], Iter [40/391] Loss: 0.0820
Epoch [27/51], Iter [50/391] Loss: 0.0951
Epoch [27/51], Iter [60/391] Loss: 0.0913
Epoch [27/51], Iter [70/391] Loss: 0.0871
Epoch [27/51], Iter [80/391] Loss: 0.0875
Epoch [27/51], Iter [90/391] Loss: 0.0872
Epoch [27/51], Iter [100/391] Loss: 0.0964
Epoch [27/51], Iter [110/391] Loss: 0.0918
Epoch [27/51], Iter [120/391] Loss: 0.1046
Epoch [27/51], Iter [130/391] Loss: 0.0884
Epoch [27/51], Iter [140/391] Loss: 0.0889
Epoch [27/51], Iter [150/391] Loss: 0.1018
Epoch [27/51], Iter [160/391] Loss: 0.0792
Epoch [27/51], Iter [170/391] Loss: 0.0982
Epoch [27/51], Iter [180/391] Loss: 0.0853
Epoch [27/51], Iter [190/391] Loss: 0.0799
Epoch [27/51], Iter [200/391] Loss: 0.0931
Epoch [27/51], Iter [210/391] Loss: 0.0837
Epoch [27/51], Iter [220/391] Loss: 0.0904
Epoch [27/51], Iter [230/391] Loss: 0.0884
Epoch [27/51], Iter [240/391] Loss: 0.0895
Epoch [27/51], Iter [250/391] Loss: 0.0879
Epoch [27/51], Iter [260/391] Loss: 0.0921
Epoch [27/51], Iter [270/391] Loss: 0.0779
Epoch [27/51], Iter [280/391] Loss: 0.0880
Epoch [27/51], Iter [290/391] Loss: 0.0829
Epoch [27/51], Iter [300/391] Loss: 0.0890
Epoch [27/51], Iter [310/391] Loss: 0.1119
Epoch [27/51], Iter [320/391] Loss: 0.0805
Epoch [27/51], Iter [330/391] Loss: 0.0811
Epoch [27/51], Iter [340/391] Loss: 0.0959
Epoch [27/51], Iter [350/391] Loss: 0.0987
Epoch [27/51], Iter [360/391] Loss: 0.0952
Epoch [27/51], Iter [370/391] Loss: 0.0865
Epoch [27/51], Iter [380/391] Loss: 0.0959
Epoch [27/51], Iter [390/391] Loss: 0.0939
Epoch [28/51], Iter [10/391] Loss: 0.0802
Epoch [28/51], Iter [20/391] Loss: 0.0773
Epoch [28/51], Iter [30/391] Loss: 0.0880
Epoch [28/51], Iter [40/391] Loss: 0.0740
Epoch [28/51], Iter [50/391] Loss: 0.0847
Epoch [28/51], Iter [60/391] Loss: 0.0841
Epoch [28/51], Iter [70/391] Loss: 0.0914
Epoch [28/51], Iter [80/391] Loss: 0.0981
Epoch [28/51], Iter [90/391] Loss: 0.0888
Epoch [28/51], Iter [100/391] Loss: 0.0889
Epoch [28/51], Iter [110/391] Loss: 0.0831
Epoch [28/51], Iter [120/391] Loss: 0.0902
Epoch [28/51], Iter [130/391] Loss: 0.0921
Epoch [28/51], Iter [140/391] Loss: 0.0895
Epoch [28/51], Iter [150/391] Loss: 0.0991
Epoch [28/51], Iter [160/391] Loss: 0.0794
Epoch [28/51], Iter [170/391] Loss: 0.0789
Epoch [28/51], Iter [180/391] Loss: 0.0934
Epoch [28/51], Iter [190/391] Loss: 0.0820
Epoch [28/51], Iter [200/391] Loss: 0.0773
Epoch [28/51], Iter [210/391] Loss: 0.1030
Epoch [28/51], Iter [220/391] Loss: 0.0962
Epoch [28/51], Iter [230/391] Loss: 0.1007
Epoch [28/51], Iter [240/391] Loss: 0.0931
Epoch [28/51], Iter [250/391] Loss: 0.0801
Epoch [28/51], Iter [260/391] Loss: 0.0895
Epoch [28/51], Iter [270/391] Loss: 0.0829
Epoch [28/51], Iter [280/391] Loss: 0.0796
Epoch [28/51], Iter [290/391] Loss: 0.0905
Epoch [28/51], Iter [300/391] Loss: 0.0812
Epoch [28/51], Iter [310/391] Loss: 0.0769
Epoch [28/51], Iter [320/391] Loss: 0.0988
Epoch [28/51], Iter [330/391] Loss: 0.0945
Epoch [28/51], Iter [340/391] Loss: 0.0842
Epoch [28/51], Iter [350/391] Loss: 0.0874
Epoch [28/51], Iter [360/391] Loss: 0.0985
Epoch [28/51], Iter [370/391] Loss: 0.0995
Epoch [28/51], Iter [380/391] Loss: 0.0739
Epoch [28/51], Iter [390/391] Loss: 0.0861
Epoch [29/51], Iter [10/391] Loss: 0.0909
Epoch [29/51], Iter [20/391] Loss: 0.0880
Epoch [29/51], Iter [30/391] Loss: 0.0963
Epoch [29/51], Iter [40/391] Loss: 0.0931
Epoch [29/51], Iter [50/391] Loss: 0.0847
Epoch [29/51], Iter [60/391] Loss: 0.0741
Epoch [29/51], Iter [70/391] Loss: 0.0878
Epoch [29/51], Iter [80/391] Loss: 0.0830
Epoch [29/51], Iter [90/391] Loss: 0.1009
Epoch [29/51], Iter [100/391] Loss: 0.0801
Epoch [29/51], Iter [110/391] Loss: 0.0785
Epoch [29/51], Iter [120/391] Loss: 0.0733
Epoch [29/51], Iter [130/391] Loss: 0.0914
Epoch [29/51], Iter [140/391] Loss: 0.0790
Epoch [29/51], Iter [150/391] Loss: 0.0884
Epoch [29/51], Iter [160/391] Loss: 0.0924
Epoch [29/51], Iter [170/391] Loss: 0.0830
Epoch [29/51], Iter [180/391] Loss: 0.0913
Epoch [29/51], Iter [190/391] Loss: 0.0898
Epoch [29/51], Iter [200/391] Loss: 0.0808
Epoch [29/51], Iter [210/391] Loss: 0.0804
Epoch [29/51], Iter [220/391] Loss: 0.0908
Epoch [29/51], Iter [230/391] Loss: 0.0777
Epoch [29/51], Iter [240/391] Loss: 0.0926
Epoch [29/51], Iter [250/391] Loss: 0.0879
Epoch [29/51], Iter [260/391] Loss: 0.0872
Epoch [29/51], Iter [270/391] Loss: 0.0711
Epoch [29/51], Iter [280/391] Loss: 0.0940
Epoch [29/51], Iter [290/391] Loss: 0.0807
Epoch [29/51], Iter [300/391] Loss: 0.0911
Epoch [29/51], Iter [310/391] Loss: 0.0926
Epoch [29/51], Iter [320/391] Loss: 0.0862
Epoch [29/51], Iter [330/391] Loss: 0.0797
Epoch [29/51], Iter [340/391] Loss: 0.0986
Epoch [29/51], Iter [350/391] Loss: 0.0833
Epoch [29/51], Iter [360/391] Loss: 0.0876
Epoch [29/51], Iter [370/391] Loss: 0.0776
Epoch [29/51], Iter [380/391] Loss: 0.0854
Epoch [29/51], Iter [390/391] Loss: 0.0916
Epoch [30/51], Iter [10/391] Loss: 0.0762
Epoch [30/51], Iter [20/391] Loss: 0.0840
Epoch [30/51], Iter [30/391] Loss: 0.0879
Epoch [30/51], Iter [40/391] Loss: 0.0711
Epoch [30/51], Iter [50/391] Loss: 0.0829
Epoch [30/51], Iter [60/391] Loss: 0.0869
Epoch [30/51], Iter [70/391] Loss: 0.0782
Epoch [30/51], Iter [80/391] Loss: 0.0997
Epoch [30/51], Iter [90/391] Loss: 0.0833
Epoch [30/51], Iter [100/391] Loss: 0.0761
Epoch [30/51], Iter [110/391] Loss: 0.0840
Epoch [30/51], Iter [120/391] Loss: 0.0791
Epoch [30/51], Iter [130/391] Loss: 0.0805
Epoch [30/51], Iter [140/391] Loss: 0.0882
Epoch [30/51], Iter [150/391] Loss: 0.0803
Epoch [30/51], Iter [160/391] Loss: 0.0988
Epoch [30/51], Iter [170/391] Loss: 0.0736
Epoch [30/51], Iter [180/391] Loss: 0.0822
Epoch [30/51], Iter [190/391] Loss: 0.0883
Epoch [30/51], Iter [200/391] Loss: 0.0776
Epoch [30/51], Iter [210/391] Loss: 0.0840
Epoch [30/51], Iter [220/391] Loss: 0.0829
Epoch [30/51], Iter [230/391] Loss: 0.0867
Epoch [30/51], Iter [240/391] Loss: 0.0731
Epoch [30/51], Iter [250/391] Loss: 0.0905
Epoch [30/51], Iter [260/391] Loss: 0.0880
Epoch [30/51], Iter [270/391] Loss: 0.0841
Epoch [30/51], Iter [280/391] Loss: 0.0878
Epoch [30/51], Iter [290/391] Loss: 0.1001
Epoch [30/51], Iter [300/391] Loss: 0.0702
Epoch [30/51], Iter [310/391] Loss: 0.0791
Epoch [30/51], Iter [320/391] Loss: 0.0826
Epoch [30/51], Iter [330/391] Loss: 0.0909
Epoch [30/51], Iter [340/391] Loss: 0.0876
Epoch [30/51], Iter [350/391] Loss: 0.0862
Epoch [30/51], Iter [360/391] Loss: 0.0911
Epoch [30/51], Iter [370/391] Loss: 0.0804
Epoch [30/51], Iter [380/391] Loss: 0.0800
Epoch [30/51], Iter [390/391] Loss: 0.0864
[Saving Checkpoint]
Epoch [31/51], Iter [10/391] Loss: 0.0818
Epoch [31/51], Iter [20/391] Loss: 0.0869
Epoch [31/51], Iter [30/391] Loss: 0.0747
Epoch [31/51], Iter [40/391] Loss: 0.0770
Epoch [31/51], Iter [50/391] Loss: 0.0873
Epoch [31/51], Iter [60/391] Loss: 0.0874
Epoch [31/51], Iter [70/391] Loss: 0.0783
Epoch [31/51], Iter [80/391] Loss: 0.0776
Epoch [31/51], Iter [90/391] Loss: 0.0781
Epoch [31/51], Iter [100/391] Loss: 0.0846
Epoch [31/51], Iter [110/391] Loss: 0.0791
Epoch [31/51], Iter [120/391] Loss: 0.0911
Epoch [31/51], Iter [130/391] Loss: 0.0981
Epoch [31/51], Iter [140/391] Loss: 0.0898
Epoch [31/51], Iter [150/391] Loss: 0.0877
Epoch [31/51], Iter [160/391] Loss: 0.0720
Epoch [31/51], Iter [170/391] Loss: 0.0643
Epoch [31/51], Iter [180/391] Loss: 0.0921
Epoch [31/51], Iter [190/391] Loss: 0.0930
Epoch [31/51], Iter [200/391] Loss: 0.0729
Epoch [31/51], Iter [210/391] Loss: 0.0987
Epoch [31/51], Iter [220/391] Loss: 0.0742
Epoch [31/51], Iter [230/391] Loss: 0.0796
Epoch [31/51], Iter [240/391] Loss: 0.0771
Epoch [31/51], Iter [250/391] Loss: 0.0797
Epoch [31/51], Iter [260/391] Loss: 0.0917
Epoch [31/51], Iter [270/391] Loss: 0.0742
Epoch [31/51], Iter [280/391] Loss: 0.0918
Epoch [31/51], Iter [290/391] Loss: 0.0830
Epoch [31/51], Iter [300/391] Loss: 0.0787
Epoch [31/51], Iter [310/391] Loss: 0.0809
Epoch [31/51], Iter [320/391] Loss: 0.0778
Epoch [31/51], Iter [330/391] Loss: 0.0846
Epoch [31/51], Iter [340/391] Loss: 0.1042
Epoch [31/51], Iter [350/391] Loss: 0.0937
Epoch [31/51], Iter [360/391] Loss: 0.0821
Epoch [31/51], Iter [370/391] Loss: 0.0874
Epoch [31/51], Iter [380/391] Loss: 0.0785
Epoch [31/51], Iter [390/391] Loss: 0.0779
Epoch [32/51], Iter [10/391] Loss: 0.0726
Epoch [32/51], Iter [20/391] Loss: 0.0766
Epoch [32/51], Iter [30/391] Loss: 0.0861
Epoch [32/51], Iter [40/391] Loss: 0.0717
Epoch [32/51], Iter [50/391] Loss: 0.0777
Epoch [32/51], Iter [60/391] Loss: 0.0785
Epoch [32/51], Iter [70/391] Loss: 0.0793
Epoch [32/51], Iter [80/391] Loss: 0.0731
Epoch [32/51], Iter [90/391] Loss: 0.0703
Epoch [32/51], Iter [100/391] Loss: 0.0791
Epoch [32/51], Iter [110/391] Loss: 0.0709
Epoch [32/51], Iter [120/391] Loss: 0.0834
Epoch [32/51], Iter [130/391] Loss: 0.0787
Epoch [32/51], Iter [140/391] Loss: 0.0743
Epoch [32/51], Iter [150/391] Loss: 0.0658
Epoch [32/51], Iter [160/391] Loss: 0.0780
Epoch [32/51], Iter [170/391] Loss: 0.0893
Epoch [32/51], Iter [180/391] Loss: 0.0958
Epoch [32/51], Iter [190/391] Loss: 0.0848
Epoch [32/51], Iter [200/391] Loss: 0.0861
Epoch [32/51], Iter [210/391] Loss: 0.0811
Epoch [32/51], Iter [220/391] Loss: 0.0800
Epoch [32/51], Iter [230/391] Loss: 0.0764
Epoch [32/51], Iter [240/391] Loss: 0.0906
Epoch [32/51], Iter [250/391] Loss: 0.0807
Epoch [32/51], Iter [260/391] Loss: 0.0934
Epoch [32/51], Iter [270/391] Loss: 0.0850
Epoch [32/51], Iter [280/391] Loss: 0.0885
Epoch [32/51], Iter [290/391] Loss: 0.0942
Epoch [32/51], Iter [300/391] Loss: 0.0890
Epoch [32/51], Iter [310/391] Loss: 0.0834
Epoch [32/51], Iter [320/391] Loss: 0.0887
Epoch [32/51], Iter [330/391] Loss: 0.0900
Epoch [32/51], Iter [340/391] Loss: 0.0842
Epoch [32/51], Iter [350/391] Loss: 0.0771
Epoch [32/51], Iter [360/391] Loss: 0.0910
Epoch [32/51], Iter [370/391] Loss: 0.0754
Epoch [32/51], Iter [380/391] Loss: 0.0911
Epoch [32/51], Iter [390/391] Loss: 0.0731
Epoch [33/51], Iter [10/391] Loss: 0.0757
Epoch [33/51], Iter [20/391] Loss: 0.0827
Epoch [33/51], Iter [30/391] Loss: 0.0732
Epoch [33/51], Iter [40/391] Loss: 0.0774
Epoch [33/51], Iter [50/391] Loss: 0.0715
Epoch [33/51], Iter [60/391] Loss: 0.0753
Epoch [33/51], Iter [70/391] Loss: 0.0878
Epoch [33/51], Iter [80/391] Loss: 0.0794
Epoch [33/51], Iter [90/391] Loss: 0.0664
Epoch [33/51], Iter [100/391] Loss: 0.0683
Epoch [33/51], Iter [110/391] Loss: 0.0758
Epoch [33/51], Iter [120/391] Loss: 0.0690
Epoch [33/51], Iter [130/391] Loss: 0.0672
Epoch [33/51], Iter [140/391] Loss: 0.0750
Epoch [33/51], Iter [150/391] Loss: 0.0766
Epoch [33/51], Iter [160/391] Loss: 0.0808
Epoch [33/51], Iter [170/391] Loss: 0.0726
Epoch [33/51], Iter [180/391] Loss: 0.0827
Epoch [33/51], Iter [190/391] Loss: 0.0658
Epoch [33/51], Iter [200/391] Loss: 0.0787
Epoch [33/51], Iter [210/391] Loss: 0.0616
Epoch [33/51], Iter [220/391] Loss: 0.0754
Epoch [33/51], Iter [230/391] Loss: 0.0759
Epoch [33/51], Iter [240/391] Loss: 0.0793
Epoch [33/51], Iter [250/391] Loss: 0.0780
Epoch [33/51], Iter [260/391] Loss: 0.0702
Epoch [33/51], Iter [270/391] Loss: 0.0864
Epoch [33/51], Iter [280/391] Loss: 0.0851
Epoch [33/51], Iter [290/391] Loss: 0.0767
Epoch [33/51], Iter [300/391] Loss: 0.0835
Epoch [33/51], Iter [310/391] Loss: 0.0729
Epoch [33/51], Iter [320/391] Loss: 0.0774
Epoch [33/51], Iter [330/391] Loss: 0.0801
Epoch [33/51], Iter [340/391] Loss: 0.0997
Epoch [33/51], Iter [350/391] Loss: 0.0756
Epoch [33/51], Iter [360/391] Loss: 0.0780
Epoch [33/51], Iter [370/391] Loss: 0.0820
Epoch [33/51], Iter [380/391] Loss: 0.0933
Epoch [33/51], Iter [390/391] Loss: 0.0825
Epoch [34/51], Iter [10/391] Loss: 0.0715
Epoch [34/51], Iter [20/391] Loss: 0.0809
Epoch [34/51], Iter [30/391] Loss: 0.0774
Epoch [34/51], Iter [40/391] Loss: 0.0718
Epoch [34/51], Iter [50/391] Loss: 0.0754
Epoch [34/51], Iter [60/391] Loss: 0.0813
Epoch [34/51], Iter [70/391] Loss: 0.0765
Epoch [34/51], Iter [80/391] Loss: 0.0694
Epoch [34/51], Iter [90/391] Loss: 0.0763
Epoch [34/51], Iter [100/391] Loss: 0.0768
Epoch [34/51], Iter [110/391] Loss: 0.0755
Epoch [34/51], Iter [120/391] Loss: 0.0821
Epoch [34/51], Iter [130/391] Loss: 0.0781
Epoch [34/51], Iter [140/391] Loss: 0.0800
Epoch [34/51], Iter [150/391] Loss: 0.0818
Epoch [34/51], Iter [160/391] Loss: 0.0789
Epoch [34/51], Iter [170/391] Loss: 0.0640
Epoch [34/51], Iter [180/391] Loss: 0.0685
Epoch [34/51], Iter [190/391] Loss: 0.0751
Epoch [34/51], Iter [200/391] Loss: 0.0716
Epoch [34/51], Iter [210/391] Loss: 0.0818
Epoch [34/51], Iter [220/391] Loss: 0.0763
Epoch [34/51], Iter [230/391] Loss: 0.0669
Epoch [34/51], Iter [240/391] Loss: 0.0681
Epoch [34/51], Iter [250/391] Loss: 0.0746
Epoch [34/51], Iter [260/391] Loss: 0.0755
Epoch [34/51], Iter [270/391] Loss: 0.0791
Epoch [34/51], Iter [280/391] Loss: 0.0747
Epoch [34/51], Iter [290/391] Loss: 0.0664
Epoch [34/51], Iter [300/391] Loss: 0.0968
Epoch [34/51], Iter [310/391] Loss: 0.0824
Epoch [34/51], Iter [320/391] Loss: 0.0897
Epoch [34/51], Iter [330/391] Loss: 0.0791
Epoch [34/51], Iter [340/391] Loss: 0.0750
Epoch [34/51], Iter [350/391] Loss: 0.0831
Epoch [34/51], Iter [360/391] Loss: 0.0753
Epoch [34/51], Iter [370/391] Loss: 0.0945
Epoch [34/51], Iter [380/391] Loss: 0.0856
Epoch [34/51], Iter [390/391] Loss: 0.0876
Epoch [35/51], Iter [10/391] Loss: 0.0793
Epoch [35/51], Iter [20/391] Loss: 0.0615
Epoch [35/51], Iter [30/391] Loss: 0.0769
Epoch [35/51], Iter [40/391] Loss: 0.0688
Epoch [35/51], Iter [50/391] Loss: 0.0864
Epoch [35/51], Iter [60/391] Loss: 0.0799
Epoch [35/51], Iter [70/391] Loss: 0.0681
Epoch [35/51], Iter [80/391] Loss: 0.0626
Epoch [35/51], Iter [90/391] Loss: 0.0648
Epoch [35/51], Iter [100/391] Loss: 0.0757
Epoch [35/51], Iter [110/391] Loss: 0.0694
Epoch [35/51], Iter [120/391] Loss: 0.0818
Epoch [35/51], Iter [130/391] Loss: 0.0819
Epoch [35/51], Iter [140/391] Loss: 0.0897
Epoch [35/51], Iter [150/391] Loss: 0.0796
Epoch [35/51], Iter [160/391] Loss: 0.0698
Epoch [35/51], Iter [170/391] Loss: 0.0785
Epoch [35/51], Iter [180/391] Loss: 0.0763
Epoch [35/51], Iter [190/391] Loss: 0.0790
Epoch [35/51], Iter [200/391] Loss: 0.0638
Epoch [35/51], Iter [210/391] Loss: 0.0888
Epoch [35/51], Iter [220/391] Loss: 0.0656
Epoch [35/51], Iter [230/391] Loss: 0.0723
Epoch [35/51], Iter [240/391] Loss: 0.0698
Epoch [35/51], Iter [250/391] Loss: 0.0846
Epoch [35/51], Iter [260/391] Loss: 0.0805
Epoch [35/51], Iter [270/391] Loss: 0.0747
Epoch [35/51], Iter [280/391] Loss: 0.0960
Epoch [35/51], Iter [290/391] Loss: 0.0773
Epoch [35/51], Iter [300/391] Loss: 0.0783
Epoch [35/51], Iter [310/391] Loss: 0.0730
Epoch [35/51], Iter [320/391] Loss: 0.0780
Epoch [35/51], Iter [330/391] Loss: 0.0753
Epoch [35/51], Iter [340/391] Loss: 0.0940
Epoch [35/51], Iter [350/391] Loss: 0.0903
Epoch [35/51], Iter [360/391] Loss: 0.0857
Epoch [35/51], Iter [370/391] Loss: 0.0741
Epoch [35/51], Iter [380/391] Loss: 0.0737
Epoch [35/51], Iter [390/391] Loss: 0.0665
Epoch [36/51], Iter [10/391] Loss: 0.0800
Epoch [36/51], Iter [20/391] Loss: 0.0761
Epoch [36/51], Iter [30/391] Loss: 0.0727
Epoch [36/51], Iter [40/391] Loss: 0.0735
Epoch [36/51], Iter [50/391] Loss: 0.0631
Epoch [36/51], Iter [60/391] Loss: 0.0835
Epoch [36/51], Iter [70/391] Loss: 0.0649
Epoch [36/51], Iter [80/391] Loss: 0.0673
Epoch [36/51], Iter [90/391] Loss: 0.0757
Epoch [36/51], Iter [100/391] Loss: 0.0715
Epoch [36/51], Iter [110/391] Loss: 0.0645
Epoch [36/51], Iter [120/391] Loss: 0.0743
Epoch [36/51], Iter [130/391] Loss: 0.0664
Epoch [36/51], Iter [140/391] Loss: 0.0756
Epoch [36/51], Iter [150/391] Loss: 0.0785
Epoch [36/51], Iter [160/391] Loss: 0.0783
Epoch [36/51], Iter [170/391] Loss: 0.0596
Epoch [36/51], Iter [180/391] Loss: 0.0837
Epoch [36/51], Iter [190/391] Loss: 0.0783
Epoch [36/51], Iter [200/391] Loss: 0.0656
Epoch [36/51], Iter [210/391] Loss: 0.0887
Epoch [36/51], Iter [220/391] Loss: 0.0794
Epoch [36/51], Iter [230/391] Loss: 0.0742
Epoch [36/51], Iter [240/391] Loss: 0.0824
Epoch [36/51], Iter [250/391] Loss: 0.0732
Epoch [36/51], Iter [260/391] Loss: 0.0770
Epoch [36/51], Iter [270/391] Loss: 0.0719
Epoch [36/51], Iter [280/391] Loss: 0.0691
Epoch [36/51], Iter [290/391] Loss: 0.0842
Epoch [36/51], Iter [300/391] Loss: 0.0763
Epoch [36/51], Iter [310/391] Loss: 0.0786
Epoch [36/51], Iter [320/391] Loss: 0.0645
Epoch [36/51], Iter [330/391] Loss: 0.0779
Epoch [36/51], Iter [340/391] Loss: 0.0828
Epoch [36/51], Iter [350/391] Loss: 0.0705
Epoch [36/51], Iter [360/391] Loss: 0.0819
Epoch [36/51], Iter [370/391] Loss: 0.0701
Epoch [36/51], Iter [380/391] Loss: 0.0944
Epoch [36/51], Iter [390/391] Loss: 0.0747
Epoch [37/51], Iter [10/391] Loss: 0.0799
Epoch [37/51], Iter [20/391] Loss: 0.0668
Epoch [37/51], Iter [30/391] Loss: 0.0665
Epoch [37/51], Iter [40/391] Loss: 0.0645
Epoch [37/51], Iter [50/391] Loss: 0.0644
Epoch [37/51], Iter [60/391] Loss: 0.0741
Epoch [37/51], Iter [70/391] Loss: 0.0771
Epoch [37/51], Iter [80/391] Loss: 0.0726
Epoch [37/51], Iter [90/391] Loss: 0.0738
Epoch [37/51], Iter [100/391] Loss: 0.0691
Epoch [37/51], Iter [110/391] Loss: 0.0714
Epoch [37/51], Iter [120/391] Loss: 0.0633
Epoch [37/51], Iter [130/391] Loss: 0.0689
Epoch [37/51], Iter [140/391] Loss: 0.0766
Epoch [37/51], Iter [150/391] Loss: 0.0746
Epoch [37/51], Iter [160/391] Loss: 0.0659
Epoch [37/51], Iter [170/391] Loss: 0.0796
Epoch [37/51], Iter [180/391] Loss: 0.0599
Epoch [37/51], Iter [190/391] Loss: 0.0711
Epoch [37/51], Iter [200/391] Loss: 0.0753
Epoch [37/51], Iter [210/391] Loss: 0.0679
Epoch [37/51], Iter [220/391] Loss: 0.0786
Epoch [37/51], Iter [230/391] Loss: 0.0779
Epoch [37/51], Iter [240/391] Loss: 0.0691
Epoch [37/51], Iter [250/391] Loss: 0.0765
Epoch [37/51], Iter [260/391] Loss: 0.0794
Epoch [37/51], Iter [270/391] Loss: 0.0642
Epoch [37/51], Iter [280/391] Loss: 0.0733
Epoch [37/51], Iter [290/391] Loss: 0.0765
Epoch [37/51], Iter [300/391] Loss: 0.0716
Epoch [37/51], Iter [310/391] Loss: 0.0745
Epoch [37/51], Iter [320/391] Loss: 0.0817
Epoch [37/51], Iter [330/391] Loss: 0.0734
Epoch [37/51], Iter [340/391] Loss: 0.0811
Epoch [37/51], Iter [350/391] Loss: 0.0706
Epoch [37/51], Iter [360/391] Loss: 0.0687
Epoch [37/51], Iter [370/391] Loss: 0.0662
Epoch [37/51], Iter [380/391] Loss: 0.0791
Epoch [37/51], Iter [390/391] Loss: 0.0887
Epoch [38/51], Iter [10/391] Loss: 0.0684
Epoch [38/51], Iter [20/391] Loss: 0.0824
Epoch [38/51], Iter [30/391] Loss: 0.0684
Epoch [38/51], Iter [40/391] Loss: 0.0745
Epoch [38/51], Iter [50/391] Loss: 0.0734
Epoch [38/51], Iter [60/391] Loss: 0.0700
Epoch [38/51], Iter [70/391] Loss: 0.0789
Epoch [38/51], Iter [80/391] Loss: 0.0622
Epoch [38/51], Iter [90/391] Loss: 0.0818
Epoch [38/51], Iter [100/391] Loss: 0.0633
Epoch [38/51], Iter [110/391] Loss: 0.0642
Epoch [38/51], Iter [120/391] Loss: 0.0769
Epoch [38/51], Iter [130/391] Loss: 0.0683
Epoch [38/51], Iter [140/391] Loss: 0.0687
Epoch [38/51], Iter [150/391] Loss: 0.0692
Epoch [38/51], Iter [160/391] Loss: 0.0712
Epoch [38/51], Iter [170/391] Loss: 0.0769
Epoch [38/51], Iter [180/391] Loss: 0.0697
Epoch [38/51], Iter [190/391] Loss: 0.0680
Epoch [38/51], Iter [200/391] Loss: 0.0695
Epoch [38/51], Iter [210/391] Loss: 0.0886
Epoch [38/51], Iter [220/391] Loss: 0.0845
Epoch [38/51], Iter [230/391] Loss: 0.0738
Epoch [38/51], Iter [240/391] Loss: 0.0649
Epoch [38/51], Iter [250/391] Loss: 0.0635
Epoch [38/51], Iter [260/391] Loss: 0.0584
Epoch [38/51], Iter [270/391] Loss: 0.0712
Epoch [38/51], Iter [280/391] Loss: 0.0777
Epoch [38/51], Iter [290/391] Loss: 0.0675
Epoch [38/51], Iter [300/391] Loss: 0.0794
Epoch [38/51], Iter [310/391] Loss: 0.0665
Epoch [38/51], Iter [320/391] Loss: 0.0761
Epoch [38/51], Iter [330/391] Loss: 0.0800
Epoch [38/51], Iter [340/391] Loss: 0.0690
Epoch [38/51], Iter [350/391] Loss: 0.0657
Epoch [38/51], Iter [360/391] Loss: 0.0803
Epoch [38/51], Iter [370/391] Loss: 0.0703
Epoch [38/51], Iter [380/391] Loss: 0.0603
Epoch [38/51], Iter [390/391] Loss: 0.0719
Epoch [39/51], Iter [10/391] Loss: 0.0641
Epoch [39/51], Iter [20/391] Loss: 0.0626
Epoch [39/51], Iter [30/391] Loss: 0.0608
Epoch [39/51], Iter [40/391] Loss: 0.0665
Epoch [39/51], Iter [50/391] Loss: 0.0799
Epoch [39/51], Iter [60/391] Loss: 0.0728
Epoch [39/51], Iter [70/391] Loss: 0.0721
Epoch [39/51], Iter [80/391] Loss: 0.0665
Epoch [39/51], Iter [90/391] Loss: 0.0783
Epoch [39/51], Iter [100/391] Loss: 0.0594
Epoch [39/51], Iter [110/391] Loss: 0.0686
Epoch [39/51], Iter [120/391] Loss: 0.0624
Epoch [39/51], Iter [130/391] Loss: 0.0709
Epoch [39/51], Iter [140/391] Loss: 0.0580
Epoch [39/51], Iter [150/391] Loss: 0.0619
Epoch [39/51], Iter [160/391] Loss: 0.0704
Epoch [39/51], Iter [170/391] Loss: 0.0718
Epoch [39/51], Iter [180/391] Loss: 0.0646
Epoch [39/51], Iter [190/391] Loss: 0.0715
Epoch [39/51], Iter [200/391] Loss: 0.0827
Epoch [39/51], Iter [210/391] Loss: 0.0658
Epoch [39/51], Iter [220/391] Loss: 0.0706
Epoch [39/51], Iter [230/391] Loss: 0.0650
Epoch [39/51], Iter [240/391] Loss: 0.0720
Epoch [39/51], Iter [250/391] Loss: 0.0795
Epoch [39/51], Iter [260/391] Loss: 0.0676
Epoch [39/51], Iter [270/391] Loss: 0.0713
Epoch [39/51], Iter [280/391] Loss: 0.0648
Epoch [39/51], Iter [290/391] Loss: 0.0797
Epoch [39/51], Iter [300/391] Loss: 0.0712
Epoch [39/51], Iter [310/391] Loss: 0.0719
Epoch [39/51], Iter [320/391] Loss: 0.0813
Epoch [39/51], Iter [330/391] Loss: 0.0647
Epoch [39/51], Iter [340/391] Loss: 0.0783
Epoch [39/51], Iter [350/391] Loss: 0.0666
Epoch [39/51], Iter [360/391] Loss: 0.0799
Epoch [39/51], Iter [370/391] Loss: 0.0734
Epoch [39/51], Iter [380/391] Loss: 0.0739
Epoch [39/51], Iter [390/391] Loss: 0.0675
Epoch [40/51], Iter [10/391] Loss: 0.0655
Epoch [40/51], Iter [20/391] Loss: 0.0765
Epoch [40/51], Iter [30/391] Loss: 0.0638
Epoch [40/51], Iter [40/391] Loss: 0.0590
Epoch [40/51], Iter [50/391] Loss: 0.0768
Epoch [40/51], Iter [60/391] Loss: 0.0678
Epoch [40/51], Iter [70/391] Loss: 0.0673
Epoch [40/51], Iter [80/391] Loss: 0.0779
Epoch [40/51], Iter [90/391] Loss: 0.0639
Epoch [40/51], Iter [100/391] Loss: 0.0856
Epoch [40/51], Iter [110/391] Loss: 0.0725
Epoch [40/51], Iter [120/391] Loss: 0.0738
Epoch [40/51], Iter [130/391] Loss: 0.0694
Epoch [40/51], Iter [140/391] Loss: 0.0630
Epoch [40/51], Iter [150/391] Loss: 0.0637
Epoch [40/51], Iter [160/391] Loss: 0.0698
Epoch [40/51], Iter [170/391] Loss: 0.0698
Epoch [40/51], Iter [180/391] Loss: 0.0596
Epoch [40/51], Iter [190/391] Loss: 0.0786
Epoch [40/51], Iter [200/391] Loss: 0.0667
Epoch [40/51], Iter [210/391] Loss: 0.0726
Epoch [40/51], Iter [220/391] Loss: 0.0707
Epoch [40/51], Iter [230/391] Loss: 0.0708
Epoch [40/51], Iter [240/391] Loss: 0.0586
Epoch [40/51], Iter [250/391] Loss: 0.0619
Epoch [40/51], Iter [260/391] Loss: 0.0693
Epoch [40/51], Iter [270/391] Loss: 0.0658
Epoch [40/51], Iter [280/391] Loss: 0.0700
Epoch [40/51], Iter [290/391] Loss: 0.0667
Epoch [40/51], Iter [300/391] Loss: 0.0745
Epoch [40/51], Iter [310/391] Loss: 0.0674
Epoch [40/51], Iter [320/391] Loss: 0.0672
Epoch [40/51], Iter [330/391] Loss: 0.0571
Epoch [40/51], Iter [340/391] Loss: 0.0699
Epoch [40/51], Iter [350/391] Loss: 0.0664
Epoch [40/51], Iter [360/391] Loss: 0.0709
Epoch [40/51], Iter [370/391] Loss: 0.0725
Epoch [40/51], Iter [380/391] Loss: 0.0708
Epoch [40/51], Iter [390/391] Loss: 0.0681
[Saving Checkpoint]
Epoch [41/51], Iter [10/391] Loss: 0.0617
Epoch [41/51], Iter [20/391] Loss: 0.0599
Epoch [41/51], Iter [30/391] Loss: 0.0676
Epoch [41/51], Iter [40/391] Loss: 0.0851
Epoch [41/51], Iter [50/391] Loss: 0.0742
Epoch [41/51], Iter [60/391] Loss: 0.0639
Epoch [41/51], Iter [70/391] Loss: 0.0646
Epoch [41/51], Iter [80/391] Loss: 0.0627
Epoch [41/51], Iter [90/391] Loss: 0.0744
Epoch [41/51], Iter [100/391] Loss: 0.0653
Epoch [41/51], Iter [110/391] Loss: 0.0621
Epoch [41/51], Iter [120/391] Loss: 0.0560
Epoch [41/51], Iter [130/391] Loss: 0.0727
Epoch [41/51], Iter [140/391] Loss: 0.0626
Epoch [41/51], Iter [150/391] Loss: 0.0776
Epoch [41/51], Iter [160/391] Loss: 0.0694
Epoch [41/51], Iter [170/391] Loss: 0.0621
Epoch [41/51], Iter [180/391] Loss: 0.0675
Epoch [41/51], Iter [190/391] Loss: 0.0756
Epoch [41/51], Iter [200/391] Loss: 0.0718
Epoch [41/51], Iter [210/391] Loss: 0.0596
Epoch [41/51], Iter [220/391] Loss: 0.0723
Epoch [41/51], Iter [230/391] Loss: 0.0669
Epoch [41/51], Iter [240/391] Loss: 0.0677
Epoch [41/51], Iter [250/391] Loss: 0.0657
Epoch [41/51], Iter [260/391] Loss: 0.0662
Epoch [41/51], Iter [270/391] Loss: 0.0679
Epoch [41/51], Iter [280/391] Loss: 0.0560
Epoch [41/51], Iter [290/391] Loss: 0.0782
Epoch [41/51], Iter [300/391] Loss: 0.0565
Epoch [41/51], Iter [310/391] Loss: 0.0645
Epoch [41/51], Iter [320/391] Loss: 0.0676
Epoch [41/51], Iter [330/391] Loss: 0.0700
Epoch [41/51], Iter [340/391] Loss: 0.0632
Epoch [41/51], Iter [350/391] Loss: 0.0808
Epoch [41/51], Iter [360/391] Loss: 0.0542
Epoch [41/51], Iter [370/391] Loss: 0.0671
Epoch [41/51], Iter [380/391] Loss: 0.0626
Epoch [41/51], Iter [390/391] Loss: 0.0674
Epoch [42/51], Iter [10/391] Loss: 0.0560
Epoch [42/51], Iter [20/391] Loss: 0.0656
Epoch [42/51], Iter [30/391] Loss: 0.0727
Epoch [42/51], Iter [40/391] Loss: 0.0569
Epoch [42/51], Iter [50/391] Loss: 0.0603
Epoch [42/51], Iter [60/391] Loss: 0.0686
Epoch [42/51], Iter [70/391] Loss: 0.0549
Epoch [42/51], Iter [80/391] Loss: 0.0592
Epoch [42/51], Iter [90/391] Loss: 0.0727
Epoch [42/51], Iter [100/391] Loss: 0.0595
Epoch [42/51], Iter [110/391] Loss: 0.0527
Epoch [42/51], Iter [120/391] Loss: 0.0838
Epoch [42/51], Iter [130/391] Loss: 0.0687
Epoch [42/51], Iter [140/391] Loss: 0.0699
Epoch [42/51], Iter [150/391] Loss: 0.0753
Epoch [42/51], Iter [160/391] Loss: 0.0558
Epoch [42/51], Iter [170/391] Loss: 0.0703
Epoch [42/51], Iter [180/391] Loss: 0.0616
Epoch [42/51], Iter [190/391] Loss: 0.0633
Epoch [42/51], Iter [200/391] Loss: 0.0608
Epoch [42/51], Iter [210/391] Loss: 0.0758
Epoch [42/51], Iter [220/391] Loss: 0.0627
Epoch [42/51], Iter [230/391] Loss: 0.0816
Epoch [42/51], Iter [240/391] Loss: 0.0678
Epoch [42/51], Iter [250/391] Loss: 0.0742
Epoch [42/51], Iter [260/391] Loss: 0.0690
Epoch [42/51], Iter [270/391] Loss: 0.0631
Epoch [42/51], Iter [280/391] Loss: 0.0725
Epoch [42/51], Iter [290/391] Loss: 0.0627
Epoch [42/51], Iter [300/391] Loss: 0.0635
Epoch [42/51], Iter [310/391] Loss: 0.0606
Epoch [42/51], Iter [320/391] Loss: 0.0679
Epoch [42/51], Iter [330/391] Loss: 0.0609
Epoch [42/51], Iter [340/391] Loss: 0.0727
Epoch [42/51], Iter [350/391] Loss: 0.0606
Epoch [42/51], Iter [360/391] Loss: 0.0713
Epoch [42/51], Iter [370/391] Loss: 0.0728
Epoch [42/51], Iter [380/391] Loss: 0.0582
Epoch [42/51], Iter [390/391] Loss: 0.0689
Epoch [43/51], Iter [10/391] Loss: 0.0581
Epoch [43/51], Iter [20/391] Loss: 0.0674
Epoch [43/51], Iter [30/391] Loss: 0.0752
Epoch [43/51], Iter [40/391] Loss: 0.0700
Epoch [43/51], Iter [50/391] Loss: 0.0587
Epoch [43/51], Iter [60/391] Loss: 0.0572
Epoch [43/51], Iter [70/391] Loss: 0.0647
Epoch [43/51], Iter [80/391] Loss: 0.0636
Epoch [43/51], Iter [90/391] Loss: 0.0584
Epoch [43/51], Iter [100/391] Loss: 0.0727
Epoch [43/51], Iter [110/391] Loss: 0.0651
Epoch [43/51], Iter [120/391] Loss: 0.0718
Epoch [43/51], Iter [130/391] Loss: 0.0750
Epoch [43/51], Iter [140/391] Loss: 0.0589
Epoch [43/51], Iter [150/391] Loss: 0.0625
Epoch [43/51], Iter [160/391] Loss: 0.0634
Epoch [43/51], Iter [170/391] Loss: 0.0631
Epoch [43/51], Iter [180/391] Loss: 0.0727
Epoch [43/51], Iter [190/391] Loss: 0.0557
Epoch [43/51], Iter [200/391] Loss: 0.0590
Epoch [43/51], Iter [210/391] Loss: 0.0663
Epoch [43/51], Iter [220/391] Loss: 0.0671
Epoch [43/51], Iter [230/391] Loss: 0.0651
Epoch [43/51], Iter [240/391] Loss: 0.0615
Epoch [43/51], Iter [250/391] Loss: 0.0638
Epoch [43/51], Iter [260/391] Loss: 0.0659
Epoch [43/51], Iter [270/391] Loss: 0.0700
Epoch [43/51], Iter [280/391] Loss: 0.0669
Epoch [43/51], Iter [290/391] Loss: 0.0593
Epoch [43/51], Iter [300/391] Loss: 0.0603
Epoch [43/51], Iter [310/391] Loss: 0.0656
Epoch [43/51], Iter [320/391] Loss: 0.0640
Epoch [43/51], Iter [330/391] Loss: 0.0706
Epoch [43/51], Iter [340/391] Loss: 0.0785
Epoch [43/51], Iter [350/391] Loss: 0.0685
Epoch [43/51], Iter [360/391] Loss: 0.0624
Epoch [43/51], Iter [370/391] Loss: 0.0668
Epoch [43/51], Iter [380/391] Loss: 0.0662
Epoch [43/51], Iter [390/391] Loss: 0.0714
Epoch [44/51], Iter [10/391] Loss: 0.0667
Epoch [44/51], Iter [20/391] Loss: 0.0586
Epoch [44/51], Iter [30/391] Loss: 0.0557
Epoch [44/51], Iter [40/391] Loss: 0.0568
Epoch [44/51], Iter [50/391] Loss: 0.0607
Epoch [44/51], Iter [60/391] Loss: 0.0694
Epoch [44/51], Iter [70/391] Loss: 0.0707
Epoch [44/51], Iter [80/391] Loss: 0.0609
Epoch [44/51], Iter [90/391] Loss: 0.0699
Epoch [44/51], Iter [100/391] Loss: 0.0664
Epoch [44/51], Iter [110/391] Loss: 0.0668
Epoch [44/51], Iter [120/391] Loss: 0.0581
Epoch [44/51], Iter [130/391] Loss: 0.0742
Epoch [44/51], Iter [140/391] Loss: 0.0558
Epoch [44/51], Iter [150/391] Loss: 0.0608
Epoch [44/51], Iter [160/391] Loss: 0.0819
Epoch [44/51], Iter [170/391] Loss: 0.0619
Epoch [44/51], Iter [180/391] Loss: 0.0800
Epoch [44/51], Iter [190/391] Loss: 0.0840
Epoch [44/51], Iter [200/391] Loss: 0.0656
Epoch [44/51], Iter [210/391] Loss: 0.0565
Epoch [44/51], Iter [220/391] Loss: 0.0632
Epoch [44/51], Iter [230/391] Loss: 0.0665
Epoch [44/51], Iter [240/391] Loss: 0.0594
Epoch [44/51], Iter [250/391] Loss: 0.0691
Epoch [44/51], Iter [260/391] Loss: 0.0673
Epoch [44/51], Iter [270/391] Loss: 0.0710
Epoch [44/51], Iter [280/391] Loss: 0.0635
Epoch [44/51], Iter [290/391] Loss: 0.0620
Epoch [44/51], Iter [300/391] Loss: 0.0707
Epoch [44/51], Iter [310/391] Loss: 0.0652
Epoch [44/51], Iter [320/391] Loss: 0.0608
Epoch [44/51], Iter [330/391] Loss: 0.0810
Epoch [44/51], Iter [340/391] Loss: 0.0661
Epoch [44/51], Iter [350/391] Loss: 0.0612
Epoch [44/51], Iter [360/391] Loss: 0.0685
Epoch [44/51], Iter [370/391] Loss: 0.0701
Epoch [44/51], Iter [380/391] Loss: 0.0663
Epoch [44/51], Iter [390/391] Loss: 0.0592
Epoch [45/51], Iter [10/391] Loss: 0.0542
Epoch [45/51], Iter [20/391] Loss: 0.0625
Epoch [45/51], Iter [30/391] Loss: 0.0484
Epoch [45/51], Iter [40/391] Loss: 0.0589
Epoch [45/51], Iter [50/391] Loss: 0.0607
Epoch [45/51], Iter [60/391] Loss: 0.0556
Epoch [45/51], Iter [70/391] Loss: 0.0635
Epoch [45/51], Iter [80/391] Loss: 0.0575
Epoch [45/51], Iter [90/391] Loss: 0.0687
Epoch [45/51], Iter [100/391] Loss: 0.0706
Epoch [45/51], Iter [110/391] Loss: 0.0676
Epoch [45/51], Iter [120/391] Loss: 0.0665
Epoch [45/51], Iter [130/391] Loss: 0.0645
Epoch [45/51], Iter [140/391] Loss: 0.0641
Epoch [45/51], Iter [150/391] Loss: 0.0645
Epoch [45/51], Iter [160/391] Loss: 0.0554
Epoch [45/51], Iter [170/391] Loss: 0.0561
Epoch [45/51], Iter [180/391] Loss: 0.0612
Epoch [45/51], Iter [190/391] Loss: 0.0746
Epoch [45/51], Iter [200/391] Loss: 0.0600
Epoch [45/51], Iter [210/391] Loss: 0.0620
Epoch [45/51], Iter [220/391] Loss: 0.0670
Epoch [45/51], Iter [230/391] Loss: 0.0699
Epoch [45/51], Iter [240/391] Loss: 0.0588
Epoch [45/51], Iter [250/391] Loss: 0.0600
Epoch [45/51], Iter [260/391] Loss: 0.0559
Epoch [45/51], Iter [270/391] Loss: 0.0641
Epoch [45/51], Iter [280/391] Loss: 0.0644
Epoch [45/51], Iter [290/391] Loss: 0.0669
Epoch [45/51], Iter [300/391] Loss: 0.0704
Epoch [45/51], Iter [310/391] Loss: 0.0523
Epoch [45/51], Iter [320/391] Loss: 0.0751
Epoch [45/51], Iter [330/391] Loss: 0.0640
Epoch [45/51], Iter [340/391] Loss: 0.0575
Epoch [45/51], Iter [350/391] Loss: 0.0659
Epoch [45/51], Iter [360/391] Loss: 0.0597
Epoch [45/51], Iter [370/391] Loss: 0.0649
Epoch [45/51], Iter [380/391] Loss: 0.0669
Epoch [45/51], Iter [390/391] Loss: 0.0577
Epoch [46/51], Iter [10/391] Loss: 0.0705
Epoch [46/51], Iter [20/391] Loss: 0.0586
Epoch [46/51], Iter [30/391] Loss: 0.0579
Epoch [46/51], Iter [40/391] Loss: 0.0635
Epoch [46/51], Iter [50/391] Loss: 0.0553
Epoch [46/51], Iter [60/391] Loss: 0.0571
Epoch [46/51], Iter [70/391] Loss: 0.0536
Epoch [46/51], Iter [80/391] Loss: 0.0639
Epoch [46/51], Iter [90/391] Loss: 0.0540
Epoch [46/51], Iter [100/391] Loss: 0.0591
Epoch [46/51], Iter [110/391] Loss: 0.0511
Epoch [46/51], Iter [120/391] Loss: 0.0594
Epoch [46/51], Iter [130/391] Loss: 0.0640
Epoch [46/51], Iter [140/391] Loss: 0.0607
Epoch [46/51], Iter [150/391] Loss: 0.0584
Epoch [46/51], Iter [160/391] Loss: 0.0681
Epoch [46/51], Iter [170/391] Loss: 0.0685
Epoch [46/51], Iter [180/391] Loss: 0.0671
Epoch [46/51], Iter [190/391] Loss: 0.0736
Epoch [46/51], Iter [200/391] Loss: 0.0554
Epoch [46/51], Iter [210/391] Loss: 0.0606
Epoch [46/51], Iter [220/391] Loss: 0.0622
Epoch [46/51], Iter [230/391] Loss: 0.0562
Epoch [46/51], Iter [240/391] Loss: 0.0646
Epoch [46/51], Iter [250/391] Loss: 0.0736
Epoch [46/51], Iter [260/391] Loss: 0.0631
Epoch [46/51], Iter [270/391] Loss: 0.0561
Epoch [46/51], Iter [280/391] Loss: 0.0499
Epoch [46/51], Iter [290/391] Loss: 0.0709
Epoch [46/51], Iter [300/391] Loss: 0.0704
Epoch [46/51], Iter [310/391] Loss: 0.0487
Epoch [46/51], Iter [320/391] Loss: 0.0629
Epoch [46/51], Iter [330/391] Loss: 0.0619
Epoch [46/51], Iter [340/391] Loss: 0.0744
Epoch [46/51], Iter [350/391] Loss: 0.0574
Epoch [46/51], Iter [360/391] Loss: 0.0658
Epoch [46/51], Iter [370/391] Loss: 0.0699
Epoch [46/51], Iter [380/391] Loss: 0.0600
Epoch [46/51], Iter [390/391] Loss: 0.0549
Epoch [47/51], Iter [10/391] Loss: 0.0592
Epoch [47/51], Iter [20/391] Loss: 0.0581
Epoch [47/51], Iter [30/391] Loss: 0.0650
Epoch [47/51], Iter [40/391] Loss: 0.0552
Epoch [47/51], Iter [50/391] Loss: 0.0648
Epoch [47/51], Iter [60/391] Loss: 0.0568
Epoch [47/51], Iter [70/391] Loss: 0.0592
Epoch [47/51], Iter [80/391] Loss: 0.0567
Epoch [47/51], Iter [90/391] Loss: 0.0555
Epoch [47/51], Iter [100/391] Loss: 0.0619
Epoch [47/51], Iter [110/391] Loss: 0.0519
Epoch [47/51], Iter [120/391] Loss: 0.0595
Epoch [47/51], Iter [130/391] Loss: 0.0606
Epoch [47/51], Iter [140/391] Loss: 0.0748
Epoch [47/51], Iter [150/391] Loss: 0.0746
Epoch [47/51], Iter [160/391] Loss: 0.0574
Epoch [47/51], Iter [170/391] Loss: 0.0544
Epoch [47/51], Iter [180/391] Loss: 0.0689
Epoch [47/51], Iter [190/391] Loss: 0.0609
Epoch [47/51], Iter [200/391] Loss: 0.0536
Epoch [47/51], Iter [210/391] Loss: 0.0617
Epoch [47/51], Iter [220/391] Loss: 0.0681
Epoch [47/51], Iter [230/391] Loss: 0.0562
Epoch [47/51], Iter [240/391] Loss: 0.0573
Epoch [47/51], Iter [250/391] Loss: 0.0570
Epoch [47/51], Iter [260/391] Loss: 0.0684
Epoch [47/51], Iter [270/391] Loss: 0.0551
Epoch [47/51], Iter [280/391] Loss: 0.0781
Epoch [47/51], Iter [290/391] Loss: 0.0594
Epoch [47/51], Iter [300/391] Loss: 0.0646
Epoch [47/51], Iter [310/391] Loss: 0.0623
Epoch [47/51], Iter [320/391] Loss: 0.0501
Epoch [47/51], Iter [330/391] Loss: 0.0545
Epoch [47/51], Iter [340/391] Loss: 0.0682
Epoch [47/51], Iter [350/391] Loss: 0.0645
Epoch [47/51], Iter [360/391] Loss: 0.0643
Epoch [47/51], Iter [370/391] Loss: 0.0624
Epoch [47/51], Iter [380/391] Loss: 0.0554
Epoch [47/51], Iter [390/391] Loss: 0.0843
Epoch [48/51], Iter [10/391] Loss: 0.0551
Epoch [48/51], Iter [20/391] Loss: 0.0587
Epoch [48/51], Iter [30/391] Loss: 0.0547
Epoch [48/51], Iter [40/391] Loss: 0.0515
Epoch [48/51], Iter [50/391] Loss: 0.0548
Epoch [48/51], Iter [60/391] Loss: 0.0684
Epoch [48/51], Iter [70/391] Loss: 0.0537
Epoch [48/51], Iter [80/391] Loss: 0.0510
Epoch [48/51], Iter [90/391] Loss: 0.0564
Epoch [48/51], Iter [100/391] Loss: 0.0672
Epoch [48/51], Iter [110/391] Loss: 0.0584
Epoch [48/51], Iter [120/391] Loss: 0.0625
Epoch [48/51], Iter [130/391] Loss: 0.0669
Epoch [48/51], Iter [140/391] Loss: 0.0699
Epoch [48/51], Iter [150/391] Loss: 0.0653
Epoch [48/51], Iter [160/391] Loss: 0.0605
Epoch [48/51], Iter [170/391] Loss: 0.0606
Epoch [48/51], Iter [180/391] Loss: 0.0516
Epoch [48/51], Iter [190/391] Loss: 0.0590
Epoch [48/51], Iter [200/391] Loss: 0.0570
Epoch [48/51], Iter [210/391] Loss: 0.0500
Epoch [48/51], Iter [220/391] Loss: 0.0606
Epoch [48/51], Iter [230/391] Loss: 0.0603
Epoch [48/51], Iter [240/391] Loss: 0.0654
Epoch [48/51], Iter [250/391] Loss: 0.0709
Epoch [48/51], Iter [260/391] Loss: 0.0615
Epoch [48/51], Iter [270/391] Loss: 0.0623
Epoch [48/51], Iter [280/391] Loss: 0.0644
Epoch [48/51], Iter [290/391] Loss: 0.0624
Epoch [48/51], Iter [300/391] Loss: 0.0619
Epoch [48/51], Iter [310/391] Loss: 0.0634
Epoch [48/51], Iter [320/391] Loss: 0.0633
Epoch [48/51], Iter [330/391] Loss: 0.0559
Epoch [48/51], Iter [340/391] Loss: 0.0608
Epoch [48/51], Iter [350/391] Loss: 0.0606
Epoch [48/51], Iter [360/391] Loss: 0.0556
Epoch [48/51], Iter [370/391] Loss: 0.0568
Epoch [48/51], Iter [380/391] Loss: 0.0663
Epoch [48/51], Iter [390/391] Loss: 0.0617
Epoch [49/51], Iter [10/391] Loss: 0.0614
Epoch [49/51], Iter [20/391] Loss: 0.0612
Epoch [49/51], Iter [30/391] Loss: 0.0599
Epoch [49/51], Iter [40/391] Loss: 0.0559
Epoch [49/51], Iter [50/391] Loss: 0.0586
Epoch [49/51], Iter [60/391] Loss: 0.0586
Epoch [49/51], Iter [70/391] Loss: 0.0599
Epoch [49/51], Iter [80/391] Loss: 0.0597
Epoch [49/51], Iter [90/391] Loss: 0.0622
Epoch [49/51], Iter [100/391] Loss: 0.0569
Epoch [49/51], Iter [110/391] Loss: 0.0633
Epoch [49/51], Iter [120/391] Loss: 0.0602
Epoch [49/51], Iter [130/391] Loss: 0.0585
Epoch [49/51], Iter [140/391] Loss: 0.0606
Epoch [49/51], Iter [150/391] Loss: 0.0609
Epoch [49/51], Iter [160/391] Loss: 0.0630
Epoch [49/51], Iter [170/391] Loss: 0.0573
Epoch [49/51], Iter [180/391] Loss: 0.0635
Epoch [49/51], Iter [190/391] Loss: 0.0684
Epoch [49/51], Iter [200/391] Loss: 0.0528
Epoch [49/51], Iter [210/391] Loss: 0.0751
Epoch [49/51], Iter [220/391] Loss: 0.0556
Epoch [49/51], Iter [230/391] Loss: 0.0586
Epoch [49/51], Iter [240/391] Loss: 0.0634
Epoch [49/51], Iter [250/391] Loss: 0.0637
Epoch [49/51], Iter [260/391] Loss: 0.0580
Epoch [49/51], Iter [270/391] Loss: 0.0518
Epoch [49/51], Iter [280/391] Loss: 0.0529
Epoch [49/51], Iter [290/391] Loss: 0.0593
Epoch [49/51], Iter [300/391] Loss: 0.0626
Epoch [49/51], Iter [310/391] Loss: 0.0581
Epoch [49/51], Iter [320/391] Loss: 0.0517
Epoch [49/51], Iter [330/391] Loss: 0.0586
Epoch [49/51], Iter [340/391] Loss: 0.0689
Epoch [49/51], Iter [350/391] Loss: 0.0557
Epoch [49/51], Iter [360/391] Loss: 0.0567
Epoch [49/51], Iter [370/391] Loss: 0.0598
Epoch [49/51], Iter [380/391] Loss: 0.0568
Epoch [49/51], Iter [390/391] Loss: 0.0657
Epoch [50/51], Iter [10/391] Loss: 0.0546
Epoch [50/51], Iter [20/391] Loss: 0.0638
Epoch [50/51], Iter [30/391] Loss: 0.0551
Epoch [50/51], Iter [40/391] Loss: 0.0531
Epoch [50/51], Iter [50/391] Loss: 0.0556
Epoch [50/51], Iter [60/391] Loss: 0.0564
Epoch [50/51], Iter [70/391] Loss: 0.0644
Epoch [50/51], Iter [80/391] Loss: 0.0490
Epoch [50/51], Iter [90/391] Loss: 0.0543
Epoch [50/51], Iter [100/391] Loss: 0.0623
Epoch [50/51], Iter [110/391] Loss: 0.0567
Epoch [50/51], Iter [120/391] Loss: 0.0723
Epoch [50/51], Iter [130/391] Loss: 0.0547
Epoch [50/51], Iter [140/391] Loss: 0.0524
Epoch [50/51], Iter [150/391] Loss: 0.0536
Epoch [50/51], Iter [160/391] Loss: 0.0570
Epoch [50/51], Iter [170/391] Loss: 0.0565
Epoch [50/51], Iter [180/391] Loss: 0.0585
Epoch [50/51], Iter [190/391] Loss: 0.0541
Epoch [50/51], Iter [200/391] Loss: 0.0561
Epoch [50/51], Iter [210/391] Loss: 0.0650
Epoch [50/51], Iter [220/391] Loss: 0.0582
Epoch [50/51], Iter [230/391] Loss: 0.0623
Epoch [50/51], Iter [240/391] Loss: 0.0601
Epoch [50/51], Iter [250/391] Loss: 0.0646
Epoch [50/51], Iter [260/391] Loss: 0.0701
Epoch [50/51], Iter [270/391] Loss: 0.0508
Epoch [50/51], Iter [280/391] Loss: 0.0581
Epoch [50/51], Iter [290/391] Loss: 0.0504
Epoch [50/51], Iter [300/391] Loss: 0.0590
Epoch [50/51], Iter [310/391] Loss: 0.0614
Epoch [50/51], Iter [320/391] Loss: 0.0677
Epoch [50/51], Iter [330/391] Loss: 0.0652
Epoch [50/51], Iter [340/391] Loss: 0.0621
Epoch [50/51], Iter [350/391] Loss: 0.0637
Epoch [50/51], Iter [360/391] Loss: 0.0587
Epoch [50/51], Iter [370/391] Loss: 0.0535
Epoch [50/51], Iter [380/391] Loss: 0.0564
Epoch [50/51], Iter [390/391] Loss: 0.0507
[Saving Checkpoint]
Epoch [51/51], Iter [10/391] Loss: 0.0561
Epoch [51/51], Iter [20/391] Loss: 0.0495
Epoch [51/51], Iter [30/391] Loss: 0.0528
Epoch [51/51], Iter [40/391] Loss: 0.0578
Epoch [51/51], Iter [50/391] Loss: 0.0522
Epoch [51/51], Iter [60/391] Loss: 0.0600
Epoch [51/51], Iter [70/391] Loss: 0.0659
Epoch [51/51], Iter [80/391] Loss: 0.0543
Epoch [51/51], Iter [90/391] Loss: 0.0557
Epoch [51/51], Iter [100/391] Loss: 0.0658
Epoch [51/51], Iter [110/391] Loss: 0.0615
Epoch [51/51], Iter [120/391] Loss: 0.0508
Epoch [51/51], Iter [130/391] Loss: 0.0437
Epoch [51/51], Iter [140/391] Loss: 0.0526
Epoch [51/51], Iter [150/391] Loss: 0.0595
Epoch [51/51], Iter [160/391] Loss: 0.0498
Epoch [51/51], Iter [170/391] Loss: 0.0496
Epoch [51/51], Iter [180/391] Loss: 0.0528
Epoch [51/51], Iter [190/391] Loss: 0.0523
Epoch [51/51], Iter [200/391] Loss: 0.0534
Epoch [51/51], Iter [210/391] Loss: 0.0627
Epoch [51/51], Iter [220/391] Loss: 0.0605
Epoch [51/51], Iter [230/391] Loss: 0.0544
Epoch [51/51], Iter [240/391] Loss: 0.0550
Epoch [51/51], Iter [250/391] Loss: 0.0563
Epoch [51/51], Iter [260/391] Loss: 0.0565
Epoch [51/51], Iter [270/391] Loss: 0.0617
Epoch [51/51], Iter [280/391] Loss: 0.0647
Epoch [51/51], Iter [290/391] Loss: 0.0528
Epoch [51/51], Iter [300/391] Loss: 0.0517
Epoch [51/51], Iter [310/391] Loss: 0.0574
Epoch [51/51], Iter [320/391] Loss: 0.0694
Epoch [51/51], Iter [330/391] Loss: 0.0604
Epoch [51/51], Iter [340/391] Loss: 0.0611
Epoch [51/51], Iter [350/391] Loss: 0.0590
Epoch [51/51], Iter [360/391] Loss: 0.0503
Epoch [51/51], Iter [370/391] Loss: 0.0511
Epoch [51/51], Iter [380/391] Loss: 0.0608
Epoch [51/51], Iter [390/391] Loss: 0.0534
# | a=1 | T=10 | epochs = 51 |
resnet_child_a1_t10_e51 = copy.deepcopy(resnet_child) #let's save for future reference
test_harness( testloader, resnet_child_a1_t10_e51 )
Accuracy of the model on the test images: 89 %
(tensor(8987, device='cuda:0'), 10000)
# | a=1 | T=15 | epochs = 51 |
resnet_child, optimizer_child = get_new_child_model()
epoch = 0
kd_loss_a1_t15 = partial( knowledge_distillation_loss, alpha=1, T=15 )
training_harness( trainloader, optimizer_child, kd_loss_a1_t15, resnet_parent, resnet_child, model_name='DeepResNet_a1_t15_e51' )
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:12: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
if sys.path[0] == '':
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:26: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:15: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
from ipykernel import kernelapp as app
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:20: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:21: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
ENABLING GPU ACCELERATION || GeForce GTX 1080 Ti
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:46: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
Epoch [1/51], Iter [10/391] Loss: 2.0751
Epoch [1/51], Iter [20/391] Loss: 1.2181
Epoch [1/51], Iter [30/391] Loss: 1.0546
Epoch [1/51], Iter [40/391] Loss: 1.0536
Epoch [1/51], Iter [50/391] Loss: 0.9236
Epoch [1/51], Iter [60/391] Loss: 0.8526
Epoch [1/51], Iter [70/391] Loss: 0.8670
Epoch [1/51], Iter [80/391] Loss: 0.8116
Epoch [1/51], Iter [90/391] Loss: 0.7796
Epoch [1/51], Iter [100/391] Loss: 0.8379
Epoch [1/51], Iter [110/391] Loss: 0.8630
Epoch [1/51], Iter [120/391] Loss: 0.7857
Epoch [1/51], Iter [130/391] Loss: 0.7264
Epoch [1/51], Iter [140/391] Loss: 0.8099
Epoch [1/51], Iter [150/391] Loss: 0.7501
Epoch [1/51], Iter [160/391] Loss: 0.7292
Epoch [1/51], Iter [170/391] Loss: 0.6561
Epoch [1/51], Iter [180/391] Loss: 0.7113
Epoch [1/51], Iter [190/391] Loss: 0.7339
Epoch [1/51], Iter [200/391] Loss: 0.7282
Epoch [1/51], Iter [210/391] Loss: 0.6503
Epoch [1/51], Iter [220/391] Loss: 0.6643
Epoch [1/51], Iter [230/391] Loss: 0.7011
Epoch [1/51], Iter [240/391] Loss: 0.6724
Epoch [1/51], Iter [250/391] Loss: 0.6557
Epoch [1/51], Iter [260/391] Loss: 0.6457
Epoch [1/51], Iter [270/391] Loss: 0.6315
Epoch [1/51], Iter [280/391] Loss: 0.6297
Epoch [1/51], Iter [290/391] Loss: 0.6337
Epoch [1/51], Iter [300/391] Loss: 0.6012
Epoch [1/51], Iter [310/391] Loss: 0.6433
Epoch [1/51], Iter [320/391] Loss: 0.5737
Epoch [1/51], Iter [330/391] Loss: 0.6016
Epoch [1/51], Iter [340/391] Loss: 0.6162
Epoch [1/51], Iter [350/391] Loss: 0.5651
Epoch [1/51], Iter [360/391] Loss: 0.6521
Epoch [1/51], Iter [370/391] Loss: 0.6148
Epoch [1/51], Iter [380/391] Loss: 0.5786
Epoch [1/51], Iter [390/391] Loss: 0.5515
Epoch [2/51], Iter [10/391] Loss: 0.5629
Epoch [2/51], Iter [20/391] Loss: 0.5791
Epoch [2/51], Iter [30/391] Loss: 0.5869
Epoch [2/51], Iter [40/391] Loss: 0.5405
Epoch [2/51], Iter [50/391] Loss: 0.5567
Epoch [2/51], Iter [60/391] Loss: 0.5418
Epoch [2/51], Iter [70/391] Loss: 0.5608
Epoch [2/51], Iter [80/391] Loss: 0.5636
Epoch [2/51], Iter [90/391] Loss: 0.5419
Epoch [2/51], Iter [100/391] Loss: 0.5643
Epoch [2/51], Iter [110/391] Loss: 0.5155
Epoch [2/51], Iter [120/391] Loss: 0.5261
Epoch [2/51], Iter [130/391] Loss: 0.4992
Epoch [2/51], Iter [140/391] Loss: 0.4995
Epoch [2/51], Iter [150/391] Loss: 0.5283
Epoch [2/51], Iter [160/391] Loss: 0.5073
Epoch [2/51], Iter [170/391] Loss: 0.4879
Epoch [2/51], Iter [180/391] Loss: 0.4619
Epoch [2/51], Iter [190/391] Loss: 0.4907
Epoch [2/51], Iter [200/391] Loss: 0.5161
Epoch [2/51], Iter [210/391] Loss: 0.4804
Epoch [2/51], Iter [220/391] Loss: 0.4885
Epoch [2/51], Iter [230/391] Loss: 0.4802
Epoch [2/51], Iter [240/391] Loss: 0.5091
Epoch [2/51], Iter [250/391] Loss: 0.4811
Epoch [2/51], Iter [260/391] Loss: 0.4468
Epoch [2/51], Iter [270/391] Loss: 0.4574
Epoch [2/51], Iter [280/391] Loss: 0.4945
Epoch [2/51], Iter [290/391] Loss: 0.4423
Epoch [2/51], Iter [300/391] Loss: 0.4376
Epoch [2/51], Iter [310/391] Loss: 0.4783
Epoch [2/51], Iter [320/391] Loss: 0.4282
Epoch [2/51], Iter [330/391] Loss: 0.4548
Epoch [2/51], Iter [340/391] Loss: 0.4035
Epoch [2/51], Iter [350/391] Loss: 0.4545
Epoch [2/51], Iter [360/391] Loss: 0.4657
Epoch [2/51], Iter [370/391] Loss: 0.4180
Epoch [2/51], Iter [380/391] Loss: 0.4036
Epoch [2/51], Iter [390/391] Loss: 0.4457
Epoch [3/51], Iter [10/391] Loss: 0.4136
Epoch [3/51], Iter [20/391] Loss: 0.4512
Epoch [3/51], Iter [30/391] Loss: 0.4034
Epoch [3/51], Iter [40/391] Loss: 0.4614
Epoch [3/51], Iter [50/391] Loss: 0.3816
Epoch [3/51], Iter [60/391] Loss: 0.4333
Epoch [3/51], Iter [70/391] Loss: 0.4101
Epoch [3/51], Iter [80/391] Loss: 0.3989
Epoch [3/51], Iter [90/391] Loss: 0.4027
Epoch [3/51], Iter [100/391] Loss: 0.4060
Epoch [3/51], Iter [110/391] Loss: 0.4514
Epoch [3/51], Iter [120/391] Loss: 0.3780
Epoch [3/51], Iter [130/391] Loss: 0.4164
Epoch [3/51], Iter [140/391] Loss: 0.3900
Epoch [3/51], Iter [150/391] Loss: 0.4221
Epoch [3/51], Iter [160/391] Loss: 0.3942
Epoch [3/51], Iter [170/391] Loss: 0.4559
Epoch [3/51], Iter [180/391] Loss: 0.3741
Epoch [3/51], Iter [190/391] Loss: 0.4003
Epoch [3/51], Iter [200/391] Loss: 0.3985
Epoch [3/51], Iter [210/391] Loss: 0.4032
Epoch [3/51], Iter [220/391] Loss: 0.3833
Epoch [3/51], Iter [230/391] Loss: 0.3612
Epoch [3/51], Iter [240/391] Loss: 0.3880
Epoch [3/51], Iter [250/391] Loss: 0.4064
Epoch [3/51], Iter [260/391] Loss: 0.3310
Epoch [3/51], Iter [270/391] Loss: 0.3828
Epoch [3/51], Iter [280/391] Loss: 0.3595
Epoch [3/51], Iter [290/391] Loss: 0.3831
Epoch [3/51], Iter [300/391] Loss: 0.3620
Epoch [3/51], Iter [310/391] Loss: 0.3706
Epoch [3/51], Iter [320/391] Loss: 0.3573
Epoch [3/51], Iter [330/391] Loss: 0.3808
Epoch [3/51], Iter [340/391] Loss: 0.3639
Epoch [3/51], Iter [350/391] Loss: 0.3644
Epoch [3/51], Iter [360/391] Loss: 0.3472
Epoch [3/51], Iter [370/391] Loss: 0.3605
Epoch [3/51], Iter [380/391] Loss: 0.3442
Epoch [3/51], Iter [390/391] Loss: 0.4041
Epoch [4/51], Iter [10/391] Loss: 0.3347
Epoch [4/51], Iter [20/391] Loss: 0.3653
Epoch [4/51], Iter [30/391] Loss: 0.3785
Epoch [4/51], Iter [40/391] Loss: 0.3358
Epoch [4/51], Iter [50/391] Loss: 0.3420
Epoch [4/51], Iter [60/391] Loss: 0.3708
Epoch [4/51], Iter [70/391] Loss: 0.3508
Epoch [4/51], Iter [80/391] Loss: 0.3423
Epoch [4/51], Iter [90/391] Loss: 0.3305
Epoch [4/51], Iter [100/391] Loss: 0.3750
Epoch [4/51], Iter [110/391] Loss: 0.3361
Epoch [4/51], Iter [120/391] Loss: 0.3340
Epoch [4/51], Iter [130/391] Loss: 0.3627
Epoch [4/51], Iter [140/391] Loss: 0.3263
Epoch [4/51], Iter [150/391] Loss: 0.3178
Epoch [4/51], Iter [160/391] Loss: 0.2888
Epoch [4/51], Iter [170/391] Loss: 0.3315
Epoch [4/51], Iter [180/391] Loss: 0.3282
Epoch [4/51], Iter [190/391] Loss: 0.3533
Epoch [4/51], Iter [200/391] Loss: 0.3251
Epoch [4/51], Iter [210/391] Loss: 0.3476
Epoch [4/51], Iter [220/391] Loss: 0.3145
Epoch [4/51], Iter [230/391] Loss: 0.3155
Epoch [4/51], Iter [240/391] Loss: 0.2780
Epoch [4/51], Iter [250/391] Loss: 0.2873
Epoch [4/51], Iter [260/391] Loss: 0.3114
Epoch [4/51], Iter [270/391] Loss: 0.3214
Epoch [4/51], Iter [280/391] Loss: 0.3306
Epoch [4/51], Iter [290/391] Loss: 0.3420
Epoch [4/51], Iter [300/391] Loss: 0.2970
Epoch [4/51], Iter [310/391] Loss: 0.3466
Epoch [4/51], Iter [320/391] Loss: 0.2926
Epoch [4/51], Iter [330/391] Loss: 0.2985
Epoch [4/51], Iter [340/391] Loss: 0.2897
Epoch [4/51], Iter [350/391] Loss: 0.3201
Epoch [4/51], Iter [360/391] Loss: 0.3265
Epoch [4/51], Iter [370/391] Loss: 0.3120
Epoch [4/51], Iter [380/391] Loss: 0.3253
Epoch [4/51], Iter [390/391] Loss: 0.3061
Epoch [5/51], Iter [10/391] Loss: 0.2572
Epoch [5/51], Iter [20/391] Loss: 0.2694
Epoch [5/51], Iter [30/391] Loss: 0.3058
Epoch [5/51], Iter [40/391] Loss: 0.2887
Epoch [5/51], Iter [50/391] Loss: 0.2807
Epoch [5/51], Iter [60/391] Loss: 0.2728
Epoch [5/51], Iter [70/391] Loss: 0.3223
Epoch [5/51], Iter [80/391] Loss: 0.3094
Epoch [5/51], Iter [90/391] Loss: 0.3030
Epoch [5/51], Iter [100/391] Loss: 0.2995
Epoch [5/51], Iter [110/391] Loss: 0.3020
Epoch [5/51], Iter [120/391] Loss: 0.3060
Epoch [5/51], Iter [130/391] Loss: 0.2337
Epoch [5/51], Iter [140/391] Loss: 0.2974
Epoch [5/51], Iter [150/391] Loss: 0.3075
Epoch [5/51], Iter [160/391] Loss: 0.2758
Epoch [5/51], Iter [170/391] Loss: 0.2919
Epoch [5/51], Iter [180/391] Loss: 0.2797
Epoch [5/51], Iter [190/391] Loss: 0.2840
Epoch [5/51], Iter [200/391] Loss: 0.2832
Epoch [5/51], Iter [210/391] Loss: 0.2796
Epoch [5/51], Iter [220/391] Loss: 0.2875
Epoch [5/51], Iter [230/391] Loss: 0.2561
Epoch [5/51], Iter [240/391] Loss: 0.2777
Epoch [5/51], Iter [250/391] Loss: 0.2686
Epoch [5/51], Iter [260/391] Loss: 0.2852
Epoch [5/51], Iter [270/391] Loss: 0.2567
Epoch [5/51], Iter [280/391] Loss: 0.3212
Epoch [5/51], Iter [290/391] Loss: 0.2958
Epoch [5/51], Iter [300/391] Loss: 0.2811
Epoch [5/51], Iter [310/391] Loss: 0.2371
Epoch [5/51], Iter [320/391] Loss: 0.2815
Epoch [5/51], Iter [330/391] Loss: 0.2517
Epoch [5/51], Iter [340/391] Loss: 0.2404
Epoch [5/51], Iter [350/391] Loss: 0.2392
Epoch [5/51], Iter [360/391] Loss: 0.2638
Epoch [5/51], Iter [370/391] Loss: 0.2517
Epoch [5/51], Iter [380/391] Loss: 0.2452
Epoch [5/51], Iter [390/391] Loss: 0.2664
Epoch [6/51], Iter [10/391] Loss: 0.2341
Epoch [6/51], Iter [20/391] Loss: 0.2536
Epoch [6/51], Iter [30/391] Loss: 0.2659
Epoch [6/51], Iter [40/391] Loss: 0.2436
Epoch [6/51], Iter [50/391] Loss: 0.2570
Epoch [6/51], Iter [60/391] Loss: 0.2295
Epoch [6/51], Iter [70/391] Loss: 0.2181
Epoch [6/51], Iter [80/391] Loss: 0.2542
Epoch [6/51], Iter [90/391] Loss: 0.2565
Epoch [6/51], Iter [100/391] Loss: 0.2489
Epoch [6/51], Iter [110/391] Loss: 0.2635
Epoch [6/51], Iter [120/391] Loss: 0.2417
Epoch [6/51], Iter [130/391] Loss: 0.2440
Epoch [6/51], Iter [140/391] Loss: 0.2464
Epoch [6/51], Iter [150/391] Loss: 0.2810
Epoch [6/51], Iter [160/391] Loss: 0.2404
Epoch [6/51], Iter [170/391] Loss: 0.2310
Epoch [6/51], Iter [180/391] Loss: 0.2638
Epoch [6/51], Iter [190/391] Loss: 0.2153
Epoch [6/51], Iter [200/391] Loss: 0.2144
Epoch [6/51], Iter [210/391] Loss: 0.2427
Epoch [6/51], Iter [220/391] Loss: 0.2290
Epoch [6/51], Iter [230/391] Loss: 0.2247
Epoch [6/51], Iter [240/391] Loss: 0.2512
Epoch [6/51], Iter [250/391] Loss: 0.2336
Epoch [6/51], Iter [260/391] Loss: 0.2530
Epoch [6/51], Iter [270/391] Loss: 0.2414
Epoch [6/51], Iter [280/391] Loss: 0.2447
Epoch [6/51], Iter [290/391] Loss: 0.2596
Epoch [6/51], Iter [300/391] Loss: 0.2458
Epoch [6/51], Iter [310/391] Loss: 0.2398
Epoch [6/51], Iter [320/391] Loss: 0.2537
Epoch [6/51], Iter [330/391] Loss: 0.2349
Epoch [6/51], Iter [340/391] Loss: 0.2758
Epoch [6/51], Iter [350/391] Loss: 0.2482
Epoch [6/51], Iter [360/391] Loss: 0.2364
Epoch [6/51], Iter [370/391] Loss: 0.2152
Epoch [6/51], Iter [380/391] Loss: 0.2514
Epoch [6/51], Iter [390/391] Loss: 0.2434
Epoch [7/51], Iter [10/391] Loss: 0.2474
Epoch [7/51], Iter [20/391] Loss: 0.2215
Epoch [7/51], Iter [30/391] Loss: 0.2042
Epoch [7/51], Iter [40/391] Loss: 0.2328
Epoch [7/51], Iter [50/391] Loss: 0.2103
Epoch [7/51], Iter [60/391] Loss: 0.2392
Epoch [7/51], Iter [70/391] Loss: 0.2190
Epoch [7/51], Iter [80/391] Loss: 0.2359
Epoch [7/51], Iter [90/391] Loss: 0.2410
Epoch [7/51], Iter [100/391] Loss: 0.2058
Epoch [7/51], Iter [110/391] Loss: 0.1881
Epoch [7/51], Iter [120/391] Loss: 0.2266
Epoch [7/51], Iter [130/391] Loss: 0.2234
Epoch [7/51], Iter [140/391] Loss: 0.2251
Epoch [7/51], Iter [150/391] Loss: 0.2479
Epoch [7/51], Iter [160/391] Loss: 0.2200
Epoch [7/51], Iter [170/391] Loss: 0.2300
Epoch [7/51], Iter [180/391] Loss: 0.2229
Epoch [7/51], Iter [190/391] Loss: 0.2164
Epoch [7/51], Iter [200/391] Loss: 0.2292
Epoch [7/51], Iter [210/391] Loss: 0.2302
Epoch [7/51], Iter [220/391] Loss: 0.2263
Epoch [7/51], Iter [230/391] Loss: 0.2254
Epoch [7/51], Iter [240/391] Loss: 0.2396
Epoch [7/51], Iter [250/391] Loss: 0.2123
Epoch [7/51], Iter [260/391] Loss: 0.1900
Epoch [7/51], Iter [270/391] Loss: 0.1965
Epoch [7/51], Iter [280/391] Loss: 0.1999
Epoch [7/51], Iter [290/391] Loss: 0.2131
Epoch [7/51], Iter [300/391] Loss: 0.2498
Epoch [7/51], Iter [310/391] Loss: 0.1955
Epoch [7/51], Iter [320/391] Loss: 0.2152
Epoch [7/51], Iter [330/391] Loss: 0.2419
Epoch [7/51], Iter [340/391] Loss: 0.1741
Epoch [7/51], Iter [350/391] Loss: 0.2043
Epoch [7/51], Iter [360/391] Loss: 0.2117
Epoch [7/51], Iter [370/391] Loss: 0.2012
Epoch [7/51], Iter [380/391] Loss: 0.2037
Epoch [7/51], Iter [390/391] Loss: 0.1852
Epoch [8/51], Iter [10/391] Loss: 0.1891
Epoch [8/51], Iter [20/391] Loss: 0.1916
Epoch [8/51], Iter [30/391] Loss: 0.1967
Epoch [8/51], Iter [40/391] Loss: 0.2068
Epoch [8/51], Iter [50/391] Loss: 0.1909
Epoch [8/51], Iter [60/391] Loss: 0.1777
Epoch [8/51], Iter [70/391] Loss: 0.1784
Epoch [8/51], Iter [80/391] Loss: 0.1817
Epoch [8/51], Iter [90/391] Loss: 0.1919
Epoch [8/51], Iter [100/391] Loss: 0.2075
Epoch [8/51], Iter [110/391] Loss: 0.2353
Epoch [8/51], Iter [120/391] Loss: 0.1990
Epoch [8/51], Iter [130/391] Loss: 0.1775
Epoch [8/51], Iter [140/391] Loss: 0.2016
Epoch [8/51], Iter [150/391] Loss: 0.2394
Epoch [8/51], Iter [160/391] Loss: 0.1961
Epoch [8/51], Iter [170/391] Loss: 0.2324
Epoch [8/51], Iter [180/391] Loss: 0.2009
Epoch [8/51], Iter [190/391] Loss: 0.2127
Epoch [8/51], Iter [200/391] Loss: 0.1832
Epoch [8/51], Iter [210/391] Loss: 0.1937
Epoch [8/51], Iter [220/391] Loss: 0.1897
Epoch [8/51], Iter [230/391] Loss: 0.2205
Epoch [8/51], Iter [240/391] Loss: 0.1976
Epoch [8/51], Iter [250/391] Loss: 0.1866
Epoch [8/51], Iter [260/391] Loss: 0.2070
Epoch [8/51], Iter [270/391] Loss: 0.1912
Epoch [8/51], Iter [280/391] Loss: 0.2064
Epoch [8/51], Iter [290/391] Loss: 0.1992
Epoch [8/51], Iter [300/391] Loss: 0.2176
Epoch [8/51], Iter [310/391] Loss: 0.2012
Epoch [8/51], Iter [320/391] Loss: 0.1856
Epoch [8/51], Iter [330/391] Loss: 0.1863
Epoch [8/51], Iter [340/391] Loss: 0.2234
Epoch [8/51], Iter [350/391] Loss: 0.1814
Epoch [8/51], Iter [360/391] Loss: 0.1987
Epoch [8/51], Iter [370/391] Loss: 0.1981
Epoch [8/51], Iter [380/391] Loss: 0.1617
Epoch [8/51], Iter [390/391] Loss: 0.2075
Epoch [9/51], Iter [10/391] Loss: 0.1962
Epoch [9/51], Iter [20/391] Loss: 0.2092
Epoch [9/51], Iter [30/391] Loss: 0.2007
Epoch [9/51], Iter [40/391] Loss: 0.1959
Epoch [9/51], Iter [50/391] Loss: 0.1663
Epoch [9/51], Iter [60/391] Loss: 0.2095
Epoch [9/51], Iter [70/391] Loss: 0.2124
Epoch [9/51], Iter [80/391] Loss: 0.2026
Epoch [9/51], Iter [90/391] Loss: 0.1953
Epoch [9/51], Iter [100/391] Loss: 0.1954
Epoch [9/51], Iter [110/391] Loss: 0.1776
Epoch [9/51], Iter [120/391] Loss: 0.1775
Epoch [9/51], Iter [130/391] Loss: 0.2043
Epoch [9/51], Iter [140/391] Loss: 0.1718
Epoch [9/51], Iter [150/391] Loss: 0.1936
Epoch [9/51], Iter [160/391] Loss: 0.1865
Epoch [9/51], Iter [170/391] Loss: 0.2020
Epoch [9/51], Iter [180/391] Loss: 0.1756
Epoch [9/51], Iter [190/391] Loss: 0.2043
Epoch [9/51], Iter [200/391] Loss: 0.2057
Epoch [9/51], Iter [210/391] Loss: 0.2059
Epoch [9/51], Iter [220/391] Loss: 0.1898
Epoch [9/51], Iter [230/391] Loss: 0.1925
Epoch [9/51], Iter [240/391] Loss: 0.1751
Epoch [9/51], Iter [250/391] Loss: 0.1908
Epoch [9/51], Iter [260/391] Loss: 0.2026
Epoch [9/51], Iter [270/391] Loss: 0.1729
Epoch [9/51], Iter [280/391] Loss: 0.1767
Epoch [9/51], Iter [290/391] Loss: 0.1883
Epoch [9/51], Iter [300/391] Loss: 0.1903
Epoch [9/51], Iter [310/391] Loss: 0.2034
Epoch [9/51], Iter [320/391] Loss: 0.1926
Epoch [9/51], Iter [330/391] Loss: 0.1826
Epoch [9/51], Iter [340/391] Loss: 0.2234
Epoch [9/51], Iter [350/391] Loss: 0.1520
Epoch [9/51], Iter [360/391] Loss: 0.1747
Epoch [9/51], Iter [370/391] Loss: 0.1957
Epoch [9/51], Iter [380/391] Loss: 0.2010
Epoch [9/51], Iter [390/391] Loss: 0.1785
Epoch [10/51], Iter [10/391] Loss: 0.1720
Epoch [10/51], Iter [20/391] Loss: 0.1721
Epoch [10/51], Iter [30/391] Loss: 0.1589
Epoch [10/51], Iter [40/391] Loss: 0.1785
Epoch [10/51], Iter [50/391] Loss: 0.1661
Epoch [10/51], Iter [60/391] Loss: 0.1950
Epoch [10/51], Iter [70/391] Loss: 0.1704
Epoch [10/51], Iter [80/391] Loss: 0.1567
Epoch [10/51], Iter [90/391] Loss: 0.1462
Epoch [10/51], Iter [100/391] Loss: 0.1619
Epoch [10/51], Iter [110/391] Loss: 0.1677
Epoch [10/51], Iter [120/391] Loss: 0.1788
Epoch [10/51], Iter [130/391] Loss: 0.1587
Epoch [10/51], Iter [140/391] Loss: 0.1683
Epoch [10/51], Iter [150/391] Loss: 0.1778
Epoch [10/51], Iter [160/391] Loss: 0.1894
Epoch [10/51], Iter [170/391] Loss: 0.1815
Epoch [10/51], Iter [180/391] Loss: 0.1772
Epoch [10/51], Iter [190/391] Loss: 0.1694
Epoch [10/51], Iter [200/391] Loss: 0.1790
Epoch [10/51], Iter [210/391] Loss: 0.1513
Epoch [10/51], Iter [220/391] Loss: 0.1429
Epoch [10/51], Iter [230/391] Loss: 0.1823
Epoch [10/51], Iter [240/391] Loss: 0.1793
Epoch [10/51], Iter [250/391] Loss: 0.1757
Epoch [10/51], Iter [260/391] Loss: 0.1649
Epoch [10/51], Iter [270/391] Loss: 0.2123
Epoch [10/51], Iter [280/391] Loss: 0.1593
Epoch [10/51], Iter [290/391] Loss: 0.1839
Epoch [10/51], Iter [300/391] Loss: 0.1648
Epoch [10/51], Iter [310/391] Loss: 0.1868
Epoch [10/51], Iter [320/391] Loss: 0.1659
Epoch [10/51], Iter [330/391] Loss: 0.1586
Epoch [10/51], Iter [340/391] Loss: 0.1603
Epoch [10/51], Iter [350/391] Loss: 0.1844
Epoch [10/51], Iter [360/391] Loss: 0.1731
Epoch [10/51], Iter [370/391] Loss: 0.1834
Epoch [10/51], Iter [380/391] Loss: 0.1898
Epoch [10/51], Iter [390/391] Loss: 0.1552
[Saving Checkpoint]
Epoch [11/51], Iter [10/391] Loss: 0.1605
Epoch [11/51], Iter [20/391] Loss: 0.1455
Epoch [11/51], Iter [30/391] Loss: 0.1742
Epoch [11/51], Iter [40/391] Loss: 0.1688
Epoch [11/51], Iter [50/391] Loss: 0.1646
Epoch [11/51], Iter [60/391] Loss: 0.1734
Epoch [11/51], Iter [70/391] Loss: 0.1737
Epoch [11/51], Iter [80/391] Loss: 0.1587
Epoch [11/51], Iter [90/391] Loss: 0.1628
Epoch [11/51], Iter [100/391] Loss: 0.1692
Epoch [11/51], Iter [110/391] Loss: 0.1617
Epoch [11/51], Iter [120/391] Loss: 0.1484
Epoch [11/51], Iter [130/391] Loss: 0.1706
Epoch [11/51], Iter [140/391] Loss: 0.1606
Epoch [11/51], Iter [150/391] Loss: 0.1573
Epoch [11/51], Iter [160/391] Loss: 0.1456
Epoch [11/51], Iter [170/391] Loss: 0.1832
Epoch [11/51], Iter [180/391] Loss: 0.1586
Epoch [11/51], Iter [190/391] Loss: 0.1556
Epoch [11/51], Iter [200/391] Loss: 0.1479
Epoch [11/51], Iter [210/391] Loss: 0.1853
Epoch [11/51], Iter [220/391] Loss: 0.1811
Epoch [11/51], Iter [230/391] Loss: 0.1528
Epoch [11/51], Iter [240/391] Loss: 0.1916
Epoch [11/51], Iter [250/391] Loss: 0.1607
Epoch [11/51], Iter [260/391] Loss: 0.1777
Epoch [11/51], Iter [270/391] Loss: 0.1423
Epoch [11/51], Iter [280/391] Loss: 0.1358
Epoch [11/51], Iter [290/391] Loss: 0.1480
Epoch [11/51], Iter [300/391] Loss: 0.1423
Epoch [11/51], Iter [310/391] Loss: 0.1466
Epoch [11/51], Iter [320/391] Loss: 0.1555
Epoch [11/51], Iter [330/391] Loss: 0.1682
Epoch [11/51], Iter [340/391] Loss: 0.1580
Epoch [11/51], Iter [350/391] Loss: 0.1525
Epoch [11/51], Iter [360/391] Loss: 0.1505
Epoch [11/51], Iter [370/391] Loss: 0.1472
Epoch [11/51], Iter [380/391] Loss: 0.1503
Epoch [11/51], Iter [390/391] Loss: 0.1638
Epoch [12/51], Iter [10/391] Loss: 0.1505
Epoch [12/51], Iter [20/391] Loss: 0.1557
Epoch [12/51], Iter [30/391] Loss: 0.1453
Epoch [12/51], Iter [40/391] Loss: 0.1526
Epoch [12/51], Iter [50/391] Loss: 0.1532
Epoch [12/51], Iter [60/391] Loss: 0.1308
Epoch [12/51], Iter [70/391] Loss: 0.1541
Epoch [12/51], Iter [80/391] Loss: 0.1488
Epoch [12/51], Iter [90/391] Loss: 0.1708
Epoch [12/51], Iter [100/391] Loss: 0.1624
Epoch [12/51], Iter [110/391] Loss: 0.1480
Epoch [12/51], Iter [120/391] Loss: 0.1451
Epoch [12/51], Iter [130/391] Loss: 0.1577
Epoch [12/51], Iter [140/391] Loss: 0.1778
Epoch [12/51], Iter [150/391] Loss: 0.1592
Epoch [12/51], Iter [160/391] Loss: 0.1383
Epoch [12/51], Iter [170/391] Loss: 0.1423
Epoch [12/51], Iter [180/391] Loss: 0.1683
Epoch [12/51], Iter [190/391] Loss: 0.1559
Epoch [12/51], Iter [200/391] Loss: 0.1535
Epoch [12/51], Iter [210/391] Loss: 0.1621
Epoch [12/51], Iter [220/391] Loss: 0.1466
Epoch [12/51], Iter [230/391] Loss: 0.1674
Epoch [12/51], Iter [240/391] Loss: 0.1373
Epoch [12/51], Iter [250/391] Loss: 0.1717
Epoch [12/51], Iter [260/391] Loss: 0.1684
Epoch [12/51], Iter [270/391] Loss: 0.1558
Epoch [12/51], Iter [280/391] Loss: 0.1335
Epoch [12/51], Iter [290/391] Loss: 0.1508
Epoch [12/51], Iter [300/391] Loss: 0.1334
Epoch [12/51], Iter [310/391] Loss: 0.1563
Epoch [12/51], Iter [320/391] Loss: 0.1552
Epoch [12/51], Iter [330/391] Loss: 0.1404
Epoch [12/51], Iter [340/391] Loss: 0.1284
Epoch [12/51], Iter [350/391] Loss: 0.1722
Epoch [12/51], Iter [360/391] Loss: 0.1555
Epoch [12/51], Iter [370/391] Loss: 0.1344
Epoch [12/51], Iter [380/391] Loss: 0.1394
Epoch [12/51], Iter [390/391] Loss: 0.1699
Epoch [13/51], Iter [10/391] Loss: 0.1500
Epoch [13/51], Iter [20/391] Loss: 0.1430
Epoch [13/51], Iter [30/391] Loss: 0.1389
Epoch [13/51], Iter [40/391] Loss: 0.1246
Epoch [13/51], Iter [50/391] Loss: 0.1475
Epoch [13/51], Iter [60/391] Loss: 0.1531
Epoch [13/51], Iter [70/391] Loss: 0.1383
Epoch [13/51], Iter [80/391] Loss: 0.1566
Epoch [13/51], Iter [90/391] Loss: 0.1393
Epoch [13/51], Iter [100/391] Loss: 0.1395
Epoch [13/51], Iter [110/391] Loss: 0.1254
Epoch [13/51], Iter [120/391] Loss: 0.1441
Epoch [13/51], Iter [130/391] Loss: 0.1556
Epoch [13/51], Iter [140/391] Loss: 0.1383
Epoch [13/51], Iter [150/391] Loss: 0.1470
Epoch [13/51], Iter [160/391] Loss: 0.1574
Epoch [13/51], Iter [170/391] Loss: 0.1534
Epoch [13/51], Iter [180/391] Loss: 0.1556
Epoch [13/51], Iter [190/391] Loss: 0.1398
Epoch [13/51], Iter [200/391] Loss: 0.1451
Epoch [13/51], Iter [210/391] Loss: 0.1764
Epoch [13/51], Iter [220/391] Loss: 0.1586
Epoch [13/51], Iter [230/391] Loss: 0.1330
Epoch [13/51], Iter [240/391] Loss: 0.1541
Epoch [13/51], Iter [250/391] Loss: 0.1515
Epoch [13/51], Iter [260/391] Loss: 0.1290
Epoch [13/51], Iter [270/391] Loss: 0.1567
Epoch [13/51], Iter [280/391] Loss: 0.1564
Epoch [13/51], Iter [290/391] Loss: 0.1505
Epoch [13/51], Iter [300/391] Loss: 0.1380
Epoch [13/51], Iter [310/391] Loss: 0.1533
Epoch [13/51], Iter [320/391] Loss: 0.1383
Epoch [13/51], Iter [330/391] Loss: 0.1464
Epoch [13/51], Iter [340/391] Loss: 0.1469
Epoch [13/51], Iter [350/391] Loss: 0.1437
Epoch [13/51], Iter [360/391] Loss: 0.1330
Epoch [13/51], Iter [370/391] Loss: 0.1471
Epoch [13/51], Iter [380/391] Loss: 0.1117
Epoch [13/51], Iter [390/391] Loss: 0.1469
Epoch [14/51], Iter [10/391] Loss: 0.1136
Epoch [14/51], Iter [20/391] Loss: 0.1423
Epoch [14/51], Iter [30/391] Loss: 0.1280
Epoch [14/51], Iter [40/391] Loss: 0.1365
Epoch [14/51], Iter [50/391] Loss: 0.1473
Epoch [14/51], Iter [60/391] Loss: 0.1491
Epoch [14/51], Iter [70/391] Loss: 0.1394
Epoch [14/51], Iter [80/391] Loss: 0.1225
Epoch [14/51], Iter [90/391] Loss: 0.1416
Epoch [14/51], Iter [100/391] Loss: 0.1326
Epoch [14/51], Iter [110/391] Loss: 0.1468
Epoch [14/51], Iter [120/391] Loss: 0.1191
Epoch [14/51], Iter [130/391] Loss: 0.1299
Epoch [14/51], Iter [140/391] Loss: 0.1395
Epoch [14/51], Iter [150/391] Loss: 0.1362
Epoch [14/51], Iter [160/391] Loss: 0.1412
Epoch [14/51], Iter [170/391] Loss: 0.1238
Epoch [14/51], Iter [180/391] Loss: 0.1516
Epoch [14/51], Iter [190/391] Loss: 0.1411
Epoch [14/51], Iter [200/391] Loss: 0.1339
Epoch [14/51], Iter [210/391] Loss: 0.1486
Epoch [14/51], Iter [220/391] Loss: 0.1399
Epoch [14/51], Iter [230/391] Loss: 0.1198
Epoch [14/51], Iter [240/391] Loss: 0.1439
Epoch [14/51], Iter [250/391] Loss: 0.1585
Epoch [14/51], Iter [260/391] Loss: 0.1267
Epoch [14/51], Iter [270/391] Loss: 0.1377
Epoch [14/51], Iter [280/391] Loss: 0.1399
Epoch [14/51], Iter [290/391] Loss: 0.1378
Epoch [14/51], Iter [300/391] Loss: 0.1426
Epoch [14/51], Iter [310/391] Loss: 0.1316
Epoch [14/51], Iter [320/391] Loss: 0.1349
Epoch [14/51], Iter [330/391] Loss: 0.1344
Epoch [14/51], Iter [340/391] Loss: 0.1272
Epoch [14/51], Iter [350/391] Loss: 0.1312
Epoch [14/51], Iter [360/391] Loss: 0.1331
Epoch [14/51], Iter [370/391] Loss: 0.1449
Epoch [14/51], Iter [380/391] Loss: 0.1360
Epoch [14/51], Iter [390/391] Loss: 0.1258
Epoch [15/51], Iter [10/391] Loss: 0.1253
Epoch [15/51], Iter [20/391] Loss: 0.1065
Epoch [15/51], Iter [30/391] Loss: 0.1496
Epoch [15/51], Iter [40/391] Loss: 0.1233
Epoch [15/51], Iter [50/391] Loss: 0.1380
Epoch [15/51], Iter [60/391] Loss: 0.1236
Epoch [15/51], Iter [70/391] Loss: 0.1222
Epoch [15/51], Iter [80/391] Loss: 0.1388
Epoch [15/51], Iter [90/391] Loss: 0.1331
Epoch [15/51], Iter [100/391] Loss: 0.1221
Epoch [15/51], Iter [110/391] Loss: 0.1343
Epoch [15/51], Iter [120/391] Loss: 0.1468
Epoch [15/51], Iter [130/391] Loss: 0.1201
Epoch [15/51], Iter [140/391] Loss: 0.1301
Epoch [15/51], Iter [150/391] Loss: 0.1322
Epoch [15/51], Iter [160/391] Loss: 0.1256
Epoch [15/51], Iter [170/391] Loss: 0.1553
Epoch [15/51], Iter [180/391] Loss: 0.1363
Epoch [15/51], Iter [190/391] Loss: 0.1239
Epoch [15/51], Iter [200/391] Loss: 0.1046
Epoch [15/51], Iter [210/391] Loss: 0.1318
Epoch [15/51], Iter [220/391] Loss: 0.1294
Epoch [15/51], Iter [230/391] Loss: 0.1366
Epoch [15/51], Iter [240/391] Loss: 0.1400
Epoch [15/51], Iter [250/391] Loss: 0.1373
Epoch [15/51], Iter [260/391] Loss: 0.1412
Epoch [15/51], Iter [270/391] Loss: 0.1143
Epoch [15/51], Iter [280/391] Loss: 0.1142
Epoch [15/51], Iter [290/391] Loss: 0.1271
Epoch [15/51], Iter [300/391] Loss: 0.1306
Epoch [15/51], Iter [310/391] Loss: 0.1250
Epoch [15/51], Iter [320/391] Loss: 0.1114
Epoch [15/51], Iter [330/391] Loss: 0.1318
Epoch [15/51], Iter [340/391] Loss: 0.1187
Epoch [15/51], Iter [350/391] Loss: 0.1317
Epoch [15/51], Iter [360/391] Loss: 0.1222
Epoch [15/51], Iter [370/391] Loss: 0.1355
Epoch [15/51], Iter [380/391] Loss: 0.1202
Epoch [15/51], Iter [390/391] Loss: 0.1461
Epoch [16/51], Iter [10/391] Loss: 0.1218
Epoch [16/51], Iter [20/391] Loss: 0.1115
Epoch [16/51], Iter [30/391] Loss: 0.1072
Epoch [16/51], Iter [40/391] Loss: 0.1244
Epoch [16/51], Iter [50/391] Loss: 0.1389
Epoch [16/51], Iter [60/391] Loss: 0.1121
Epoch [16/51], Iter [70/391] Loss: 0.1103
Epoch [16/51], Iter [80/391] Loss: 0.1348
Epoch [16/51], Iter [90/391] Loss: 0.0987
Epoch [16/51], Iter [100/391] Loss: 0.1176
Epoch [16/51], Iter [110/391] Loss: 0.1270
Epoch [16/51], Iter [120/391] Loss: 0.1215
Epoch [16/51], Iter [130/391] Loss: 0.1318
Epoch [16/51], Iter [140/391] Loss: 0.1133
Epoch [16/51], Iter [150/391] Loss: 0.1253
Epoch [16/51], Iter [160/391] Loss: 0.1245
Epoch [16/51], Iter [170/391] Loss: 0.1264
Epoch [16/51], Iter [180/391] Loss: 0.1266
Epoch [16/51], Iter [190/391] Loss: 0.1175
Epoch [16/51], Iter [200/391] Loss: 0.1200
Epoch [16/51], Iter [210/391] Loss: 0.1359
Epoch [16/51], Iter [220/391] Loss: 0.1347
Epoch [16/51], Iter [230/391] Loss: 0.1332
Epoch [16/51], Iter [240/391] Loss: 0.1422
Epoch [16/51], Iter [250/391] Loss: 0.1114
Epoch [16/51], Iter [260/391] Loss: 0.1273
Epoch [16/51], Iter [270/391] Loss: 0.1212
Epoch [16/51], Iter [280/391] Loss: 0.1075
Epoch [16/51], Iter [290/391] Loss: 0.1249
Epoch [16/51], Iter [300/391] Loss: 0.1262
Epoch [16/51], Iter [310/391] Loss: 0.1221
Epoch [16/51], Iter [320/391] Loss: 0.1265
Epoch [16/51], Iter [330/391] Loss: 0.1231
Epoch [16/51], Iter [340/391] Loss: 0.1247
Epoch [16/51], Iter [350/391] Loss: 0.1217
Epoch [16/51], Iter [360/391] Loss: 0.1198
Epoch [16/51], Iter [370/391] Loss: 0.1001
Epoch [16/51], Iter [380/391] Loss: 0.1305
Epoch [16/51], Iter [390/391] Loss: 0.1213
Epoch [17/51], Iter [10/391] Loss: 0.1171
Epoch [17/51], Iter [20/391] Loss: 0.1162
Epoch [17/51], Iter [30/391] Loss: 0.1303
Epoch [17/51], Iter [40/391] Loss: 0.1261
Epoch [17/51], Iter [50/391] Loss: 0.1419
Epoch [17/51], Iter [60/391] Loss: 0.1262
Epoch [17/51], Iter [70/391] Loss: 0.1196
Epoch [17/51], Iter [80/391] Loss: 0.1277
Epoch [17/51], Iter [90/391] Loss: 0.1296
Epoch [17/51], Iter [100/391] Loss: 0.1225
Epoch [17/51], Iter [110/391] Loss: 0.1211
Epoch [17/51], Iter [120/391] Loss: 0.1232
Epoch [17/51], Iter [130/391] Loss: 0.1061
Epoch [17/51], Iter [140/391] Loss: 0.1092
Epoch [17/51], Iter [150/391] Loss: 0.1147
Epoch [17/51], Iter [160/391] Loss: 0.1120
Epoch [17/51], Iter [170/391] Loss: 0.1031
Epoch [17/51], Iter [180/391] Loss: 0.1099
Epoch [17/51], Iter [190/391] Loss: 0.1344
Epoch [17/51], Iter [200/391] Loss: 0.1265
Epoch [17/51], Iter [210/391] Loss: 0.1096
Epoch [17/51], Iter [220/391] Loss: 0.1211
Epoch [17/51], Iter [230/391] Loss: 0.1238
Epoch [17/51], Iter [240/391] Loss: 0.1002
Epoch [17/51], Iter [250/391] Loss: 0.1105
Epoch [17/51], Iter [260/391] Loss: 0.1192
Epoch [17/51], Iter [270/391] Loss: 0.1224
Epoch [17/51], Iter [280/391] Loss: 0.1058
Epoch [17/51], Iter [290/391] Loss: 0.1044
Epoch [17/51], Iter [300/391] Loss: 0.1209
Epoch [17/51], Iter [310/391] Loss: 0.1121
Epoch [17/51], Iter [320/391] Loss: 0.1279
Epoch [17/51], Iter [330/391] Loss: 0.1159
Epoch [17/51], Iter [340/391] Loss: 0.1129
Epoch [17/51], Iter [350/391] Loss: 0.1082
Epoch [17/51], Iter [360/391] Loss: 0.1055
Epoch [17/51], Iter [370/391] Loss: 0.1469
Epoch [17/51], Iter [380/391] Loss: 0.1014
Epoch [17/51], Iter [390/391] Loss: 0.1139
Epoch [18/51], Iter [10/391] Loss: 0.1055
Epoch [18/51], Iter [20/391] Loss: 0.1067
Epoch [18/51], Iter [30/391] Loss: 0.0992
Epoch [18/51], Iter [40/391] Loss: 0.1201
Epoch [18/51], Iter [50/391] Loss: 0.1203
Epoch [18/51], Iter [60/391] Loss: 0.1092
Epoch [18/51], Iter [70/391] Loss: 0.1093
Epoch [18/51], Iter [80/391] Loss: 0.1157
Epoch [18/51], Iter [90/391] Loss: 0.1179
Epoch [18/51], Iter [100/391] Loss: 0.1278
Epoch [18/51], Iter [110/391] Loss: 0.1236
Epoch [18/51], Iter [120/391] Loss: 0.1097
Epoch [18/51], Iter [130/391] Loss: 0.1251
Epoch [18/51], Iter [140/391] Loss: 0.1301
Epoch [18/51], Iter [150/391] Loss: 0.1171
Epoch [18/51], Iter [160/391] Loss: 0.1124
Epoch [18/51], Iter [170/391] Loss: 0.1207
Epoch [18/51], Iter [180/391] Loss: 0.1228
Epoch [18/51], Iter [190/391] Loss: 0.1362
Epoch [18/51], Iter [200/391] Loss: 0.1132
Epoch [18/51], Iter [210/391] Loss: 0.1089
Epoch [18/51], Iter [220/391] Loss: 0.1133
Epoch [18/51], Iter [230/391] Loss: 0.1164
Epoch [18/51], Iter [240/391] Loss: 0.1250
Epoch [18/51], Iter [250/391] Loss: 0.1386
Epoch [18/51], Iter [260/391] Loss: 0.1229
Epoch [18/51], Iter [270/391] Loss: 0.1183
Epoch [18/51], Iter [280/391] Loss: 0.1220
Epoch [18/51], Iter [290/391] Loss: 0.1301
Epoch [18/51], Iter [300/391] Loss: 0.1151
Epoch [18/51], Iter [310/391] Loss: 0.1229
Epoch [18/51], Iter [320/391] Loss: 0.1055
Epoch [18/51], Iter [330/391] Loss: 0.1155
Epoch [18/51], Iter [340/391] Loss: 0.1095
Epoch [18/51], Iter [350/391] Loss: 0.1054
Epoch [18/51], Iter [360/391] Loss: 0.1248
Epoch [18/51], Iter [370/391] Loss: 0.1178
Epoch [18/51], Iter [380/391] Loss: 0.1170
Epoch [18/51], Iter [390/391] Loss: 0.0919
Epoch [19/51], Iter [10/391] Loss: 0.1031
Epoch [19/51], Iter [20/391] Loss: 0.0904
Epoch [19/51], Iter [30/391] Loss: 0.1036
Epoch [19/51], Iter [40/391] Loss: 0.1088
Epoch [19/51], Iter [50/391] Loss: 0.1143
Epoch [19/51], Iter [60/391] Loss: 0.1021
Epoch [19/51], Iter [70/391] Loss: 0.1395
Epoch [19/51], Iter [80/391] Loss: 0.1130
Epoch [19/51], Iter [90/391] Loss: 0.1083
Epoch [19/51], Iter [100/391] Loss: 0.0998
Epoch [19/51], Iter [110/391] Loss: 0.1151
Epoch [19/51], Iter [120/391] Loss: 0.1057
Epoch [19/51], Iter [130/391] Loss: 0.1188
Epoch [19/51], Iter [140/391] Loss: 0.1193
Epoch [19/51], Iter [150/391] Loss: 0.0968
Epoch [19/51], Iter [160/391] Loss: 0.1150
Epoch [19/51], Iter [170/391] Loss: 0.1184
Epoch [19/51], Iter [180/391] Loss: 0.1042
Epoch [19/51], Iter [190/391] Loss: 0.1101
Epoch [19/51], Iter [200/391] Loss: 0.1174
Epoch [19/51], Iter [210/391] Loss: 0.1063
Epoch [19/51], Iter [220/391] Loss: 0.1218
Epoch [19/51], Iter [230/391] Loss: 0.1142
Epoch [19/51], Iter [240/391] Loss: 0.1032
Epoch [19/51], Iter [250/391] Loss: 0.1176
Epoch [19/51], Iter [260/391] Loss: 0.0981
Epoch [19/51], Iter [270/391] Loss: 0.1288
Epoch [19/51], Iter [280/391] Loss: 0.1152
Epoch [19/51], Iter [290/391] Loss: 0.1023
Epoch [19/51], Iter [300/391] Loss: 0.1108
Epoch [19/51], Iter [310/391] Loss: 0.1225
Epoch [19/51], Iter [320/391] Loss: 0.1192
Epoch [19/51], Iter [330/391] Loss: 0.1059
Epoch [19/51], Iter [340/391] Loss: 0.1182
Epoch [19/51], Iter [350/391] Loss: 0.1136
Epoch [19/51], Iter [360/391] Loss: 0.1307
Epoch [19/51], Iter [370/391] Loss: 0.1064
Epoch [19/51], Iter [380/391] Loss: 0.0981
Epoch [19/51], Iter [390/391] Loss: 0.0988
Epoch [20/51], Iter [10/391] Loss: 0.1104
Epoch [20/51], Iter [20/391] Loss: 0.1026
Epoch [20/51], Iter [30/391] Loss: 0.1079
Epoch [20/51], Iter [40/391] Loss: 0.0940
Epoch [20/51], Iter [50/391] Loss: 0.0974
Epoch [20/51], Iter [60/391] Loss: 0.1019
Epoch [20/51], Iter [70/391] Loss: 0.1166
Epoch [20/51], Iter [80/391] Loss: 0.1185
Epoch [20/51], Iter [90/391] Loss: 0.1093
Epoch [20/51], Iter [100/391] Loss: 0.1148
Epoch [20/51], Iter [110/391] Loss: 0.1107
Epoch [20/51], Iter [120/391] Loss: 0.0972
Epoch [20/51], Iter [130/391] Loss: 0.1069
Epoch [20/51], Iter [140/391] Loss: 0.0963
Epoch [20/51], Iter [150/391] Loss: 0.0992
Epoch [20/51], Iter [160/391] Loss: 0.1085
Epoch [20/51], Iter [170/391] Loss: 0.1022
Epoch [20/51], Iter [180/391] Loss: 0.0919
Epoch [20/51], Iter [190/391] Loss: 0.1104
Epoch [20/51], Iter [200/391] Loss: 0.1183
Epoch [20/51], Iter [210/391] Loss: 0.1049
Epoch [20/51], Iter [220/391] Loss: 0.1092
Epoch [20/51], Iter [230/391] Loss: 0.0951
Epoch [20/51], Iter [240/391] Loss: 0.0967
Epoch [20/51], Iter [250/391] Loss: 0.1079
Epoch [20/51], Iter [260/391] Loss: 0.1182
Epoch [20/51], Iter [270/391] Loss: 0.0887
Epoch [20/51], Iter [280/391] Loss: 0.1062
Epoch [20/51], Iter [290/391] Loss: 0.1084
Epoch [20/51], Iter [300/391] Loss: 0.1068
Epoch [20/51], Iter [310/391] Loss: 0.0994
Epoch [20/51], Iter [320/391] Loss: 0.1062
Epoch [20/51], Iter [330/391] Loss: 0.1045
Epoch [20/51], Iter [340/391] Loss: 0.1030
Epoch [20/51], Iter [350/391] Loss: 0.1003
Epoch [20/51], Iter [360/391] Loss: 0.1043
Epoch [20/51], Iter [370/391] Loss: 0.1039
Epoch [20/51], Iter [380/391] Loss: 0.1106
Epoch [20/51], Iter [390/391] Loss: 0.1173
[Saving Checkpoint]
Epoch [21/51], Iter [10/391] Loss: 0.1003
Epoch [21/51], Iter [20/391] Loss: 0.0904
Epoch [21/51], Iter [30/391] Loss: 0.1180
Epoch [21/51], Iter [40/391] Loss: 0.0944
Epoch [21/51], Iter [50/391] Loss: 0.0930
Epoch [21/51], Iter [60/391] Loss: 0.1089
Epoch [21/51], Iter [70/391] Loss: 0.1100
Epoch [21/51], Iter [80/391] Loss: 0.0972
Epoch [21/51], Iter [90/391] Loss: 0.1010
Epoch [21/51], Iter [100/391] Loss: 0.1001
Epoch [21/51], Iter [110/391] Loss: 0.0985
Epoch [21/51], Iter [120/391] Loss: 0.0979
Epoch [21/51], Iter [130/391] Loss: 0.1019
Epoch [21/51], Iter [140/391] Loss: 0.0864
Epoch [21/51], Iter [150/391] Loss: 0.0911
Epoch [21/51], Iter [160/391] Loss: 0.0914
Epoch [21/51], Iter [170/391] Loss: 0.1037
Epoch [21/51], Iter [180/391] Loss: 0.1181
Epoch [21/51], Iter [190/391] Loss: 0.0996
Epoch [21/51], Iter [200/391] Loss: 0.1039
Epoch [21/51], Iter [210/391] Loss: 0.1005
Epoch [21/51], Iter [220/391] Loss: 0.0993
Epoch [21/51], Iter [230/391] Loss: 0.0917
Epoch [21/51], Iter [240/391] Loss: 0.1031
Epoch [21/51], Iter [250/391] Loss: 0.0912
Epoch [21/51], Iter [260/391] Loss: 0.0982
Epoch [21/51], Iter [270/391] Loss: 0.1013
Epoch [21/51], Iter [280/391] Loss: 0.0967
Epoch [21/51], Iter [290/391] Loss: 0.1104
Epoch [21/51], Iter [300/391] Loss: 0.1215
Epoch [21/51], Iter [310/391] Loss: 0.1136
Epoch [21/51], Iter [320/391] Loss: 0.1154
Epoch [21/51], Iter [330/391] Loss: 0.0948
Epoch [21/51], Iter [340/391] Loss: 0.1036
Epoch [21/51], Iter [350/391] Loss: 0.0960
Epoch [21/51], Iter [360/391] Loss: 0.1060
Epoch [21/51], Iter [370/391] Loss: 0.1104
Epoch [21/51], Iter [380/391] Loss: 0.1169
Epoch [21/51], Iter [390/391] Loss: 0.1065
Epoch [22/51], Iter [10/391] Loss: 0.0963
Epoch [22/51], Iter [20/391] Loss: 0.0989
Epoch [22/51], Iter [30/391] Loss: 0.0833
Epoch [22/51], Iter [40/391] Loss: 0.0854
Epoch [22/51], Iter [50/391] Loss: 0.1001
Epoch [22/51], Iter [60/391] Loss: 0.1066
Epoch [22/51], Iter [70/391] Loss: 0.0892
Epoch [22/51], Iter [80/391] Loss: 0.1027
Epoch [22/51], Iter [90/391] Loss: 0.1100
Epoch [22/51], Iter [100/391] Loss: 0.0896
Epoch [22/51], Iter [110/391] Loss: 0.0897
Epoch [22/51], Iter [120/391] Loss: 0.0965
Epoch [22/51], Iter [130/391] Loss: 0.0832
Epoch [22/51], Iter [140/391] Loss: 0.0852
Epoch [22/51], Iter [150/391] Loss: 0.1079
Epoch [22/51], Iter [160/391] Loss: 0.1002
Epoch [22/51], Iter [170/391] Loss: 0.0958
Epoch [22/51], Iter [180/391] Loss: 0.1012
Epoch [22/51], Iter [190/391] Loss: 0.0934
Epoch [22/51], Iter [200/391] Loss: 0.0958
Epoch [22/51], Iter [210/391] Loss: 0.1093
Epoch [22/51], Iter [220/391] Loss: 0.0959
Epoch [22/51], Iter [230/391] Loss: 0.0974
Epoch [22/51], Iter [240/391] Loss: 0.0888
Epoch [22/51], Iter [250/391] Loss: 0.0894
Epoch [22/51], Iter [260/391] Loss: 0.0925
Epoch [22/51], Iter [270/391] Loss: 0.0892
Epoch [22/51], Iter [280/391] Loss: 0.0877
Epoch [22/51], Iter [290/391] Loss: 0.0971
Epoch [22/51], Iter [300/391] Loss: 0.0952
Epoch [22/51], Iter [310/391] Loss: 0.1118
Epoch [22/51], Iter [320/391] Loss: 0.0905
Epoch [22/51], Iter [330/391] Loss: 0.1041
Epoch [22/51], Iter [340/391] Loss: 0.1066
Epoch [22/51], Iter [350/391] Loss: 0.0989
Epoch [22/51], Iter [360/391] Loss: 0.1107
Epoch [22/51], Iter [370/391] Loss: 0.0798
Epoch [22/51], Iter [380/391] Loss: 0.1131
Epoch [22/51], Iter [390/391] Loss: 0.1060
Epoch [23/51], Iter [10/391] Loss: 0.0964
Epoch [23/51], Iter [20/391] Loss: 0.1027
Epoch [23/51], Iter [30/391] Loss: 0.1050
Epoch [23/51], Iter [40/391] Loss: 0.0929
Epoch [23/51], Iter [50/391] Loss: 0.0901
Epoch [23/51], Iter [60/391] Loss: 0.1065
Epoch [23/51], Iter [70/391] Loss: 0.1076
Epoch [23/51], Iter [80/391] Loss: 0.0839
Epoch [23/51], Iter [90/391] Loss: 0.0871
Epoch [23/51], Iter [100/391] Loss: 0.0863
Epoch [23/51], Iter [110/391] Loss: 0.0862
Epoch [23/51], Iter [120/391] Loss: 0.1047
Epoch [23/51], Iter [130/391] Loss: 0.1182
Epoch [23/51], Iter [140/391] Loss: 0.0973
Epoch [23/51], Iter [150/391] Loss: 0.0870
Epoch [23/51], Iter [160/391] Loss: 0.0947
Epoch [23/51], Iter [170/391] Loss: 0.0966
Epoch [23/51], Iter [180/391] Loss: 0.1095
Epoch [23/51], Iter [190/391] Loss: 0.0933
Epoch [23/51], Iter [200/391] Loss: 0.0991
Epoch [23/51], Iter [210/391] Loss: 0.1000
Epoch [23/51], Iter [220/391] Loss: 0.0886
Epoch [23/51], Iter [230/391] Loss: 0.0820
Epoch [23/51], Iter [240/391] Loss: 0.0998
Epoch [23/51], Iter [250/391] Loss: 0.1004
Epoch [23/51], Iter [260/391] Loss: 0.0956
Epoch [23/51], Iter [270/391] Loss: 0.0908
Epoch [23/51], Iter [280/391] Loss: 0.1021
Epoch [23/51], Iter [290/391] Loss: 0.0974
Epoch [23/51], Iter [300/391] Loss: 0.0996
Epoch [23/51], Iter [310/391] Loss: 0.0874
Epoch [23/51], Iter [320/391] Loss: 0.0992
Epoch [23/51], Iter [330/391] Loss: 0.0887
Epoch [23/51], Iter [340/391] Loss: 0.0951
Epoch [23/51], Iter [350/391] Loss: 0.1161
Epoch [23/51], Iter [360/391] Loss: 0.0956
Epoch [23/51], Iter [370/391] Loss: 0.1085
Epoch [23/51], Iter [380/391] Loss: 0.1054
Epoch [23/51], Iter [390/391] Loss: 0.0910
Epoch [24/51], Iter [10/391] Loss: 0.0966
Epoch [24/51], Iter [20/391] Loss: 0.0981
Epoch [24/51], Iter [30/391] Loss: 0.0832
Epoch [24/51], Iter [40/391] Loss: 0.1077
Epoch [24/51], Iter [50/391] Loss: 0.0847
Epoch [24/51], Iter [60/391] Loss: 0.0883
Epoch [24/51], Iter [70/391] Loss: 0.0828
Epoch [24/51], Iter [80/391] Loss: 0.0860
Epoch [24/51], Iter [90/391] Loss: 0.0870
Epoch [24/51], Iter [100/391] Loss: 0.0909
Epoch [24/51], Iter [110/391] Loss: 0.0990
Epoch [24/51], Iter [120/391] Loss: 0.0909
Epoch [24/51], Iter [130/391] Loss: 0.1035
Epoch [24/51], Iter [140/391] Loss: 0.0906
Epoch [24/51], Iter [150/391] Loss: 0.0922
Epoch [24/51], Iter [160/391] Loss: 0.0800
Epoch [24/51], Iter [170/391] Loss: 0.0873
Epoch [24/51], Iter [180/391] Loss: 0.0839
Epoch [24/51], Iter [190/391] Loss: 0.1041
Epoch [24/51], Iter [200/391] Loss: 0.0965
Epoch [24/51], Iter [210/391] Loss: 0.0791
Epoch [24/51], Iter [220/391] Loss: 0.1055
Epoch [24/51], Iter [230/391] Loss: 0.0993
Epoch [24/51], Iter [240/391] Loss: 0.0874
Epoch [24/51], Iter [250/391] Loss: 0.0987
Epoch [24/51], Iter [260/391] Loss: 0.1089
Epoch [24/51], Iter [270/391] Loss: 0.0949
Epoch [24/51], Iter [280/391] Loss: 0.0782
Epoch [24/51], Iter [290/391] Loss: 0.0864
Epoch [24/51], Iter [300/391] Loss: 0.0780
Epoch [24/51], Iter [310/391] Loss: 0.0910
Epoch [24/51], Iter [320/391] Loss: 0.0977
Epoch [24/51], Iter [330/391] Loss: 0.0849
Epoch [24/51], Iter [340/391] Loss: 0.0908
Epoch [24/51], Iter [350/391] Loss: 0.0802
Epoch [24/51], Iter [360/391] Loss: 0.0832
Epoch [24/51], Iter [370/391] Loss: 0.1042
Epoch [24/51], Iter [380/391] Loss: 0.0895
Epoch [24/51], Iter [390/391] Loss: 0.0832
Epoch [25/51], Iter [10/391] Loss: 0.0850
Epoch [25/51], Iter [20/391] Loss: 0.0861
Epoch [25/51], Iter [30/391] Loss: 0.0892
Epoch [25/51], Iter [40/391] Loss: 0.0770
Epoch [25/51], Iter [50/391] Loss: 0.1008
Epoch [25/51], Iter [60/391] Loss: 0.0974
Epoch [25/51], Iter [70/391] Loss: 0.0870
Epoch [25/51], Iter [80/391] Loss: 0.0714
Epoch [25/51], Iter [90/391] Loss: 0.0860
Epoch [25/51], Iter [100/391] Loss: 0.0919
Epoch [25/51], Iter [110/391] Loss: 0.0820
Epoch [25/51], Iter [120/391] Loss: 0.0797
Epoch [25/51], Iter [130/391] Loss: 0.0844
Epoch [25/51], Iter [140/391] Loss: 0.0868
Epoch [25/51], Iter [150/391] Loss: 0.0935
Epoch [25/51], Iter [160/391] Loss: 0.0941
Epoch [25/51], Iter [170/391] Loss: 0.1021
Epoch [25/51], Iter [180/391] Loss: 0.0911
Epoch [25/51], Iter [190/391] Loss: 0.0906
Epoch [25/51], Iter [200/391] Loss: 0.0934
Epoch [25/51], Iter [210/391] Loss: 0.0929
Epoch [25/51], Iter [220/391] Loss: 0.0865
Epoch [25/51], Iter [230/391] Loss: 0.1023
Epoch [25/51], Iter [240/391] Loss: 0.0925
Epoch [25/51], Iter [250/391] Loss: 0.0847
Epoch [25/51], Iter [260/391] Loss: 0.0845
Epoch [25/51], Iter [270/391] Loss: 0.0917
Epoch [25/51], Iter [280/391] Loss: 0.0801
Epoch [25/51], Iter [290/391] Loss: 0.0984
Epoch [25/51], Iter [300/391] Loss: 0.0897
Epoch [25/51], Iter [310/391] Loss: 0.0811
Epoch [25/51], Iter [320/391] Loss: 0.0961
Epoch [25/51], Iter [330/391] Loss: 0.1155
Epoch [25/51], Iter [340/391] Loss: 0.0909
Epoch [25/51], Iter [350/391] Loss: 0.0953
Epoch [25/51], Iter [360/391] Loss: 0.0943
Epoch [25/51], Iter [370/391] Loss: 0.0752
Epoch [25/51], Iter [380/391] Loss: 0.0749
Epoch [25/51], Iter [390/391] Loss: 0.0942
Epoch [26/51], Iter [10/391] Loss: 0.1071
Epoch [26/51], Iter [20/391] Loss: 0.0904
Epoch [26/51], Iter [30/391] Loss: 0.0778
Epoch [26/51], Iter [40/391] Loss: 0.0819
Epoch [26/51], Iter [50/391] Loss: 0.0911
Epoch [26/51], Iter [60/391] Loss: 0.0815
Epoch [26/51], Iter [70/391] Loss: 0.0795
Epoch [26/51], Iter [80/391] Loss: 0.0952
Epoch [26/51], Iter [90/391] Loss: 0.0770
Epoch [26/51], Iter [100/391] Loss: 0.0849
Epoch [26/51], Iter [110/391] Loss: 0.0817
Epoch [26/51], Iter [120/391] Loss: 0.0872
Epoch [26/51], Iter [130/391] Loss: 0.0940
Epoch [26/51], Iter [140/391] Loss: 0.0918
Epoch [26/51], Iter [150/391] Loss: 0.0840
Epoch [26/51], Iter [160/391] Loss: 0.0899
Epoch [26/51], Iter [170/391] Loss: 0.0770
Epoch [26/51], Iter [180/391] Loss: 0.0794
Epoch [26/51], Iter [190/391] Loss: 0.0787
Epoch [26/51], Iter [200/391] Loss: 0.0890
Epoch [26/51], Iter [210/391] Loss: 0.0800
Epoch [26/51], Iter [220/391] Loss: 0.1027
Epoch [26/51], Iter [230/391] Loss: 0.0898
Epoch [26/51], Iter [240/391] Loss: 0.0847
Epoch [26/51], Iter [250/391] Loss: 0.0777
Epoch [26/51], Iter [260/391] Loss: 0.0855
Epoch [26/51], Iter [270/391] Loss: 0.0929
Epoch [26/51], Iter [280/391] Loss: 0.0732
Epoch [26/51], Iter [290/391] Loss: 0.0922
Epoch [26/51], Iter [300/391] Loss: 0.0772
Epoch [26/51], Iter [310/391] Loss: 0.0873
Epoch [26/51], Iter [320/391] Loss: 0.0849
Epoch [26/51], Iter [330/391] Loss: 0.1095
Epoch [26/51], Iter [340/391] Loss: 0.0917
Epoch [26/51], Iter [350/391] Loss: 0.1052
Epoch [26/51], Iter [360/391] Loss: 0.0899
Epoch [26/51], Iter [370/391] Loss: 0.0807
Epoch [26/51], Iter [380/391] Loss: 0.0934
Epoch [26/51], Iter [390/391] Loss: 0.0950
Epoch [27/51], Iter [10/391] Loss: 0.0819
Epoch [27/51], Iter [20/391] Loss: 0.0900
Epoch [27/51], Iter [30/391] Loss: 0.0827
Epoch [27/51], Iter [40/391] Loss: 0.0879
Epoch [27/51], Iter [50/391] Loss: 0.0820
Epoch [27/51], Iter [60/391] Loss: 0.0924
Epoch [27/51], Iter [70/391] Loss: 0.0778
Epoch [27/51], Iter [80/391] Loss: 0.0867
Epoch [27/51], Iter [90/391] Loss: 0.0816
Epoch [27/51], Iter [100/391] Loss: 0.0756
Epoch [27/51], Iter [110/391] Loss: 0.0866
Epoch [27/51], Iter [120/391] Loss: 0.0866
Epoch [27/51], Iter [130/391] Loss: 0.0824
Epoch [27/51], Iter [140/391] Loss: 0.0906
Epoch [27/51], Iter [150/391] Loss: 0.0810
Epoch [27/51], Iter [160/391] Loss: 0.0891
Epoch [27/51], Iter [170/391] Loss: 0.0820
Epoch [27/51], Iter [180/391] Loss: 0.0959
Epoch [27/51], Iter [190/391] Loss: 0.0772
Epoch [27/51], Iter [200/391] Loss: 0.0836
Epoch [27/51], Iter [210/391] Loss: 0.0756
Epoch [27/51], Iter [220/391] Loss: 0.0897
Epoch [27/51], Iter [230/391] Loss: 0.0947
Epoch [27/51], Iter [240/391] Loss: 0.0871
Epoch [27/51], Iter [250/391] Loss: 0.0819
Epoch [27/51], Iter [260/391] Loss: 0.0877
Epoch [27/51], Iter [270/391] Loss: 0.0910
Epoch [27/51], Iter [280/391] Loss: 0.0925
Epoch [27/51], Iter [290/391] Loss: 0.0937
Epoch [27/51], Iter [300/391] Loss: 0.0902
Epoch [27/51], Iter [310/391] Loss: 0.0852
Epoch [27/51], Iter [320/391] Loss: 0.0780
Epoch [27/51], Iter [330/391] Loss: 0.0906
Epoch [27/51], Iter [340/391] Loss: 0.0909
Epoch [27/51], Iter [350/391] Loss: 0.0831
Epoch [27/51], Iter [360/391] Loss: 0.0934
Epoch [27/51], Iter [370/391] Loss: 0.0844
Epoch [27/51], Iter [380/391] Loss: 0.0946
Epoch [27/51], Iter [390/391] Loss: 0.0829
Epoch [28/51], Iter [10/391] Loss: 0.0769
Epoch [28/51], Iter [20/391] Loss: 0.0870
Epoch [28/51], Iter [30/391] Loss: 0.0870
Epoch [28/51], Iter [40/391] Loss: 0.0788
Epoch [28/51], Iter [50/391] Loss: 0.0792
Epoch [28/51], Iter [60/391] Loss: 0.0861
Epoch [28/51], Iter [70/391] Loss: 0.0902
Epoch [28/51], Iter [80/391] Loss: 0.0962
Epoch [28/51], Iter [90/391] Loss: 0.0753
Epoch [28/51], Iter [100/391] Loss: 0.0796
Epoch [28/51], Iter [110/391] Loss: 0.0831
Epoch [28/51], Iter [120/391] Loss: 0.0774
Epoch [28/51], Iter [130/391] Loss: 0.0912
Epoch [28/51], Iter [140/391] Loss: 0.0871
Epoch [28/51], Iter [150/391] Loss: 0.0798
Epoch [28/51], Iter [160/391] Loss: 0.0847
Epoch [28/51], Iter [170/391] Loss: 0.0895
Epoch [28/51], Iter [180/391] Loss: 0.0818
Epoch [28/51], Iter [190/391] Loss: 0.0818
Epoch [28/51], Iter [200/391] Loss: 0.0802
Epoch [28/51], Iter [210/391] Loss: 0.0816
Epoch [28/51], Iter [220/391] Loss: 0.0751
Epoch [28/51], Iter [230/391] Loss: 0.0831
Epoch [28/51], Iter [240/391] Loss: 0.1014
Epoch [28/51], Iter [250/391] Loss: 0.0830
Epoch [28/51], Iter [260/391] Loss: 0.0714
Epoch [28/51], Iter [270/391] Loss: 0.0940
Epoch [28/51], Iter [280/391] Loss: 0.0816
Epoch [28/51], Iter [290/391] Loss: 0.0813
Epoch [28/51], Iter [300/391] Loss: 0.0867
Epoch [28/51], Iter [310/391] Loss: 0.0818
Epoch [28/51], Iter [320/391] Loss: 0.0876
Epoch [28/51], Iter [330/391] Loss: 0.0761
Epoch [28/51], Iter [340/391] Loss: 0.0905
Epoch [28/51], Iter [350/391] Loss: 0.0884
Epoch [28/51], Iter [360/391] Loss: 0.0791
Epoch [28/51], Iter [370/391] Loss: 0.0811
Epoch [28/51], Iter [380/391] Loss: 0.0835
Epoch [28/51], Iter [390/391] Loss: 0.0732
Epoch [29/51], Iter [10/391] Loss: 0.0769
Epoch [29/51], Iter [20/391] Loss: 0.0748
Epoch [29/51], Iter [30/391] Loss: 0.0753
Epoch [29/51], Iter [40/391] Loss: 0.0876
Epoch [29/51], Iter [50/391] Loss: 0.0897
Epoch [29/51], Iter [60/391] Loss: 0.0709
Epoch [29/51], Iter [70/391] Loss: 0.0806
Epoch [29/51], Iter [80/391] Loss: 0.0809
Epoch [29/51], Iter [90/391] Loss: 0.0740
Epoch [29/51], Iter [100/391] Loss: 0.0917
Epoch [29/51], Iter [110/391] Loss: 0.0878
Epoch [29/51], Iter [120/391] Loss: 0.0919
Epoch [29/51], Iter [130/391] Loss: 0.0878
Epoch [29/51], Iter [140/391] Loss: 0.0742
Epoch [29/51], Iter [150/391] Loss: 0.0798
Epoch [29/51], Iter [160/391] Loss: 0.0849
Epoch [29/51], Iter [170/391] Loss: 0.0809
Epoch [29/51], Iter [180/391] Loss: 0.0785
Epoch [29/51], Iter [190/391] Loss: 0.0797
Epoch [29/51], Iter [200/391] Loss: 0.0925
Epoch [29/51], Iter [210/391] Loss: 0.0711
Epoch [29/51], Iter [220/391] Loss: 0.0690
Epoch [29/51], Iter [230/391] Loss: 0.0777
Epoch [29/51], Iter [240/391] Loss: 0.0889
Epoch [29/51], Iter [250/391] Loss: 0.0896
Epoch [29/51], Iter [260/391] Loss: 0.0858
Epoch [29/51], Iter [270/391] Loss: 0.0906
Epoch [29/51], Iter [280/391] Loss: 0.0821
Epoch [29/51], Iter [290/391] Loss: 0.0907
Epoch [29/51], Iter [300/391] Loss: 0.0916
Epoch [29/51], Iter [310/391] Loss: 0.0843
Epoch [29/51], Iter [320/391] Loss: 0.0751
Epoch [29/51], Iter [330/391] Loss: 0.0794
Epoch [29/51], Iter [340/391] Loss: 0.0725
Epoch [29/51], Iter [350/391] Loss: 0.0802
Epoch [29/51], Iter [360/391] Loss: 0.0893
Epoch [29/51], Iter [370/391] Loss: 0.0977
Epoch [29/51], Iter [380/391] Loss: 0.0772
Epoch [29/51], Iter [390/391] Loss: 0.0811
Epoch [30/51], Iter [10/391] Loss: 0.0864
Epoch [30/51], Iter [20/391] Loss: 0.0775
Epoch [30/51], Iter [30/391] Loss: 0.0768
Epoch [30/51], Iter [40/391] Loss: 0.0848
Epoch [30/51], Iter [50/391] Loss: 0.0673
Epoch [30/51], Iter [60/391] Loss: 0.0742
Epoch [30/51], Iter [70/391] Loss: 0.0656
Epoch [30/51], Iter [80/391] Loss: 0.0683
Epoch [30/51], Iter [90/391] Loss: 0.0772
Epoch [30/51], Iter [100/391] Loss: 0.0885
Epoch [30/51], Iter [110/391] Loss: 0.0820
Epoch [30/51], Iter [120/391] Loss: 0.0841
Epoch [30/51], Iter [130/391] Loss: 0.0670
Epoch [30/51], Iter [140/391] Loss: 0.0915
Epoch [30/51], Iter [150/391] Loss: 0.0773
Epoch [30/51], Iter [160/391] Loss: 0.0742
Epoch [30/51], Iter [170/391] Loss: 0.0691
Epoch [30/51], Iter [180/391] Loss: 0.0834
Epoch [30/51], Iter [190/391] Loss: 0.0742
Epoch [30/51], Iter [200/391] Loss: 0.0725
Epoch [30/51], Iter [210/391] Loss: 0.0816
Epoch [30/51], Iter [220/391] Loss: 0.0841
Epoch [30/51], Iter [230/391] Loss: 0.0792
Epoch [30/51], Iter [240/391] Loss: 0.0760
Epoch [30/51], Iter [250/391] Loss: 0.0814
Epoch [30/51], Iter [260/391] Loss: 0.0849
Epoch [30/51], Iter [270/391] Loss: 0.0777
Epoch [30/51], Iter [280/391] Loss: 0.0754
Epoch [30/51], Iter [290/391] Loss: 0.0762
Epoch [30/51], Iter [300/391] Loss: 0.0858
Epoch [30/51], Iter [310/391] Loss: 0.0768
Epoch [30/51], Iter [320/391] Loss: 0.0760
Epoch [30/51], Iter [330/391] Loss: 0.0713
Epoch [30/51], Iter [340/391] Loss: 0.0882
Epoch [30/51], Iter [350/391] Loss: 0.0727
Epoch [30/51], Iter [360/391] Loss: 0.0728
Epoch [30/51], Iter [370/391] Loss: 0.0827
Epoch [30/51], Iter [380/391] Loss: 0.0882
Epoch [30/51], Iter [390/391] Loss: 0.0790
[Saving Checkpoint]
Epoch [31/51], Iter [10/391] Loss: 0.0699
Epoch [31/51], Iter [20/391] Loss: 0.0750
Epoch [31/51], Iter [30/391] Loss: 0.0814
Epoch [31/51], Iter [40/391] Loss: 0.0930
Epoch [31/51], Iter [50/391] Loss: 0.0827
Epoch [31/51], Iter [60/391] Loss: 0.0745
Epoch [31/51], Iter [70/391] Loss: 0.0675
Epoch [31/51], Iter [80/391] Loss: 0.0852
Epoch [31/51], Iter [90/391] Loss: 0.0701
Epoch [31/51], Iter [100/391] Loss: 0.0761
Epoch [31/51], Iter [110/391] Loss: 0.0695
Epoch [31/51], Iter [120/391] Loss: 0.0659
Epoch [31/51], Iter [130/391] Loss: 0.0706
Epoch [31/51], Iter [140/391] Loss: 0.0824
Epoch [31/51], Iter [150/391] Loss: 0.0850
Epoch [31/51], Iter [160/391] Loss: 0.0716
Epoch [31/51], Iter [170/391] Loss: 0.0855
Epoch [31/51], Iter [180/391] Loss: 0.0787
Epoch [31/51], Iter [190/391] Loss: 0.0708
Epoch [31/51], Iter [200/391] Loss: 0.0854
Epoch [31/51], Iter [210/391] Loss: 0.0810
Epoch [31/51], Iter [220/391] Loss: 0.0682
Epoch [31/51], Iter [230/391] Loss: 0.0745
Epoch [31/51], Iter [240/391] Loss: 0.0711
Epoch [31/51], Iter [250/391] Loss: 0.0829
Epoch [31/51], Iter [260/391] Loss: 0.0834
Epoch [31/51], Iter [270/391] Loss: 0.0796
Epoch [31/51], Iter [280/391] Loss: 0.0707
Epoch [31/51], Iter [290/391] Loss: 0.0760
Epoch [31/51], Iter [300/391] Loss: 0.0714
Epoch [31/51], Iter [310/391] Loss: 0.0781
Epoch [31/51], Iter [320/391] Loss: 0.0652
Epoch [31/51], Iter [330/391] Loss: 0.0828
Epoch [31/51], Iter [340/391] Loss: 0.0781
Epoch [31/51], Iter [350/391] Loss: 0.0721
Epoch [31/51], Iter [360/391] Loss: 0.0744
Epoch [31/51], Iter [370/391] Loss: 0.0872
Epoch [31/51], Iter [380/391] Loss: 0.0825
Epoch [31/51], Iter [390/391] Loss: 0.0792
Epoch [32/51], Iter [10/391] Loss: 0.0739
Epoch [32/51], Iter [20/391] Loss: 0.0788
Epoch [32/51], Iter [30/391] Loss: 0.0679
Epoch [32/51], Iter [40/391] Loss: 0.0757
Epoch [32/51], Iter [50/391] Loss: 0.0761
Epoch [32/51], Iter [60/391] Loss: 0.0627
Epoch [32/51], Iter [70/391] Loss: 0.0696
Epoch [32/51], Iter [80/391] Loss: 0.0779
Epoch [32/51], Iter [90/391] Loss: 0.0833
Epoch [32/51], Iter [100/391] Loss: 0.0761
Epoch [32/51], Iter [110/391] Loss: 0.0803
Epoch [32/51], Iter [120/391] Loss: 0.0827
Epoch [32/51], Iter [130/391] Loss: 0.0781
Epoch [32/51], Iter [140/391] Loss: 0.0709
Epoch [32/51], Iter [150/391] Loss: 0.0691
Epoch [32/51], Iter [160/391] Loss: 0.0712
Epoch [32/51], Iter [170/391] Loss: 0.0748
Epoch [32/51], Iter [180/391] Loss: 0.0834
Epoch [32/51], Iter [190/391] Loss: 0.0802
Epoch [32/51], Iter [200/391] Loss: 0.0669
Epoch [32/51], Iter [210/391] Loss: 0.0826
Epoch [32/51], Iter [220/391] Loss: 0.0825
Epoch [32/51], Iter [230/391] Loss: 0.0726
Epoch [32/51], Iter [240/391] Loss: 0.0889
Epoch [32/51], Iter [250/391] Loss: 0.0858
Epoch [32/51], Iter [260/391] Loss: 0.0835
Epoch [32/51], Iter [270/391] Loss: 0.0748
Epoch [32/51], Iter [280/391] Loss: 0.0719
Epoch [32/51], Iter [290/391] Loss: 0.0787
Epoch [32/51], Iter [300/391] Loss: 0.0764
Epoch [32/51], Iter [310/391] Loss: 0.0736
Epoch [32/51], Iter [320/391] Loss: 0.0702
Epoch [32/51], Iter [330/391] Loss: 0.0927
Epoch [32/51], Iter [340/391] Loss: 0.0736
Epoch [32/51], Iter [350/391] Loss: 0.0802
Epoch [32/51], Iter [360/391] Loss: 0.0861
Epoch [32/51], Iter [370/391] Loss: 0.0854
Epoch [32/51], Iter [380/391] Loss: 0.0965
Epoch [32/51], Iter [390/391] Loss: 0.0689
Epoch [33/51], Iter [10/391] Loss: 0.0736
Epoch [33/51], Iter [20/391] Loss: 0.0724
Epoch [33/51], Iter [30/391] Loss: 0.0761
Epoch [33/51], Iter [40/391] Loss: 0.0697
Epoch [33/51], Iter [50/391] Loss: 0.0679
Epoch [33/51], Iter [60/391] Loss: 0.0716
Epoch [33/51], Iter [70/391] Loss: 0.0740
Epoch [33/51], Iter [80/391] Loss: 0.0612
Epoch [33/51], Iter [90/391] Loss: 0.0787
Epoch [33/51], Iter [100/391] Loss: 0.0760
Epoch [33/51], Iter [110/391] Loss: 0.0626
Epoch [33/51], Iter [120/391] Loss: 0.0759
Epoch [33/51], Iter [130/391] Loss: 0.0773
Epoch [33/51], Iter [140/391] Loss: 0.0625
Epoch [33/51], Iter [150/391] Loss: 0.0813
Epoch [33/51], Iter [160/391] Loss: 0.0676
Epoch [33/51], Iter [170/391] Loss: 0.0736
Epoch [33/51], Iter [180/391] Loss: 0.0835
Epoch [33/51], Iter [190/391] Loss: 0.0780
Epoch [33/51], Iter [200/391] Loss: 0.0755
Epoch [33/51], Iter [210/391] Loss: 0.0769
Epoch [33/51], Iter [220/391] Loss: 0.0582
Epoch [33/51], Iter [230/391] Loss: 0.0753
Epoch [33/51], Iter [240/391] Loss: 0.0696
Epoch [33/51], Iter [250/391] Loss: 0.0684
Epoch [33/51], Iter [260/391] Loss: 0.0695
Epoch [33/51], Iter [270/391] Loss: 0.0727
Epoch [33/51], Iter [280/391] Loss: 0.0687
Epoch [33/51], Iter [290/391] Loss: 0.0682
Epoch [33/51], Iter [300/391] Loss: 0.0845
Epoch [33/51], Iter [310/391] Loss: 0.0786
Epoch [33/51], Iter [320/391] Loss: 0.0684
Epoch [33/51], Iter [330/391] Loss: 0.0720
Epoch [33/51], Iter [340/391] Loss: 0.0747
Epoch [33/51], Iter [350/391] Loss: 0.0775
Epoch [33/51], Iter [360/391] Loss: 0.0715
Epoch [33/51], Iter [370/391] Loss: 0.0797
Epoch [33/51], Iter [380/391] Loss: 0.0879
Epoch [33/51], Iter [390/391] Loss: 0.0617
Epoch [34/51], Iter [10/391] Loss: 0.0667
Epoch [34/51], Iter [20/391] Loss: 0.0809
Epoch [34/51], Iter [30/391] Loss: 0.0768
Epoch [34/51], Iter [40/391] Loss: 0.0653
Epoch [34/51], Iter [50/391] Loss: 0.0753
Epoch [34/51], Iter [60/391] Loss: 0.0767
Epoch [34/51], Iter [70/391] Loss: 0.0738
Epoch [34/51], Iter [80/391] Loss: 0.0781
Epoch [34/51], Iter [90/391] Loss: 0.0589
Epoch [34/51], Iter [100/391] Loss: 0.0764
Epoch [34/51], Iter [110/391] Loss: 0.0628
Epoch [34/51], Iter [120/391] Loss: 0.0612
Epoch [34/51], Iter [130/391] Loss: 0.0724
Epoch [34/51], Iter [140/391] Loss: 0.0650
Epoch [34/51], Iter [150/391] Loss: 0.0873
Epoch [34/51], Iter [160/391] Loss: 0.0672
Epoch [34/51], Iter [170/391] Loss: 0.0714
Epoch [34/51], Iter [180/391] Loss: 0.0696
Epoch [34/51], Iter [190/391] Loss: 0.0671
Epoch [34/51], Iter [200/391] Loss: 0.0727
Epoch [34/51], Iter [210/391] Loss: 0.0883
Epoch [34/51], Iter [220/391] Loss: 0.0721
Epoch [34/51], Iter [230/391] Loss: 0.0684
Epoch [34/51], Iter [240/391] Loss: 0.0739
Epoch [34/51], Iter [250/391] Loss: 0.0726
Epoch [34/51], Iter [260/391] Loss: 0.0680
Epoch [34/51], Iter [270/391] Loss: 0.0813
Epoch [34/51], Iter [280/391] Loss: 0.0729
Epoch [34/51], Iter [290/391] Loss: 0.0783
Epoch [34/51], Iter [300/391] Loss: 0.0844
Epoch [34/51], Iter [310/391] Loss: 0.0702
Epoch [34/51], Iter [320/391] Loss: 0.0631
Epoch [34/51], Iter [330/391] Loss: 0.0721
Epoch [34/51], Iter [340/391] Loss: 0.0738
Epoch [34/51], Iter [350/391] Loss: 0.0836
Epoch [34/51], Iter [360/391] Loss: 0.0643
Epoch [34/51], Iter [370/391] Loss: 0.0635
Epoch [34/51], Iter [380/391] Loss: 0.0657
Epoch [34/51], Iter [390/391] Loss: 0.0735
Epoch [35/51], Iter [10/391] Loss: 0.0711
Epoch [35/51], Iter [20/391] Loss: 0.0636
Epoch [35/51], Iter [30/391] Loss: 0.0676
Epoch [35/51], Iter [40/391] Loss: 0.0657
Epoch [35/51], Iter [50/391] Loss: 0.0726
Epoch [35/51], Iter [60/391] Loss: 0.0710
Epoch [35/51], Iter [70/391] Loss: 0.0606
Epoch [35/51], Iter [80/391] Loss: 0.0732
Epoch [35/51], Iter [90/391] Loss: 0.0656
Epoch [35/51], Iter [100/391] Loss: 0.0694
Epoch [35/51], Iter [110/391] Loss: 0.0714
Epoch [35/51], Iter [120/391] Loss: 0.0692
Epoch [35/51], Iter [130/391] Loss: 0.0673
Epoch [35/51], Iter [140/391] Loss: 0.0632
Epoch [35/51], Iter [150/391] Loss: 0.0655
Epoch [35/51], Iter [160/391] Loss: 0.0892
Epoch [35/51], Iter [170/391] Loss: 0.0641
Epoch [35/51], Iter [180/391] Loss: 0.0677
Epoch [35/51], Iter [190/391] Loss: 0.0752
Epoch [35/51], Iter [200/391] Loss: 0.0637
Epoch [35/51], Iter [210/391] Loss: 0.0673
Epoch [35/51], Iter [220/391] Loss: 0.0673
Epoch [35/51], Iter [230/391] Loss: 0.0814
Epoch [35/51], Iter [240/391] Loss: 0.0636
Epoch [35/51], Iter [250/391] Loss: 0.0721
Epoch [35/51], Iter [260/391] Loss: 0.0665
Epoch [35/51], Iter [270/391] Loss: 0.0755
Epoch [35/51], Iter [280/391] Loss: 0.0690
Epoch [35/51], Iter [290/391] Loss: 0.0705
Epoch [35/51], Iter [300/391] Loss: 0.0751
Epoch [35/51], Iter [310/391] Loss: 0.0797
Epoch [35/51], Iter [320/391] Loss: 0.0600
Epoch [35/51], Iter [330/391] Loss: 0.0710
Epoch [35/51], Iter [340/391] Loss: 0.0733
Epoch [35/51], Iter [350/391] Loss: 0.0762
Epoch [35/51], Iter [360/391] Loss: 0.0700
Epoch [35/51], Iter [370/391] Loss: 0.0769
Epoch [35/51], Iter [380/391] Loss: 0.0705
Epoch [35/51], Iter [390/391] Loss: 0.0667
Epoch [36/51], Iter [10/391] Loss: 0.0642
Epoch [36/51], Iter [20/391] Loss: 0.0758
Epoch [36/51], Iter [30/391] Loss: 0.0674
Epoch [36/51], Iter [40/391] Loss: 0.0631
Epoch [36/51], Iter [50/391] Loss: 0.0706
Epoch [36/51], Iter [60/391] Loss: 0.0589
Epoch [36/51], Iter [70/391] Loss: 0.0766
Epoch [36/51], Iter [80/391] Loss: 0.0672
Epoch [36/51], Iter [90/391] Loss: 0.0701
Epoch [36/51], Iter [100/391] Loss: 0.0674
Epoch [36/51], Iter [110/391] Loss: 0.0653
Epoch [36/51], Iter [120/391] Loss: 0.0740
Epoch [36/51], Iter [130/391] Loss: 0.0609
Epoch [36/51], Iter [140/391] Loss: 0.0841
Epoch [36/51], Iter [150/391] Loss: 0.0718
Epoch [36/51], Iter [160/391] Loss: 0.0634
Epoch [36/51], Iter [170/391] Loss: 0.0747
Epoch [36/51], Iter [180/391] Loss: 0.0748
Epoch [36/51], Iter [190/391] Loss: 0.0629
Epoch [36/51], Iter [200/391] Loss: 0.0824
Epoch [36/51], Iter [210/391] Loss: 0.0774
Epoch [36/51], Iter [220/391] Loss: 0.0829
Epoch [36/51], Iter [230/391] Loss: 0.0846
Epoch [36/51], Iter [240/391] Loss: 0.0754
Epoch [36/51], Iter [250/391] Loss: 0.0635
Epoch [36/51], Iter [260/391] Loss: 0.0705
Epoch [36/51], Iter [270/391] Loss: 0.0661
Epoch [36/51], Iter [280/391] Loss: 0.0622
Epoch [36/51], Iter [290/391] Loss: 0.0696
Epoch [36/51], Iter [300/391] Loss: 0.0706
Epoch [36/51], Iter [310/391] Loss: 0.0706
Epoch [36/51], Iter [320/391] Loss: 0.0749
Epoch [36/51], Iter [330/391] Loss: 0.0748
Epoch [36/51], Iter [340/391] Loss: 0.0715
Epoch [36/51], Iter [350/391] Loss: 0.0670
Epoch [36/51], Iter [360/391] Loss: 0.0716
Epoch [36/51], Iter [370/391] Loss: 0.0770
Epoch [36/51], Iter [380/391] Loss: 0.0720
Epoch [36/51], Iter [390/391] Loss: 0.0689
Epoch [37/51], Iter [10/391] Loss: 0.0746
Epoch [37/51], Iter [20/391] Loss: 0.0675
Epoch [37/51], Iter [30/391] Loss: 0.0661
Epoch [37/51], Iter [40/391] Loss: 0.0621
Epoch [37/51], Iter [50/391] Loss: 0.0665
Epoch [37/51], Iter [60/391] Loss: 0.0693
Epoch [37/51], Iter [70/391] Loss: 0.0570
Epoch [37/51], Iter [80/391] Loss: 0.0664
Epoch [37/51], Iter [90/391] Loss: 0.0748
Epoch [37/51], Iter [100/391] Loss: 0.0728
Epoch [37/51], Iter [110/391] Loss: 0.0659
Epoch [37/51], Iter [120/391] Loss: 0.0605
Epoch [37/51], Iter [130/391] Loss: 0.0645
Epoch [37/51], Iter [140/391] Loss: 0.0668
Epoch [37/51], Iter [150/391] Loss: 0.0703
Epoch [37/51], Iter [160/391] Loss: 0.0566
Epoch [37/51], Iter [170/391] Loss: 0.0650
Epoch [37/51], Iter [180/391] Loss: 0.0550
Epoch [37/51], Iter [190/391] Loss: 0.0726
Epoch [37/51], Iter [200/391] Loss: 0.0729
Epoch [37/51], Iter [210/391] Loss: 0.0679
Epoch [37/51], Iter [220/391] Loss: 0.0679
Epoch [37/51], Iter [230/391] Loss: 0.0668
Epoch [37/51], Iter [240/391] Loss: 0.0634
Epoch [37/51], Iter [250/391] Loss: 0.0742
Epoch [37/51], Iter [260/391] Loss: 0.0734
Epoch [37/51], Iter [270/391] Loss: 0.0721
Epoch [37/51], Iter [280/391] Loss: 0.0726
Epoch [37/51], Iter [290/391] Loss: 0.0618
Epoch [37/51], Iter [300/391] Loss: 0.0805
Epoch [37/51], Iter [310/391] Loss: 0.0820
Epoch [37/51], Iter [320/391] Loss: 0.0664
Epoch [37/51], Iter [330/391] Loss: 0.0834
Epoch [37/51], Iter [340/391] Loss: 0.0665
Epoch [37/51], Iter [350/391] Loss: 0.0678
Epoch [37/51], Iter [360/391] Loss: 0.0625
Epoch [37/51], Iter [370/391] Loss: 0.0614
Epoch [37/51], Iter [380/391] Loss: 0.0697
Epoch [37/51], Iter [390/391] Loss: 0.0682
Epoch [38/51], Iter [10/391] Loss: 0.0747
Epoch [38/51], Iter [20/391] Loss: 0.0654
Epoch [38/51], Iter [30/391] Loss: 0.0609
Epoch [38/51], Iter [40/391] Loss: 0.0635
Epoch [38/51], Iter [50/391] Loss: 0.0804
Epoch [38/51], Iter [60/391] Loss: 0.0712
Epoch [38/51], Iter [70/391] Loss: 0.0784
Epoch [38/51], Iter [80/391] Loss: 0.0646
Epoch [38/51], Iter [90/391] Loss: 0.0665
Epoch [38/51], Iter [100/391] Loss: 0.0610
Epoch [38/51], Iter [110/391] Loss: 0.0617
Epoch [38/51], Iter [120/391] Loss: 0.0657
Epoch [38/51], Iter [130/391] Loss: 0.0717
Epoch [38/51], Iter [140/391] Loss: 0.0616
Epoch [38/51], Iter [150/391] Loss: 0.0648
Epoch [38/51], Iter [160/391] Loss: 0.0636
Epoch [38/51], Iter [170/391] Loss: 0.0623
Epoch [38/51], Iter [180/391] Loss: 0.0646
Epoch [38/51], Iter [190/391] Loss: 0.0720
Epoch [38/51], Iter [200/391] Loss: 0.0614
Epoch [38/51], Iter [210/391] Loss: 0.0633
Epoch [38/51], Iter [220/391] Loss: 0.0688
Epoch [38/51], Iter [230/391] Loss: 0.0822
Epoch [38/51], Iter [240/391] Loss: 0.0693
Epoch [38/51], Iter [250/391] Loss: 0.0567
Epoch [38/51], Iter [260/391] Loss: 0.0742
Epoch [38/51], Iter [270/391] Loss: 0.0665
Epoch [38/51], Iter [280/391] Loss: 0.0615
Epoch [38/51], Iter [290/391] Loss: 0.0582
Epoch [38/51], Iter [300/391] Loss: 0.0742
Epoch [38/51], Iter [310/391] Loss: 0.0626
Epoch [38/51], Iter [320/391] Loss: 0.0649
Epoch [38/51], Iter [330/391] Loss: 0.0630
Epoch [38/51], Iter [340/391] Loss: 0.0693
Epoch [38/51], Iter [350/391] Loss: 0.0642
Epoch [38/51], Iter [360/391] Loss: 0.0673
Epoch [38/51], Iter [370/391] Loss: 0.0660
Epoch [38/51], Iter [380/391] Loss: 0.0677
Epoch [38/51], Iter [390/391] Loss: 0.0711
Epoch [39/51], Iter [10/391] Loss: 0.0600
Epoch [39/51], Iter [20/391] Loss: 0.0644
Epoch [39/51], Iter [30/391] Loss: 0.0705
Epoch [39/51], Iter [40/391] Loss: 0.0633
Epoch [39/51], Iter [50/391] Loss: 0.0679
Epoch [39/51], Iter [60/391] Loss: 0.0592
Epoch [39/51], Iter [70/391] Loss: 0.0792
Epoch [39/51], Iter [80/391] Loss: 0.0581
Epoch [39/51], Iter [90/391] Loss: 0.0670
Epoch [39/51], Iter [100/391] Loss: 0.0591
Epoch [39/51], Iter [110/391] Loss: 0.0725
Epoch [39/51], Iter [120/391] Loss: 0.0642
Epoch [39/51], Iter [130/391] Loss: 0.0694
Epoch [39/51], Iter [140/391] Loss: 0.0641
Epoch [39/51], Iter [150/391] Loss: 0.0729
Epoch [39/51], Iter [160/391] Loss: 0.0590
Epoch [39/51], Iter [170/391] Loss: 0.0516
Epoch [39/51], Iter [180/391] Loss: 0.0513
Epoch [39/51], Iter [190/391] Loss: 0.0649
Epoch [39/51], Iter [200/391] Loss: 0.0684
Epoch [39/51], Iter [210/391] Loss: 0.0552
Epoch [39/51], Iter [220/391] Loss: 0.0597
Epoch [39/51], Iter [230/391] Loss: 0.0753
Epoch [39/51], Iter [240/391] Loss: 0.0642
Epoch [39/51], Iter [250/391] Loss: 0.0698
Epoch [39/51], Iter [260/391] Loss: 0.0572
Epoch [39/51], Iter [270/391] Loss: 0.0619
Epoch [39/51], Iter [280/391] Loss: 0.0578
Epoch [39/51], Iter [290/391] Loss: 0.0708
Epoch [39/51], Iter [300/391] Loss: 0.0714
Epoch [39/51], Iter [310/391] Loss: 0.0619
Epoch [39/51], Iter [320/391] Loss: 0.0570
Epoch [39/51], Iter [330/391] Loss: 0.0626
Epoch [39/51], Iter [340/391] Loss: 0.0716
Epoch [39/51], Iter [350/391] Loss: 0.0678
Epoch [39/51], Iter [360/391] Loss: 0.0686
Epoch [39/51], Iter [370/391] Loss: 0.0732
Epoch [39/51], Iter [380/391] Loss: 0.0645
Epoch [39/51], Iter [390/391] Loss: 0.0630
Epoch [40/51], Iter [10/391] Loss: 0.0555
Epoch [40/51], Iter [20/391] Loss: 0.0645
Epoch [40/51], Iter [30/391] Loss: 0.0620
Epoch [40/51], Iter [40/391] Loss: 0.0622
Epoch [40/51], Iter [50/391] Loss: 0.0641
Epoch [40/51], Iter [60/391] Loss: 0.0598
Epoch [40/51], Iter [70/391] Loss: 0.0600
Epoch [40/51], Iter [80/391] Loss: 0.0607
Epoch [40/51], Iter [90/391] Loss: 0.0566
Epoch [40/51], Iter [100/391] Loss: 0.0722
Epoch [40/51], Iter [110/391] Loss: 0.0576
Epoch [40/51], Iter [120/391] Loss: 0.0687
Epoch [40/51], Iter [130/391] Loss: 0.0533
Epoch [40/51], Iter [140/391] Loss: 0.0614
Epoch [40/51], Iter [150/391] Loss: 0.0711
Epoch [40/51], Iter [160/391] Loss: 0.0643
Epoch [40/51], Iter [170/391] Loss: 0.0683
Epoch [40/51], Iter [180/391] Loss: 0.0663
Epoch [40/51], Iter [190/391] Loss: 0.0606
Epoch [40/51], Iter [200/391] Loss: 0.0679
Epoch [40/51], Iter [210/391] Loss: 0.0516
Epoch [40/51], Iter [220/391] Loss: 0.0673
Epoch [40/51], Iter [230/391] Loss: 0.0618
Epoch [40/51], Iter [240/391] Loss: 0.0599
Epoch [40/51], Iter [250/391] Loss: 0.0628
Epoch [40/51], Iter [260/391] Loss: 0.0537
Epoch [40/51], Iter [270/391] Loss: 0.0724
Epoch [40/51], Iter [280/391] Loss: 0.0618
Epoch [40/51], Iter [290/391] Loss: 0.0691
Epoch [40/51], Iter [300/391] Loss: 0.0558
Epoch [40/51], Iter [310/391] Loss: 0.0705
Epoch [40/51], Iter [320/391] Loss: 0.0579
Epoch [40/51], Iter [330/391] Loss: 0.0767
Epoch [40/51], Iter [340/391] Loss: 0.0635
Epoch [40/51], Iter [350/391] Loss: 0.0663
Epoch [40/51], Iter [360/391] Loss: 0.0730
Epoch [40/51], Iter [370/391] Loss: 0.0656
Epoch [40/51], Iter [380/391] Loss: 0.0656
Epoch [40/51], Iter [390/391] Loss: 0.0660
[Saving Checkpoint]
Epoch [41/51], Iter [10/391] Loss: 0.0551
Epoch [41/51], Iter [20/391] Loss: 0.0559
Epoch [41/51], Iter [30/391] Loss: 0.0652
Epoch [41/51], Iter [40/391] Loss: 0.0623
Epoch [41/51], Iter [50/391] Loss: 0.0568
Epoch [41/51], Iter [60/391] Loss: 0.0615
Epoch [41/51], Iter [70/391] Loss: 0.0694
Epoch [41/51], Iter [80/391] Loss: 0.0689
Epoch [41/51], Iter [90/391] Loss: 0.0642
Epoch [41/51], Iter [100/391] Loss: 0.0570
Epoch [41/51], Iter [110/391] Loss: 0.0580
Epoch [41/51], Iter [120/391] Loss: 0.0594
Epoch [41/51], Iter [130/391] Loss: 0.0526
Epoch [41/51], Iter [140/391] Loss: 0.0617
Epoch [41/51], Iter [150/391] Loss: 0.0570
Epoch [41/51], Iter [160/391] Loss: 0.0705
Epoch [41/51], Iter [170/391] Loss: 0.0627
Epoch [41/51], Iter [180/391] Loss: 0.0640
Epoch [41/51], Iter [190/391] Loss: 0.0664
Epoch [41/51], Iter [200/391] Loss: 0.0723
Epoch [41/51], Iter [210/391] Loss: 0.0625
Epoch [41/51], Iter [220/391] Loss: 0.0615
Epoch [41/51], Iter [230/391] Loss: 0.0630
Epoch [41/51], Iter [240/391] Loss: 0.0727
Epoch [41/51], Iter [250/391] Loss: 0.0545
Epoch [41/51], Iter [260/391] Loss: 0.0632
Epoch [41/51], Iter [270/391] Loss: 0.0613
Epoch [41/51], Iter [280/391] Loss: 0.0704
Epoch [41/51], Iter [290/391] Loss: 0.0606
Epoch [41/51], Iter [300/391] Loss: 0.0686
Epoch [41/51], Iter [310/391] Loss: 0.0639
Epoch [41/51], Iter [320/391] Loss: 0.0638
Epoch [41/51], Iter [330/391] Loss: 0.0850
Epoch [41/51], Iter [340/391] Loss: 0.0598
Epoch [41/51], Iter [350/391] Loss: 0.0589
Epoch [41/51], Iter [360/391] Loss: 0.0726
Epoch [41/51], Iter [370/391] Loss: 0.0649
Epoch [41/51], Iter [380/391] Loss: 0.0677
Epoch [41/51], Iter [390/391] Loss: 0.0694
Epoch [42/51], Iter [10/391] Loss: 0.0659
Epoch [42/51], Iter [20/391] Loss: 0.0645
Epoch [42/51], Iter [30/391] Loss: 0.0601
Epoch [42/51], Iter [40/391] Loss: 0.0550
Epoch [42/51], Iter [50/391] Loss: 0.0531
Epoch [42/51], Iter [60/391] Loss: 0.0504
Epoch [42/51], Iter [70/391] Loss: 0.0598
Epoch [42/51], Iter [80/391] Loss: 0.0663
Epoch [42/51], Iter [90/391] Loss: 0.0626
Epoch [42/51], Iter [100/391] Loss: 0.0581
Epoch [42/51], Iter [110/391] Loss: 0.0708
Epoch [42/51], Iter [120/391] Loss: 0.0588
Epoch [42/51], Iter [130/391] Loss: 0.0542
Epoch [42/51], Iter [140/391] Loss: 0.0583
Epoch [42/51], Iter [150/391] Loss: 0.0579
Epoch [42/51], Iter [160/391] Loss: 0.0688
Epoch [42/51], Iter [170/391] Loss: 0.0668
Epoch [42/51], Iter [180/391] Loss: 0.0561
Epoch [42/51], Iter [190/391] Loss: 0.0683
Epoch [42/51], Iter [200/391] Loss: 0.0608
Epoch [42/51], Iter [210/391] Loss: 0.0593
Epoch [42/51], Iter [220/391] Loss: 0.0676
Epoch [42/51], Iter [230/391] Loss: 0.0616
Epoch [42/51], Iter [240/391] Loss: 0.0741
Epoch [42/51], Iter [250/391] Loss: 0.0623
Epoch [42/51], Iter [260/391] Loss: 0.0564
Epoch [42/51], Iter [270/391] Loss: 0.0534
Epoch [42/51], Iter [280/391] Loss: 0.0641
Epoch [42/51], Iter [290/391] Loss: 0.0605
Epoch [42/51], Iter [300/391] Loss: 0.0615
Epoch [42/51], Iter [310/391] Loss: 0.0693
Epoch [42/51], Iter [320/391] Loss: 0.0594
Epoch [42/51], Iter [330/391] Loss: 0.0601
Epoch [42/51], Iter [340/391] Loss: 0.0621
Epoch [42/51], Iter [350/391] Loss: 0.0698
Epoch [42/51], Iter [360/391] Loss: 0.0612
Epoch [42/51], Iter [370/391] Loss: 0.0746
Epoch [42/51], Iter [380/391] Loss: 0.0654
Epoch [42/51], Iter [390/391] Loss: 0.0666
Epoch [43/51], Iter [10/391] Loss: 0.0574
Epoch [43/51], Iter [20/391] Loss: 0.0600
Epoch [43/51], Iter [30/391] Loss: 0.0515
Epoch [43/51], Iter [40/391] Loss: 0.0586
Epoch [43/51], Iter [50/391] Loss: 0.0570
Epoch [43/51], Iter [60/391] Loss: 0.0640
Epoch [43/51], Iter [70/391] Loss: 0.0616
Epoch [43/51], Iter [80/391] Loss: 0.0596
Epoch [43/51], Iter [90/391] Loss: 0.0629
Epoch [43/51], Iter [100/391] Loss: 0.0711
Epoch [43/51], Iter [110/391] Loss: 0.0667
Epoch [43/51], Iter [120/391] Loss: 0.0574
Epoch [43/51], Iter [130/391] Loss: 0.0576
Epoch [43/51], Iter [140/391] Loss: 0.0608
Epoch [43/51], Iter [150/391] Loss: 0.0603
Epoch [43/51], Iter [160/391] Loss: 0.0627
Epoch [43/51], Iter [170/391] Loss: 0.0557
Epoch [43/51], Iter [180/391] Loss: 0.0532
Epoch [43/51], Iter [190/391] Loss: 0.0652
Epoch [43/51], Iter [200/391] Loss: 0.0639
Epoch [43/51], Iter [210/391] Loss: 0.0575
Epoch [43/51], Iter [220/391] Loss: 0.0636
Epoch [43/51], Iter [230/391] Loss: 0.0537
Epoch [43/51], Iter [240/391] Loss: 0.0682
Epoch [43/51], Iter [250/391] Loss: 0.0639
Epoch [43/51], Iter [260/391] Loss: 0.0705
Epoch [43/51], Iter [270/391] Loss: 0.0579
Epoch [43/51], Iter [280/391] Loss: 0.0584
Epoch [43/51], Iter [290/391] Loss: 0.0665
Epoch [43/51], Iter [300/391] Loss: 0.0625
Epoch [43/51], Iter [310/391] Loss: 0.0648
Epoch [43/51], Iter [320/391] Loss: 0.0617
Epoch [43/51], Iter [330/391] Loss: 0.0608
Epoch [43/51], Iter [340/391] Loss: 0.0598
Epoch [43/51], Iter [350/391] Loss: 0.0637
Epoch [43/51], Iter [360/391] Loss: 0.0584
Epoch [43/51], Iter [370/391] Loss: 0.0657
Epoch [43/51], Iter [380/391] Loss: 0.0734
Epoch [43/51], Iter [390/391] Loss: 0.0665
Epoch [44/51], Iter [10/391] Loss: 0.0540
Epoch [44/51], Iter [20/391] Loss: 0.0596
Epoch [44/51], Iter [30/391] Loss: 0.0625
Epoch [44/51], Iter [40/391] Loss: 0.0604
Epoch [44/51], Iter [50/391] Loss: 0.0528
Epoch [44/51], Iter [60/391] Loss: 0.0537
Epoch [44/51], Iter [70/391] Loss: 0.0532
Epoch [44/51], Iter [80/391] Loss: 0.0696
Epoch [44/51], Iter [90/391] Loss: 0.0676
Epoch [44/51], Iter [100/391] Loss: 0.0554
Epoch [44/51], Iter [110/391] Loss: 0.0513
Epoch [44/51], Iter [120/391] Loss: 0.0553
Epoch [44/51], Iter [130/391] Loss: 0.0560
Epoch [44/51], Iter [140/391] Loss: 0.0485
Epoch [44/51], Iter [150/391] Loss: 0.0520
Epoch [44/51], Iter [160/391] Loss: 0.0569
Epoch [44/51], Iter [170/391] Loss: 0.0548
Epoch [44/51], Iter [180/391] Loss: 0.0496
Epoch [44/51], Iter [190/391] Loss: 0.0549
Epoch [44/51], Iter [200/391] Loss: 0.0556
Epoch [44/51], Iter [210/391] Loss: 0.0527
Epoch [44/51], Iter [220/391] Loss: 0.0602
Epoch [44/51], Iter [230/391] Loss: 0.0612
Epoch [44/51], Iter [240/391] Loss: 0.0665
Epoch [44/51], Iter [250/391] Loss: 0.0637
Epoch [44/51], Iter [260/391] Loss: 0.0545
Epoch [44/51], Iter [270/391] Loss: 0.0611
Epoch [44/51], Iter [280/391] Loss: 0.0578
Epoch [44/51], Iter [290/391] Loss: 0.0512
Epoch [44/51], Iter [300/391] Loss: 0.0717
Epoch [44/51], Iter [310/391] Loss: 0.0690
Epoch [44/51], Iter [320/391] Loss: 0.0622
Epoch [44/51], Iter [330/391] Loss: 0.0560
Epoch [44/51], Iter [340/391] Loss: 0.0560
Epoch [44/51], Iter [350/391] Loss: 0.0642
Epoch [44/51], Iter [360/391] Loss: 0.0570
Epoch [44/51], Iter [370/391] Loss: 0.0580
Epoch [44/51], Iter [380/391] Loss: 0.0616
Epoch [44/51], Iter [390/391] Loss: 0.0532
Epoch [45/51], Iter [10/391] Loss: 0.0615
Epoch [45/51], Iter [20/391] Loss: 0.0622
Epoch [45/51], Iter [30/391] Loss: 0.0609
Epoch [45/51], Iter [40/391] Loss: 0.0628
Epoch [45/51], Iter [50/391] Loss: 0.0549
Epoch [45/51], Iter [60/391] Loss: 0.0586
Epoch [45/51], Iter [70/391] Loss: 0.0575
Epoch [45/51], Iter [80/391] Loss: 0.0601
Epoch [45/51], Iter [90/391] Loss: 0.0579
Epoch [45/51], Iter [100/391] Loss: 0.0644
Epoch [45/51], Iter [110/391] Loss: 0.0524
Epoch [45/51], Iter [120/391] Loss: 0.0551
Epoch [45/51], Iter [130/391] Loss: 0.0544
Epoch [45/51], Iter [140/391] Loss: 0.0654
Epoch [45/51], Iter [150/391] Loss: 0.0584
Epoch [45/51], Iter [160/391] Loss: 0.0545
Epoch [45/51], Iter [170/391] Loss: 0.0581
Epoch [45/51], Iter [180/391] Loss: 0.0615
Epoch [45/51], Iter [190/391] Loss: 0.0603
Epoch [45/51], Iter [200/391] Loss: 0.0601
Epoch [45/51], Iter [210/391] Loss: 0.0576
Epoch [45/51], Iter [220/391] Loss: 0.0533
Epoch [45/51], Iter [230/391] Loss: 0.0623
Epoch [45/51], Iter [240/391] Loss: 0.0516
Epoch [45/51], Iter [250/391] Loss: 0.0498
Epoch [45/51], Iter [260/391] Loss: 0.0594
Epoch [45/51], Iter [270/391] Loss: 0.0586
Epoch [45/51], Iter [280/391] Loss: 0.0495
Epoch [45/51], Iter [290/391] Loss: 0.0597
Epoch [45/51], Iter [300/391] Loss: 0.0580
Epoch [45/51], Iter [310/391] Loss: 0.0576
Epoch [45/51], Iter [320/391] Loss: 0.0574
Epoch [45/51], Iter [330/391] Loss: 0.0522
Epoch [45/51], Iter [340/391] Loss: 0.0660
Epoch [45/51], Iter [350/391] Loss: 0.0585
Epoch [45/51], Iter [360/391] Loss: 0.0558
Epoch [45/51], Iter [370/391] Loss: 0.0478
Epoch [45/51], Iter [380/391] Loss: 0.0654
Epoch [45/51], Iter [390/391] Loss: 0.0578
Epoch [46/51], Iter [10/391] Loss: 0.0550
Epoch [46/51], Iter [20/391] Loss: 0.0523
Epoch [46/51], Iter [30/391] Loss: 0.0554
Epoch [46/51], Iter [40/391] Loss: 0.0674
Epoch [46/51], Iter [50/391] Loss: 0.0485
Epoch [46/51], Iter [60/391] Loss: 0.0432
Epoch [46/51], Iter [70/391] Loss: 0.0611
Epoch [46/51], Iter [80/391] Loss: 0.0584
Epoch [46/51], Iter [90/391] Loss: 0.0592
Epoch [46/51], Iter [100/391] Loss: 0.0632
Epoch [46/51], Iter [110/391] Loss: 0.0522
Epoch [46/51], Iter [120/391] Loss: 0.0550
Epoch [46/51], Iter [130/391] Loss: 0.0601
Epoch [46/51], Iter [140/391] Loss: 0.0576
Epoch [46/51], Iter [150/391] Loss: 0.0539
Epoch [46/51], Iter [160/391] Loss: 0.0536
Epoch [46/51], Iter [170/391] Loss: 0.0556
Epoch [46/51], Iter [180/391] Loss: 0.0490
Epoch [46/51], Iter [190/391] Loss: 0.0641
Epoch [46/51], Iter [200/391] Loss: 0.0589
Epoch [46/51], Iter [210/391] Loss: 0.0613
Epoch [46/51], Iter [220/391] Loss: 0.0592
Epoch [46/51], Iter [230/391] Loss: 0.0549
Epoch [46/51], Iter [240/391] Loss: 0.0678
Epoch [46/51], Iter [250/391] Loss: 0.0553
Epoch [46/51], Iter [260/391] Loss: 0.0630
Epoch [46/51], Iter [270/391] Loss: 0.0597
Epoch [46/51], Iter [280/391] Loss: 0.0568
Epoch [46/51], Iter [290/391] Loss: 0.0512
Epoch [46/51], Iter [300/391] Loss: 0.0510
Epoch [46/51], Iter [310/391] Loss: 0.0498
Epoch [46/51], Iter [320/391] Loss: 0.0542
Epoch [46/51], Iter [330/391] Loss: 0.0546
Epoch [46/51], Iter [340/391] Loss: 0.0601
Epoch [46/51], Iter [350/391] Loss: 0.0710
Epoch [46/51], Iter [360/391] Loss: 0.0607
Epoch [46/51], Iter [370/391] Loss: 0.0528
Epoch [46/51], Iter [380/391] Loss: 0.0518
Epoch [46/51], Iter [390/391] Loss: 0.0541
Epoch [47/51], Iter [10/391] Loss: 0.0534
Epoch [47/51], Iter [20/391] Loss: 0.0504
Epoch [47/51], Iter [30/391] Loss: 0.0564
Epoch [47/51], Iter [40/391] Loss: 0.0501
Epoch [47/51], Iter [50/391] Loss: 0.0576
Epoch [47/51], Iter [60/391] Loss: 0.0500
Epoch [47/51], Iter [70/391] Loss: 0.0537
Epoch [47/51], Iter [80/391] Loss: 0.0488
Epoch [47/51], Iter [90/391] Loss: 0.0566
Epoch [47/51], Iter [100/391] Loss: 0.0449
Epoch [47/51], Iter [110/391] Loss: 0.0552
Epoch [47/51], Iter [120/391] Loss: 0.0485
Epoch [47/51], Iter [130/391] Loss: 0.0579
Epoch [47/51], Iter [140/391] Loss: 0.0504
Epoch [47/51], Iter [150/391] Loss: 0.0548
Epoch [47/51], Iter [160/391] Loss: 0.0597
Epoch [47/51], Iter [170/391] Loss: 0.0594
Epoch [47/51], Iter [180/391] Loss: 0.0596
Epoch [47/51], Iter [190/391] Loss: 0.0576
Epoch [47/51], Iter [200/391] Loss: 0.0569
Epoch [47/51], Iter [210/391] Loss: 0.0637
Epoch [47/51], Iter [220/391] Loss: 0.0561
Epoch [47/51], Iter [230/391] Loss: 0.0553
Epoch [47/51], Iter [240/391] Loss: 0.0558
Epoch [47/51], Iter [250/391] Loss: 0.0642
Epoch [47/51], Iter [260/391] Loss: 0.0559
Epoch [47/51], Iter [270/391] Loss: 0.0578
Epoch [47/51], Iter [280/391] Loss: 0.0565
Epoch [47/51], Iter [290/391] Loss: 0.0604
Epoch [47/51], Iter [300/391] Loss: 0.0445
Epoch [47/51], Iter [310/391] Loss: 0.0529
Epoch [47/51], Iter [320/391] Loss: 0.0610
Epoch [47/51], Iter [330/391] Loss: 0.0601
Epoch [47/51], Iter [340/391] Loss: 0.0549
Epoch [47/51], Iter [350/391] Loss: 0.0529
Epoch [47/51], Iter [360/391] Loss: 0.0587
Epoch [47/51], Iter [370/391] Loss: 0.0773
Epoch [47/51], Iter [380/391] Loss: 0.0626
Epoch [47/51], Iter [390/391] Loss: 0.0592
Epoch [48/51], Iter [10/391] Loss: 0.0513
Epoch [48/51], Iter [20/391] Loss: 0.0554
Epoch [48/51], Iter [30/391] Loss: 0.0618
Epoch [48/51], Iter [40/391] Loss: 0.0483
Epoch [48/51], Iter [50/391] Loss: 0.0503
Epoch [48/51], Iter [60/391] Loss: 0.0597
Epoch [48/51], Iter [70/391] Loss: 0.0498
Epoch [48/51], Iter [80/391] Loss: 0.0503
Epoch [48/51], Iter [90/391] Loss: 0.0547
Epoch [48/51], Iter [100/391] Loss: 0.0516
Epoch [48/51], Iter [110/391] Loss: 0.0521
Epoch [48/51], Iter [120/391] Loss: 0.0516
Epoch [48/51], Iter [130/391] Loss: 0.0504
Epoch [48/51], Iter [140/391] Loss: 0.0560
Epoch [48/51], Iter [150/391] Loss: 0.0637
Epoch [48/51], Iter [160/391] Loss: 0.0576
Epoch [48/51], Iter [170/391] Loss: 0.0532
Epoch [48/51], Iter [180/391] Loss: 0.0558
Epoch [48/51], Iter [190/391] Loss: 0.0634
Epoch [48/51], Iter [200/391] Loss: 0.0520
Epoch [48/51], Iter [210/391] Loss: 0.0521
Epoch [48/51], Iter [220/391] Loss: 0.0635
Epoch [48/51], Iter [230/391] Loss: 0.0570
Epoch [48/51], Iter [240/391] Loss: 0.0533
Epoch [48/51], Iter [250/391] Loss: 0.0580
Epoch [48/51], Iter [260/391] Loss: 0.0567
Epoch [48/51], Iter [270/391] Loss: 0.0562
Epoch [48/51], Iter [280/391] Loss: 0.0587
Epoch [48/51], Iter [290/391] Loss: 0.0622
Epoch [48/51], Iter [300/391] Loss: 0.0569
Epoch [48/51], Iter [310/391] Loss: 0.0634
Epoch [48/51], Iter [320/391] Loss: 0.0529
Epoch [48/51], Iter [330/391] Loss: 0.0631
Epoch [48/51], Iter [340/391] Loss: 0.0568
Epoch [48/51], Iter [350/391] Loss: 0.0609
Epoch [48/51], Iter [360/391] Loss: 0.0578
Epoch [48/51], Iter [370/391] Loss: 0.0515
Epoch [48/51], Iter [380/391] Loss: 0.0562
Epoch [48/51], Iter [390/391] Loss: 0.0482
Epoch [49/51], Iter [10/391] Loss: 0.0601
Epoch [49/51], Iter [20/391] Loss: 0.0485
Epoch [49/51], Iter [30/391] Loss: 0.0598
Epoch [49/51], Iter [40/391] Loss: 0.0495
Epoch [49/51], Iter [50/391] Loss: 0.0520
Epoch [49/51], Iter [60/391] Loss: 0.0536
Epoch [49/51], Iter [70/391] Loss: 0.0603
Epoch [49/51], Iter [80/391] Loss: 0.0515
Epoch [49/51], Iter [90/391] Loss: 0.0525
Epoch [49/51], Iter [100/391] Loss: 0.0467
Epoch [49/51], Iter [110/391] Loss: 0.0544
Epoch [49/51], Iter [120/391] Loss: 0.0551
Epoch [49/51], Iter [130/391] Loss: 0.0606
Epoch [49/51], Iter [140/391] Loss: 0.0500
Epoch [49/51], Iter [150/391] Loss: 0.0490
Epoch [49/51], Iter [160/391] Loss: 0.0555
Epoch [49/51], Iter [170/391] Loss: 0.0522
Epoch [49/51], Iter [180/391] Loss: 0.0562
Epoch [49/51], Iter [190/391] Loss: 0.0592
Epoch [49/51], Iter [200/391] Loss: 0.0582
Epoch [49/51], Iter [210/391] Loss: 0.0492
Epoch [49/51], Iter [220/391] Loss: 0.0525
Epoch [49/51], Iter [230/391] Loss: 0.0615
Epoch [49/51], Iter [240/391] Loss: 0.0660
Epoch [49/51], Iter [250/391] Loss: 0.0621
Epoch [49/51], Iter [260/391] Loss: 0.0590
Epoch [49/51], Iter [270/391] Loss: 0.0481
Epoch [49/51], Iter [280/391] Loss: 0.0605
Epoch [49/51], Iter [290/391] Loss: 0.0576
Epoch [49/51], Iter [300/391] Loss: 0.0519
Epoch [49/51], Iter [310/391] Loss: 0.0612
Epoch [49/51], Iter [320/391] Loss: 0.0586
Epoch [49/51], Iter [330/391] Loss: 0.0515
Epoch [49/51], Iter [340/391] Loss: 0.0486
Epoch [49/51], Iter [350/391] Loss: 0.0669
Epoch [49/51], Iter [360/391] Loss: 0.0632
Epoch [49/51], Iter [370/391] Loss: 0.0607
Epoch [49/51], Iter [380/391] Loss: 0.0562
Epoch [49/51], Iter [390/391] Loss: 0.0525
Epoch [50/51], Iter [10/391] Loss: 0.0501
Epoch [50/51], Iter [20/391] Loss: 0.0513
Epoch [50/51], Iter [30/391] Loss: 0.0508
Epoch [50/51], Iter [40/391] Loss: 0.0599
Epoch [50/51], Iter [50/391] Loss: 0.0489
Epoch [50/51], Iter [60/391] Loss: 0.0504
Epoch [50/51], Iter [70/391] Loss: 0.0451
Epoch [50/51], Iter [80/391] Loss: 0.0509
Epoch [50/51], Iter [90/391] Loss: 0.0553
Epoch [50/51], Iter [100/391] Loss: 0.0432
Epoch [50/51], Iter [110/391] Loss: 0.0513
Epoch [50/51], Iter [120/391] Loss: 0.0612
Epoch [50/51], Iter [130/391] Loss: 0.0541
Epoch [50/51], Iter [140/391] Loss: 0.0554
Epoch [50/51], Iter [150/391] Loss: 0.0514
Epoch [50/51], Iter [160/391] Loss: 0.0513
Epoch [50/51], Iter [170/391] Loss: 0.0514
Epoch [50/51], Iter [180/391] Loss: 0.0566
Epoch [50/51], Iter [190/391] Loss: 0.0547
Epoch [50/51], Iter [200/391] Loss: 0.0551
Epoch [50/51], Iter [210/391] Loss: 0.0517
Epoch [50/51], Iter [220/391] Loss: 0.0592
Epoch [50/51], Iter [230/391] Loss: 0.0517
Epoch [50/51], Iter [240/391] Loss: 0.0487
Epoch [50/51], Iter [250/391] Loss: 0.0570
Epoch [50/51], Iter [260/391] Loss: 0.0593
Epoch [50/51], Iter [270/391] Loss: 0.0496
Epoch [50/51], Iter [280/391] Loss: 0.0543
Epoch [50/51], Iter [290/391] Loss: 0.0484
Epoch [50/51], Iter [300/391] Loss: 0.0539
Epoch [50/51], Iter [310/391] Loss: 0.0604
Epoch [50/51], Iter [320/391] Loss: 0.0630
Epoch [50/51], Iter [330/391] Loss: 0.0518
Epoch [50/51], Iter [340/391] Loss: 0.0520
Epoch [50/51], Iter [350/391] Loss: 0.0467
Epoch [50/51], Iter [360/391] Loss: 0.0515
Epoch [50/51], Iter [370/391] Loss: 0.0465
Epoch [50/51], Iter [380/391] Loss: 0.0612
Epoch [50/51], Iter [390/391] Loss: 0.0592
[Saving Checkpoint]
Epoch [51/51], Iter [10/391] Loss: 0.0480
Epoch [51/51], Iter [20/391] Loss: 0.0494
Epoch [51/51], Iter [30/391] Loss: 0.0509
Epoch [51/51], Iter [40/391] Loss: 0.0575
Epoch [51/51], Iter [50/391] Loss: 0.0468
Epoch [51/51], Iter [60/391] Loss: 0.0577
Epoch [51/51], Iter [70/391] Loss: 0.0539
Epoch [51/51], Iter [80/391] Loss: 0.0552
Epoch [51/51], Iter [90/391] Loss: 0.0575
Epoch [51/51], Iter [100/391] Loss: 0.0543
Epoch [51/51], Iter [110/391] Loss: 0.0586
Epoch [51/51], Iter [120/391] Loss: 0.0470
Epoch [51/51], Iter [130/391] Loss: 0.0569
Epoch [51/51], Iter [140/391] Loss: 0.0591
Epoch [51/51], Iter [150/391] Loss: 0.0563
Epoch [51/51], Iter [160/391] Loss: 0.0453
Epoch [51/51], Iter [170/391] Loss: 0.0554
Epoch [51/51], Iter [180/391] Loss: 0.0459
Epoch [51/51], Iter [190/391] Loss: 0.0583
Epoch [51/51], Iter [200/391] Loss: 0.0578
Epoch [51/51], Iter [210/391] Loss: 0.0546
Epoch [51/51], Iter [220/391] Loss: 0.0539
Epoch [51/51], Iter [230/391] Loss: 0.0568
Epoch [51/51], Iter [240/391] Loss: 0.0535
Epoch [51/51], Iter [250/391] Loss: 0.0544
Epoch [51/51], Iter [260/391] Loss: 0.0589
Epoch [51/51], Iter [270/391] Loss: 0.0535
Epoch [51/51], Iter [280/391] Loss: 0.0584
Epoch [51/51], Iter [290/391] Loss: 0.0598
Epoch [51/51], Iter [300/391] Loss: 0.0597
Epoch [51/51], Iter [310/391] Loss: 0.0513
Epoch [51/51], Iter [320/391] Loss: 0.0541
Epoch [51/51], Iter [330/391] Loss: 0.0618
Epoch [51/51], Iter [340/391] Loss: 0.0492
Epoch [51/51], Iter [350/391] Loss: 0.0527
Epoch [51/51], Iter [360/391] Loss: 0.0617
Epoch [51/51], Iter [370/391] Loss: 0.0496
Epoch [51/51], Iter [380/391] Loss: 0.0521
Epoch [51/51], Iter [390/391] Loss: 0.0570
# | a=1 | T=15 | epochs = 51 |
resnet_child_a1_t15_e51 = copy.deepcopy(resnet_child) #let's save for future reference
test_harness( testloader, resnet_child_a1_t15_e51 )
Accuracy of the model on the test images: 89 %
(tensor(8910, device='cuda:0'), 10000)
# | a=1 | T=2 | epochs = 51 |
resnet_child, optimizer_child = get_new_child_model()
epoch = 0
kd_loss_a1_t2 = partial( knowledge_distillation_loss, alpha=1, T=2 )
training_harness( trainloader, optimizer_child, kd_loss_a1_t2, resnet_parent, resnet_child, model_name='DeepResNet_a1_t2_e51' )
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:12: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
if sys.path[0] == '':
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:26: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:15: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
from ipykernel import kernelapp as app
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:20: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:21: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
ENABLING GPU ACCELERATION || GeForce GTX 1080 Ti
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:46: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
Epoch [1/51], Iter [10/391] Loss: 0.7287
Epoch [1/51], Iter [20/391] Loss: 0.6564
Epoch [1/51], Iter [30/391] Loss: 0.6365
Epoch [1/51], Iter [40/391] Loss: 0.6439
Epoch [1/51], Iter [50/391] Loss: 0.6296
Epoch [1/51], Iter [60/391] Loss: 0.5961
Epoch [1/51], Iter [70/391] Loss: 0.5753
Epoch [1/51], Iter [80/391] Loss: 0.5947
Epoch [1/51], Iter [90/391] Loss: 0.5449
Epoch [1/51], Iter [100/391] Loss: 0.5392
Epoch [1/51], Iter [110/391] Loss: 0.5424
Epoch [1/51], Iter [120/391] Loss: 0.5472
Epoch [1/51], Iter [130/391] Loss: 0.5221
Epoch [1/51], Iter [140/391] Loss: 0.5191
Epoch [1/51], Iter [150/391] Loss: 0.4782
Epoch [1/51], Iter [160/391] Loss: 0.5652
Epoch [1/51], Iter [170/391] Loss: 0.5151
Epoch [1/51], Iter [180/391] Loss: 0.5071
Epoch [1/51], Iter [190/391] Loss: 0.5263
Epoch [1/51], Iter [200/391] Loss: 0.4934
Epoch [1/51], Iter [210/391] Loss: 0.5426
Epoch [1/51], Iter [220/391] Loss: 0.4848
Epoch [1/51], Iter [230/391] Loss: 0.5039
Epoch [1/51], Iter [240/391] Loss: 0.4823
Epoch [1/51], Iter [250/391] Loss: 0.4724
Epoch [1/51], Iter [260/391] Loss: 0.5216
Epoch [1/51], Iter [270/391] Loss: 0.4595
Epoch [1/51], Iter [280/391] Loss: 0.4459
Epoch [1/51], Iter [290/391] Loss: 0.4502
Epoch [1/51], Iter [300/391] Loss: 0.4308
Epoch [1/51], Iter [310/391] Loss: 0.5056
Epoch [1/51], Iter [320/391] Loss: 0.4216
Epoch [1/51], Iter [330/391] Loss: 0.4161
Epoch [1/51], Iter [340/391] Loss: 0.4246
Epoch [1/51], Iter [350/391] Loss: 0.4243
Epoch [1/51], Iter [360/391] Loss: 0.4877
Epoch [1/51], Iter [370/391] Loss: 0.4588
Epoch [1/51], Iter [380/391] Loss: 0.4258
Epoch [1/51], Iter [390/391] Loss: 0.4175
Epoch [2/51], Iter [10/391] Loss: 0.4224
Epoch [2/51], Iter [20/391] Loss: 0.4143
Epoch [2/51], Iter [30/391] Loss: 0.4256
Epoch [2/51], Iter [40/391] Loss: 0.4203
Epoch [2/51], Iter [50/391] Loss: 0.4496
Epoch [2/51], Iter [60/391] Loss: 0.4343
Epoch [2/51], Iter [70/391] Loss: 0.4387
Epoch [2/51], Iter [80/391] Loss: 0.4023
Epoch [2/51], Iter [90/391] Loss: 0.3991
Epoch [2/51], Iter [100/391] Loss: 0.4035
Epoch [2/51], Iter [110/391] Loss: 0.3642
Epoch [2/51], Iter [120/391] Loss: 0.4030
Epoch [2/51], Iter [130/391] Loss: 0.4059
Epoch [2/51], Iter [140/391] Loss: 0.4105
Epoch [2/51], Iter [150/391] Loss: 0.4326
Epoch [2/51], Iter [160/391] Loss: 0.3883
Epoch [2/51], Iter [170/391] Loss: 0.3835
Epoch [2/51], Iter [180/391] Loss: 0.3559
Epoch [2/51], Iter [190/391] Loss: 0.3539
Epoch [2/51], Iter [200/391] Loss: 0.3222
Epoch [2/51], Iter [210/391] Loss: 0.3774
Epoch [2/51], Iter [220/391] Loss: 0.3938
Epoch [2/51], Iter [230/391] Loss: 0.3730
Epoch [2/51], Iter [240/391] Loss: 0.3533
Epoch [2/51], Iter [250/391] Loss: 0.3246
Epoch [2/51], Iter [260/391] Loss: 0.3535
Epoch [2/51], Iter [270/391] Loss: 0.3312
Epoch [2/51], Iter [280/391] Loss: 0.3484
Epoch [2/51], Iter [290/391] Loss: 0.3017
Epoch [2/51], Iter [300/391] Loss: 0.3443
Epoch [2/51], Iter [310/391] Loss: 0.3724
Epoch [2/51], Iter [320/391] Loss: 0.3243
Epoch [2/51], Iter [330/391] Loss: 0.3494
Epoch [2/51], Iter [340/391] Loss: 0.3413
Epoch [2/51], Iter [350/391] Loss: 0.3840
Epoch [2/51], Iter [360/391] Loss: 0.3127
Epoch [2/51], Iter [370/391] Loss: 0.3404
Epoch [2/51], Iter [380/391] Loss: 0.3396
Epoch [2/51], Iter [390/391] Loss: 0.3419
Epoch [3/51], Iter [10/391] Loss: 0.3014
Epoch [3/51], Iter [20/391] Loss: 0.2938
Epoch [3/51], Iter [30/391] Loss: 0.4256
Epoch [3/51], Iter [40/391] Loss: 0.2903
Epoch [3/51], Iter [50/391] Loss: 0.3196
Epoch [3/51], Iter [60/391] Loss: 0.3488
Epoch [3/51], Iter [70/391] Loss: 0.2968
Epoch [3/51], Iter [80/391] Loss: 0.3447
Epoch [3/51], Iter [90/391] Loss: 0.3249
Epoch [3/51], Iter [100/391] Loss: 0.3133
Epoch [3/51], Iter [110/391] Loss: 0.3324
Epoch [3/51], Iter [120/391] Loss: 0.2932
Epoch [3/51], Iter [130/391] Loss: 0.3034
Epoch [3/51], Iter [140/391] Loss: 0.2659
Epoch [3/51], Iter [150/391] Loss: 0.3411
Epoch [3/51], Iter [160/391] Loss: 0.3363
Epoch [3/51], Iter [170/391] Loss: 0.2934
Epoch [3/51], Iter [180/391] Loss: 0.3665
Epoch [3/51], Iter [190/391] Loss: 0.3173
Epoch [3/51], Iter [200/391] Loss: 0.3280
Epoch [3/51], Iter [210/391] Loss: 0.3543
Epoch [3/51], Iter [220/391] Loss: 0.2722
Epoch [3/51], Iter [230/391] Loss: 0.3021
Epoch [3/51], Iter [240/391] Loss: 0.3400
Epoch [3/51], Iter [250/391] Loss: 0.2898
Epoch [3/51], Iter [260/391] Loss: 0.3037
Epoch [3/51], Iter [270/391] Loss: 0.3138
Epoch [3/51], Iter [280/391] Loss: 0.2726
Epoch [3/51], Iter [290/391] Loss: 0.3417
Epoch [3/51], Iter [300/391] Loss: 0.3018
Epoch [3/51], Iter [310/391] Loss: 0.2761
Epoch [3/51], Iter [320/391] Loss: 0.3119
Epoch [3/51], Iter [330/391] Loss: 0.2726
Epoch [3/51], Iter [340/391] Loss: 0.2310
Epoch [3/51], Iter [350/391] Loss: 0.2771
Epoch [3/51], Iter [360/391] Loss: 0.2871
Epoch [3/51], Iter [370/391] Loss: 0.3266
Epoch [3/51], Iter [380/391] Loss: 0.2655
Epoch [3/51], Iter [390/391] Loss: 0.2565
Epoch [4/51], Iter [10/391] Loss: 0.2369
Epoch [4/51], Iter [20/391] Loss: 0.2623
Epoch [4/51], Iter [30/391] Loss: 0.3251
Epoch [4/51], Iter [40/391] Loss: 0.2857
Epoch [4/51], Iter [50/391] Loss: 0.2791
Epoch [4/51], Iter [60/391] Loss: 0.2441
Epoch [4/51], Iter [70/391] Loss: 0.3080
Epoch [4/51], Iter [80/391] Loss: 0.2653
Epoch [4/51], Iter [90/391] Loss: 0.3278
Epoch [4/51], Iter [100/391] Loss: 0.2704
Epoch [4/51], Iter [110/391] Loss: 0.2668
Epoch [4/51], Iter [120/391] Loss: 0.2829
Epoch [4/51], Iter [130/391] Loss: 0.2553
Epoch [4/51], Iter [140/391] Loss: 0.2954
Epoch [4/51], Iter [150/391] Loss: 0.3039
Epoch [4/51], Iter [160/391] Loss: 0.2789
Epoch [4/51], Iter [170/391] Loss: 0.2228
Epoch [4/51], Iter [180/391] Loss: 0.3330
Epoch [4/51], Iter [190/391] Loss: 0.2671
Epoch [4/51], Iter [200/391] Loss: 0.2353
Epoch [4/51], Iter [210/391] Loss: 0.2560
Epoch [4/51], Iter [220/391] Loss: 0.2392
Epoch [4/51], Iter [230/391] Loss: 0.2451
Epoch [4/51], Iter [240/391] Loss: 0.2821
Epoch [4/51], Iter [250/391] Loss: 0.2742
Epoch [4/51], Iter [260/391] Loss: 0.2585
Epoch [4/51], Iter [270/391] Loss: 0.2576
Epoch [4/51], Iter [280/391] Loss: 0.2888
Epoch [4/51], Iter [290/391] Loss: 0.2670
Epoch [4/51], Iter [300/391] Loss: 0.2279
Epoch [4/51], Iter [310/391] Loss: 0.2644
Epoch [4/51], Iter [320/391] Loss: 0.2755
Epoch [4/51], Iter [330/391] Loss: 0.2707
Epoch [4/51], Iter [340/391] Loss: 0.2599
Epoch [4/51], Iter [350/391] Loss: 0.2970
Epoch [4/51], Iter [360/391] Loss: 0.2545
Epoch [4/51], Iter [370/391] Loss: 0.2359
Epoch [4/51], Iter [380/391] Loss: 0.2591
Epoch [4/51], Iter [390/391] Loss: 0.2571
Epoch [5/51], Iter [10/391] Loss: 0.2305
Epoch [5/51], Iter [20/391] Loss: 0.2575
Epoch [5/51], Iter [30/391] Loss: 0.2276
Epoch [5/51], Iter [40/391] Loss: 0.2402
Epoch [5/51], Iter [50/391] Loss: 0.2350
Epoch [5/51], Iter [60/391] Loss: 0.2474
Epoch [5/51], Iter [70/391] Loss: 0.2696
Epoch [5/51], Iter [80/391] Loss: 0.2092
Epoch [5/51], Iter [90/391] Loss: 0.2804
Epoch [5/51], Iter [100/391] Loss: 0.2367
Epoch [5/51], Iter [110/391] Loss: 0.2530
Epoch [5/51], Iter [120/391] Loss: 0.2164
Epoch [5/51], Iter [130/391] Loss: 0.2236
Epoch [5/51], Iter [140/391] Loss: 0.2737
Epoch [5/51], Iter [150/391] Loss: 0.2303
Epoch [5/51], Iter [160/391] Loss: 0.2352
Epoch [5/51], Iter [170/391] Loss: 0.2014
Epoch [5/51], Iter [180/391] Loss: 0.2408
Epoch [5/51], Iter [190/391] Loss: 0.2149
Epoch [5/51], Iter [200/391] Loss: 0.2523
Epoch [5/51], Iter [210/391] Loss: 0.2430
Epoch [5/51], Iter [220/391] Loss: 0.2339
Epoch [5/51], Iter [230/391] Loss: 0.2417
Epoch [5/51], Iter [240/391] Loss: 0.2100
Epoch [5/51], Iter [250/391] Loss: 0.2580
Epoch [5/51], Iter [260/391] Loss: 0.1994
Epoch [5/51], Iter [270/391] Loss: 0.2441
Epoch [5/51], Iter [280/391] Loss: 0.2329
Epoch [5/51], Iter [290/391] Loss: 0.2662
Epoch [5/51], Iter [300/391] Loss: 0.2063
Epoch [5/51], Iter [310/391] Loss: 0.2113
Epoch [5/51], Iter [320/391] Loss: 0.2394
Epoch [5/51], Iter [330/391] Loss: 0.2586
Epoch [5/51], Iter [340/391] Loss: 0.2248
Epoch [5/51], Iter [350/391] Loss: 0.2070
Epoch [5/51], Iter [360/391] Loss: 0.2451
Epoch [5/51], Iter [370/391] Loss: 0.2499
Epoch [5/51], Iter [380/391] Loss: 0.2186
Epoch [5/51], Iter [390/391] Loss: 0.2663
Epoch [6/51], Iter [10/391] Loss: 0.1951
Epoch [6/51], Iter [20/391] Loss: 0.2014
Epoch [6/51], Iter [30/391] Loss: 0.2264
Epoch [6/51], Iter [40/391] Loss: 0.2030
Epoch [6/51], Iter [50/391] Loss: 0.1940
Epoch [6/51], Iter [60/391] Loss: 0.2375
Epoch [6/51], Iter [70/391] Loss: 0.2450
Epoch [6/51], Iter [80/391] Loss: 0.1844
Epoch [6/51], Iter [90/391] Loss: 0.2171
Epoch [6/51], Iter [100/391] Loss: 0.2403
Epoch [6/51], Iter [110/391] Loss: 0.2299
Epoch [6/51], Iter [120/391] Loss: 0.2010
Epoch [6/51], Iter [130/391] Loss: 0.2352
Epoch [6/51], Iter [140/391] Loss: 0.2319
Epoch [6/51], Iter [150/391] Loss: 0.2084
Epoch [6/51], Iter [160/391] Loss: 0.1692
Epoch [6/51], Iter [170/391] Loss: 0.1786
Epoch [6/51], Iter [180/391] Loss: 0.2236
Epoch [6/51], Iter [190/391] Loss: 0.2194
Epoch [6/51], Iter [200/391] Loss: 0.2560
Epoch [6/51], Iter [210/391] Loss: 0.2207
Epoch [6/51], Iter [220/391] Loss: 0.2245
Epoch [6/51], Iter [230/391] Loss: 0.2046
Epoch [6/51], Iter [240/391] Loss: 0.2371
Epoch [6/51], Iter [250/391] Loss: 0.2185
Epoch [6/51], Iter [260/391] Loss: 0.2349
Epoch [6/51], Iter [270/391] Loss: 0.2360
Epoch [6/51], Iter [280/391] Loss: 0.2521
Epoch [6/51], Iter [290/391] Loss: 0.2059
Epoch [6/51], Iter [300/391] Loss: 0.2545
Epoch [6/51], Iter [310/391] Loss: 0.2344
Epoch [6/51], Iter [320/391] Loss: 0.2660
Epoch [6/51], Iter [330/391] Loss: 0.2042
Epoch [6/51], Iter [340/391] Loss: 0.1885
Epoch [6/51], Iter [350/391] Loss: 0.1858
Epoch [6/51], Iter [360/391] Loss: 0.2094
Epoch [6/51], Iter [370/391] Loss: 0.2046
Epoch [6/51], Iter [380/391] Loss: 0.2054
Epoch [6/51], Iter [390/391] Loss: 0.1976
Epoch [7/51], Iter [10/391] Loss: 0.2086
Epoch [7/51], Iter [20/391] Loss: 0.1906
Epoch [7/51], Iter [30/391] Loss: 0.2024
Epoch [7/51], Iter [40/391] Loss: 0.1850
Epoch [7/51], Iter [50/391] Loss: 0.2178
Epoch [7/51], Iter [60/391] Loss: 0.1793
Epoch [7/51], Iter [70/391] Loss: 0.1768
Epoch [7/51], Iter [80/391] Loss: 0.1917
Epoch [7/51], Iter [90/391] Loss: 0.1751
Epoch [7/51], Iter [100/391] Loss: 0.1960
Epoch [7/51], Iter [110/391] Loss: 0.2046
Epoch [7/51], Iter [120/391] Loss: 0.1864
Epoch [7/51], Iter [130/391] Loss: 0.1759
Epoch [7/51], Iter [140/391] Loss: 0.1965
Epoch [7/51], Iter [150/391] Loss: 0.1559
Epoch [7/51], Iter [160/391] Loss: 0.2109
Epoch [7/51], Iter [170/391] Loss: 0.1869
Epoch [7/51], Iter [180/391] Loss: 0.1676
Epoch [7/51], Iter [190/391] Loss: 0.2175
Epoch [7/51], Iter [200/391] Loss: 0.1958
Epoch [7/51], Iter [210/391] Loss: 0.2130
Epoch [7/51], Iter [220/391] Loss: 0.1910
Epoch [7/51], Iter [230/391] Loss: 0.1986
Epoch [7/51], Iter [240/391] Loss: 0.1960
Epoch [7/51], Iter [250/391] Loss: 0.1673
Epoch [7/51], Iter [260/391] Loss: 0.1568
Epoch [7/51], Iter [270/391] Loss: 0.1931
Epoch [7/51], Iter [280/391] Loss: 0.1675
Epoch [7/51], Iter [290/391] Loss: 0.1791
Epoch [7/51], Iter [300/391] Loss: 0.2134
Epoch [7/51], Iter [310/391] Loss: 0.1970
Epoch [7/51], Iter [320/391] Loss: 0.1663
Epoch [7/51], Iter [330/391] Loss: 0.1817
Epoch [7/51], Iter [340/391] Loss: 0.1706
Epoch [7/51], Iter [350/391] Loss: 0.2026
Epoch [7/51], Iter [360/391] Loss: 0.1908
Epoch [7/51], Iter [370/391] Loss: 0.1809
Epoch [7/51], Iter [380/391] Loss: 0.1818
Epoch [7/51], Iter [390/391] Loss: 0.1775
Epoch [8/51], Iter [10/391] Loss: 0.1706
Epoch [8/51], Iter [20/391] Loss: 0.1705
Epoch [8/51], Iter [30/391] Loss: 0.1727
Epoch [8/51], Iter [40/391] Loss: 0.1644
Epoch [8/51], Iter [50/391] Loss: 0.1590
Epoch [8/51], Iter [60/391] Loss: 0.1758
Epoch [8/51], Iter [70/391] Loss: 0.1917
Epoch [8/51], Iter [80/391] Loss: 0.1927
Epoch [8/51], Iter [90/391] Loss: 0.1573
Epoch [8/51], Iter [100/391] Loss: 0.1685
Epoch [8/51], Iter [110/391] Loss: 0.1683
Epoch [8/51], Iter [120/391] Loss: 0.1648
Epoch [8/51], Iter [130/391] Loss: 0.1893
Epoch [8/51], Iter [140/391] Loss: 0.1802
Epoch [8/51], Iter [150/391] Loss: 0.1995
Epoch [8/51], Iter [160/391] Loss: 0.2130
Epoch [8/51], Iter [170/391] Loss: 0.1657
Epoch [8/51], Iter [180/391] Loss: 0.1749
Epoch [8/51], Iter [190/391] Loss: 0.1894
Epoch [8/51], Iter [200/391] Loss: 0.1528
Epoch [8/51], Iter [210/391] Loss: 0.1610
Epoch [8/51], Iter [220/391] Loss: 0.2136
Epoch [8/51], Iter [230/391] Loss: 0.1945
Epoch [8/51], Iter [240/391] Loss: 0.1595
Epoch [8/51], Iter [250/391] Loss: 0.1485
Epoch [8/51], Iter [260/391] Loss: 0.2264
Epoch [8/51], Iter [270/391] Loss: 0.1408
Epoch [8/51], Iter [280/391] Loss: 0.2095
Epoch [8/51], Iter [290/391] Loss: 0.1615
Epoch [8/51], Iter [300/391] Loss: 0.1633
Epoch [8/51], Iter [310/391] Loss: 0.1737
Epoch [8/51], Iter [320/391] Loss: 0.1703
Epoch [8/51], Iter [330/391] Loss: 0.1764
Epoch [8/51], Iter [340/391] Loss: 0.1650
Epoch [8/51], Iter [350/391] Loss: 0.1637
Epoch [8/51], Iter [360/391] Loss: 0.1804
Epoch [8/51], Iter [370/391] Loss: 0.2045
Epoch [8/51], Iter [380/391] Loss: 0.2540
Epoch [8/51], Iter [390/391] Loss: 0.1662
Epoch [9/51], Iter [10/391] Loss: 0.2097
Epoch [9/51], Iter [20/391] Loss: 0.1279
Epoch [9/51], Iter [30/391] Loss: 0.1729
Epoch [9/51], Iter [40/391] Loss: 0.1974
Epoch [9/51], Iter [50/391] Loss: 0.1722
Epoch [9/51], Iter [60/391] Loss: 0.1342
Epoch [9/51], Iter [70/391] Loss: 0.1589
Epoch [9/51], Iter [80/391] Loss: 0.1548
Epoch [9/51], Iter [90/391] Loss: 0.1624
Epoch [9/51], Iter [100/391] Loss: 0.1470
Epoch [9/51], Iter [110/391] Loss: 0.1376
Epoch [9/51], Iter [120/391] Loss: 0.1439
Epoch [9/51], Iter [130/391] Loss: 0.1650
Epoch [9/51], Iter [140/391] Loss: 0.1792
Epoch [9/51], Iter [150/391] Loss: 0.2014
Epoch [9/51], Iter [160/391] Loss: 0.1745
Epoch [9/51], Iter [170/391] Loss: 0.1499
Epoch [9/51], Iter [180/391] Loss: 0.1724
Epoch [9/51], Iter [190/391] Loss: 0.1529
Epoch [9/51], Iter [200/391] Loss: 0.1677
Epoch [9/51], Iter [210/391] Loss: 0.1718
Epoch [9/51], Iter [220/391] Loss: 0.1421
Epoch [9/51], Iter [230/391] Loss: 0.1749
Epoch [9/51], Iter [240/391] Loss: 0.1557
Epoch [9/51], Iter [250/391] Loss: 0.1483
Epoch [9/51], Iter [260/391] Loss: 0.1533
Epoch [9/51], Iter [270/391] Loss: 0.1347
Epoch [9/51], Iter [280/391] Loss: 0.1369
Epoch [9/51], Iter [290/391] Loss: 0.1518
Epoch [9/51], Iter [300/391] Loss: 0.1490
Epoch [9/51], Iter [310/391] Loss: 0.1567
Epoch [9/51], Iter [320/391] Loss: 0.1748
Epoch [9/51], Iter [330/391] Loss: 0.1458
Epoch [9/51], Iter [340/391] Loss: 0.1856
Epoch [9/51], Iter [350/391] Loss: 0.2018
Epoch [9/51], Iter [360/391] Loss: 0.1323
Epoch [9/51], Iter [370/391] Loss: 0.1670
Epoch [9/51], Iter [380/391] Loss: 0.1728
Epoch [9/51], Iter [390/391] Loss: 0.1720
Epoch [10/51], Iter [10/391] Loss: 0.1455
Epoch [10/51], Iter [20/391] Loss: 0.1554
Epoch [10/51], Iter [30/391] Loss: 0.1313
Epoch [10/51], Iter [40/391] Loss: 0.1600
Epoch [10/51], Iter [50/391] Loss: 0.1150
Epoch [10/51], Iter [60/391] Loss: 0.1536
Epoch [10/51], Iter [70/391] Loss: 0.1270
Epoch [10/51], Iter [80/391] Loss: 0.1279
Epoch [10/51], Iter [90/391] Loss: 0.1670
Epoch [10/51], Iter [100/391] Loss: 0.1399
Epoch [10/51], Iter [110/391] Loss: 0.1572
Epoch [10/51], Iter [120/391] Loss: 0.1479
Epoch [10/51], Iter [130/391] Loss: 0.1551
Epoch [10/51], Iter [140/391] Loss: 0.1791
Epoch [10/51], Iter [150/391] Loss: 0.1424
Epoch [10/51], Iter [160/391] Loss: 0.1705
Epoch [10/51], Iter [170/391] Loss: 0.1546
Epoch [10/51], Iter [180/391] Loss: 0.1546
Epoch [10/51], Iter [190/391] Loss: 0.1641
Epoch [10/51], Iter [200/391] Loss: 0.1104
Epoch [10/51], Iter [210/391] Loss: 0.1463
Epoch [10/51], Iter [220/391] Loss: 0.1223
Epoch [10/51], Iter [230/391] Loss: 0.1389
Epoch [10/51], Iter [240/391] Loss: 0.1477
Epoch [10/51], Iter [250/391] Loss: 0.1781
Epoch [10/51], Iter [260/391] Loss: 0.1446
Epoch [10/51], Iter [270/391] Loss: 0.1818
Epoch [10/51], Iter [280/391] Loss: 0.1725
Epoch [10/51], Iter [290/391] Loss: 0.1277
Epoch [10/51], Iter [300/391] Loss: 0.1444
Epoch [10/51], Iter [310/391] Loss: 0.1648
Epoch [10/51], Iter [320/391] Loss: 0.1580
Epoch [10/51], Iter [330/391] Loss: 0.1716
Epoch [10/51], Iter [340/391] Loss: 0.1599
Epoch [10/51], Iter [350/391] Loss: 0.1191
Epoch [10/51], Iter [360/391] Loss: 0.1778
Epoch [10/51], Iter [370/391] Loss: 0.1485
Epoch [10/51], Iter [380/391] Loss: 0.1488
Epoch [10/51], Iter [390/391] Loss: 0.1572
[Saving Checkpoint]
Epoch [11/51], Iter [10/391] Loss: 0.1674
Epoch [11/51], Iter [20/391] Loss: 0.1201
Epoch [11/51], Iter [30/391] Loss: 0.1416
Epoch [11/51], Iter [40/391] Loss: 0.1391
Epoch [11/51], Iter [50/391] Loss: 0.1430
Epoch [11/51], Iter [60/391] Loss: 0.1544
Epoch [11/51], Iter [70/391] Loss: 0.1290
Epoch [11/51], Iter [80/391] Loss: 0.1623
Epoch [11/51], Iter [90/391] Loss: 0.1570
Epoch [11/51], Iter [100/391] Loss: 0.1046
Epoch [11/51], Iter [110/391] Loss: 0.1695
Epoch [11/51], Iter [120/391] Loss: 0.1372
Epoch [11/51], Iter [130/391] Loss: 0.1399
Epoch [11/51], Iter [140/391] Loss: 0.1689
Epoch [11/51], Iter [150/391] Loss: 0.1240
Epoch [11/51], Iter [160/391] Loss: 0.1381
Epoch [11/51], Iter [170/391] Loss: 0.1135
Epoch [11/51], Iter [180/391] Loss: 0.1402
Epoch [11/51], Iter [190/391] Loss: 0.1391
Epoch [11/51], Iter [200/391] Loss: 0.1327
Epoch [11/51], Iter [210/391] Loss: 0.1402
Epoch [11/51], Iter [220/391] Loss: 0.1688
Epoch [11/51], Iter [230/391] Loss: 0.1244
Epoch [11/51], Iter [240/391] Loss: 0.1308
Epoch [11/51], Iter [250/391] Loss: 0.1143
Epoch [11/51], Iter [260/391] Loss: 0.1459
Epoch [11/51], Iter [270/391] Loss: 0.1474
Epoch [11/51], Iter [280/391] Loss: 0.1254
Epoch [11/51], Iter [290/391] Loss: 0.1557
Epoch [11/51], Iter [300/391] Loss: 0.1352
Epoch [11/51], Iter [310/391] Loss: 0.1596
Epoch [11/51], Iter [320/391] Loss: 0.1242
Epoch [11/51], Iter [330/391] Loss: 0.1229
Epoch [11/51], Iter [340/391] Loss: 0.1348
Epoch [11/51], Iter [350/391] Loss: 0.1524
Epoch [11/51], Iter [360/391] Loss: 0.1559
Epoch [11/51], Iter [370/391] Loss: 0.1537
Epoch [11/51], Iter [380/391] Loss: 0.1223
Epoch [11/51], Iter [390/391] Loss: 0.1431
Epoch [12/51], Iter [10/391] Loss: 0.1974
Epoch [12/51], Iter [20/391] Loss: 0.1609
Epoch [12/51], Iter [30/391] Loss: 0.1369
Epoch [12/51], Iter [40/391] Loss: 0.1351
Epoch [12/51], Iter [50/391] Loss: 0.0933
Epoch [12/51], Iter [60/391] Loss: 0.1569
Epoch [12/51], Iter [70/391] Loss: 0.1510
Epoch [12/51], Iter [80/391] Loss: 0.1376
Epoch [12/51], Iter [90/391] Loss: 0.1201
Epoch [12/51], Iter [100/391] Loss: 0.1562
Epoch [12/51], Iter [110/391] Loss: 0.1317
Epoch [12/51], Iter [120/391] Loss: 0.1378
Epoch [12/51], Iter [130/391] Loss: 0.1569
Epoch [12/51], Iter [140/391] Loss: 0.1411
Epoch [12/51], Iter [150/391] Loss: 0.1444
Epoch [12/51], Iter [160/391] Loss: 0.1214
Epoch [12/51], Iter [170/391] Loss: 0.1166
Epoch [12/51], Iter [180/391] Loss: 0.1161
Epoch [12/51], Iter [190/391] Loss: 0.1214
Epoch [12/51], Iter [200/391] Loss: 0.1191
Epoch [12/51], Iter [210/391] Loss: 0.1526
Epoch [12/51], Iter [220/391] Loss: 0.1657
Epoch [12/51], Iter [230/391] Loss: 0.1468
Epoch [12/51], Iter [240/391] Loss: 0.1288
Epoch [12/51], Iter [250/391] Loss: 0.1421
Epoch [12/51], Iter [260/391] Loss: 0.1499
Epoch [12/51], Iter [270/391] Loss: 0.1542
Epoch [12/51], Iter [280/391] Loss: 0.1324
Epoch [12/51], Iter [290/391] Loss: 0.1350
Epoch [12/51], Iter [300/391] Loss: 0.1552
Epoch [12/51], Iter [310/391] Loss: 0.1594
Epoch [12/51], Iter [320/391] Loss: 0.1306
Epoch [12/51], Iter [330/391] Loss: 0.1221
Epoch [12/51], Iter [340/391] Loss: 0.1313
Epoch [12/51], Iter [350/391] Loss: 0.1340
Epoch [12/51], Iter [360/391] Loss: 0.1222
Epoch [12/51], Iter [370/391] Loss: 0.1387
Epoch [12/51], Iter [380/391] Loss: 0.1398
Epoch [12/51], Iter [390/391] Loss: 0.1262
Epoch [13/51], Iter [10/391] Loss: 0.1269
Epoch [13/51], Iter [20/391] Loss: 0.0937
Epoch [13/51], Iter [30/391] Loss: 0.1008
Epoch [13/51], Iter [40/391] Loss: 0.1276
Epoch [13/51], Iter [50/391] Loss: 0.1175
Epoch [13/51], Iter [60/391] Loss: 0.1097
Epoch [13/51], Iter [70/391] Loss: 0.1257
Epoch [13/51], Iter [80/391] Loss: 0.0977
Epoch [13/51], Iter [90/391] Loss: 0.1346
Epoch [13/51], Iter [100/391] Loss: 0.1260
Epoch [13/51], Iter [110/391] Loss: 0.1370
Epoch [13/51], Iter [120/391] Loss: 0.1456
Epoch [13/51], Iter [130/391] Loss: 0.1376
Epoch [13/51], Iter [140/391] Loss: 0.1182
Epoch [13/51], Iter [150/391] Loss: 0.1241
Epoch [13/51], Iter [160/391] Loss: 0.1576
Epoch [13/51], Iter [170/391] Loss: 0.1366
Epoch [13/51], Iter [180/391] Loss: 0.1182
Epoch [13/51], Iter [190/391] Loss: 0.1091
Epoch [13/51], Iter [200/391] Loss: 0.0940
Epoch [13/51], Iter [210/391] Loss: 0.1159
Epoch [13/51], Iter [220/391] Loss: 0.1426
Epoch [13/51], Iter [230/391] Loss: 0.0771
Epoch [13/51], Iter [240/391] Loss: 0.1584
Epoch [13/51], Iter [250/391] Loss: 0.1221
Epoch [13/51], Iter [260/391] Loss: 0.1326
Epoch [13/51], Iter [270/391] Loss: 0.1407
Epoch [13/51], Iter [280/391] Loss: 0.1370
Epoch [13/51], Iter [290/391] Loss: 0.1607
Epoch [13/51], Iter [300/391] Loss: 0.1368
Epoch [13/51], Iter [310/391] Loss: 0.1414
Epoch [13/51], Iter [320/391] Loss: 0.1371
Epoch [13/51], Iter [330/391] Loss: 0.1471
Epoch [13/51], Iter [340/391] Loss: 0.1396
Epoch [13/51], Iter [350/391] Loss: 0.1535
Epoch [13/51], Iter [360/391] Loss: 0.1121
Epoch [13/51], Iter [370/391] Loss: 0.1343
Epoch [13/51], Iter [380/391] Loss: 0.1340
Epoch [13/51], Iter [390/391] Loss: 0.1128
Epoch [14/51], Iter [10/391] Loss: 0.1247
Epoch [14/51], Iter [20/391] Loss: 0.1334
Epoch [14/51], Iter [30/391] Loss: 0.1434
Epoch [14/51], Iter [40/391] Loss: 0.1275
Epoch [14/51], Iter [50/391] Loss: 0.1196
Epoch [14/51], Iter [60/391] Loss: 0.1320
Epoch [14/51], Iter [70/391] Loss: 0.1345
Epoch [14/51], Iter [80/391] Loss: 0.1359
Epoch [14/51], Iter [90/391] Loss: 0.1094
Epoch [14/51], Iter [100/391] Loss: 0.1222
Epoch [14/51], Iter [110/391] Loss: 0.1086
Epoch [14/51], Iter [120/391] Loss: 0.1056
Epoch [14/51], Iter [130/391] Loss: 0.1358
Epoch [14/51], Iter [140/391] Loss: 0.1250
Epoch [14/51], Iter [150/391] Loss: 0.1166
Epoch [14/51], Iter [160/391] Loss: 0.1173
Epoch [14/51], Iter [170/391] Loss: 0.1354
Epoch [14/51], Iter [180/391] Loss: 0.1163
Epoch [14/51], Iter [190/391] Loss: 0.1137
Epoch [14/51], Iter [200/391] Loss: 0.1159
Epoch [14/51], Iter [210/391] Loss: 0.1087
Epoch [14/51], Iter [220/391] Loss: 0.1573
Epoch [14/51], Iter [230/391] Loss: 0.1336
Epoch [14/51], Iter [240/391] Loss: 0.1384
Epoch [14/51], Iter [250/391] Loss: 0.1326
Epoch [14/51], Iter [260/391] Loss: 0.1521
Epoch [14/51], Iter [270/391] Loss: 0.1040
Epoch [14/51], Iter [280/391] Loss: 0.1023
Epoch [14/51], Iter [290/391] Loss: 0.1087
Epoch [14/51], Iter [300/391] Loss: 0.1246
Epoch [14/51], Iter [310/391] Loss: 0.1247
Epoch [14/51], Iter [320/391] Loss: 0.1016
Epoch [14/51], Iter [330/391] Loss: 0.1081
Epoch [14/51], Iter [340/391] Loss: 0.1012
Epoch [14/51], Iter [350/391] Loss: 0.1158
Epoch [14/51], Iter [360/391] Loss: 0.1153
Epoch [14/51], Iter [370/391] Loss: 0.1409
Epoch [14/51], Iter [380/391] Loss: 0.1313
Epoch [14/51], Iter [390/391] Loss: 0.1260
Epoch [15/51], Iter [10/391] Loss: 0.0941
Epoch [15/51], Iter [20/391] Loss: 0.1130
Epoch [15/51], Iter [30/391] Loss: 0.1066
Epoch [15/51], Iter [40/391] Loss: 0.0930
Epoch [15/51], Iter [50/391] Loss: 0.1209
Epoch [15/51], Iter [60/391] Loss: 0.1226
Epoch [15/51], Iter [70/391] Loss: 0.1100
Epoch [15/51], Iter [80/391] Loss: 0.1243
Epoch [15/51], Iter [90/391] Loss: 0.1115
Epoch [15/51], Iter [100/391] Loss: 0.0919
Epoch [15/51], Iter [110/391] Loss: 0.1064
Epoch [15/51], Iter [120/391] Loss: 0.1230
Epoch [15/51], Iter [130/391] Loss: 0.1047
Epoch [15/51], Iter [140/391] Loss: 0.1151
Epoch [15/51], Iter [150/391] Loss: 0.1090
Epoch [15/51], Iter [160/391] Loss: 0.1382
Epoch [15/51], Iter [170/391] Loss: 0.1240
Epoch [15/51], Iter [180/391] Loss: 0.1520
Epoch [15/51], Iter [190/391] Loss: 0.1518
Epoch [15/51], Iter [200/391] Loss: 0.1070
Epoch [15/51], Iter [210/391] Loss: 0.1118
Epoch [15/51], Iter [220/391] Loss: 0.1611
Epoch [15/51], Iter [230/391] Loss: 0.1099
Epoch [15/51], Iter [240/391] Loss: 0.1336
Epoch [15/51], Iter [250/391] Loss: 0.1026
Epoch [15/51], Iter [260/391] Loss: 0.1349
Epoch [15/51], Iter [270/391] Loss: 0.1167
Epoch [15/51], Iter [280/391] Loss: 0.1060
Epoch [15/51], Iter [290/391] Loss: 0.1307
Epoch [15/51], Iter [300/391] Loss: 0.0955
Epoch [15/51], Iter [310/391] Loss: 0.1155
Epoch [15/51], Iter [320/391] Loss: 0.1262
Epoch [15/51], Iter [330/391] Loss: 0.0927
Epoch [15/51], Iter [340/391] Loss: 0.0903
Epoch [15/51], Iter [350/391] Loss: 0.1073
Epoch [15/51], Iter [360/391] Loss: 0.1070
Epoch [15/51], Iter [370/391] Loss: 0.1296
Epoch [15/51], Iter [380/391] Loss: 0.1222
Epoch [15/51], Iter [390/391] Loss: 0.0967
Epoch [16/51], Iter [10/391] Loss: 0.0822
Epoch [16/51], Iter [20/391] Loss: 0.1188
Epoch [16/51], Iter [30/391] Loss: 0.0803
Epoch [16/51], Iter [40/391] Loss: 0.0846
Epoch [16/51], Iter [50/391] Loss: 0.1100
Epoch [16/51], Iter [60/391] Loss: 0.1321
Epoch [16/51], Iter [70/391] Loss: 0.0879
Epoch [16/51], Iter [80/391] Loss: 0.0722
Epoch [16/51], Iter [90/391] Loss: 0.1100
Epoch [16/51], Iter [100/391] Loss: 0.1106
Epoch [16/51], Iter [110/391] Loss: 0.1047
Epoch [16/51], Iter [120/391] Loss: 0.1353
Epoch [16/51], Iter [130/391] Loss: 0.1160
Epoch [16/51], Iter [140/391] Loss: 0.1127
Epoch [16/51], Iter [150/391] Loss: 0.1237
Epoch [16/51], Iter [160/391] Loss: 0.0940
Epoch [16/51], Iter [170/391] Loss: 0.0789
Epoch [16/51], Iter [180/391] Loss: 0.1244
Epoch [16/51], Iter [190/391] Loss: 0.0988
Epoch [16/51], Iter [200/391] Loss: 0.0849
Epoch [16/51], Iter [210/391] Loss: 0.1238
Epoch [16/51], Iter [220/391] Loss: 0.1070
Epoch [16/51], Iter [230/391] Loss: 0.1278
Epoch [16/51], Iter [240/391] Loss: 0.1063
Epoch [16/51], Iter [250/391] Loss: 0.1036
Epoch [16/51], Iter [260/391] Loss: 0.0895
Epoch [16/51], Iter [270/391] Loss: 0.1148
Epoch [16/51], Iter [280/391] Loss: 0.1143
Epoch [16/51], Iter [290/391] Loss: 0.0990
Epoch [16/51], Iter [300/391] Loss: 0.1028
Epoch [16/51], Iter [310/391] Loss: 0.1399
Epoch [16/51], Iter [320/391] Loss: 0.0984
Epoch [16/51], Iter [330/391] Loss: 0.1103
Epoch [16/51], Iter [340/391] Loss: 0.1250
Epoch [16/51], Iter [350/391] Loss: 0.1126
Epoch [16/51], Iter [360/391] Loss: 0.0996
Epoch [16/51], Iter [370/391] Loss: 0.1100
Epoch [16/51], Iter [380/391] Loss: 0.0998
Epoch [16/51], Iter [390/391] Loss: 0.1176
Epoch [17/51], Iter [10/391] Loss: 0.0971
Epoch [17/51], Iter [20/391] Loss: 0.1088
Epoch [17/51], Iter [30/391] Loss: 0.1085
Epoch [17/51], Iter [40/391] Loss: 0.1090
Epoch [17/51], Iter [50/391] Loss: 0.0985
Epoch [17/51], Iter [60/391] Loss: 0.1028
Epoch [17/51], Iter [70/391] Loss: 0.0855
Epoch [17/51], Iter [80/391] Loss: 0.0979
Epoch [17/51], Iter [90/391] Loss: 0.0815
Epoch [17/51], Iter [100/391] Loss: 0.0897
Epoch [17/51], Iter [110/391] Loss: 0.1138
Epoch [17/51], Iter [120/391] Loss: 0.0835
Epoch [17/51], Iter [130/391] Loss: 0.1165
Epoch [17/51], Iter [140/391] Loss: 0.0891
Epoch [17/51], Iter [150/391] Loss: 0.1465
Epoch [17/51], Iter [160/391] Loss: 0.1171
Epoch [17/51], Iter [170/391] Loss: 0.1021
Epoch [17/51], Iter [180/391] Loss: 0.0932
Epoch [17/51], Iter [190/391] Loss: 0.1333
Epoch [17/51], Iter [200/391] Loss: 0.0985
Epoch [17/51], Iter [210/391] Loss: 0.1074
Epoch [17/51], Iter [220/391] Loss: 0.0894
Epoch [17/51], Iter [230/391] Loss: 0.1021
Epoch [17/51], Iter [240/391] Loss: 0.1102
Epoch [17/51], Iter [250/391] Loss: 0.0895
Epoch [17/51], Iter [260/391] Loss: 0.0948
Epoch [17/51], Iter [270/391] Loss: 0.0944
Epoch [17/51], Iter [280/391] Loss: 0.0931
Epoch [17/51], Iter [290/391] Loss: 0.1069
Epoch [17/51], Iter [300/391] Loss: 0.0973
Epoch [17/51], Iter [310/391] Loss: 0.0683
Epoch [17/51], Iter [320/391] Loss: 0.1172
Epoch [17/51], Iter [330/391] Loss: 0.1142
Epoch [17/51], Iter [340/391] Loss: 0.1232
Epoch [17/51], Iter [350/391] Loss: 0.1144
Epoch [17/51], Iter [360/391] Loss: 0.1031
Epoch [17/51], Iter [370/391] Loss: 0.1097
Epoch [17/51], Iter [380/391] Loss: 0.1033
Epoch [17/51], Iter [390/391] Loss: 0.1139
Epoch [18/51], Iter [10/391] Loss: 0.0894
Epoch [18/51], Iter [20/391] Loss: 0.0873
Epoch [18/51], Iter [30/391] Loss: 0.0947
Epoch [18/51], Iter [40/391] Loss: 0.0916
Epoch [18/51], Iter [50/391] Loss: 0.0996
Epoch [18/51], Iter [60/391] Loss: 0.0726
Epoch [18/51], Iter [70/391] Loss: 0.1169
Epoch [18/51], Iter [80/391] Loss: 0.1003
Epoch [18/51], Iter [90/391] Loss: 0.1134
Epoch [18/51], Iter [100/391] Loss: 0.1294
Epoch [18/51], Iter [110/391] Loss: 0.0969
Epoch [18/51], Iter [120/391] Loss: 0.0989
Epoch [18/51], Iter [130/391] Loss: 0.0926
Epoch [18/51], Iter [140/391] Loss: 0.0993
Epoch [18/51], Iter [150/391] Loss: 0.0908
Epoch [18/51], Iter [160/391] Loss: 0.1028
Epoch [18/51], Iter [170/391] Loss: 0.0795
Epoch [18/51], Iter [180/391] Loss: 0.1117
Epoch [18/51], Iter [190/391] Loss: 0.1154
Epoch [18/51], Iter [200/391] Loss: 0.1070
Epoch [18/51], Iter [210/391] Loss: 0.1350
Epoch [18/51], Iter [220/391] Loss: 0.1080
Epoch [18/51], Iter [230/391] Loss: 0.1045
Epoch [18/51], Iter [240/391] Loss: 0.0894
Epoch [18/51], Iter [250/391] Loss: 0.0960
Epoch [18/51], Iter [260/391] Loss: 0.0830
Epoch [18/51], Iter [270/391] Loss: 0.1082
Epoch [18/51], Iter [280/391] Loss: 0.1036
Epoch [18/51], Iter [290/391] Loss: 0.1331
Epoch [18/51], Iter [300/391] Loss: 0.0815
Epoch [18/51], Iter [310/391] Loss: 0.0995
Epoch [18/51], Iter [320/391] Loss: 0.0985
Epoch [18/51], Iter [330/391] Loss: 0.1005
Epoch [18/51], Iter [340/391] Loss: 0.0812
Epoch [18/51], Iter [350/391] Loss: 0.0797
Epoch [18/51], Iter [360/391] Loss: 0.0873
Epoch [18/51], Iter [370/391] Loss: 0.1005
Epoch [18/51], Iter [380/391] Loss: 0.0904
Epoch [18/51], Iter [390/391] Loss: 0.0649
Epoch [19/51], Iter [10/391] Loss: 0.0853
Epoch [19/51], Iter [20/391] Loss: 0.0902
Epoch [19/51], Iter [30/391] Loss: 0.0864
Epoch [19/51], Iter [40/391] Loss: 0.0892
Epoch [19/51], Iter [50/391] Loss: 0.1081
Epoch [19/51], Iter [60/391] Loss: 0.0965
Epoch [19/51], Iter [70/391] Loss: 0.0920
Epoch [19/51], Iter [80/391] Loss: 0.0953
Epoch [19/51], Iter [90/391] Loss: 0.0869
Epoch [19/51], Iter [100/391] Loss: 0.1140
Epoch [19/51], Iter [110/391] Loss: 0.0811
Epoch [19/51], Iter [120/391] Loss: 0.0864
Epoch [19/51], Iter [130/391] Loss: 0.0966
Epoch [19/51], Iter [140/391] Loss: 0.0966
Epoch [19/51], Iter [150/391] Loss: 0.0819
Epoch [19/51], Iter [160/391] Loss: 0.0936
Epoch [19/51], Iter [170/391] Loss: 0.0860
Epoch [19/51], Iter [180/391] Loss: 0.0884
Epoch [19/51], Iter [190/391] Loss: 0.1164
Epoch [19/51], Iter [200/391] Loss: 0.0710
Epoch [19/51], Iter [210/391] Loss: 0.0694
Epoch [19/51], Iter [220/391] Loss: 0.0763
Epoch [19/51], Iter [230/391] Loss: 0.1140
Epoch [19/51], Iter [240/391] Loss: 0.1021
Epoch [19/51], Iter [250/391] Loss: 0.0720
Epoch [19/51], Iter [260/391] Loss: 0.0801
Epoch [19/51], Iter [270/391] Loss: 0.0770
Epoch [19/51], Iter [280/391] Loss: 0.0944
Epoch [19/51], Iter [290/391] Loss: 0.1043
Epoch [19/51], Iter [300/391] Loss: 0.0830
Epoch [19/51], Iter [310/391] Loss: 0.1013
Epoch [19/51], Iter [320/391] Loss: 0.0960
Epoch [19/51], Iter [330/391] Loss: 0.0858
Epoch [19/51], Iter [340/391] Loss: 0.1358
Epoch [19/51], Iter [350/391] Loss: 0.0899
Epoch [19/51], Iter [360/391] Loss: 0.0892
Epoch [19/51], Iter [370/391] Loss: 0.0975
Epoch [19/51], Iter [380/391] Loss: 0.1134
Epoch [19/51], Iter [390/391] Loss: 0.1053
Epoch [20/51], Iter [10/391] Loss: 0.0834
Epoch [20/51], Iter [20/391] Loss: 0.0936
Epoch [20/51], Iter [30/391] Loss: 0.0828
Epoch [20/51], Iter [40/391] Loss: 0.1061
Epoch [20/51], Iter [50/391] Loss: 0.1294
Epoch [20/51], Iter [60/391] Loss: 0.0866
Epoch [20/51], Iter [70/391] Loss: 0.1044
Epoch [20/51], Iter [80/391] Loss: 0.0820
Epoch [20/51], Iter [90/391] Loss: 0.0931
Epoch [20/51], Iter [100/391] Loss: 0.0764
Epoch [20/51], Iter [110/391] Loss: 0.0929
Epoch [20/51], Iter [120/391] Loss: 0.0908
Epoch [20/51], Iter [130/391] Loss: 0.0721
Epoch [20/51], Iter [140/391] Loss: 0.0692
Epoch [20/51], Iter [150/391] Loss: 0.0722
Epoch [20/51], Iter [160/391] Loss: 0.1053
Epoch [20/51], Iter [170/391] Loss: 0.0930
Epoch [20/51], Iter [180/391] Loss: 0.0926
Epoch [20/51], Iter [190/391] Loss: 0.0793
Epoch [20/51], Iter [200/391] Loss: 0.1198
Epoch [20/51], Iter [210/391] Loss: 0.0853
Epoch [20/51], Iter [220/391] Loss: 0.0736
Epoch [20/51], Iter [230/391] Loss: 0.0992
Epoch [20/51], Iter [240/391] Loss: 0.1042
Epoch [20/51], Iter [250/391] Loss: 0.0892
Epoch [20/51], Iter [260/391] Loss: 0.0917
Epoch [20/51], Iter [270/391] Loss: 0.0636
Epoch [20/51], Iter [280/391] Loss: 0.1085
Epoch [20/51], Iter [290/391] Loss: 0.0925
Epoch [20/51], Iter [300/391] Loss: 0.0847
Epoch [20/51], Iter [310/391] Loss: 0.0870
Epoch [20/51], Iter [320/391] Loss: 0.0933
Epoch [20/51], Iter [330/391] Loss: 0.0716
Epoch [20/51], Iter [340/391] Loss: 0.0769
Epoch [20/51], Iter [350/391] Loss: 0.0837
Epoch [20/51], Iter [360/391] Loss: 0.0743
Epoch [20/51], Iter [370/391] Loss: 0.0920
Epoch [20/51], Iter [380/391] Loss: 0.0719
Epoch [20/51], Iter [390/391] Loss: 0.1442
[Saving Checkpoint]
Epoch [21/51], Iter [10/391] Loss: 0.1396
Epoch [21/51], Iter [20/391] Loss: 0.1007
Epoch [21/51], Iter [30/391] Loss: 0.1043
Epoch [21/51], Iter [40/391] Loss: 0.0823
Epoch [21/51], Iter [50/391] Loss: 0.0876
Epoch [21/51], Iter [60/391] Loss: 0.1128
Epoch [21/51], Iter [70/391] Loss: 0.0657
Epoch [21/51], Iter [80/391] Loss: 0.0797
Epoch [21/51], Iter [90/391] Loss: 0.0639
Epoch [21/51], Iter [100/391] Loss: 0.1161
Epoch [21/51], Iter [110/391] Loss: 0.0838
Epoch [21/51], Iter [120/391] Loss: 0.0754
Epoch [21/51], Iter [130/391] Loss: 0.0742
Epoch [21/51], Iter [140/391] Loss: 0.0854
Epoch [21/51], Iter [150/391] Loss: 0.0955
Epoch [21/51], Iter [160/391] Loss: 0.0727
Epoch [21/51], Iter [170/391] Loss: 0.1086
Epoch [21/51], Iter [180/391] Loss: 0.1018
Epoch [21/51], Iter [190/391] Loss: 0.0947
Epoch [21/51], Iter [200/391] Loss: 0.0938
Epoch [21/51], Iter [210/391] Loss: 0.0865
Epoch [21/51], Iter [220/391] Loss: 0.0780
Epoch [21/51], Iter [230/391] Loss: 0.0774
Epoch [21/51], Iter [240/391] Loss: 0.0778
Epoch [21/51], Iter [250/391] Loss: 0.0786
Epoch [21/51], Iter [260/391] Loss: 0.0869
Epoch [21/51], Iter [270/391] Loss: 0.0701
Epoch [21/51], Iter [280/391] Loss: 0.0776
Epoch [21/51], Iter [290/391] Loss: 0.0875
Epoch [21/51], Iter [300/391] Loss: 0.0754
Epoch [21/51], Iter [310/391] Loss: 0.1025
Epoch [21/51], Iter [320/391] Loss: 0.0912
Epoch [21/51], Iter [330/391] Loss: 0.0763
Epoch [21/51], Iter [340/391] Loss: 0.1232
Epoch [21/51], Iter [350/391] Loss: 0.0988
Epoch [21/51], Iter [360/391] Loss: 0.1029
Epoch [21/51], Iter [370/391] Loss: 0.0912
Epoch [21/51], Iter [380/391] Loss: 0.0758
Epoch [21/51], Iter [390/391] Loss: 0.0902
Epoch [22/51], Iter [10/391] Loss: 0.0668
Epoch [22/51], Iter [20/391] Loss: 0.0751
Epoch [22/51], Iter [30/391] Loss: 0.0735
Epoch [22/51], Iter [40/391] Loss: 0.0520
Epoch [22/51], Iter [50/391] Loss: 0.0785
Epoch [22/51], Iter [60/391] Loss: 0.0936
Epoch [22/51], Iter [70/391] Loss: 0.0811
Epoch [22/51], Iter [80/391] Loss: 0.0872
Epoch [22/51], Iter [90/391] Loss: 0.0684
Epoch [22/51], Iter [100/391] Loss: 0.0714
Epoch [22/51], Iter [110/391] Loss: 0.0880
Epoch [22/51], Iter [120/391] Loss: 0.0813
Epoch [22/51], Iter [130/391] Loss: 0.0736
Epoch [22/51], Iter [140/391] Loss: 0.0873
Epoch [22/51], Iter [150/391] Loss: 0.0834
Epoch [22/51], Iter [160/391] Loss: 0.0852
Epoch [22/51], Iter [170/391] Loss: 0.1060
Epoch [22/51], Iter [180/391] Loss: 0.0746
Epoch [22/51], Iter [190/391] Loss: 0.0701
Epoch [22/51], Iter [200/391] Loss: 0.0818
Epoch [22/51], Iter [210/391] Loss: 0.0695
Epoch [22/51], Iter [220/391] Loss: 0.0887
Epoch [22/51], Iter [230/391] Loss: 0.0794
Epoch [22/51], Iter [240/391] Loss: 0.0692
Epoch [22/51], Iter [250/391] Loss: 0.1052
Epoch [22/51], Iter [260/391] Loss: 0.0908
Epoch [22/51], Iter [270/391] Loss: 0.0994
Epoch [22/51], Iter [280/391] Loss: 0.1001
Epoch [22/51], Iter [290/391] Loss: 0.0965
Epoch [22/51], Iter [300/391] Loss: 0.0814
Epoch [22/51], Iter [310/391] Loss: 0.0838
Epoch [22/51], Iter [320/391] Loss: 0.0968
Epoch [22/51], Iter [330/391] Loss: 0.0626
Epoch [22/51], Iter [340/391] Loss: 0.0782
Epoch [22/51], Iter [350/391] Loss: 0.0723
Epoch [22/51], Iter [360/391] Loss: 0.0780
Epoch [22/51], Iter [370/391] Loss: 0.0897
Epoch [22/51], Iter [380/391] Loss: 0.0949
Epoch [22/51], Iter [390/391] Loss: 0.1266
Epoch [23/51], Iter [10/391] Loss: 0.0945
Epoch [23/51], Iter [20/391] Loss: 0.0847
Epoch [23/51], Iter [30/391] Loss: 0.0737
Epoch [23/51], Iter [40/391] Loss: 0.0772
Epoch [23/51], Iter [50/391] Loss: 0.0979
Epoch [23/51], Iter [60/391] Loss: 0.0828
Epoch [23/51], Iter [70/391] Loss: 0.0940
Epoch [23/51], Iter [80/391] Loss: 0.0963
Epoch [23/51], Iter [90/391] Loss: 0.0920
Epoch [23/51], Iter [100/391] Loss: 0.0858
Epoch [23/51], Iter [110/391] Loss: 0.0693
Epoch [23/51], Iter [120/391] Loss: 0.0851
Epoch [23/51], Iter [130/391] Loss: 0.0899
Epoch [23/51], Iter [140/391] Loss: 0.0724
Epoch [23/51], Iter [150/391] Loss: 0.0779
Epoch [23/51], Iter [160/391] Loss: 0.0865
Epoch [23/51], Iter [170/391] Loss: 0.0779
Epoch [23/51], Iter [180/391] Loss: 0.0831
Epoch [23/51], Iter [190/391] Loss: 0.0714
Epoch [23/51], Iter [200/391] Loss: 0.0747
Epoch [23/51], Iter [210/391] Loss: 0.0819
Epoch [23/51], Iter [220/391] Loss: 0.0972
Epoch [23/51], Iter [230/391] Loss: 0.0826
Epoch [23/51], Iter [240/391] Loss: 0.0789
Epoch [23/51], Iter [250/391] Loss: 0.0909
Epoch [23/51], Iter [260/391] Loss: 0.0776
Epoch [23/51], Iter [270/391] Loss: 0.0949
Epoch [23/51], Iter [280/391] Loss: 0.0779
Epoch [23/51], Iter [290/391] Loss: 0.0740
Epoch [23/51], Iter [300/391] Loss: 0.0786
Epoch [23/51], Iter [310/391] Loss: 0.0792
Epoch [23/51], Iter [320/391] Loss: 0.0889
Epoch [23/51], Iter [330/391] Loss: 0.0760
Epoch [23/51], Iter [340/391] Loss: 0.0625
Epoch [23/51], Iter [350/391] Loss: 0.0654
Epoch [23/51], Iter [360/391] Loss: 0.0828
Epoch [23/51], Iter [370/391] Loss: 0.0741
Epoch [23/51], Iter [380/391] Loss: 0.0812
Epoch [23/51], Iter [390/391] Loss: 0.0950
Epoch [24/51], Iter [10/391] Loss: 0.0933
Epoch [24/51], Iter [20/391] Loss: 0.0945
Epoch [24/51], Iter [30/391] Loss: 0.0812
Epoch [24/51], Iter [40/391] Loss: 0.0790
Epoch [24/51], Iter [50/391] Loss: 0.0636
Epoch [24/51], Iter [60/391] Loss: 0.0903
Epoch [24/51], Iter [70/391] Loss: 0.0766
Epoch [24/51], Iter [80/391] Loss: 0.1057
Epoch [24/51], Iter [90/391] Loss: 0.0630
Epoch [24/51], Iter [100/391] Loss: 0.0576
Epoch [24/51], Iter [110/391] Loss: 0.0963
Epoch [24/51], Iter [120/391] Loss: 0.0932
Epoch [24/51], Iter [130/391] Loss: 0.0673
Epoch [24/51], Iter [140/391] Loss: 0.0791
Epoch [24/51], Iter [150/391] Loss: 0.0722
Epoch [24/51], Iter [160/391] Loss: 0.0715
Epoch [24/51], Iter [170/391] Loss: 0.0824
Epoch [24/51], Iter [180/391] Loss: 0.0842
Epoch [24/51], Iter [190/391] Loss: 0.0561
Epoch [24/51], Iter [200/391] Loss: 0.0836
Epoch [24/51], Iter [210/391] Loss: 0.0676
Epoch [24/51], Iter [220/391] Loss: 0.0748
Epoch [24/51], Iter [230/391] Loss: 0.0934
Epoch [24/51], Iter [240/391] Loss: 0.0882
Epoch [24/51], Iter [250/391] Loss: 0.0701
Epoch [24/51], Iter [260/391] Loss: 0.0937
Epoch [24/51], Iter [270/391] Loss: 0.0813
Epoch [24/51], Iter [280/391] Loss: 0.0876
Epoch [24/51], Iter [290/391] Loss: 0.0932
Epoch [24/51], Iter [300/391] Loss: 0.0835
Epoch [24/51], Iter [310/391] Loss: 0.0616
Epoch [24/51], Iter [320/391] Loss: 0.0964
Epoch [24/51], Iter [330/391] Loss: 0.0723
Epoch [24/51], Iter [340/391] Loss: 0.0924
Epoch [24/51], Iter [350/391] Loss: 0.0714
Epoch [24/51], Iter [360/391] Loss: 0.0660
Epoch [24/51], Iter [370/391] Loss: 0.0857
Epoch [24/51], Iter [380/391] Loss: 0.0770
Epoch [24/51], Iter [390/391] Loss: 0.0975
Epoch [25/51], Iter [10/391] Loss: 0.0770
Epoch [25/51], Iter [20/391] Loss: 0.0691
Epoch [25/51], Iter [30/391] Loss: 0.0896
Epoch [25/51], Iter [40/391] Loss: 0.0874
Epoch [25/51], Iter [50/391] Loss: 0.0673
Epoch [25/51], Iter [60/391] Loss: 0.0684
Epoch [25/51], Iter [70/391] Loss: 0.0804
Epoch [25/51], Iter [80/391] Loss: 0.0598
Epoch [25/51], Iter [90/391] Loss: 0.0712
Epoch [25/51], Iter [100/391] Loss: 0.1000
Epoch [25/51], Iter [110/391] Loss: 0.0639
Epoch [25/51], Iter [120/391] Loss: 0.0698
Epoch [25/51], Iter [130/391] Loss: 0.0632
Epoch [25/51], Iter [140/391] Loss: 0.0732
Epoch [25/51], Iter [150/391] Loss: 0.0794
Epoch [25/51], Iter [160/391] Loss: 0.0843
Epoch [25/51], Iter [170/391] Loss: 0.0860
Epoch [25/51], Iter [180/391] Loss: 0.0585
Epoch [25/51], Iter [190/391] Loss: 0.0851
Epoch [25/51], Iter [200/391] Loss: 0.0845
Epoch [25/51], Iter [210/391] Loss: 0.0679
Epoch [25/51], Iter [220/391] Loss: 0.0705
Epoch [25/51], Iter [230/391] Loss: 0.0719
Epoch [25/51], Iter [240/391] Loss: 0.0721
Epoch [25/51], Iter [250/391] Loss: 0.0815
Epoch [25/51], Iter [260/391] Loss: 0.0904
Epoch [25/51], Iter [270/391] Loss: 0.0924
Epoch [25/51], Iter [280/391] Loss: 0.0845
Epoch [25/51], Iter [290/391] Loss: 0.0834
Epoch [25/51], Iter [300/391] Loss: 0.0807
Epoch [25/51], Iter [310/391] Loss: 0.1092
Epoch [25/51], Iter [320/391] Loss: 0.0779
Epoch [25/51], Iter [330/391] Loss: 0.0777
Epoch [25/51], Iter [340/391] Loss: 0.0845
Epoch [25/51], Iter [350/391] Loss: 0.0658
Epoch [25/51], Iter [360/391] Loss: 0.0623
Epoch [25/51], Iter [370/391] Loss: 0.1008
Epoch [25/51], Iter [380/391] Loss: 0.0576
Epoch [25/51], Iter [390/391] Loss: 0.0718
Epoch [26/51], Iter [10/391] Loss: 0.0834
Epoch [26/51], Iter [20/391] Loss: 0.0525
Epoch [26/51], Iter [30/391] Loss: 0.0696
Epoch [26/51], Iter [40/391] Loss: 0.0759
Epoch [26/51], Iter [50/391] Loss: 0.0510
Epoch [26/51], Iter [60/391] Loss: 0.0662
Epoch [26/51], Iter [70/391] Loss: 0.0398
Epoch [26/51], Iter [80/391] Loss: 0.0722
Epoch [26/51], Iter [90/391] Loss: 0.0797
Epoch [26/51], Iter [100/391] Loss: 0.0754
Epoch [26/51], Iter [110/391] Loss: 0.0664
Epoch [26/51], Iter [120/391] Loss: 0.0764
Epoch [26/51], Iter [130/391] Loss: 0.0766
Epoch [26/51], Iter [140/391] Loss: 0.0576
Epoch [26/51], Iter [150/391] Loss: 0.0808
Epoch [26/51], Iter [160/391] Loss: 0.0814
Epoch [26/51], Iter [170/391] Loss: 0.0836
Epoch [26/51], Iter [180/391] Loss: 0.0617
Epoch [26/51], Iter [190/391] Loss: 0.0806
Epoch [26/51], Iter [200/391] Loss: 0.0649
Epoch [26/51], Iter [210/391] Loss: 0.0597
Epoch [26/51], Iter [220/391] Loss: 0.0572
Epoch [26/51], Iter [230/391] Loss: 0.0692
Epoch [26/51], Iter [240/391] Loss: 0.0760
Epoch [26/51], Iter [250/391] Loss: 0.0630
Epoch [26/51], Iter [260/391] Loss: 0.0600
Epoch [26/51], Iter [270/391] Loss: 0.0743
Epoch [26/51], Iter [280/391] Loss: 0.0743
Epoch [26/51], Iter [290/391] Loss: 0.0593
Epoch [26/51], Iter [300/391] Loss: 0.0817
Epoch [26/51], Iter [310/391] Loss: 0.0727
Epoch [26/51], Iter [320/391] Loss: 0.0660
Epoch [26/51], Iter [330/391] Loss: 0.0726
Epoch [26/51], Iter [340/391] Loss: 0.0829
Epoch [26/51], Iter [350/391] Loss: 0.0790
Epoch [26/51], Iter [360/391] Loss: 0.0611
Epoch [26/51], Iter [370/391] Loss: 0.0719
Epoch [26/51], Iter [380/391] Loss: 0.0572
Epoch [26/51], Iter [390/391] Loss: 0.0851
Epoch [27/51], Iter [10/391] Loss: 0.0737
Epoch [27/51], Iter [20/391] Loss: 0.0828
Epoch [27/51], Iter [30/391] Loss: 0.0648
Epoch [27/51], Iter [40/391] Loss: 0.0547
Epoch [27/51], Iter [50/391] Loss: 0.0781
Epoch [27/51], Iter [60/391] Loss: 0.0607
Epoch [27/51], Iter [70/391] Loss: 0.0539
Epoch [27/51], Iter [80/391] Loss: 0.0594
Epoch [27/51], Iter [90/391] Loss: 0.0791
Epoch [27/51], Iter [100/391] Loss: 0.0589
Epoch [27/51], Iter [110/391] Loss: 0.0738
Epoch [27/51], Iter [120/391] Loss: 0.0933
Epoch [27/51], Iter [130/391] Loss: 0.0769
Epoch [27/51], Iter [140/391] Loss: 0.0959
Epoch [27/51], Iter [150/391] Loss: 0.0921
Epoch [27/51], Iter [160/391] Loss: 0.0596
Epoch [27/51], Iter [170/391] Loss: 0.0740
Epoch [27/51], Iter [180/391] Loss: 0.0942
Epoch [27/51], Iter [190/391] Loss: 0.0729
Epoch [27/51], Iter [200/391] Loss: 0.0852
Epoch [27/51], Iter [210/391] Loss: 0.0644
Epoch [27/51], Iter [220/391] Loss: 0.0485
Epoch [27/51], Iter [230/391] Loss: 0.0760
Epoch [27/51], Iter [240/391] Loss: 0.0503
Epoch [27/51], Iter [250/391] Loss: 0.0674
Epoch [27/51], Iter [260/391] Loss: 0.0699
Epoch [27/51], Iter [270/391] Loss: 0.0917
Epoch [27/51], Iter [280/391] Loss: 0.0804
Epoch [27/51], Iter [290/391] Loss: 0.0612
Epoch [27/51], Iter [300/391] Loss: 0.0880
Epoch [27/51], Iter [310/391] Loss: 0.0801
Epoch [27/51], Iter [320/391] Loss: 0.0690
Epoch [27/51], Iter [330/391] Loss: 0.0832
Epoch [27/51], Iter [340/391] Loss: 0.0830
Epoch [27/51], Iter [350/391] Loss: 0.0517
Epoch [27/51], Iter [360/391] Loss: 0.0841
Epoch [27/51], Iter [370/391] Loss: 0.0668
Epoch [27/51], Iter [380/391] Loss: 0.0539
Epoch [27/51], Iter [390/391] Loss: 0.0781
Epoch [28/51], Iter [10/391] Loss: 0.0960
Epoch [28/51], Iter [20/391] Loss: 0.0725
Epoch [28/51], Iter [30/391] Loss: 0.0456
Epoch [28/51], Iter [40/391] Loss: 0.0396
Epoch [28/51], Iter [50/391] Loss: 0.0613
Epoch [28/51], Iter [60/391] Loss: 0.0747
Epoch [28/51], Iter [70/391] Loss: 0.0614
Epoch [28/51], Iter [80/391] Loss: 0.0697
Epoch [28/51], Iter [90/391] Loss: 0.0878
Epoch [28/51], Iter [100/391] Loss: 0.0573
Epoch [28/51], Iter [110/391] Loss: 0.0746
Epoch [28/51], Iter [120/391] Loss: 0.1094
Epoch [28/51], Iter [130/391] Loss: 0.0607
Epoch [28/51], Iter [140/391] Loss: 0.0681
Epoch [28/51], Iter [150/391] Loss: 0.0766
Epoch [28/51], Iter [160/391] Loss: 0.0526
Epoch [28/51], Iter [170/391] Loss: 0.0616
Epoch [28/51], Iter [180/391] Loss: 0.0781
Epoch [28/51], Iter [190/391] Loss: 0.0905
Epoch [28/51], Iter [200/391] Loss: 0.0630
Epoch [28/51], Iter [210/391] Loss: 0.0685
Epoch [28/51], Iter [220/391] Loss: 0.0749
Epoch [28/51], Iter [230/391] Loss: 0.0428
Epoch [28/51], Iter [240/391] Loss: 0.0653
Epoch [28/51], Iter [250/391] Loss: 0.0594
Epoch [28/51], Iter [260/391] Loss: 0.0626
Epoch [28/51], Iter [270/391] Loss: 0.0835
Epoch [28/51], Iter [280/391] Loss: 0.0475
Epoch [28/51], Iter [290/391] Loss: 0.0707
Epoch [28/51], Iter [300/391] Loss: 0.0573
Epoch [28/51], Iter [310/391] Loss: 0.0787
Epoch [28/51], Iter [320/391] Loss: 0.0750
Epoch [28/51], Iter [330/391] Loss: 0.0588
Epoch [28/51], Iter [340/391] Loss: 0.0808
Epoch [28/51], Iter [350/391] Loss: 0.0669
Epoch [28/51], Iter [360/391] Loss: 0.0714
Epoch [28/51], Iter [370/391] Loss: 0.0793
Epoch [28/51], Iter [380/391] Loss: 0.0788
Epoch [28/51], Iter [390/391] Loss: 0.0453
Epoch [29/51], Iter [10/391] Loss: 0.0704
Epoch [29/51], Iter [20/391] Loss: 0.0671
Epoch [29/51], Iter [30/391] Loss: 0.0626
Epoch [29/51], Iter [40/391] Loss: 0.0471
Epoch [29/51], Iter [50/391] Loss: 0.0518
Epoch [29/51], Iter [60/391] Loss: 0.0623
Epoch [29/51], Iter [70/391] Loss: 0.0578
Epoch [29/51], Iter [80/391] Loss: 0.0608
Epoch [29/51], Iter [90/391] Loss: 0.0578
Epoch [29/51], Iter [100/391] Loss: 0.0653
Epoch [29/51], Iter [110/391] Loss: 0.0760
Epoch [29/51], Iter [120/391] Loss: 0.0822
Epoch [29/51], Iter [130/391] Loss: 0.0549
Epoch [29/51], Iter [140/391] Loss: 0.0614
Epoch [29/51], Iter [150/391] Loss: 0.0855
Epoch [29/51], Iter [160/391] Loss: 0.0772
Epoch [29/51], Iter [170/391] Loss: 0.0754
Epoch [29/51], Iter [180/391] Loss: 0.0680
Epoch [29/51], Iter [190/391] Loss: 0.0789
Epoch [29/51], Iter [200/391] Loss: 0.0908
Epoch [29/51], Iter [210/391] Loss: 0.0583
Epoch [29/51], Iter [220/391] Loss: 0.0589
Epoch [29/51], Iter [230/391] Loss: 0.0708
Epoch [29/51], Iter [240/391] Loss: 0.0673
Epoch [29/51], Iter [250/391] Loss: 0.0644
Epoch [29/51], Iter [260/391] Loss: 0.0666
Epoch [29/51], Iter [270/391] Loss: 0.0573
Epoch [29/51], Iter [280/391] Loss: 0.0683
Epoch [29/51], Iter [290/391] Loss: 0.0656
Epoch [29/51], Iter [300/391] Loss: 0.0794
Epoch [29/51], Iter [310/391] Loss: 0.0626
Epoch [29/51], Iter [320/391] Loss: 0.0679
Epoch [29/51], Iter [330/391] Loss: 0.0735
Epoch [29/51], Iter [340/391] Loss: 0.0707
Epoch [29/51], Iter [350/391] Loss: 0.0917
Epoch [29/51], Iter [360/391] Loss: 0.0564
Epoch [29/51], Iter [370/391] Loss: 0.0654
Epoch [29/51], Iter [380/391] Loss: 0.0706
Epoch [29/51], Iter [390/391] Loss: 0.0513
Epoch [30/51], Iter [10/391] Loss: 0.0777
Epoch [30/51], Iter [20/391] Loss: 0.0551
Epoch [30/51], Iter [30/391] Loss: 0.0646
Epoch [30/51], Iter [40/391] Loss: 0.0524
Epoch [30/51], Iter [50/391] Loss: 0.0538
Epoch [30/51], Iter [60/391] Loss: 0.0679
Epoch [30/51], Iter [70/391] Loss: 0.0579
Epoch [30/51], Iter [80/391] Loss: 0.0602
Epoch [30/51], Iter [90/391] Loss: 0.0749
Epoch [30/51], Iter [100/391] Loss: 0.0523
Epoch [30/51], Iter [110/391] Loss: 0.0590
Epoch [30/51], Iter [120/391] Loss: 0.0979
Epoch [30/51], Iter [130/391] Loss: 0.0535
Epoch [30/51], Iter [140/391] Loss: 0.0605
Epoch [30/51], Iter [150/391] Loss: 0.0519
Epoch [30/51], Iter [160/391] Loss: 0.0668
Epoch [30/51], Iter [170/391] Loss: 0.0589
Epoch [30/51], Iter [180/391] Loss: 0.0570
Epoch [30/51], Iter [190/391] Loss: 0.0556
Epoch [30/51], Iter [200/391] Loss: 0.0539
Epoch [30/51], Iter [210/391] Loss: 0.0510
Epoch [30/51], Iter [220/391] Loss: 0.0755
Epoch [30/51], Iter [230/391] Loss: 0.0795
Epoch [30/51], Iter [240/391] Loss: 0.0550
Epoch [30/51], Iter [250/391] Loss: 0.0956
Epoch [30/51], Iter [260/391] Loss: 0.0746
Epoch [30/51], Iter [270/391] Loss: 0.0887
Epoch [30/51], Iter [280/391] Loss: 0.0444
Epoch [30/51], Iter [290/391] Loss: 0.0642
Epoch [30/51], Iter [300/391] Loss: 0.0498
Epoch [30/51], Iter [310/391] Loss: 0.0750
Epoch [30/51], Iter [320/391] Loss: 0.0537
Epoch [30/51], Iter [330/391] Loss: 0.0556
Epoch [30/51], Iter [340/391] Loss: 0.0830
Epoch [30/51], Iter [350/391] Loss: 0.0573
Epoch [30/51], Iter [360/391] Loss: 0.0583
Epoch [30/51], Iter [370/391] Loss: 0.0605
Epoch [30/51], Iter [380/391] Loss: 0.0639
Epoch [30/51], Iter [390/391] Loss: 0.0554
[Saving Checkpoint]
Epoch [31/51], Iter [10/391] Loss: 0.0620
Epoch [31/51], Iter [20/391] Loss: 0.0647
Epoch [31/51], Iter [30/391] Loss: 0.0711
Epoch [31/51], Iter [40/391] Loss: 0.0519
Epoch [31/51], Iter [50/391] Loss: 0.0724
Epoch [31/51], Iter [60/391] Loss: 0.0600
Epoch [31/51], Iter [70/391] Loss: 0.0452
Epoch [31/51], Iter [80/391] Loss: 0.0529
Epoch [31/51], Iter [90/391] Loss: 0.0816
Epoch [31/51], Iter [100/391] Loss: 0.0530
Epoch [31/51], Iter [110/391] Loss: 0.0535
Epoch [31/51], Iter [120/391] Loss: 0.0447
Epoch [31/51], Iter [130/391] Loss: 0.0820
Epoch [31/51], Iter [140/391] Loss: 0.0548
Epoch [31/51], Iter [150/391] Loss: 0.0625
Epoch [31/51], Iter [160/391] Loss: 0.0607
Epoch [31/51], Iter [170/391] Loss: 0.0479
Epoch [31/51], Iter [180/391] Loss: 0.0546
Epoch [31/51], Iter [190/391] Loss: 0.0670
Epoch [31/51], Iter [200/391] Loss: 0.0709
Epoch [31/51], Iter [210/391] Loss: 0.0547
Epoch [31/51], Iter [220/391] Loss: 0.0866
Epoch [31/51], Iter [230/391] Loss: 0.0582
Epoch [31/51], Iter [240/391] Loss: 0.0689
Epoch [31/51], Iter [250/391] Loss: 0.0627
Epoch [31/51], Iter [260/391] Loss: 0.0598
Epoch [31/51], Iter [270/391] Loss: 0.0521
Epoch [31/51], Iter [280/391] Loss: 0.0689
Epoch [31/51], Iter [290/391] Loss: 0.0523
Epoch [31/51], Iter [300/391] Loss: 0.0711
Epoch [31/51], Iter [310/391] Loss: 0.0795
Epoch [31/51], Iter [320/391] Loss: 0.0633
Epoch [31/51], Iter [330/391] Loss: 0.0888
Epoch [31/51], Iter [340/391] Loss: 0.0601
Epoch [31/51], Iter [350/391] Loss: 0.0897
Epoch [31/51], Iter [360/391] Loss: 0.0536
Epoch [31/51], Iter [370/391] Loss: 0.0519
Epoch [31/51], Iter [380/391] Loss: 0.0536
Epoch [31/51], Iter [390/391] Loss: 0.0641
Epoch [32/51], Iter [10/391] Loss: 0.0409
Epoch [32/51], Iter [20/391] Loss: 0.0639
Epoch [32/51], Iter [30/391] Loss: 0.0536
Epoch [32/51], Iter [40/391] Loss: 0.0591
Epoch [32/51], Iter [50/391] Loss: 0.0820
Epoch [32/51], Iter [60/391] Loss: 0.0652
Epoch [32/51], Iter [70/391] Loss: 0.0724
Epoch [32/51], Iter [80/391] Loss: 0.0795
Epoch [32/51], Iter [90/391] Loss: 0.0580
Epoch [32/51], Iter [100/391] Loss: 0.0608
Epoch [32/51], Iter [110/391] Loss: 0.0580
Epoch [32/51], Iter [120/391] Loss: 0.0483
Epoch [32/51], Iter [130/391] Loss: 0.0521
Epoch [32/51], Iter [140/391] Loss: 0.0605
Epoch [32/51], Iter [150/391] Loss: 0.0669
Epoch [32/51], Iter [160/391] Loss: 0.0558
Epoch [32/51], Iter [170/391] Loss: 0.0678
Epoch [32/51], Iter [180/391] Loss: 0.0638
Epoch [32/51], Iter [190/391] Loss: 0.0509
Epoch [32/51], Iter [200/391] Loss: 0.0629
Epoch [32/51], Iter [210/391] Loss: 0.0560
Epoch [32/51], Iter [220/391] Loss: 0.0489
Epoch [32/51], Iter [230/391] Loss: 0.0367
Epoch [32/51], Iter [240/391] Loss: 0.0524
Epoch [32/51], Iter [250/391] Loss: 0.0653
Epoch [32/51], Iter [260/391] Loss: 0.0559
Epoch [32/51], Iter [270/391] Loss: 0.0658
Epoch [32/51], Iter [280/391] Loss: 0.0648
Epoch [32/51], Iter [290/391] Loss: 0.0544
Epoch [32/51], Iter [300/391] Loss: 0.0585
Epoch [32/51], Iter [310/391] Loss: 0.0572
Epoch [32/51], Iter [320/391] Loss: 0.0550
Epoch [32/51], Iter [330/391] Loss: 0.0588
Epoch [32/51], Iter [340/391] Loss: 0.0534
Epoch [32/51], Iter [350/391] Loss: 0.0548
Epoch [32/51], Iter [360/391] Loss: 0.0538
Epoch [32/51], Iter [370/391] Loss: 0.0533
Epoch [32/51], Iter [380/391] Loss: 0.0575
Epoch [32/51], Iter [390/391] Loss: 0.0652
Epoch [33/51], Iter [10/391] Loss: 0.0636
Epoch [33/51], Iter [20/391] Loss: 0.0618
Epoch [33/51], Iter [30/391] Loss: 0.0528
Epoch [33/51], Iter [40/391] Loss: 0.0642
Epoch [33/51], Iter [50/391] Loss: 0.0591
Epoch [33/51], Iter [60/391] Loss: 0.0453
Epoch [33/51], Iter [70/391] Loss: 0.0565
Epoch [33/51], Iter [80/391] Loss: 0.0723
Epoch [33/51], Iter [90/391] Loss: 0.0566
Epoch [33/51], Iter [100/391] Loss: 0.0587
Epoch [33/51], Iter [110/391] Loss: 0.0680
Epoch [33/51], Iter [120/391] Loss: 0.0534
Epoch [33/51], Iter [130/391] Loss: 0.0675
Epoch [33/51], Iter [140/391] Loss: 0.0684
Epoch [33/51], Iter [150/391] Loss: 0.0563
Epoch [33/51], Iter [160/391] Loss: 0.0579
Epoch [33/51], Iter [170/391] Loss: 0.0612
Epoch [33/51], Iter [180/391] Loss: 0.0705
Epoch [33/51], Iter [190/391] Loss: 0.0689
Epoch [33/51], Iter [200/391] Loss: 0.0710
Epoch [33/51], Iter [210/391] Loss: 0.0673
Epoch [33/51], Iter [220/391] Loss: 0.0895
Epoch [33/51], Iter [230/391] Loss: 0.0581
Epoch [33/51], Iter [240/391] Loss: 0.0455
Epoch [33/51], Iter [250/391] Loss: 0.0460
Epoch [33/51], Iter [260/391] Loss: 0.0562
Epoch [33/51], Iter [270/391] Loss: 0.0713
Epoch [33/51], Iter [280/391] Loss: 0.0627
Epoch [33/51], Iter [290/391] Loss: 0.0470
Epoch [33/51], Iter [300/391] Loss: 0.0487
Epoch [33/51], Iter [310/391] Loss: 0.0714
Epoch [33/51], Iter [320/391] Loss: 0.0539
Epoch [33/51], Iter [330/391] Loss: 0.0506
Epoch [33/51], Iter [340/391] Loss: 0.0822
Epoch [33/51], Iter [350/391] Loss: 0.0609
Epoch [33/51], Iter [360/391] Loss: 0.0586
Epoch [33/51], Iter [370/391] Loss: 0.0599
Epoch [33/51], Iter [380/391] Loss: 0.0646
Epoch [33/51], Iter [390/391] Loss: 0.0522
Epoch [34/51], Iter [10/391] Loss: 0.0404
Epoch [34/51], Iter [20/391] Loss: 0.0498
Epoch [34/51], Iter [30/391] Loss: 0.0485
Epoch [34/51], Iter [40/391] Loss: 0.0582
Epoch [34/51], Iter [50/391] Loss: 0.0525
Epoch [34/51], Iter [60/391] Loss: 0.0532
Epoch [34/51], Iter [70/391] Loss: 0.0469
Epoch [34/51], Iter [80/391] Loss: 0.0537
Epoch [34/51], Iter [90/391] Loss: 0.0631
Epoch [34/51], Iter [100/391] Loss: 0.0597
Epoch [34/51], Iter [110/391] Loss: 0.0499
Epoch [34/51], Iter [120/391] Loss: 0.0561
Epoch [34/51], Iter [130/391] Loss: 0.0726
Epoch [34/51], Iter [140/391] Loss: 0.0581
Epoch [34/51], Iter [150/391] Loss: 0.0515
Epoch [34/51], Iter [160/391] Loss: 0.0481
Epoch [34/51], Iter [170/391] Loss: 0.0497
Epoch [34/51], Iter [180/391] Loss: 0.0587
Epoch [34/51], Iter [190/391] Loss: 0.0630
Epoch [34/51], Iter [200/391] Loss: 0.0659
Epoch [34/51], Iter [210/391] Loss: 0.0627
Epoch [34/51], Iter [220/391] Loss: 0.0615
Epoch [34/51], Iter [230/391] Loss: 0.0525
Epoch [34/51], Iter [240/391] Loss: 0.0550
Epoch [34/51], Iter [250/391] Loss: 0.0541
Epoch [34/51], Iter [260/391] Loss: 0.0474
Epoch [34/51], Iter [270/391] Loss: 0.0511
Epoch [34/51], Iter [280/391] Loss: 0.0530
Epoch [34/51], Iter [290/391] Loss: 0.0732
Epoch [34/51], Iter [300/391] Loss: 0.0473
Epoch [34/51], Iter [310/391] Loss: 0.0657
Epoch [34/51], Iter [320/391] Loss: 0.0855
Epoch [34/51], Iter [330/391] Loss: 0.0569
Epoch [34/51], Iter [340/391] Loss: 0.0433
Epoch [34/51], Iter [350/391] Loss: 0.0602
Epoch [34/51], Iter [360/391] Loss: 0.0569
Epoch [34/51], Iter [370/391] Loss: 0.0623
Epoch [34/51], Iter [380/391] Loss: 0.0500
Epoch [34/51], Iter [390/391] Loss: 0.0614
Epoch [35/51], Iter [10/391] Loss: 0.0549
Epoch [35/51], Iter [20/391] Loss: 0.0458
Epoch [35/51], Iter [30/391] Loss: 0.0479
Epoch [35/51], Iter [40/391] Loss: 0.0666
Epoch [35/51], Iter [50/391] Loss: 0.0561
Epoch [35/51], Iter [60/391] Loss: 0.0559
Epoch [35/51], Iter [70/391] Loss: 0.0543
Epoch [35/51], Iter [80/391] Loss: 0.0464
Epoch [35/51], Iter [90/391] Loss: 0.0572
Epoch [35/51], Iter [100/391] Loss: 0.0538
Epoch [35/51], Iter [110/391] Loss: 0.0676
Epoch [35/51], Iter [120/391] Loss: 0.0642
Epoch [35/51], Iter [130/391] Loss: 0.0431
Epoch [35/51], Iter [140/391] Loss: 0.0397
Epoch [35/51], Iter [150/391] Loss: 0.0635
Epoch [35/51], Iter [160/391] Loss: 0.0470
Epoch [35/51], Iter [170/391] Loss: 0.0691
Epoch [35/51], Iter [180/391] Loss: 0.0583
Epoch [35/51], Iter [190/391] Loss: 0.0489
Epoch [35/51], Iter [200/391] Loss: 0.0532
Epoch [35/51], Iter [210/391] Loss: 0.0495
Epoch [35/51], Iter [220/391] Loss: 0.0376
Epoch [35/51], Iter [230/391] Loss: 0.0380
Epoch [35/51], Iter [240/391] Loss: 0.0655
Epoch [35/51], Iter [250/391] Loss: 0.0543
Epoch [35/51], Iter [260/391] Loss: 0.0561
Epoch [35/51], Iter [270/391] Loss: 0.0678
Epoch [35/51], Iter [280/391] Loss: 0.0672
Epoch [35/51], Iter [290/391] Loss: 0.0656
Epoch [35/51], Iter [300/391] Loss: 0.0730
Epoch [35/51], Iter [310/391] Loss: 0.0465
Epoch [35/51], Iter [320/391] Loss: 0.0626
Epoch [35/51], Iter [330/391] Loss: 0.0574
Epoch [35/51], Iter [340/391] Loss: 0.0545
Epoch [35/51], Iter [350/391] Loss: 0.0611
Epoch [35/51], Iter [360/391] Loss: 0.0537
Epoch [35/51], Iter [370/391] Loss: 0.0764
Epoch [35/51], Iter [380/391] Loss: 0.0492
Epoch [35/51], Iter [390/391] Loss: 0.0602
Epoch [36/51], Iter [10/391] Loss: 0.0606
Epoch [36/51], Iter [20/391] Loss: 0.0429
Epoch [36/51], Iter [30/391] Loss: 0.0454
Epoch [36/51], Iter [40/391] Loss: 0.0318
Epoch [36/51], Iter [50/391] Loss: 0.0521
Epoch [36/51], Iter [60/391] Loss: 0.0726
Epoch [36/51], Iter [70/391] Loss: 0.0685
Epoch [36/51], Iter [80/391] Loss: 0.0539
Epoch [36/51], Iter [90/391] Loss: 0.0523
Epoch [36/51], Iter [100/391] Loss: 0.0435
Epoch [36/51], Iter [110/391] Loss: 0.0630
Epoch [36/51], Iter [120/391] Loss: 0.0451
Epoch [36/51], Iter [130/391] Loss: 0.0623
Epoch [36/51], Iter [140/391] Loss: 0.0596
Epoch [36/51], Iter [150/391] Loss: 0.0550
Epoch [36/51], Iter [160/391] Loss: 0.0487
Epoch [36/51], Iter [170/391] Loss: 0.0430
Epoch [36/51], Iter [180/391] Loss: 0.0476
Epoch [36/51], Iter [190/391] Loss: 0.0568
Epoch [36/51], Iter [200/391] Loss: 0.0684
Epoch [36/51], Iter [210/391] Loss: 0.0466
Epoch [36/51], Iter [220/391] Loss: 0.0520
Epoch [36/51], Iter [230/391] Loss: 0.0639
Epoch [36/51], Iter [240/391] Loss: 0.0518
Epoch [36/51], Iter [250/391] Loss: 0.0390
Epoch [36/51], Iter [260/391] Loss: 0.0392
Epoch [36/51], Iter [270/391] Loss: 0.0356
Epoch [36/51], Iter [280/391] Loss: 0.0486
Epoch [36/51], Iter [290/391] Loss: 0.0563
Epoch [36/51], Iter [300/391] Loss: 0.0519
Epoch [36/51], Iter [310/391] Loss: 0.0517
Epoch [36/51], Iter [320/391] Loss: 0.0483
Epoch [36/51], Iter [330/391] Loss: 0.0636
Epoch [36/51], Iter [340/391] Loss: 0.0742
Epoch [36/51], Iter [350/391] Loss: 0.0508
Epoch [36/51], Iter [360/391] Loss: 0.0548
Epoch [36/51], Iter [370/391] Loss: 0.0629
Epoch [36/51], Iter [380/391] Loss: 0.0542
Epoch [36/51], Iter [390/391] Loss: 0.0571
Epoch [37/51], Iter [10/391] Loss: 0.0691
Epoch [37/51], Iter [20/391] Loss: 0.0442
Epoch [37/51], Iter [30/391] Loss: 0.0492
Epoch [37/51], Iter [40/391] Loss: 0.0345
Epoch [37/51], Iter [50/391] Loss: 0.0463
Epoch [37/51], Iter [60/391] Loss: 0.0397
Epoch [37/51], Iter [70/391] Loss: 0.0582
Epoch [37/51], Iter [80/391] Loss: 0.0506
Epoch [37/51], Iter [90/391] Loss: 0.0545
Epoch [37/51], Iter [100/391] Loss: 0.0469
Epoch [37/51], Iter [110/391] Loss: 0.0642
Epoch [37/51], Iter [120/391] Loss: 0.0682
Epoch [37/51], Iter [130/391] Loss: 0.0502
Epoch [37/51], Iter [140/391] Loss: 0.0535
Epoch [37/51], Iter [150/391] Loss: 0.0461
Epoch [37/51], Iter [160/391] Loss: 0.0494
Epoch [37/51], Iter [170/391] Loss: 0.0439
Epoch [37/51], Iter [180/391] Loss: 0.0556
Epoch [37/51], Iter [190/391] Loss: 0.0520
Epoch [37/51], Iter [200/391] Loss: 0.0500
Epoch [37/51], Iter [210/391] Loss: 0.0640
Epoch [37/51], Iter [220/391] Loss: 0.0599
Epoch [37/51], Iter [230/391] Loss: 0.0426
Epoch [37/51], Iter [240/391] Loss: 0.0455
Epoch [37/51], Iter [250/391] Loss: 0.0447
Epoch [37/51], Iter [260/391] Loss: 0.0469
Epoch [37/51], Iter [270/391] Loss: 0.0539
Epoch [37/51], Iter [280/391] Loss: 0.0563
Epoch [37/51], Iter [290/391] Loss: 0.0476
Epoch [37/51], Iter [300/391] Loss: 0.0633
Epoch [37/51], Iter [310/391] Loss: 0.0717
Epoch [37/51], Iter [320/391] Loss: 0.0487
Epoch [37/51], Iter [330/391] Loss: 0.0522
Epoch [37/51], Iter [340/391] Loss: 0.0557
Epoch [37/51], Iter [350/391] Loss: 0.0467
Epoch [37/51], Iter [360/391] Loss: 0.0525
Epoch [37/51], Iter [370/391] Loss: 0.0384
Epoch [37/51], Iter [380/391] Loss: 0.0577
Epoch [37/51], Iter [390/391] Loss: 0.0345
Epoch [38/51], Iter [10/391] Loss: 0.0524
Epoch [38/51], Iter [20/391] Loss: 0.0520
Epoch [38/51], Iter [30/391] Loss: 0.0396
Epoch [38/51], Iter [40/391] Loss: 0.0543
Epoch [38/51], Iter [50/391] Loss: 0.0517
Epoch [38/51], Iter [60/391] Loss: 0.0460
Epoch [38/51], Iter [70/391] Loss: 0.0466
Epoch [38/51], Iter [80/391] Loss: 0.0434
Epoch [38/51], Iter [90/391] Loss: 0.0366
Epoch [38/51], Iter [100/391] Loss: 0.0596
Epoch [38/51], Iter [110/391] Loss: 0.0530
Epoch [38/51], Iter [120/391] Loss: 0.0544
Epoch [38/51], Iter [130/391] Loss: 0.0488
Epoch [38/51], Iter [140/391] Loss: 0.0603
Epoch [38/51], Iter [150/391] Loss: 0.0420
Epoch [38/51], Iter [160/391] Loss: 0.0493
Epoch [38/51], Iter [170/391] Loss: 0.0570
Epoch [38/51], Iter [180/391] Loss: 0.0429
Epoch [38/51], Iter [190/391] Loss: 0.0480
Epoch [38/51], Iter [200/391] Loss: 0.0552
Epoch [38/51], Iter [210/391] Loss: 0.0536
Epoch [38/51], Iter [220/391] Loss: 0.0561
Epoch [38/51], Iter [230/391] Loss: 0.0590
Epoch [38/51], Iter [240/391] Loss: 0.0559
Epoch [38/51], Iter [250/391] Loss: 0.0500
Epoch [38/51], Iter [260/391] Loss: 0.0521
Epoch [38/51], Iter [270/391] Loss: 0.0425
Epoch [38/51], Iter [280/391] Loss: 0.0386
Epoch [38/51], Iter [290/391] Loss: 0.0333
Epoch [38/51], Iter [300/391] Loss: 0.0572
Epoch [38/51], Iter [310/391] Loss: 0.0539
Epoch [38/51], Iter [320/391] Loss: 0.0622
Epoch [38/51], Iter [330/391] Loss: 0.0618
Epoch [38/51], Iter [340/391] Loss: 0.0657
Epoch [38/51], Iter [350/391] Loss: 0.0436
Epoch [38/51], Iter [360/391] Loss: 0.0565
Epoch [38/51], Iter [370/391] Loss: 0.0532
Epoch [38/51], Iter [380/391] Loss: 0.0390
Epoch [38/51], Iter [390/391] Loss: 0.0508
Epoch [39/51], Iter [10/391] Loss: 0.0531
Epoch [39/51], Iter [20/391] Loss: 0.0418
Epoch [39/51], Iter [30/391] Loss: 0.0457
Epoch [39/51], Iter [40/391] Loss: 0.0501
Epoch [39/51], Iter [50/391] Loss: 0.0458
Epoch [39/51], Iter [60/391] Loss: 0.0449
Epoch [39/51], Iter [70/391] Loss: 0.0533
Epoch [39/51], Iter [80/391] Loss: 0.0434
Epoch [39/51], Iter [90/391] Loss: 0.0499
Epoch [39/51], Iter [100/391] Loss: 0.0508
Epoch [39/51], Iter [110/391] Loss: 0.0539
Epoch [39/51], Iter [120/391] Loss: 0.0595
Epoch [39/51], Iter [130/391] Loss: 0.0516
Epoch [39/51], Iter [140/391] Loss: 0.0413
Epoch [39/51], Iter [150/391] Loss: 0.0490
Epoch [39/51], Iter [160/391] Loss: 0.0424
Epoch [39/51], Iter [170/391] Loss: 0.0415
Epoch [39/51], Iter [180/391] Loss: 0.0505
Epoch [39/51], Iter [190/391] Loss: 0.0493
Epoch [39/51], Iter [200/391] Loss: 0.0418
Epoch [39/51], Iter [210/391] Loss: 0.0491
Epoch [39/51], Iter [220/391] Loss: 0.0386
Epoch [39/51], Iter [230/391] Loss: 0.0383
Epoch [39/51], Iter [240/391] Loss: 0.0589
Epoch [39/51], Iter [250/391] Loss: 0.0406
Epoch [39/51], Iter [260/391] Loss: 0.0431
Epoch [39/51], Iter [270/391] Loss: 0.0488
Epoch [39/51], Iter [280/391] Loss: 0.0557
Epoch [39/51], Iter [290/391] Loss: 0.0431
Epoch [39/51], Iter [300/391] Loss: 0.0392
Epoch [39/51], Iter [310/391] Loss: 0.0547
Epoch [39/51], Iter [320/391] Loss: 0.0596
Epoch [39/51], Iter [330/391] Loss: 0.0633
Epoch [39/51], Iter [340/391] Loss: 0.0501
Epoch [39/51], Iter [350/391] Loss: 0.0442
Epoch [39/51], Iter [360/391] Loss: 0.0515
Epoch [39/51], Iter [370/391] Loss: 0.0683
Epoch [39/51], Iter [380/391] Loss: 0.0511
Epoch [39/51], Iter [390/391] Loss: 0.0366
Epoch [40/51], Iter [10/391] Loss: 0.0424
Epoch [40/51], Iter [20/391] Loss: 0.0515
Epoch [40/51], Iter [30/391] Loss: 0.0438
Epoch [40/51], Iter [40/391] Loss: 0.0590
Epoch [40/51], Iter [50/391] Loss: 0.0450
Epoch [40/51], Iter [60/391] Loss: 0.0471
Epoch [40/51], Iter [70/391] Loss: 0.0398
Epoch [40/51], Iter [80/391] Loss: 0.0629
Epoch [40/51], Iter [90/391] Loss: 0.0418
Epoch [40/51], Iter [100/391] Loss: 0.0425
Epoch [40/51], Iter [110/391] Loss: 0.0661
Epoch [40/51], Iter [120/391] Loss: 0.0540
Epoch [40/51], Iter [130/391] Loss: 0.0506
Epoch [40/51], Iter [140/391] Loss: 0.0623
Epoch [40/51], Iter [150/391] Loss: 0.0405
Epoch [40/51], Iter [160/391] Loss: 0.0538
Epoch [40/51], Iter [170/391] Loss: 0.0593
Epoch [40/51], Iter [180/391] Loss: 0.0524
Epoch [40/51], Iter [190/391] Loss: 0.0612
Epoch [40/51], Iter [200/391] Loss: 0.0572
Epoch [40/51], Iter [210/391] Loss: 0.0501
Epoch [40/51], Iter [220/391] Loss: 0.0642
Epoch [40/51], Iter [230/391] Loss: 0.0615
Epoch [40/51], Iter [240/391] Loss: 0.0460
Epoch [40/51], Iter [250/391] Loss: 0.0589
Epoch [40/51], Iter [260/391] Loss: 0.0623
Epoch [40/51], Iter [270/391] Loss: 0.0474
Epoch [40/51], Iter [280/391] Loss: 0.0540
Epoch [40/51], Iter [290/391] Loss: 0.0412
Epoch [40/51], Iter [300/391] Loss: 0.0459
Epoch [40/51], Iter [310/391] Loss: 0.0435
Epoch [40/51], Iter [320/391] Loss: 0.0443
Epoch [40/51], Iter [330/391] Loss: 0.0621
Epoch [40/51], Iter [340/391] Loss: 0.0556
Epoch [40/51], Iter [350/391] Loss: 0.0512
Epoch [40/51], Iter [360/391] Loss: 0.0504
Epoch [40/51], Iter [370/391] Loss: 0.0395
Epoch [40/51], Iter [380/391] Loss: 0.0606
Epoch [40/51], Iter [390/391] Loss: 0.0573
[Saving Checkpoint]
Epoch [41/51], Iter [10/391] Loss: 0.0477
Epoch [41/51], Iter [20/391] Loss: 0.0336
Epoch [41/51], Iter [30/391] Loss: 0.0392
Epoch [41/51], Iter [40/391] Loss: 0.0374
Epoch [41/51], Iter [50/391] Loss: 0.0334
Epoch [41/51], Iter [60/391] Loss: 0.0462
Epoch [41/51], Iter [70/391] Loss: 0.0460
Epoch [41/51], Iter [80/391] Loss: 0.0639
Epoch [41/51], Iter [90/391] Loss: 0.0524
Epoch [41/51], Iter [100/391] Loss: 0.0572
Epoch [41/51], Iter [110/391] Loss: 0.0434
Epoch [41/51], Iter [120/391] Loss: 0.0511
Epoch [41/51], Iter [130/391] Loss: 0.0322
Epoch [41/51], Iter [140/391] Loss: 0.0506
Epoch [41/51], Iter [150/391] Loss: 0.0479
Epoch [41/51], Iter [160/391] Loss: 0.0364
Epoch [41/51], Iter [170/391] Loss: 0.0418
Epoch [41/51], Iter [180/391] Loss: 0.0460
Epoch [41/51], Iter [190/391] Loss: 0.0485
Epoch [41/51], Iter [200/391] Loss: 0.0434
Epoch [41/51], Iter [210/391] Loss: 0.0488
Epoch [41/51], Iter [220/391] Loss: 0.0625
Epoch [41/51], Iter [230/391] Loss: 0.0336
Epoch [41/51], Iter [240/391] Loss: 0.0482
Epoch [41/51], Iter [250/391] Loss: 0.0445
Epoch [41/51], Iter [260/391] Loss: 0.0575
Epoch [41/51], Iter [270/391] Loss: 0.0444
Epoch [41/51], Iter [280/391] Loss: 0.0332
Epoch [41/51], Iter [290/391] Loss: 0.0669
Epoch [41/51], Iter [300/391] Loss: 0.0464
Epoch [41/51], Iter [310/391] Loss: 0.0526
Epoch [41/51], Iter [320/391] Loss: 0.0395
Epoch [41/51], Iter [330/391] Loss: 0.0440
Epoch [41/51], Iter [340/391] Loss: 0.0552
Epoch [41/51], Iter [350/391] Loss: 0.0401
Epoch [41/51], Iter [360/391] Loss: 0.0545
Epoch [41/51], Iter [370/391] Loss: 0.0578
Epoch [41/51], Iter [380/391] Loss: 0.0381
Epoch [41/51], Iter [390/391] Loss: 0.0716
Epoch [42/51], Iter [10/391] Loss: 0.0528
Epoch [42/51], Iter [20/391] Loss: 0.0478
Epoch [42/51], Iter [30/391] Loss: 0.0492
Epoch [42/51], Iter [40/391] Loss: 0.0579
Epoch [42/51], Iter [50/391] Loss: 0.0425
Epoch [42/51], Iter [60/391] Loss: 0.0338
Epoch [42/51], Iter [70/391] Loss: 0.0513
Epoch [42/51], Iter [80/391] Loss: 0.0406
Epoch [42/51], Iter [90/391] Loss: 0.0500
Epoch [42/51], Iter [100/391] Loss: 0.0505
Epoch [42/51], Iter [110/391] Loss: 0.0424
Epoch [42/51], Iter [120/391] Loss: 0.0378
Epoch [42/51], Iter [130/391] Loss: 0.0456
Epoch [42/51], Iter [140/391] Loss: 0.0415
Epoch [42/51], Iter [150/391] Loss: 0.0455
Epoch [42/51], Iter [160/391] Loss: 0.0495
Epoch [42/51], Iter [170/391] Loss: 0.0364
Epoch [42/51], Iter [180/391] Loss: 0.0474
Epoch [42/51], Iter [190/391] Loss: 0.0561
Epoch [42/51], Iter [200/391] Loss: 0.0535
Epoch [42/51], Iter [210/391] Loss: 0.0485
Epoch [42/51], Iter [220/391] Loss: 0.0657
Epoch [42/51], Iter [230/391] Loss: 0.0403
Epoch [42/51], Iter [240/391] Loss: 0.0592
Epoch [42/51], Iter [250/391] Loss: 0.0361
Epoch [42/51], Iter [260/391] Loss: 0.0498
Epoch [42/51], Iter [270/391] Loss: 0.0421
Epoch [42/51], Iter [280/391] Loss: 0.0443
Epoch [42/51], Iter [290/391] Loss: 0.0463
Epoch [42/51], Iter [300/391] Loss: 0.0544
Epoch [42/51], Iter [310/391] Loss: 0.0430
Epoch [42/51], Iter [320/391] Loss: 0.0734
Epoch [42/51], Iter [330/391] Loss: 0.0599
Epoch [42/51], Iter [340/391] Loss: 0.0555
Epoch [42/51], Iter [350/391] Loss: 0.0446
Epoch [42/51], Iter [360/391] Loss: 0.0454
Epoch [42/51], Iter [370/391] Loss: 0.0442
Epoch [42/51], Iter [380/391] Loss: 0.0442
Epoch [42/51], Iter [390/391] Loss: 0.0471
Epoch [43/51], Iter [10/391] Loss: 0.0267
Epoch [43/51], Iter [20/391] Loss: 0.0462
Epoch [43/51], Iter [30/391] Loss: 0.0469
Epoch [43/51], Iter [40/391] Loss: 0.0453
Epoch [43/51], Iter [50/391] Loss: 0.0394
Epoch [43/51], Iter [60/391] Loss: 0.0368
Epoch [43/51], Iter [70/391] Loss: 0.0408
Epoch [43/51], Iter [80/391] Loss: 0.0393
Epoch [43/51], Iter [90/391] Loss: 0.0582
Epoch [43/51], Iter [100/391] Loss: 0.0374
Epoch [43/51], Iter [110/391] Loss: 0.0476
Epoch [43/51], Iter [120/391] Loss: 0.0537
Epoch [43/51], Iter [130/391] Loss: 0.0377
Epoch [43/51], Iter [140/391] Loss: 0.0590
Epoch [43/51], Iter [150/391] Loss: 0.0467
Epoch [43/51], Iter [160/391] Loss: 0.0396
Epoch [43/51], Iter [170/391] Loss: 0.0411
Epoch [43/51], Iter [180/391] Loss: 0.0465
Epoch [43/51], Iter [190/391] Loss: 0.0297
Epoch [43/51], Iter [200/391] Loss: 0.0579
Epoch [43/51], Iter [210/391] Loss: 0.0489
Epoch [43/51], Iter [220/391] Loss: 0.0392
Epoch [43/51], Iter [230/391] Loss: 0.0467
Epoch [43/51], Iter [240/391] Loss: 0.0600
Epoch [43/51], Iter [250/391] Loss: 0.0437
Epoch [43/51], Iter [260/391] Loss: 0.0430
Epoch [43/51], Iter [270/391] Loss: 0.0386
Epoch [43/51], Iter [280/391] Loss: 0.0378
Epoch [43/51], Iter [290/391] Loss: 0.0353
Epoch [43/51], Iter [300/391] Loss: 0.0426
Epoch [43/51], Iter [310/391] Loss: 0.0519
Epoch [43/51], Iter [320/391] Loss: 0.0267
Epoch [43/51], Iter [330/391] Loss: 0.0479
Epoch [43/51], Iter [340/391] Loss: 0.0525
Epoch [43/51], Iter [350/391] Loss: 0.0470
Epoch [43/51], Iter [360/391] Loss: 0.0339
Epoch [43/51], Iter [370/391] Loss: 0.0505
Epoch [43/51], Iter [380/391] Loss: 0.0404
Epoch [43/51], Iter [390/391] Loss: 0.0515
Epoch [44/51], Iter [10/391] Loss: 0.0487
Epoch [44/51], Iter [20/391] Loss: 0.0433
Epoch [44/51], Iter [30/391] Loss: 0.0432
Epoch [44/51], Iter [40/391] Loss: 0.0463
Epoch [44/51], Iter [50/391] Loss: 0.0379
Epoch [44/51], Iter [60/391] Loss: 0.0419
Epoch [44/51], Iter [70/391] Loss: 0.0521
Epoch [44/51], Iter [80/391] Loss: 0.0327
Epoch [44/51], Iter [90/391] Loss: 0.0320
Epoch [44/51], Iter [100/391] Loss: 0.0510
Epoch [44/51], Iter [110/391] Loss: 0.0392
Epoch [44/51], Iter [120/391] Loss: 0.0376
Epoch [44/51], Iter [130/391] Loss: 0.0352
Epoch [44/51], Iter [140/391] Loss: 0.0373
Epoch [44/51], Iter [150/391] Loss: 0.0390
Epoch [44/51], Iter [160/391] Loss: 0.0383
Epoch [44/51], Iter [170/391] Loss: 0.0440
Epoch [44/51], Iter [180/391] Loss: 0.0430
Epoch [44/51], Iter [190/391] Loss: 0.0440
Epoch [44/51], Iter [200/391] Loss: 0.0515
Epoch [44/51], Iter [210/391] Loss: 0.0648
Epoch [44/51], Iter [220/391] Loss: 0.0329
Epoch [44/51], Iter [230/391] Loss: 0.0439
Epoch [44/51], Iter [240/391] Loss: 0.0359
Epoch [44/51], Iter [250/391] Loss: 0.0542
Epoch [44/51], Iter [260/391] Loss: 0.0531
Epoch [44/51], Iter [270/391] Loss: 0.0498
Epoch [44/51], Iter [280/391] Loss: 0.0382
Epoch [44/51], Iter [290/391] Loss: 0.0634
Epoch [44/51], Iter [300/391] Loss: 0.0355
Epoch [44/51], Iter [310/391] Loss: 0.0440
Epoch [44/51], Iter [320/391] Loss: 0.0630
Epoch [44/51], Iter [330/391] Loss: 0.0485
Epoch [44/51], Iter [340/391] Loss: 0.0475
Epoch [44/51], Iter [350/391] Loss: 0.0538
Epoch [44/51], Iter [360/391] Loss: 0.0541
Epoch [44/51], Iter [370/391] Loss: 0.0547
Epoch [44/51], Iter [380/391] Loss: 0.0295
Epoch [44/51], Iter [390/391] Loss: 0.0465
Epoch [45/51], Iter [10/391] Loss: 0.0448
Epoch [45/51], Iter [20/391] Loss: 0.0361
Epoch [45/51], Iter [30/391] Loss: 0.0436
Epoch [45/51], Iter [40/391] Loss: 0.0556
Epoch [45/51], Iter [50/391] Loss: 0.0274
Epoch [45/51], Iter [60/391] Loss: 0.0376
Epoch [45/51], Iter [70/391] Loss: 0.0507
Epoch [45/51], Iter [80/391] Loss: 0.0389
Epoch [45/51], Iter [90/391] Loss: 0.0409
Epoch [45/51], Iter [100/391] Loss: 0.0291
Epoch [45/51], Iter [110/391] Loss: 0.0396
Epoch [45/51], Iter [120/391] Loss: 0.0378
Epoch [45/51], Iter [130/391] Loss: 0.0393
Epoch [45/51], Iter [140/391] Loss: 0.0346
Epoch [45/51], Iter [150/391] Loss: 0.0324
Epoch [45/51], Iter [160/391] Loss: 0.0356
Epoch [45/51], Iter [170/391] Loss: 0.0423
Epoch [45/51], Iter [180/391] Loss: 0.0436
Epoch [45/51], Iter [190/391] Loss: 0.0376
Epoch [45/51], Iter [200/391] Loss: 0.0519
Epoch [45/51], Iter [210/391] Loss: 0.0621
Epoch [45/51], Iter [220/391] Loss: 0.0527
Epoch [45/51], Iter [230/391] Loss: 0.0378
Epoch [45/51], Iter [240/391] Loss: 0.0372
Epoch [45/51], Iter [250/391] Loss: 0.0538
Epoch [45/51], Iter [260/391] Loss: 0.0393
Epoch [45/51], Iter [270/391] Loss: 0.0480
Epoch [45/51], Iter [280/391] Loss: 0.0402
Epoch [45/51], Iter [290/391] Loss: 0.0423
Epoch [45/51], Iter [300/391] Loss: 0.0500
Epoch [45/51], Iter [310/391] Loss: 0.0389
Epoch [45/51], Iter [320/391] Loss: 0.0451
Epoch [45/51], Iter [330/391] Loss: 0.0554
Epoch [45/51], Iter [340/391] Loss: 0.0528
Epoch [45/51], Iter [350/391] Loss: 0.0523
Epoch [45/51], Iter [360/391] Loss: 0.0363
Epoch [45/51], Iter [370/391] Loss: 0.0521
Epoch [45/51], Iter [380/391] Loss: 0.0553
Epoch [45/51], Iter [390/391] Loss: 0.0468
Epoch [46/51], Iter [10/391] Loss: 0.0343
Epoch [46/51], Iter [20/391] Loss: 0.0311
Epoch [46/51], Iter [30/391] Loss: 0.0374
Epoch [46/51], Iter [40/391] Loss: 0.0314
Epoch [46/51], Iter [50/391] Loss: 0.0359
Epoch [46/51], Iter [60/391] Loss: 0.0395
Epoch [46/51], Iter [70/391] Loss: 0.0400
Epoch [46/51], Iter [80/391] Loss: 0.0402
Epoch [46/51], Iter [90/391] Loss: 0.0421
Epoch [46/51], Iter [100/391] Loss: 0.0345
Epoch [46/51], Iter [110/391] Loss: 0.0451
Epoch [46/51], Iter [120/391] Loss: 0.0408
Epoch [46/51], Iter [130/391] Loss: 0.0457
Epoch [46/51], Iter [140/391] Loss: 0.0429
Epoch [46/51], Iter [150/391] Loss: 0.0415
Epoch [46/51], Iter [160/391] Loss: 0.0462
Epoch [46/51], Iter [170/391] Loss: 0.0494
Epoch [46/51], Iter [180/391] Loss: 0.0362
Epoch [46/51], Iter [190/391] Loss: 0.0620
Epoch [46/51], Iter [200/391] Loss: 0.0468
Epoch [46/51], Iter [210/391] Loss: 0.0429
Epoch [46/51], Iter [220/391] Loss: 0.0576
Epoch [46/51], Iter [230/391] Loss: 0.0511
Epoch [46/51], Iter [240/391] Loss: 0.0473
Epoch [46/51], Iter [250/391] Loss: 0.0408
Epoch [46/51], Iter [260/391] Loss: 0.0413
Epoch [46/51], Iter [270/391] Loss: 0.0445
Epoch [46/51], Iter [280/391] Loss: 0.0399
Epoch [46/51], Iter [290/391] Loss: 0.0362
Epoch [46/51], Iter [300/391] Loss: 0.0476
Epoch [46/51], Iter [310/391] Loss: 0.0584
Epoch [46/51], Iter [320/391] Loss: 0.0494
Epoch [46/51], Iter [330/391] Loss: 0.0607
Epoch [46/51], Iter [340/391] Loss: 0.0397
Epoch [46/51], Iter [350/391] Loss: 0.0431
Epoch [46/51], Iter [360/391] Loss: 0.0481
Epoch [46/51], Iter [370/391] Loss: 0.0474
Epoch [46/51], Iter [380/391] Loss: 0.0469
Epoch [46/51], Iter [390/391] Loss: 0.0526
Epoch [47/51], Iter [10/391] Loss: 0.0385
Epoch [47/51], Iter [20/391] Loss: 0.0474
Epoch [47/51], Iter [30/391] Loss: 0.0459
Epoch [47/51], Iter [40/391] Loss: 0.0579
Epoch [47/51], Iter [50/391] Loss: 0.0302
Epoch [47/51], Iter [60/391] Loss: 0.0361
Epoch [47/51], Iter [70/391] Loss: 0.0330
Epoch [47/51], Iter [80/391] Loss: 0.0475
Epoch [47/51], Iter [90/391] Loss: 0.0380
Epoch [47/51], Iter [100/391] Loss: 0.0463
Epoch [47/51], Iter [110/391] Loss: 0.0428
Epoch [47/51], Iter [120/391] Loss: 0.0360
Epoch [47/51], Iter [130/391] Loss: 0.0353
Epoch [47/51], Iter [140/391] Loss: 0.0412
Epoch [47/51], Iter [150/391] Loss: 0.0328
Epoch [47/51], Iter [160/391] Loss: 0.0406
Epoch [47/51], Iter [170/391] Loss: 0.0416
Epoch [47/51], Iter [180/391] Loss: 0.0309
Epoch [47/51], Iter [190/391] Loss: 0.0414
Epoch [47/51], Iter [200/391] Loss: 0.0446
Epoch [47/51], Iter [210/391] Loss: 0.0446
Epoch [47/51], Iter [220/391] Loss: 0.0324
Epoch [47/51], Iter [230/391] Loss: 0.0362
Epoch [47/51], Iter [240/391] Loss: 0.0350
Epoch [47/51], Iter [250/391] Loss: 0.0502
Epoch [47/51], Iter [260/391] Loss: 0.0313
Epoch [47/51], Iter [270/391] Loss: 0.0483
Epoch [47/51], Iter [280/391] Loss: 0.0350
Epoch [47/51], Iter [290/391] Loss: 0.0486
Epoch [47/51], Iter [300/391] Loss: 0.0363
Epoch [47/51], Iter [310/391] Loss: 0.0391
Epoch [47/51], Iter [320/391] Loss: 0.0459
Epoch [47/51], Iter [330/391] Loss: 0.0488
Epoch [47/51], Iter [340/391] Loss: 0.0442
Epoch [47/51], Iter [350/391] Loss: 0.0480
Epoch [47/51], Iter [360/391] Loss: 0.0422
Epoch [47/51], Iter [370/391] Loss: 0.0513
Epoch [47/51], Iter [380/391] Loss: 0.0471
Epoch [47/51], Iter [390/391] Loss: 0.0412
Epoch [48/51], Iter [10/391] Loss: 0.0388
Epoch [48/51], Iter [20/391] Loss: 0.0334
Epoch [48/51], Iter [30/391] Loss: 0.0372
Epoch [48/51], Iter [40/391] Loss: 0.0336
Epoch [48/51], Iter [50/391] Loss: 0.0280
Epoch [48/51], Iter [60/391] Loss: 0.0441
Epoch [48/51], Iter [70/391] Loss: 0.0524
Epoch [48/51], Iter [80/391] Loss: 0.0393
Epoch [48/51], Iter [90/391] Loss: 0.0435
Epoch [48/51], Iter [100/391] Loss: 0.0381
Epoch [48/51], Iter [110/391] Loss: 0.0292
Epoch [48/51], Iter [120/391] Loss: 0.0474
Epoch [48/51], Iter [130/391] Loss: 0.0264
Epoch [48/51], Iter [140/391] Loss: 0.0382
Epoch [48/51], Iter [150/391] Loss: 0.0376
Epoch [48/51], Iter [160/391] Loss: 0.0433
Epoch [48/51], Iter [170/391] Loss: 0.0333
Epoch [48/51], Iter [180/391] Loss: 0.0363
Epoch [48/51], Iter [190/391] Loss: 0.0337
Epoch [48/51], Iter [200/391] Loss: 0.0406
Epoch [48/51], Iter [210/391] Loss: 0.0426
Epoch [48/51], Iter [220/391] Loss: 0.0242
Epoch [48/51], Iter [230/391] Loss: 0.0450
Epoch [48/51], Iter [240/391] Loss: 0.0391
Epoch [48/51], Iter [250/391] Loss: 0.0372
Epoch [48/51], Iter [260/391] Loss: 0.0489
Epoch [48/51], Iter [270/391] Loss: 0.0467
Epoch [48/51], Iter [280/391] Loss: 0.0414
Epoch [48/51], Iter [290/391] Loss: 0.0541
Epoch [48/51], Iter [300/391] Loss: 0.0386
Epoch [48/51], Iter [310/391] Loss: 0.0386
Epoch [48/51], Iter [320/391] Loss: 0.0397
Epoch [48/51], Iter [330/391] Loss: 0.0324
Epoch [48/51], Iter [340/391] Loss: 0.0409
Epoch [48/51], Iter [350/391] Loss: 0.0543
Epoch [48/51], Iter [360/391] Loss: 0.0366
Epoch [48/51], Iter [370/391] Loss: 0.0425
Epoch [48/51], Iter [380/391] Loss: 0.0431
Epoch [48/51], Iter [390/391] Loss: 0.0468
Epoch [49/51], Iter [10/391] Loss: 0.0385
Epoch [49/51], Iter [20/391] Loss: 0.0369
Epoch [49/51], Iter [30/391] Loss: 0.0427
Epoch [49/51], Iter [40/391] Loss: 0.0287
Epoch [49/51], Iter [50/391] Loss: 0.0356
Epoch [49/51], Iter [60/391] Loss: 0.0373
Epoch [49/51], Iter [70/391] Loss: 0.0492
Epoch [49/51], Iter [80/391] Loss: 0.0379
Epoch [49/51], Iter [90/391] Loss: 0.0373
Epoch [49/51], Iter [100/391] Loss: 0.0262
Epoch [49/51], Iter [110/391] Loss: 0.0361
Epoch [49/51], Iter [120/391] Loss: 0.0329
Epoch [49/51], Iter [130/391] Loss: 0.0362
Epoch [49/51], Iter [140/391] Loss: 0.0361
Epoch [49/51], Iter [150/391] Loss: 0.0282
Epoch [49/51], Iter [160/391] Loss: 0.0332
Epoch [49/51], Iter [170/391] Loss: 0.0432
Epoch [49/51], Iter [180/391] Loss: 0.0392
Epoch [49/51], Iter [190/391] Loss: 0.0293
Epoch [49/51], Iter [200/391] Loss: 0.0489
Epoch [49/51], Iter [210/391] Loss: 0.0377
Epoch [49/51], Iter [220/391] Loss: 0.0507
Epoch [49/51], Iter [230/391] Loss: 0.0399
Epoch [49/51], Iter [240/391] Loss: 0.0382
Epoch [49/51], Iter [250/391] Loss: 0.0321
Epoch [49/51], Iter [260/391] Loss: 0.0397
Epoch [49/51], Iter [270/391] Loss: 0.0331
Epoch [49/51], Iter [280/391] Loss: 0.0459
Epoch [49/51], Iter [290/391] Loss: 0.0390
Epoch [49/51], Iter [300/391] Loss: 0.0522
Epoch [49/51], Iter [310/391] Loss: 0.0455
Epoch [49/51], Iter [320/391] Loss: 0.0447
Epoch [49/51], Iter [330/391] Loss: 0.0532
Epoch [49/51], Iter [340/391] Loss: 0.0435
Epoch [49/51], Iter [350/391] Loss: 0.0355
Epoch [49/51], Iter [360/391] Loss: 0.0476
Epoch [49/51], Iter [370/391] Loss: 0.0324
Epoch [49/51], Iter [380/391] Loss: 0.0348
Epoch [49/51], Iter [390/391] Loss: 0.0317
Epoch [50/51], Iter [10/391] Loss: 0.0456
Epoch [50/51], Iter [20/391] Loss: 0.0275
Epoch [50/51], Iter [30/391] Loss: 0.0326
Epoch [50/51], Iter [40/391] Loss: 0.0429
Epoch [50/51], Iter [50/391] Loss: 0.0283
Epoch [50/51], Iter [60/391] Loss: 0.0400
Epoch [50/51], Iter [70/391] Loss: 0.0403
Epoch [50/51], Iter [80/391] Loss: 0.0305
Epoch [50/51], Iter [90/391] Loss: 0.0347
Epoch [50/51], Iter [100/391] Loss: 0.0288
Epoch [50/51], Iter [110/391] Loss: 0.0308
Epoch [50/51], Iter [120/391] Loss: 0.0321
Epoch [50/51], Iter [130/391] Loss: 0.0388
Epoch [50/51], Iter [140/391] Loss: 0.0382
Epoch [50/51], Iter [150/391] Loss: 0.0358
Epoch [50/51], Iter [160/391] Loss: 0.0415
Epoch [50/51], Iter [170/391] Loss: 0.0382
Epoch [50/51], Iter [180/391] Loss: 0.0401
Epoch [50/51], Iter [190/391] Loss: 0.0505
Epoch [50/51], Iter [200/391] Loss: 0.0327
Epoch [50/51], Iter [210/391] Loss: 0.0366
Epoch [50/51], Iter [220/391] Loss: 0.0522
Epoch [50/51], Iter [230/391] Loss: 0.0434
Epoch [50/51], Iter [240/391] Loss: 0.0379
Epoch [50/51], Iter [250/391] Loss: 0.0350
Epoch [50/51], Iter [260/391] Loss: 0.0386
Epoch [50/51], Iter [270/391] Loss: 0.0403
Epoch [50/51], Iter [280/391] Loss: 0.0466
Epoch [50/51], Iter [290/391] Loss: 0.0326
Epoch [50/51], Iter [300/391] Loss: 0.0450
Epoch [50/51], Iter [310/391] Loss: 0.0439
Epoch [50/51], Iter [320/391] Loss: 0.0322
Epoch [50/51], Iter [330/391] Loss: 0.0527
Epoch [50/51], Iter [340/391] Loss: 0.0320
Epoch [50/51], Iter [350/391] Loss: 0.0399
Epoch [50/51], Iter [360/391] Loss: 0.0426
Epoch [50/51], Iter [370/391] Loss: 0.0451
Epoch [50/51], Iter [380/391] Loss: 0.0271
Epoch [50/51], Iter [390/391] Loss: 0.0502
[Saving Checkpoint]
Epoch [51/51], Iter [10/391] Loss: 0.0353
Epoch [51/51], Iter [20/391] Loss: 0.0263
Epoch [51/51], Iter [30/391] Loss: 0.0283
Epoch [51/51], Iter [40/391] Loss: 0.0374
Epoch [51/51], Iter [50/391] Loss: 0.0298
Epoch [51/51], Iter [60/391] Loss: 0.0388
Epoch [51/51], Iter [70/391] Loss: 0.0388
Epoch [51/51], Iter [80/391] Loss: 0.0385
Epoch [51/51], Iter [90/391] Loss: 0.0283
Epoch [51/51], Iter [100/391] Loss: 0.0365
Epoch [51/51], Iter [110/391] Loss: 0.0362
Epoch [51/51], Iter [120/391] Loss: 0.0393
Epoch [51/51], Iter [130/391] Loss: 0.0339
Epoch [51/51], Iter [140/391] Loss: 0.0464
Epoch [51/51], Iter [150/391] Loss: 0.0287
Epoch [51/51], Iter [160/391] Loss: 0.0434
Epoch [51/51], Iter [170/391] Loss: 0.0388
Epoch [51/51], Iter [180/391] Loss: 0.0368
Epoch [51/51], Iter [190/391] Loss: 0.0262
Epoch [51/51], Iter [200/391] Loss: 0.0385
Epoch [51/51], Iter [210/391] Loss: 0.0635
Epoch [51/51], Iter [220/391] Loss: 0.0444
Epoch [51/51], Iter [230/391] Loss: 0.0366
Epoch [51/51], Iter [240/391] Loss: 0.0534
Epoch [51/51], Iter [250/391] Loss: 0.0460
Epoch [51/51], Iter [260/391] Loss: 0.0505
Epoch [51/51], Iter [270/391] Loss: 0.0290
Epoch [51/51], Iter [280/391] Loss: 0.0411
Epoch [51/51], Iter [290/391] Loss: 0.0591
Epoch [51/51], Iter [300/391] Loss: 0.0370
Epoch [51/51], Iter [310/391] Loss: 0.0418
Epoch [51/51], Iter [320/391] Loss: 0.0343
Epoch [51/51], Iter [330/391] Loss: 0.0379
Epoch [51/51], Iter [340/391] Loss: 0.0352
Epoch [51/51], Iter [350/391] Loss: 0.0504
Epoch [51/51], Iter [360/391] Loss: 0.0393
Epoch [51/51], Iter [370/391] Loss: 0.0361
Epoch [51/51], Iter [380/391] Loss: 0.0316
Epoch [51/51], Iter [390/391] Loss: 0.0273
# | a=1 | T=2 | epochs = 51 |
resnet_child_a1_t2_e51 = copy.deepcopy(resnet_child) #let's save for future reference
test_harness( testloader, resnet_child_a1_t2_e51 )
Accuracy of the model on the test images: 88 %
(tensor(8827, device='cuda:0'), 10000)
# | a=0.5 | T=2 | epochs = 51 |
resnet_child, optimizer_child = get_new_child_model()
epoch = 0
kd_loss_a0dot5_t2 = partial( knowledge_distillation_loss, alpha=0.5, T=2 )
training_harness( trainloader, optimizer_child, kd_loss_a0dot5_t2, resnet_parent, resnet_child, model_name='DeepResNet_a0dot5_t2_e51' )
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:12: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
if sys.path[0] == '':
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:26: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:15: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
from ipykernel import kernelapp as app
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:20: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:21: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
ENABLING GPU ACCELERATION || GeForce GTX 1080 Ti
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:46: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
Epoch [1/51], Iter [10/391] Loss: 1.3926
Epoch [1/51], Iter [20/391] Loss: 1.3197
Epoch [1/51], Iter [30/391] Loss: 1.2835
Epoch [1/51], Iter [40/391] Loss: 1.2537
Epoch [1/51], Iter [50/391] Loss: 1.2057
Epoch [1/51], Iter [60/391] Loss: 1.1892
Epoch [1/51], Iter [70/391] Loss: 1.1352
Epoch [1/51], Iter [80/391] Loss: 1.0316
Epoch [1/51], Iter [90/391] Loss: 1.0658
Epoch [1/51], Iter [100/391] Loss: 1.0728
Epoch [1/51], Iter [110/391] Loss: 1.0166
Epoch [1/51], Iter [120/391] Loss: 1.0070
Epoch [1/51], Iter [130/391] Loss: 1.0334
Epoch [1/51], Iter [140/391] Loss: 1.0558
Epoch [1/51], Iter [150/391] Loss: 1.0081
Epoch [1/51], Iter [160/391] Loss: 0.9810
Epoch [1/51], Iter [170/391] Loss: 0.8242
Epoch [1/51], Iter [180/391] Loss: 0.9635
Epoch [1/51], Iter [190/391] Loss: 1.0138
Epoch [1/51], Iter [200/391] Loss: 0.9833
Epoch [1/51], Iter [210/391] Loss: 0.9211
Epoch [1/51], Iter [220/391] Loss: 0.8274
Epoch [1/51], Iter [230/391] Loss: 0.8006
Epoch [1/51], Iter [240/391] Loss: 0.9441
Epoch [1/51], Iter [250/391] Loss: 0.8673
Epoch [1/51], Iter [260/391] Loss: 0.9151
Epoch [1/51], Iter [270/391] Loss: 0.9088
Epoch [1/51], Iter [280/391] Loss: 0.8240
Epoch [1/51], Iter [290/391] Loss: 0.9513
Epoch [1/51], Iter [300/391] Loss: 0.9289
Epoch [1/51], Iter [310/391] Loss: 0.8679
Epoch [1/51], Iter [320/391] Loss: 0.7961
Epoch [1/51], Iter [330/391] Loss: 0.8850
Epoch [1/51], Iter [340/391] Loss: 0.8931
Epoch [1/51], Iter [350/391] Loss: 0.7325
Epoch [1/51], Iter [360/391] Loss: 0.7977
Epoch [1/51], Iter [370/391] Loss: 0.8400
Epoch [1/51], Iter [380/391] Loss: 0.8330
Epoch [1/51], Iter [390/391] Loss: 0.7704
Epoch [2/51], Iter [10/391] Loss: 0.6161
Epoch [2/51], Iter [20/391] Loss: 0.8133
Epoch [2/51], Iter [30/391] Loss: 0.8272
Epoch [2/51], Iter [40/391] Loss: 0.9109
Epoch [2/51], Iter [50/391] Loss: 0.7834
Epoch [2/51], Iter [60/391] Loss: 0.6682
Epoch [2/51], Iter [70/391] Loss: 0.7591
Epoch [2/51], Iter [80/391] Loss: 0.7232
Epoch [2/51], Iter [90/391] Loss: 0.7374
Epoch [2/51], Iter [100/391] Loss: 0.7194
Epoch [2/51], Iter [110/391] Loss: 0.7822
Epoch [2/51], Iter [120/391] Loss: 0.7261
Epoch [2/51], Iter [130/391] Loss: 0.7085
Epoch [2/51], Iter [140/391] Loss: 0.7303
Epoch [2/51], Iter [150/391] Loss: 0.6746
Epoch [2/51], Iter [160/391] Loss: 0.7700
Epoch [2/51], Iter [170/391] Loss: 0.7476
Epoch [2/51], Iter [180/391] Loss: 0.6447
Epoch [2/51], Iter [190/391] Loss: 0.6924
Epoch [2/51], Iter [200/391] Loss: 0.7896
Epoch [2/51], Iter [210/391] Loss: 0.7299
Epoch [2/51], Iter [220/391] Loss: 0.5872
Epoch [2/51], Iter [230/391] Loss: 0.7238
Epoch [2/51], Iter [240/391] Loss: 0.6382
Epoch [2/51], Iter [250/391] Loss: 0.6483
Epoch [2/51], Iter [260/391] Loss: 0.6532
Epoch [2/51], Iter [270/391] Loss: 0.6750
Epoch [2/51], Iter [280/391] Loss: 0.6408
Epoch [2/51], Iter [290/391] Loss: 0.6556
Epoch [2/51], Iter [300/391] Loss: 0.7450
Epoch [2/51], Iter [310/391] Loss: 0.6261
Epoch [2/51], Iter [320/391] Loss: 0.8165
Epoch [2/51], Iter [330/391] Loss: 0.6794
Epoch [2/51], Iter [340/391] Loss: 0.5676
Epoch [2/51], Iter [350/391] Loss: 0.5698
Epoch [2/51], Iter [360/391] Loss: 0.6159
Epoch [2/51], Iter [370/391] Loss: 0.6615
Epoch [2/51], Iter [380/391] Loss: 0.6736
Epoch [2/51], Iter [390/391] Loss: 0.6964
Epoch [3/51], Iter [10/391] Loss: 0.5328
Epoch [3/51], Iter [20/391] Loss: 0.6639
Epoch [3/51], Iter [30/391] Loss: 0.6580
Epoch [3/51], Iter [40/391] Loss: 0.6777
Epoch [3/51], Iter [50/391] Loss: 0.6775
Epoch [3/51], Iter [60/391] Loss: 0.5635
Epoch [3/51], Iter [70/391] Loss: 0.6440
Epoch [3/51], Iter [80/391] Loss: 0.6045
Epoch [3/51], Iter [90/391] Loss: 0.5248
Epoch [3/51], Iter [100/391] Loss: 0.6068
Epoch [3/51], Iter [110/391] Loss: 0.5226
Epoch [3/51], Iter [120/391] Loss: 0.6089
Epoch [3/51], Iter [130/391] Loss: 0.6120
Epoch [3/51], Iter [140/391] Loss: 0.6798
Epoch [3/51], Iter [150/391] Loss: 0.5827
Epoch [3/51], Iter [160/391] Loss: 0.6288
Epoch [3/51], Iter [170/391] Loss: 0.5576
Epoch [3/51], Iter [180/391] Loss: 0.5963
Epoch [3/51], Iter [190/391] Loss: 0.5420
Epoch [3/51], Iter [200/391] Loss: 0.5756
Epoch [3/51], Iter [210/391] Loss: 0.5734
Epoch [3/51], Iter [220/391] Loss: 0.5812
Epoch [3/51], Iter [230/391] Loss: 0.5439
Epoch [3/51], Iter [240/391] Loss: 0.5573
Epoch [3/51], Iter [250/391] Loss: 0.5868
Epoch [3/51], Iter [260/391] Loss: 0.6112
Epoch [3/51], Iter [270/391] Loss: 0.4956
Epoch [3/51], Iter [280/391] Loss: 0.5597
Epoch [3/51], Iter [290/391] Loss: 0.5922
Epoch [3/51], Iter [300/391] Loss: 0.5579
Epoch [3/51], Iter [310/391] Loss: 0.7130
Epoch [3/51], Iter [320/391] Loss: 0.5801
Epoch [3/51], Iter [330/391] Loss: 0.5942
Epoch [3/51], Iter [340/391] Loss: 0.5353
Epoch [3/51], Iter [350/391] Loss: 0.5026
Epoch [3/51], Iter [360/391] Loss: 0.6519
Epoch [3/51], Iter [370/391] Loss: 0.6248
Epoch [3/51], Iter [380/391] Loss: 0.5501
Epoch [3/51], Iter [390/391] Loss: 0.6212
Epoch [4/51], Iter [10/391] Loss: 0.5162
Epoch [4/51], Iter [20/391] Loss: 0.6284
Epoch [4/51], Iter [30/391] Loss: 0.5468
Epoch [4/51], Iter [40/391] Loss: 0.5539
Epoch [4/51], Iter [50/391] Loss: 0.6567
Epoch [4/51], Iter [60/391] Loss: 0.5128
Epoch [4/51], Iter [70/391] Loss: 0.5322
Epoch [4/51], Iter [80/391] Loss: 0.5450
Epoch [4/51], Iter [90/391] Loss: 0.5008
Epoch [4/51], Iter [100/391] Loss: 0.4726
Epoch [4/51], Iter [110/391] Loss: 0.5506
Epoch [4/51], Iter [120/391] Loss: 0.5533
Epoch [4/51], Iter [130/391] Loss: 0.5350
Epoch [4/51], Iter [140/391] Loss: 0.5670
Epoch [4/51], Iter [150/391] Loss: 0.5022
Epoch [4/51], Iter [160/391] Loss: 0.5312
Epoch [4/51], Iter [170/391] Loss: 0.4837
Epoch [4/51], Iter [180/391] Loss: 0.5028
Epoch [4/51], Iter [190/391] Loss: 0.5832
Epoch [4/51], Iter [200/391] Loss: 0.5824
Epoch [4/51], Iter [210/391] Loss: 0.4901
Epoch [4/51], Iter [220/391] Loss: 0.4369
Epoch [4/51], Iter [230/391] Loss: 0.5567
Epoch [4/51], Iter [240/391] Loss: 0.4998
Epoch [4/51], Iter [250/391] Loss: 0.4789
Epoch [4/51], Iter [260/391] Loss: 0.4937
Epoch [4/51], Iter [270/391] Loss: 0.4116
Epoch [4/51], Iter [280/391] Loss: 0.4861
Epoch [4/51], Iter [290/391] Loss: 0.5394
Epoch [4/51], Iter [300/391] Loss: 0.4567
Epoch [4/51], Iter [310/391] Loss: 0.5846
Epoch [4/51], Iter [320/391] Loss: 0.4375
Epoch [4/51], Iter [330/391] Loss: 0.4843
Epoch [4/51], Iter [340/391] Loss: 0.4307
Epoch [4/51], Iter [350/391] Loss: 0.4314
Epoch [4/51], Iter [360/391] Loss: 0.4174
Epoch [4/51], Iter [370/391] Loss: 0.4876
Epoch [4/51], Iter [380/391] Loss: 0.4708
Epoch [4/51], Iter [390/391] Loss: 0.4052
Epoch [5/51], Iter [10/391] Loss: 0.4796
Epoch [5/51], Iter [20/391] Loss: 0.4856
Epoch [5/51], Iter [30/391] Loss: 0.4081
Epoch [5/51], Iter [40/391] Loss: 0.4157
Epoch [5/51], Iter [50/391] Loss: 0.4619
Epoch [5/51], Iter [60/391] Loss: 0.4591
Epoch [5/51], Iter [70/391] Loss: 0.4906
Epoch [5/51], Iter [80/391] Loss: 0.4555
Epoch [5/51], Iter [90/391] Loss: 0.5157
Epoch [5/51], Iter [100/391] Loss: 0.5172
Epoch [5/51], Iter [110/391] Loss: 0.4362
Epoch [5/51], Iter [120/391] Loss: 0.5424
Epoch [5/51], Iter [130/391] Loss: 0.4546
Epoch [5/51], Iter [140/391] Loss: 0.3490
Epoch [5/51], Iter [150/391] Loss: 0.5729
Epoch [5/51], Iter [160/391] Loss: 0.5195
Epoch [5/51], Iter [170/391] Loss: 0.4887
Epoch [5/51], Iter [180/391] Loss: 0.5877
Epoch [5/51], Iter [190/391] Loss: 0.4692
Epoch [5/51], Iter [200/391] Loss: 0.4894
Epoch [5/51], Iter [210/391] Loss: 0.5686
Epoch [5/51], Iter [220/391] Loss: 0.5617
Epoch [5/51], Iter [230/391] Loss: 0.4556
Epoch [5/51], Iter [240/391] Loss: 0.4021
Epoch [5/51], Iter [250/391] Loss: 0.4011
Epoch [5/51], Iter [260/391] Loss: 0.3950
Epoch [5/51], Iter [270/391] Loss: 0.4776
Epoch [5/51], Iter [280/391] Loss: 0.4267
Epoch [5/51], Iter [290/391] Loss: 0.4315
Epoch [5/51], Iter [300/391] Loss: 0.4858
Epoch [5/51], Iter [310/391] Loss: 0.4445
Epoch [5/51], Iter [320/391] Loss: 0.5178
Epoch [5/51], Iter [330/391] Loss: 0.4820
Epoch [5/51], Iter [340/391] Loss: 0.3728
Epoch [5/51], Iter [350/391] Loss: 0.4257
Epoch [5/51], Iter [360/391] Loss: 0.5268
Epoch [5/51], Iter [370/391] Loss: 0.4897
Epoch [5/51], Iter [380/391] Loss: 0.4917
Epoch [5/51], Iter [390/391] Loss: 0.4307
Epoch [6/51], Iter [10/391] Loss: 0.4709
Epoch [6/51], Iter [20/391] Loss: 0.4937
Epoch [6/51], Iter [30/391] Loss: 0.4248
Epoch [6/51], Iter [40/391] Loss: 0.4347
Epoch [6/51], Iter [50/391] Loss: 0.4377
Epoch [6/51], Iter [60/391] Loss: 0.4221
Epoch [6/51], Iter [70/391] Loss: 0.4829
Epoch [6/51], Iter [80/391] Loss: 0.3145
Epoch [6/51], Iter [90/391] Loss: 0.4874
Epoch [6/51], Iter [100/391] Loss: 0.3871
Epoch [6/51], Iter [110/391] Loss: 0.4987
Epoch [6/51], Iter [120/391] Loss: 0.4210
Epoch [6/51], Iter [130/391] Loss: 0.4536
Epoch [6/51], Iter [140/391] Loss: 0.4986
Epoch [6/51], Iter [150/391] Loss: 0.3115
Epoch [6/51], Iter [160/391] Loss: 0.4253
Epoch [6/51], Iter [170/391] Loss: 0.3591
Epoch [6/51], Iter [180/391] Loss: 0.3549
Epoch [6/51], Iter [190/391] Loss: 0.4039
Epoch [6/51], Iter [200/391] Loss: 0.3838
Epoch [6/51], Iter [210/391] Loss: 0.3767
Epoch [6/51], Iter [220/391] Loss: 0.3000
Epoch [6/51], Iter [230/391] Loss: 0.3340
Epoch [6/51], Iter [240/391] Loss: 0.3749
Epoch [6/51], Iter [250/391] Loss: 0.4096
Epoch [6/51], Iter [260/391] Loss: 0.4269
Epoch [6/51], Iter [270/391] Loss: 0.4116
Epoch [6/51], Iter [280/391] Loss: 0.4401
Epoch [6/51], Iter [290/391] Loss: 0.4524
Epoch [6/51], Iter [300/391] Loss: 0.3612
Epoch [6/51], Iter [310/391] Loss: 0.3760
Epoch [6/51], Iter [320/391] Loss: 0.3323
Epoch [6/51], Iter [330/391] Loss: 0.4101
Epoch [6/51], Iter [340/391] Loss: 0.4512
Epoch [6/51], Iter [350/391] Loss: 0.4474
Epoch [6/51], Iter [360/391] Loss: 0.4674
Epoch [6/51], Iter [370/391] Loss: 0.3869
Epoch [6/51], Iter [380/391] Loss: 0.4308
Epoch [6/51], Iter [390/391] Loss: 0.3881
Epoch [7/51], Iter [10/391] Loss: 0.3689
Epoch [7/51], Iter [20/391] Loss: 0.3116
Epoch [7/51], Iter [30/391] Loss: 0.4960
Epoch [7/51], Iter [40/391] Loss: 0.3823
Epoch [7/51], Iter [50/391] Loss: 0.3215
Epoch [7/51], Iter [60/391] Loss: 0.3720
Epoch [7/51], Iter [70/391] Loss: 0.3675
Epoch [7/51], Iter [80/391] Loss: 0.3692
Epoch [7/51], Iter [90/391] Loss: 0.3227
Epoch [7/51], Iter [100/391] Loss: 0.3557
Epoch [7/51], Iter [110/391] Loss: 0.2944
Epoch [7/51], Iter [120/391] Loss: 0.3376
Epoch [7/51], Iter [130/391] Loss: 0.3607
Epoch [7/51], Iter [140/391] Loss: 0.4597
Epoch [7/51], Iter [150/391] Loss: 0.3623
Epoch [7/51], Iter [160/391] Loss: 0.3470
Epoch [7/51], Iter [170/391] Loss: 0.4538
Epoch [7/51], Iter [180/391] Loss: 0.3500
Epoch [7/51], Iter [190/391] Loss: 0.3867
Epoch [7/51], Iter [200/391] Loss: 0.3662
Epoch [7/51], Iter [210/391] Loss: 0.3930
Epoch [7/51], Iter [220/391] Loss: 0.3354
Epoch [7/51], Iter [230/391] Loss: 0.4190
Epoch [7/51], Iter [240/391] Loss: 0.3315
Epoch [7/51], Iter [250/391] Loss: 0.4227
Epoch [7/51], Iter [260/391] Loss: 0.4116
Epoch [7/51], Iter [270/391] Loss: 0.4918
Epoch [7/51], Iter [280/391] Loss: 0.4185
Epoch [7/51], Iter [290/391] Loss: 0.3510
Epoch [7/51], Iter [300/391] Loss: 0.4050
Epoch [7/51], Iter [310/391] Loss: 0.3393
Epoch [7/51], Iter [320/391] Loss: 0.3845
Epoch [7/51], Iter [330/391] Loss: 0.3379
Epoch [7/51], Iter [340/391] Loss: 0.3978
Epoch [7/51], Iter [350/391] Loss: 0.3390
Epoch [7/51], Iter [360/391] Loss: 0.3407
Epoch [7/51], Iter [370/391] Loss: 0.3730
Epoch [7/51], Iter [380/391] Loss: 0.3626
Epoch [7/51], Iter [390/391] Loss: 0.4246
Epoch [8/51], Iter [10/391] Loss: 0.3313
Epoch [8/51], Iter [20/391] Loss: 0.2791
Epoch [8/51], Iter [30/391] Loss: 0.4059
Epoch [8/51], Iter [40/391] Loss: 0.3538
Epoch [8/51], Iter [50/391] Loss: 0.3558
Epoch [8/51], Iter [60/391] Loss: 0.3811
Epoch [8/51], Iter [70/391] Loss: 0.3447
Epoch [8/51], Iter [80/391] Loss: 0.3006
Epoch [8/51], Iter [90/391] Loss: 0.3528
Epoch [8/51], Iter [100/391] Loss: 0.4524
Epoch [8/51], Iter [110/391] Loss: 0.3381
Epoch [8/51], Iter [120/391] Loss: 0.3840
Epoch [8/51], Iter [130/391] Loss: 0.2985
Epoch [8/51], Iter [140/391] Loss: 0.4274
Epoch [8/51], Iter [150/391] Loss: 0.3345
Epoch [8/51], Iter [160/391] Loss: 0.3383
Epoch [8/51], Iter [170/391] Loss: 0.3574
Epoch [8/51], Iter [180/391] Loss: 0.3042
Epoch [8/51], Iter [190/391] Loss: 0.3529
Epoch [8/51], Iter [200/391] Loss: 0.4059
Epoch [8/51], Iter [210/391] Loss: 0.3463
Epoch [8/51], Iter [220/391] Loss: 0.3680
Epoch [8/51], Iter [230/391] Loss: 0.4793
Epoch [8/51], Iter [240/391] Loss: 0.3282
Epoch [8/51], Iter [250/391] Loss: 0.3087
Epoch [8/51], Iter [260/391] Loss: 0.3508
Epoch [8/51], Iter [270/391] Loss: 0.3704
Epoch [8/51], Iter [280/391] Loss: 0.3427
Epoch [8/51], Iter [290/391] Loss: 0.4055
Epoch [8/51], Iter [300/391] Loss: 0.3279
Epoch [8/51], Iter [310/391] Loss: 0.3643
Epoch [8/51], Iter [320/391] Loss: 0.3393
Epoch [8/51], Iter [330/391] Loss: 0.3625
Epoch [8/51], Iter [340/391] Loss: 0.4235
Epoch [8/51], Iter [350/391] Loss: 0.4231
Epoch [8/51], Iter [360/391] Loss: 0.3403
Epoch [8/51], Iter [370/391] Loss: 0.3557
Epoch [8/51], Iter [380/391] Loss: 0.2989
Epoch [8/51], Iter [390/391] Loss: 0.3575
Epoch [9/51], Iter [10/391] Loss: 0.3852
Epoch [9/51], Iter [20/391] Loss: 0.3600
Epoch [9/51], Iter [30/391] Loss: 0.3020
Epoch [9/51], Iter [40/391] Loss: 0.3737
Epoch [9/51], Iter [50/391] Loss: 0.2861
Epoch [9/51], Iter [60/391] Loss: 0.3279
Epoch [9/51], Iter [70/391] Loss: 0.2726
Epoch [9/51], Iter [80/391] Loss: 0.2879
Epoch [9/51], Iter [90/391] Loss: 0.3871
Epoch [9/51], Iter [100/391] Loss: 0.3700
Epoch [9/51], Iter [110/391] Loss: 0.3547
Epoch [9/51], Iter [120/391] Loss: 0.3715
Epoch [9/51], Iter [130/391] Loss: 0.3336
Epoch [9/51], Iter [140/391] Loss: 0.3239
Epoch [9/51], Iter [150/391] Loss: 0.2500
Epoch [9/51], Iter [160/391] Loss: 0.3442
Epoch [9/51], Iter [170/391] Loss: 0.3583
Epoch [9/51], Iter [180/391] Loss: 0.3529
Epoch [9/51], Iter [190/391] Loss: 0.3268
Epoch [9/51], Iter [200/391] Loss: 0.3648
Epoch [9/51], Iter [210/391] Loss: 0.2809
Epoch [9/51], Iter [220/391] Loss: 0.3843
Epoch [9/51], Iter [230/391] Loss: 0.3080
Epoch [9/51], Iter [240/391] Loss: 0.3242
Epoch [9/51], Iter [250/391] Loss: 0.2993
Epoch [9/51], Iter [260/391] Loss: 0.2650
Epoch [9/51], Iter [270/391] Loss: 0.3356
Epoch [9/51], Iter [280/391] Loss: 0.3065
Epoch [9/51], Iter [290/391] Loss: 0.3777
Epoch [9/51], Iter [300/391] Loss: 0.3720
Epoch [9/51], Iter [310/391] Loss: 0.3750
Epoch [9/51], Iter [320/391] Loss: 0.3179
Epoch [9/51], Iter [330/391] Loss: 0.3224
Epoch [9/51], Iter [340/391] Loss: 0.3497
Epoch [9/51], Iter [350/391] Loss: 0.3886
Epoch [9/51], Iter [360/391] Loss: 0.2749
Epoch [9/51], Iter [370/391] Loss: 0.2630
Epoch [9/51], Iter [380/391] Loss: 0.2721
Epoch [9/51], Iter [390/391] Loss: 0.3602
Epoch [10/51], Iter [10/391] Loss: 0.2789
Epoch [10/51], Iter [20/391] Loss: 0.3212
Epoch [10/51], Iter [30/391] Loss: 0.3093
Epoch [10/51], Iter [40/391] Loss: 0.3527
Epoch [10/51], Iter [50/391] Loss: 0.2984
Epoch [10/51], Iter [60/391] Loss: 0.3258
Epoch [10/51], Iter [70/391] Loss: 0.3801
Epoch [10/51], Iter [80/391] Loss: 0.3265
Epoch [10/51], Iter [90/391] Loss: 0.3410
Epoch [10/51], Iter [100/391] Loss: 0.3001
Epoch [10/51], Iter [110/391] Loss: 0.2784
Epoch [10/51], Iter [120/391] Loss: 0.4092
Epoch [10/51], Iter [130/391] Loss: 0.2438
Epoch [10/51], Iter [140/391] Loss: 0.3144
Epoch [10/51], Iter [150/391] Loss: 0.2837
Epoch [10/51], Iter [160/391] Loss: 0.3121
Epoch [10/51], Iter [170/391] Loss: 0.2477
Epoch [10/51], Iter [180/391] Loss: 0.2658
Epoch [10/51], Iter [190/391] Loss: 0.2857
Epoch [10/51], Iter [200/391] Loss: 0.2774
Epoch [10/51], Iter [210/391] Loss: 0.3348
Epoch [10/51], Iter [220/391] Loss: 0.3527
Epoch [10/51], Iter [230/391] Loss: 0.2811
Epoch [10/51], Iter [240/391] Loss: 0.3822
Epoch [10/51], Iter [250/391] Loss: 0.3290
Epoch [10/51], Iter [260/391] Loss: 0.3432
Epoch [10/51], Iter [270/391] Loss: 0.3509
Epoch [10/51], Iter [280/391] Loss: 0.3468
Epoch [10/51], Iter [290/391] Loss: 0.3675
Epoch [10/51], Iter [300/391] Loss: 0.2431
Epoch [10/51], Iter [310/391] Loss: 0.3025
Epoch [10/51], Iter [320/391] Loss: 0.2942
Epoch [10/51], Iter [330/391] Loss: 0.2535
Epoch [10/51], Iter [340/391] Loss: 0.3347
Epoch [10/51], Iter [350/391] Loss: 0.2169
Epoch [10/51], Iter [360/391] Loss: 0.3229
Epoch [10/51], Iter [370/391] Loss: 0.3107
Epoch [10/51], Iter [380/391] Loss: 0.2388
Epoch [10/51], Iter [390/391] Loss: 0.2905
[Saving Checkpoint]
Epoch [11/51], Iter [10/391] Loss: 0.2577
Epoch [11/51], Iter [20/391] Loss: 0.2388
Epoch [11/51], Iter [30/391] Loss: 0.2845
Epoch [11/51], Iter [40/391] Loss: 0.3421
Epoch [11/51], Iter [50/391] Loss: 0.3898
Epoch [11/51], Iter [60/391] Loss: 0.2968
Epoch [11/51], Iter [70/391] Loss: 0.3271
Epoch [11/51], Iter [80/391] Loss: 0.2274
Epoch [11/51], Iter [90/391] Loss: 0.2787
Epoch [11/51], Iter [100/391] Loss: 0.3275
Epoch [11/51], Iter [110/391] Loss: 0.2659
Epoch [11/51], Iter [120/391] Loss: 0.2808
Epoch [11/51], Iter [130/391] Loss: 0.3614
Epoch [11/51], Iter [140/391] Loss: 0.2428
Epoch [11/51], Iter [150/391] Loss: 0.2854
Epoch [11/51], Iter [160/391] Loss: 0.2545
Epoch [11/51], Iter [170/391] Loss: 0.2914
Epoch [11/51], Iter [180/391] Loss: 0.2249
Epoch [11/51], Iter [190/391] Loss: 0.2541
Epoch [11/51], Iter [200/391] Loss: 0.3286
Epoch [11/51], Iter [210/391] Loss: 0.2547
Epoch [11/51], Iter [220/391] Loss: 0.2885
Epoch [11/51], Iter [230/391] Loss: 0.3518
Epoch [11/51], Iter [240/391] Loss: 0.2989
Epoch [11/51], Iter [250/391] Loss: 0.3086
Epoch [11/51], Iter [260/391] Loss: 0.2552
Epoch [11/51], Iter [270/391] Loss: 0.2417
Epoch [11/51], Iter [280/391] Loss: 0.2995
Epoch [11/51], Iter [290/391] Loss: 0.2726
Epoch [11/51], Iter [300/391] Loss: 0.2712
Epoch [11/51], Iter [310/391] Loss: 0.2487
Epoch [11/51], Iter [320/391] Loss: 0.2505
Epoch [11/51], Iter [330/391] Loss: 0.3949
Epoch [11/51], Iter [340/391] Loss: 0.2346
Epoch [11/51], Iter [350/391] Loss: 0.2736
Epoch [11/51], Iter [360/391] Loss: 0.1905
Epoch [11/51], Iter [370/391] Loss: 0.3647
Epoch [11/51], Iter [380/391] Loss: 0.2862
Epoch [11/51], Iter [390/391] Loss: 0.2757
Epoch [12/51], Iter [10/391] Loss: 0.3324
Epoch [12/51], Iter [20/391] Loss: 0.4092
Epoch [12/51], Iter [30/391] Loss: 0.2506
Epoch [12/51], Iter [40/391] Loss: 0.3061
Epoch [12/51], Iter [50/391] Loss: 0.3750
Epoch [12/51], Iter [60/391] Loss: 0.2459
Epoch [12/51], Iter [70/391] Loss: 0.2253
Epoch [12/51], Iter [80/391] Loss: 0.2775
Epoch [12/51], Iter [90/391] Loss: 0.1862
Epoch [12/51], Iter [100/391] Loss: 0.2900
Epoch [12/51], Iter [110/391] Loss: 0.2656
Epoch [12/51], Iter [120/391] Loss: 0.2716
Epoch [12/51], Iter [130/391] Loss: 0.3691
Epoch [12/51], Iter [140/391] Loss: 0.2027
Epoch [12/51], Iter [150/391] Loss: 0.2427
Epoch [12/51], Iter [160/391] Loss: 0.2378
Epoch [12/51], Iter [170/391] Loss: 0.3326
Epoch [12/51], Iter [180/391] Loss: 0.3127
Epoch [12/51], Iter [190/391] Loss: 0.2541
Epoch [12/51], Iter [200/391] Loss: 0.2555
Epoch [12/51], Iter [210/391] Loss: 0.2526
Epoch [12/51], Iter [220/391] Loss: 0.2795
Epoch [12/51], Iter [230/391] Loss: 0.2539
Epoch [12/51], Iter [240/391] Loss: 0.1921
Epoch [12/51], Iter [250/391] Loss: 0.2533
Epoch [12/51], Iter [260/391] Loss: 0.2559
Epoch [12/51], Iter [270/391] Loss: 0.2978
Epoch [12/51], Iter [280/391] Loss: 0.2428
Epoch [12/51], Iter [290/391] Loss: 0.2512
Epoch [12/51], Iter [300/391] Loss: 0.2672
Epoch [12/51], Iter [310/391] Loss: 0.2715
Epoch [12/51], Iter [320/391] Loss: 0.2399
Epoch [12/51], Iter [330/391] Loss: 0.3157
Epoch [12/51], Iter [340/391] Loss: 0.3130
Epoch [12/51], Iter [350/391] Loss: 0.2557
Epoch [12/51], Iter [360/391] Loss: 0.1761
Epoch [12/51], Iter [370/391] Loss: 0.2924
Epoch [12/51], Iter [380/391] Loss: 0.2747
Epoch [12/51], Iter [390/391] Loss: 0.2638
Epoch [13/51], Iter [10/391] Loss: 0.2290
Epoch [13/51], Iter [20/391] Loss: 0.1967
Epoch [13/51], Iter [30/391] Loss: 0.2022
Epoch [13/51], Iter [40/391] Loss: 0.2575
Epoch [13/51], Iter [50/391] Loss: 0.2116
Epoch [13/51], Iter [60/391] Loss: 0.2318
Epoch [13/51], Iter [70/391] Loss: 0.2969
Epoch [13/51], Iter [80/391] Loss: 0.2169
Epoch [13/51], Iter [90/391] Loss: 0.2565
Epoch [13/51], Iter [100/391] Loss: 0.3090
Epoch [13/51], Iter [110/391] Loss: 0.2126
Epoch [13/51], Iter [120/391] Loss: 0.2524
Epoch [13/51], Iter [130/391] Loss: 0.3163
Epoch [13/51], Iter [140/391] Loss: 0.3623
Epoch [13/51], Iter [150/391] Loss: 0.2416
Epoch [13/51], Iter [160/391] Loss: 0.2432
Epoch [13/51], Iter [170/391] Loss: 0.2476
Epoch [13/51], Iter [180/391] Loss: 0.2587
Epoch [13/51], Iter [190/391] Loss: 0.2452
Epoch [13/51], Iter [200/391] Loss: 0.2254
Epoch [13/51], Iter [210/391] Loss: 0.1549
Epoch [13/51], Iter [220/391] Loss: 0.2849
Epoch [13/51], Iter [230/391] Loss: 0.2187
Epoch [13/51], Iter [240/391] Loss: 0.2848
Epoch [13/51], Iter [250/391] Loss: 0.2532
Epoch [13/51], Iter [260/391] Loss: 0.2386
Epoch [13/51], Iter [270/391] Loss: 0.2862
Epoch [13/51], Iter [280/391] Loss: 0.2156
Epoch [13/51], Iter [290/391] Loss: 0.2918
Epoch [13/51], Iter [300/391] Loss: 0.1966
Epoch [13/51], Iter [310/391] Loss: 0.3225
Epoch [13/51], Iter [320/391] Loss: 0.2422
Epoch [13/51], Iter [330/391] Loss: 0.2824
Epoch [13/51], Iter [340/391] Loss: 0.2255
Epoch [13/51], Iter [350/391] Loss: 0.2210
Epoch [13/51], Iter [360/391] Loss: 0.2268
Epoch [13/51], Iter [370/391] Loss: 0.2242
Epoch [13/51], Iter [380/391] Loss: 0.2634
Epoch [13/51], Iter [390/391] Loss: 0.2326
Epoch [14/51], Iter [10/391] Loss: 0.2444
Epoch [14/51], Iter [20/391] Loss: 0.2752
Epoch [14/51], Iter [30/391] Loss: 0.1754
Epoch [14/51], Iter [40/391] Loss: 0.2230
Epoch [14/51], Iter [50/391] Loss: 0.3148
Epoch [14/51], Iter [60/391] Loss: 0.2136
Epoch [14/51], Iter [70/391] Loss: 0.2362
Epoch [14/51], Iter [80/391] Loss: 0.3125
Epoch [14/51], Iter [90/391] Loss: 0.1733
Epoch [14/51], Iter [100/391] Loss: 0.2483
Epoch [14/51], Iter [110/391] Loss: 0.2671
Epoch [14/51], Iter [120/391] Loss: 0.1360
Epoch [14/51], Iter [130/391] Loss: 0.2797
Epoch [14/51], Iter [140/391] Loss: 0.2469
Epoch [14/51], Iter [150/391] Loss: 0.1958
Epoch [14/51], Iter [160/391] Loss: 0.3307
Epoch [14/51], Iter [170/391] Loss: 0.3253
Epoch [14/51], Iter [180/391] Loss: 0.2827
Epoch [14/51], Iter [190/391] Loss: 0.2440
Epoch [14/51], Iter [200/391] Loss: 0.2500
Epoch [14/51], Iter [210/391] Loss: 0.2124
Epoch [14/51], Iter [220/391] Loss: 0.2362
Epoch [14/51], Iter [230/391] Loss: 0.2387
Epoch [14/51], Iter [240/391] Loss: 0.3139
Epoch [14/51], Iter [250/391] Loss: 0.2042
Epoch [14/51], Iter [260/391] Loss: 0.2627
Epoch [14/51], Iter [270/391] Loss: 0.2947
Epoch [14/51], Iter [280/391] Loss: 0.2432
Epoch [14/51], Iter [290/391] Loss: 0.1856
Epoch [14/51], Iter [300/391] Loss: 0.2978
Epoch [14/51], Iter [310/391] Loss: 0.2248
Epoch [14/51], Iter [320/391] Loss: 0.2768
Epoch [14/51], Iter [330/391] Loss: 0.2253
Epoch [14/51], Iter [340/391] Loss: 0.2675
Epoch [14/51], Iter [350/391] Loss: 0.2277
Epoch [14/51], Iter [360/391] Loss: 0.3018
Epoch [14/51], Iter [370/391] Loss: 0.2662
Epoch [14/51], Iter [380/391] Loss: 0.2120
Epoch [14/51], Iter [390/391] Loss: 0.1942
Epoch [15/51], Iter [10/391] Loss: 0.2276
Epoch [15/51], Iter [20/391] Loss: 0.2307
Epoch [15/51], Iter [30/391] Loss: 0.2856
Epoch [15/51], Iter [40/391] Loss: 0.1707
Epoch [15/51], Iter [50/391] Loss: 0.1619
Epoch [15/51], Iter [60/391] Loss: 0.2227
Epoch [15/51], Iter [70/391] Loss: 0.1946
Epoch [15/51], Iter [80/391] Loss: 0.2541
Epoch [15/51], Iter [90/391] Loss: 0.2753
Epoch [15/51], Iter [100/391] Loss: 0.2129
Epoch [15/51], Iter [110/391] Loss: 0.1934
Epoch [15/51], Iter [120/391] Loss: 0.1776
Epoch [15/51], Iter [130/391] Loss: 0.2419
Epoch [15/51], Iter [140/391] Loss: 0.2912
Epoch [15/51], Iter [150/391] Loss: 0.2377
Epoch [15/51], Iter [160/391] Loss: 0.2269
Epoch [15/51], Iter [170/391] Loss: 0.2288
Epoch [15/51], Iter [180/391] Loss: 0.2539
Epoch [15/51], Iter [190/391] Loss: 0.2423
Epoch [15/51], Iter [200/391] Loss: 0.2128
Epoch [15/51], Iter [210/391] Loss: 0.2508
Epoch [15/51], Iter [220/391] Loss: 0.2721
Epoch [15/51], Iter [230/391] Loss: 0.2753
Epoch [15/51], Iter [240/391] Loss: 0.2305
Epoch [15/51], Iter [250/391] Loss: 0.2661
Epoch [15/51], Iter [260/391] Loss: 0.1635
Epoch [15/51], Iter [270/391] Loss: 0.2123
Epoch [15/51], Iter [280/391] Loss: 0.1988
Epoch [15/51], Iter [290/391] Loss: 0.3072
Epoch [15/51], Iter [300/391] Loss: 0.2781
Epoch [15/51], Iter [310/391] Loss: 0.2316
Epoch [15/51], Iter [320/391] Loss: 0.2519
Epoch [15/51], Iter [330/391] Loss: 0.1903
Epoch [15/51], Iter [340/391] Loss: 0.2235
Epoch [15/51], Iter [350/391] Loss: 0.2380
Epoch [15/51], Iter [360/391] Loss: 0.2323
Epoch [15/51], Iter [370/391] Loss: 0.2247
Epoch [15/51], Iter [380/391] Loss: 0.2654
Epoch [15/51], Iter [390/391] Loss: 0.2141
Epoch [16/51], Iter [10/391] Loss: 0.1897
Epoch [16/51], Iter [20/391] Loss: 0.2143
Epoch [16/51], Iter [30/391] Loss: 0.2061
Epoch [16/51], Iter [40/391] Loss: 0.2586
Epoch [16/51], Iter [50/391] Loss: 0.2565
Epoch [16/51], Iter [60/391] Loss: 0.2534
Epoch [16/51], Iter [70/391] Loss: 0.2296
Epoch [16/51], Iter [80/391] Loss: 0.2438
Epoch [16/51], Iter [90/391] Loss: 0.2090
Epoch [16/51], Iter [100/391] Loss: 0.2657
Epoch [16/51], Iter [110/391] Loss: 0.2244
Epoch [16/51], Iter [120/391] Loss: 0.1692
Epoch [16/51], Iter [130/391] Loss: 0.2493
Epoch [16/51], Iter [140/391] Loss: 0.2740
Epoch [16/51], Iter [150/391] Loss: 0.2787
Epoch [16/51], Iter [160/391] Loss: 0.1876
Epoch [16/51], Iter [170/391] Loss: 0.2442
Epoch [16/51], Iter [180/391] Loss: 0.2348
Epoch [16/51], Iter [190/391] Loss: 0.1812
Epoch [16/51], Iter [200/391] Loss: 0.2509
Epoch [16/51], Iter [210/391] Loss: 0.2091
Epoch [16/51], Iter [220/391] Loss: 0.2139
Epoch [16/51], Iter [230/391] Loss: 0.2053
Epoch [16/51], Iter [240/391] Loss: 0.2539
Epoch [16/51], Iter [250/391] Loss: 0.2033
Epoch [16/51], Iter [260/391] Loss: 0.2391
Epoch [16/51], Iter [270/391] Loss: 0.2708
Epoch [16/51], Iter [280/391] Loss: 0.2478
Epoch [16/51], Iter [290/391] Loss: 0.1756
Epoch [16/51], Iter [300/391] Loss: 0.2370
Epoch [16/51], Iter [310/391] Loss: 0.2052
Epoch [16/51], Iter [320/391] Loss: 0.2944
Epoch [16/51], Iter [330/391] Loss: 0.2068
Epoch [16/51], Iter [340/391] Loss: 0.1969
Epoch [16/51], Iter [350/391] Loss: 0.1492
Epoch [16/51], Iter [360/391] Loss: 0.2010
Epoch [16/51], Iter [370/391] Loss: 0.2047
Epoch [16/51], Iter [380/391] Loss: 0.2370
Epoch [16/51], Iter [390/391] Loss: 0.1895
Epoch [17/51], Iter [10/391] Loss: 0.2108
Epoch [17/51], Iter [20/391] Loss: 0.1835
Epoch [17/51], Iter [30/391] Loss: 0.2724
Epoch [17/51], Iter [40/391] Loss: 0.2241
Epoch [17/51], Iter [50/391] Loss: 0.2101
Epoch [17/51], Iter [60/391] Loss: 0.1935
Epoch [17/51], Iter [70/391] Loss: 0.2284
Epoch [17/51], Iter [80/391] Loss: 0.1944
Epoch [17/51], Iter [90/391] Loss: 0.1828
Epoch [17/51], Iter [100/391] Loss: 0.2570
Epoch [17/51], Iter [110/391] Loss: 0.2164
Epoch [17/51], Iter [120/391] Loss: 0.2181
Epoch [17/51], Iter [130/391] Loss: 0.2173
Epoch [17/51], Iter [140/391] Loss: 0.2070
Epoch [17/51], Iter [150/391] Loss: 0.2111
Epoch [17/51], Iter [160/391] Loss: 0.1808
Epoch [17/51], Iter [170/391] Loss: 0.2278
Epoch [17/51], Iter [180/391] Loss: 0.3323
Epoch [17/51], Iter [190/391] Loss: 0.1916
Epoch [17/51], Iter [200/391] Loss: 0.1758
Epoch [17/51], Iter [210/391] Loss: 0.1860
Epoch [17/51], Iter [220/391] Loss: 0.2152
Epoch [17/51], Iter [230/391] Loss: 0.2412
Epoch [17/51], Iter [240/391] Loss: 0.2425
Epoch [17/51], Iter [250/391] Loss: 0.2135
Epoch [17/51], Iter [260/391] Loss: 0.2036
Epoch [17/51], Iter [270/391] Loss: 0.1919
Epoch [17/51], Iter [280/391] Loss: 0.2407
Epoch [17/51], Iter [290/391] Loss: 0.1971
Epoch [17/51], Iter [300/391] Loss: 0.1762
Epoch [17/51], Iter [310/391] Loss: 0.2534
Epoch [17/51], Iter [320/391] Loss: 0.2695
Epoch [17/51], Iter [330/391] Loss: 0.2468
Epoch [17/51], Iter [340/391] Loss: 0.2957
Epoch [17/51], Iter [350/391] Loss: 0.2598
Epoch [17/51], Iter [360/391] Loss: 0.2013
Epoch [17/51], Iter [370/391] Loss: 0.2297
Epoch [17/51], Iter [380/391] Loss: 0.2659
Epoch [17/51], Iter [390/391] Loss: 0.2238
Epoch [18/51], Iter [10/391] Loss: 0.1774
Epoch [18/51], Iter [20/391] Loss: 0.2292
Epoch [18/51], Iter [30/391] Loss: 0.2264
Epoch [18/51], Iter [40/391] Loss: 0.2334
Epoch [18/51], Iter [50/391] Loss: 0.2521
Epoch [18/51], Iter [60/391] Loss: 0.2189
Epoch [18/51], Iter [70/391] Loss: 0.2551
Epoch [18/51], Iter [80/391] Loss: 0.1853
Epoch [18/51], Iter [90/391] Loss: 0.1860
Epoch [18/51], Iter [100/391] Loss: 0.2111
Epoch [18/51], Iter [110/391] Loss: 0.2150
Epoch [18/51], Iter [120/391] Loss: 0.1800
Epoch [18/51], Iter [130/391] Loss: 0.2481
Epoch [18/51], Iter [140/391] Loss: 0.1492
Epoch [18/51], Iter [150/391] Loss: 0.1748
Epoch [18/51], Iter [160/391] Loss: 0.1969
Epoch [18/51], Iter [170/391] Loss: 0.2175
Epoch [18/51], Iter [180/391] Loss: 0.2375
Epoch [18/51], Iter [190/391] Loss: 0.2342
Epoch [18/51], Iter [200/391] Loss: 0.2083
Epoch [18/51], Iter [210/391] Loss: 0.2665
Epoch [18/51], Iter [220/391] Loss: 0.1960
Epoch [18/51], Iter [230/391] Loss: 0.1401
Epoch [18/51], Iter [240/391] Loss: 0.2149
Epoch [18/51], Iter [250/391] Loss: 0.1984
Epoch [18/51], Iter [260/391] Loss: 0.1518
Epoch [18/51], Iter [270/391] Loss: 0.1962
Epoch [18/51], Iter [280/391] Loss: 0.2254
Epoch [18/51], Iter [290/391] Loss: 0.2323
Epoch [18/51], Iter [300/391] Loss: 0.1193
Epoch [18/51], Iter [310/391] Loss: 0.2120
Epoch [18/51], Iter [320/391] Loss: 0.1679
Epoch [18/51], Iter [330/391] Loss: 0.1665
Epoch [18/51], Iter [340/391] Loss: 0.1407
Epoch [18/51], Iter [350/391] Loss: 0.2089
Epoch [18/51], Iter [360/391] Loss: 0.1611
Epoch [18/51], Iter [370/391] Loss: 0.2122
Epoch [18/51], Iter [380/391] Loss: 0.2667
Epoch [18/51], Iter [390/391] Loss: 0.1761
Epoch [19/51], Iter [10/391] Loss: 0.2292
Epoch [19/51], Iter [20/391] Loss: 0.1666
Epoch [19/51], Iter [30/391] Loss: 0.1670
Epoch [19/51], Iter [40/391] Loss: 0.1948
Epoch [19/51], Iter [50/391] Loss: 0.1512
Epoch [19/51], Iter [60/391] Loss: 0.1718
Epoch [19/51], Iter [70/391] Loss: 0.1823
Epoch [19/51], Iter [80/391] Loss: 0.1634
Epoch [19/51], Iter [90/391] Loss: 0.1845
Epoch [19/51], Iter [100/391] Loss: 0.1970
Epoch [19/51], Iter [110/391] Loss: 0.2269
Epoch [19/51], Iter [120/391] Loss: 0.2066
Epoch [19/51], Iter [130/391] Loss: 0.2389
Epoch [19/51], Iter [140/391] Loss: 0.1851
Epoch [19/51], Iter [150/391] Loss: 0.1862
Epoch [19/51], Iter [160/391] Loss: 0.2101
Epoch [19/51], Iter [170/391] Loss: 0.1711
Epoch [19/51], Iter [180/391] Loss: 0.2213
Epoch [19/51], Iter [190/391] Loss: 0.2366
Epoch [19/51], Iter [200/391] Loss: 0.1666
Epoch [19/51], Iter [210/391] Loss: 0.1710
Epoch [19/51], Iter [220/391] Loss: 0.2404
Epoch [19/51], Iter [230/391] Loss: 0.2466
Epoch [19/51], Iter [240/391] Loss: 0.1931
Epoch [19/51], Iter [250/391] Loss: 0.1885
Epoch [19/51], Iter [260/391] Loss: 0.2216
Epoch [19/51], Iter [270/391] Loss: 0.1956
Epoch [19/51], Iter [280/391] Loss: 0.1806
Epoch [19/51], Iter [290/391] Loss: 0.2017
Epoch [19/51], Iter [300/391] Loss: 0.2082
Epoch [19/51], Iter [310/391] Loss: 0.1945
Epoch [19/51], Iter [320/391] Loss: 0.2113
Epoch [19/51], Iter [330/391] Loss: 0.1962
Epoch [19/51], Iter [340/391] Loss: 0.2156
Epoch [19/51], Iter [350/391] Loss: 0.1892
Epoch [19/51], Iter [360/391] Loss: 0.1763
Epoch [19/51], Iter [370/391] Loss: 0.2610
Epoch [19/51], Iter [380/391] Loss: 0.2116
Epoch [19/51], Iter [390/391] Loss: 0.2442
Epoch [20/51], Iter [10/391] Loss: 0.1462
Epoch [20/51], Iter [20/391] Loss: 0.1450
Epoch [20/51], Iter [30/391] Loss: 0.2036
Epoch [20/51], Iter [40/391] Loss: 0.1492
Epoch [20/51], Iter [50/391] Loss: 0.2024
Epoch [20/51], Iter [60/391] Loss: 0.2163
Epoch [20/51], Iter [70/391] Loss: 0.1665
Epoch [20/51], Iter [80/391] Loss: 0.1931
Epoch [20/51], Iter [90/391] Loss: 0.1667
Epoch [20/51], Iter [100/391] Loss: 0.1724
Epoch [20/51], Iter [110/391] Loss: 0.2071
Epoch [20/51], Iter [120/391] Loss: 0.2238
Epoch [20/51], Iter [130/391] Loss: 0.1694
Epoch [20/51], Iter [140/391] Loss: 0.2080
Epoch [20/51], Iter [150/391] Loss: 0.1914
Epoch [20/51], Iter [160/391] Loss: 0.1631
Epoch [20/51], Iter [170/391] Loss: 0.1794
Epoch [20/51], Iter [180/391] Loss: 0.2417
Epoch [20/51], Iter [190/391] Loss: 0.2443
Epoch [20/51], Iter [200/391] Loss: 0.1835
Epoch [20/51], Iter [210/391] Loss: 0.1557
Epoch [20/51], Iter [220/391] Loss: 0.2165
Epoch [20/51], Iter [230/391] Loss: 0.1900
Epoch [20/51], Iter [240/391] Loss: 0.1667
Epoch [20/51], Iter [250/391] Loss: 0.1537
Epoch [20/51], Iter [260/391] Loss: 0.1460
Epoch [20/51], Iter [270/391] Loss: 0.2133
Epoch [20/51], Iter [280/391] Loss: 0.2145
Epoch [20/51], Iter [290/391] Loss: 0.1945
Epoch [20/51], Iter [300/391] Loss: 0.2474
Epoch [20/51], Iter [310/391] Loss: 0.2476
Epoch [20/51], Iter [320/391] Loss: 0.1609
Epoch [20/51], Iter [330/391] Loss: 0.1644
Epoch [20/51], Iter [340/391] Loss: 0.1990
Epoch [20/51], Iter [350/391] Loss: 0.1742
Epoch [20/51], Iter [360/391] Loss: 0.2125
Epoch [20/51], Iter [370/391] Loss: 0.1425
Epoch [20/51], Iter [380/391] Loss: 0.2046
Epoch [20/51], Iter [390/391] Loss: 0.1751
[Saving Checkpoint]
Epoch [21/51], Iter [10/391] Loss: 0.1314
Epoch [21/51], Iter [20/391] Loss: 0.1459
Epoch [21/51], Iter [30/391] Loss: 0.1365
Epoch [21/51], Iter [40/391] Loss: 0.1535
Epoch [21/51], Iter [50/391] Loss: 0.1548
Epoch [21/51], Iter [60/391] Loss: 0.1804
Epoch [21/51], Iter [70/391] Loss: 0.2136
Epoch [21/51], Iter [80/391] Loss: 0.2080
Epoch [21/51], Iter [90/391] Loss: 0.1475
Epoch [21/51], Iter [100/391] Loss: 0.2224
Epoch [21/51], Iter [110/391] Loss: 0.1814
Epoch [21/51], Iter [120/391] Loss: 0.1555
Epoch [21/51], Iter [130/391] Loss: 0.2027
Epoch [21/51], Iter [140/391] Loss: 0.1415
Epoch [21/51], Iter [150/391] Loss: 0.2037
Epoch [21/51], Iter [160/391] Loss: 0.2205
Epoch [21/51], Iter [170/391] Loss: 0.1892
Epoch [21/51], Iter [180/391] Loss: 0.1816
Epoch [21/51], Iter [190/391] Loss: 0.1536
Epoch [21/51], Iter [200/391] Loss: 0.2220
Epoch [21/51], Iter [210/391] Loss: 0.1410
Epoch [21/51], Iter [220/391] Loss: 0.1089
Epoch [21/51], Iter [230/391] Loss: 0.1994
Epoch [21/51], Iter [240/391] Loss: 0.2110
Epoch [21/51], Iter [250/391] Loss: 0.2292
Epoch [21/51], Iter [260/391] Loss: 0.1597
Epoch [21/51], Iter [270/391] Loss: 0.1712
Epoch [21/51], Iter [280/391] Loss: 0.1936
Epoch [21/51], Iter [290/391] Loss: 0.1961
Epoch [21/51], Iter [300/391] Loss: 0.2047
Epoch [21/51], Iter [310/391] Loss: 0.2304
Epoch [21/51], Iter [320/391] Loss: 0.2012
Epoch [21/51], Iter [330/391] Loss: 0.1574
Epoch [21/51], Iter [340/391] Loss: 0.2183
Epoch [21/51], Iter [350/391] Loss: 0.2195
Epoch [21/51], Iter [360/391] Loss: 0.2264
Epoch [21/51], Iter [370/391] Loss: 0.2013
Epoch [21/51], Iter [380/391] Loss: 0.1547
Epoch [21/51], Iter [390/391] Loss: 0.1447
Epoch [22/51], Iter [10/391] Loss: 0.1557
Epoch [22/51], Iter [20/391] Loss: 0.1086
Epoch [22/51], Iter [30/391] Loss: 0.1611
Epoch [22/51], Iter [40/391] Loss: 0.1483
Epoch [22/51], Iter [50/391] Loss: 0.1393
Epoch [22/51], Iter [60/391] Loss: 0.1523
Epoch [22/51], Iter [70/391] Loss: 0.1403
Epoch [22/51], Iter [80/391] Loss: 0.1808
Epoch [22/51], Iter [90/391] Loss: 0.1522
Epoch [22/51], Iter [100/391] Loss: 0.1900
Epoch [22/51], Iter [110/391] Loss: 0.1194
Epoch [22/51], Iter [120/391] Loss: 0.1464
Epoch [22/51], Iter [130/391] Loss: 0.1427
Epoch [22/51], Iter [140/391] Loss: 0.1806
Epoch [22/51], Iter [150/391] Loss: 0.1974
Epoch [22/51], Iter [160/391] Loss: 0.1922
Epoch [22/51], Iter [170/391] Loss: 0.1586
Epoch [22/51], Iter [180/391] Loss: 0.1260
Epoch [22/51], Iter [190/391] Loss: 0.1582
Epoch [22/51], Iter [200/391] Loss: 0.1628
Epoch [22/51], Iter [210/391] Loss: 0.1550
Epoch [22/51], Iter [220/391] Loss: 0.1150
Epoch [22/51], Iter [230/391] Loss: 0.1696
Epoch [22/51], Iter [240/391] Loss: 0.1659
Epoch [22/51], Iter [250/391] Loss: 0.1529
Epoch [22/51], Iter [260/391] Loss: 0.1520
Epoch [22/51], Iter [270/391] Loss: 0.1467
Epoch [22/51], Iter [280/391] Loss: 0.1896
Epoch [22/51], Iter [290/391] Loss: 0.1442
Epoch [22/51], Iter [300/391] Loss: 0.1816
Epoch [22/51], Iter [310/391] Loss: 0.1425
Epoch [22/51], Iter [320/391] Loss: 0.2079
Epoch [22/51], Iter [330/391] Loss: 0.2279
Epoch [22/51], Iter [340/391] Loss: 0.1683
Epoch [22/51], Iter [350/391] Loss: 0.1800
Epoch [22/51], Iter [360/391] Loss: 0.2558
Epoch [22/51], Iter [370/391] Loss: 0.2041
Epoch [22/51], Iter [380/391] Loss: 0.1873
Epoch [22/51], Iter [390/391] Loss: 0.1931
Epoch [23/51], Iter [10/391] Loss: 0.1784
Epoch [23/51], Iter [20/391] Loss: 0.1394
Epoch [23/51], Iter [30/391] Loss: 0.1350
Epoch [23/51], Iter [40/391] Loss: 0.1317
Epoch [23/51], Iter [50/391] Loss: 0.1360
Epoch [23/51], Iter [60/391] Loss: 0.1999
Epoch [23/51], Iter [70/391] Loss: 0.1244
Epoch [23/51], Iter [80/391] Loss: 0.2092
Epoch [23/51], Iter [90/391] Loss: 0.1855
Epoch [23/51], Iter [100/391] Loss: 0.1427
Epoch [23/51], Iter [110/391] Loss: 0.2154
Epoch [23/51], Iter [120/391] Loss: 0.1285
Epoch [23/51], Iter [130/391] Loss: 0.1742
Epoch [23/51], Iter [140/391] Loss: 0.1510
Epoch [23/51], Iter [150/391] Loss: 0.1561
Epoch [23/51], Iter [160/391] Loss: 0.1814
Epoch [23/51], Iter [170/391] Loss: 0.1449
Epoch [23/51], Iter [180/391] Loss: 0.1938
Epoch [23/51], Iter [190/391] Loss: 0.1659
Epoch [23/51], Iter [200/391] Loss: 0.2382
Epoch [23/51], Iter [210/391] Loss: 0.1907
Epoch [23/51], Iter [220/391] Loss: 0.1666
Epoch [23/51], Iter [230/391] Loss: 0.1696
Epoch [23/51], Iter [240/391] Loss: 0.1878
Epoch [23/51], Iter [250/391] Loss: 0.1542
Epoch [23/51], Iter [260/391] Loss: 0.1465
Epoch [23/51], Iter [270/391] Loss: 0.2158
Epoch [23/51], Iter [280/391] Loss: 0.1408
Epoch [23/51], Iter [290/391] Loss: 0.1739
Epoch [23/51], Iter [300/391] Loss: 0.2165
Epoch [23/51], Iter [310/391] Loss: 0.1831
Epoch [23/51], Iter [320/391] Loss: 0.1804
Epoch [23/51], Iter [330/391] Loss: 0.1498
Epoch [23/51], Iter [340/391] Loss: 0.2048
Epoch [23/51], Iter [350/391] Loss: 0.1698
Epoch [23/51], Iter [360/391] Loss: 0.1551
Epoch [23/51], Iter [370/391] Loss: 0.1346
Epoch [23/51], Iter [380/391] Loss: 0.1355
Epoch [23/51], Iter [390/391] Loss: 0.1687
Epoch [24/51], Iter [10/391] Loss: 0.1303
Epoch [24/51], Iter [20/391] Loss: 0.1637
Epoch [24/51], Iter [30/391] Loss: 0.1296
Epoch [24/51], Iter [40/391] Loss: 0.1879
Epoch [24/51], Iter [50/391] Loss: 0.2202
Epoch [24/51], Iter [60/391] Loss: 0.1986
Epoch [24/51], Iter [70/391] Loss: 0.1633
Epoch [24/51], Iter [80/391] Loss: 0.1614
Epoch [24/51], Iter [90/391] Loss: 0.1420
Epoch [24/51], Iter [100/391] Loss: 0.1940
Epoch [24/51], Iter [110/391] Loss: 0.1004
Epoch [24/51], Iter [120/391] Loss: 0.1483
Epoch [24/51], Iter [130/391] Loss: 0.1686
Epoch [24/51], Iter [140/391] Loss: 0.1300
Epoch [24/51], Iter [150/391] Loss: 0.2004
Epoch [24/51], Iter [160/391] Loss: 0.1001
Epoch [24/51], Iter [170/391] Loss: 0.1276
Epoch [24/51], Iter [180/391] Loss: 0.1167
Epoch [24/51], Iter [190/391] Loss: 0.1413
Epoch [24/51], Iter [200/391] Loss: 0.1978
Epoch [24/51], Iter [210/391] Loss: 0.1966
Epoch [24/51], Iter [220/391] Loss: 0.1206
Epoch [24/51], Iter [230/391] Loss: 0.1654
Epoch [24/51], Iter [240/391] Loss: 0.1835
Epoch [24/51], Iter [250/391] Loss: 0.2339
Epoch [24/51], Iter [260/391] Loss: 0.1770
Epoch [24/51], Iter [270/391] Loss: 0.1865
Epoch [24/51], Iter [280/391] Loss: 0.1550
Epoch [24/51], Iter [290/391] Loss: 0.1687
Epoch [24/51], Iter [300/391] Loss: 0.1250
Epoch [24/51], Iter [310/391] Loss: 0.1626
Epoch [24/51], Iter [320/391] Loss: 0.1649
Epoch [24/51], Iter [330/391] Loss: 0.1315
Epoch [24/51], Iter [340/391] Loss: 0.1858
Epoch [24/51], Iter [350/391] Loss: 0.1250
Epoch [24/51], Iter [360/391] Loss: 0.2379
Epoch [24/51], Iter [370/391] Loss: 0.1629
Epoch [24/51], Iter [380/391] Loss: 0.2082
Epoch [24/51], Iter [390/391] Loss: 0.1149
Epoch [25/51], Iter [10/391] Loss: 0.1964
Epoch [25/51], Iter [20/391] Loss: 0.1591
Epoch [25/51], Iter [30/391] Loss: 0.0960
Epoch [25/51], Iter [40/391] Loss: 0.1605
Epoch [25/51], Iter [50/391] Loss: 0.1179
Epoch [25/51], Iter [60/391] Loss: 0.1425
Epoch [25/51], Iter [70/391] Loss: 0.1741
Epoch [25/51], Iter [80/391] Loss: 0.1520
Epoch [25/51], Iter [90/391] Loss: 0.1410
Epoch [25/51], Iter [100/391] Loss: 0.1322
Epoch [25/51], Iter [110/391] Loss: 0.1469
Epoch [25/51], Iter [120/391] Loss: 0.1698
Epoch [25/51], Iter [130/391] Loss: 0.1604
Epoch [25/51], Iter [140/391] Loss: 0.1310
Epoch [25/51], Iter [150/391] Loss: 0.1502
Epoch [25/51], Iter [160/391] Loss: 0.1696
Epoch [25/51], Iter [170/391] Loss: 0.1894
Epoch [25/51], Iter [180/391] Loss: 0.1271
Epoch [25/51], Iter [190/391] Loss: 0.1371
Epoch [25/51], Iter [200/391] Loss: 0.1772
Epoch [25/51], Iter [210/391] Loss: 0.1198
Epoch [25/51], Iter [220/391] Loss: 0.1611
Epoch [25/51], Iter [230/391] Loss: 0.1937
Epoch [25/51], Iter [240/391] Loss: 0.1589
Epoch [25/51], Iter [250/391] Loss: 0.1402
Epoch [25/51], Iter [260/391] Loss: 0.1615
Epoch [25/51], Iter [270/391] Loss: 0.1195
Epoch [25/51], Iter [280/391] Loss: 0.1842
Epoch [25/51], Iter [290/391] Loss: 0.1789
Epoch [25/51], Iter [300/391] Loss: 0.1234
Epoch [25/51], Iter [310/391] Loss: 0.1159
Epoch [25/51], Iter [320/391] Loss: 0.1671
Epoch [25/51], Iter [330/391] Loss: 0.1645
Epoch [25/51], Iter [340/391] Loss: 0.1360
Epoch [25/51], Iter [350/391] Loss: 0.1700
Epoch [25/51], Iter [360/391] Loss: 0.1689
Epoch [25/51], Iter [370/391] Loss: 0.1042
Epoch [25/51], Iter [380/391] Loss: 0.1639
Epoch [25/51], Iter [390/391] Loss: 0.1839
Epoch [26/51], Iter [10/391] Loss: 0.0770
Epoch [26/51], Iter [20/391] Loss: 0.1218
Epoch [26/51], Iter [30/391] Loss: 0.1252
Epoch [26/51], Iter [40/391] Loss: 0.0726
Epoch [26/51], Iter [50/391] Loss: 0.1718
Epoch [26/51], Iter [60/391] Loss: 0.1218
Epoch [26/51], Iter [70/391] Loss: 0.1513
Epoch [26/51], Iter [80/391] Loss: 0.1087
Epoch [26/51], Iter [90/391] Loss: 0.1970
Epoch [26/51], Iter [100/391] Loss: 0.1406
Epoch [26/51], Iter [110/391] Loss: 0.1317
Epoch [26/51], Iter [120/391] Loss: 0.1539
Epoch [26/51], Iter [130/391] Loss: 0.1536
Epoch [26/51], Iter [140/391] Loss: 0.1697
Epoch [26/51], Iter [150/391] Loss: 0.2012
Epoch [26/51], Iter [160/391] Loss: 0.0935
Epoch [26/51], Iter [170/391] Loss: 0.1466
Epoch [26/51], Iter [180/391] Loss: 0.1905
Epoch [26/51], Iter [190/391] Loss: 0.1436
Epoch [26/51], Iter [200/391] Loss: 0.1676
Epoch [26/51], Iter [210/391] Loss: 0.1616
Epoch [26/51], Iter [220/391] Loss: 0.1124
Epoch [26/51], Iter [230/391] Loss: 0.1407
Epoch [26/51], Iter [240/391] Loss: 0.1445
Epoch [26/51], Iter [250/391] Loss: 0.1320
Epoch [26/51], Iter [260/391] Loss: 0.1836
Epoch [26/51], Iter [270/391] Loss: 0.1753
Epoch [26/51], Iter [280/391] Loss: 0.1268
Epoch [26/51], Iter [290/391] Loss: 0.1158
Epoch [26/51], Iter [300/391] Loss: 0.1414
Epoch [26/51], Iter [310/391] Loss: 0.1877
Epoch [26/51], Iter [320/391] Loss: 0.1363
Epoch [26/51], Iter [330/391] Loss: 0.1323
Epoch [26/51], Iter [340/391] Loss: 0.1298
Epoch [26/51], Iter [350/391] Loss: 0.1325
Epoch [26/51], Iter [360/391] Loss: 0.1624
Epoch [26/51], Iter [370/391] Loss: 0.1484
Epoch [26/51], Iter [380/391] Loss: 0.1922
Epoch [26/51], Iter [390/391] Loss: 0.1740
Epoch [27/51], Iter [10/391] Loss: 0.1525
Epoch [27/51], Iter [20/391] Loss: 0.1352
Epoch [27/51], Iter [30/391] Loss: 0.1151
Epoch [27/51], Iter [40/391] Loss: 0.1451
Epoch [27/51], Iter [50/391] Loss: 0.1236
Epoch [27/51], Iter [60/391] Loss: 0.1287
Epoch [27/51], Iter [70/391] Loss: 0.1598
Epoch [27/51], Iter [80/391] Loss: 0.1828
Epoch [27/51], Iter [90/391] Loss: 0.1299
Epoch [27/51], Iter [100/391] Loss: 0.2057
Epoch [27/51], Iter [110/391] Loss: 0.1436
Epoch [27/51], Iter [120/391] Loss: 0.1231
Epoch [27/51], Iter [130/391] Loss: 0.1445
Epoch [27/51], Iter [140/391] Loss: 0.1690
Epoch [27/51], Iter [150/391] Loss: 0.1602
Epoch [27/51], Iter [160/391] Loss: 0.1327
Epoch [27/51], Iter [170/391] Loss: 0.1202
Epoch [27/51], Iter [180/391] Loss: 0.1086
Epoch [27/51], Iter [190/391] Loss: 0.1247
Epoch [27/51], Iter [200/391] Loss: 0.1411
Epoch [27/51], Iter [210/391] Loss: 0.1541
Epoch [27/51], Iter [220/391] Loss: 0.1549
Epoch [27/51], Iter [230/391] Loss: 0.0929
Epoch [27/51], Iter [240/391] Loss: 0.1786
Epoch [27/51], Iter [250/391] Loss: 0.1614
Epoch [27/51], Iter [260/391] Loss: 0.1409
Epoch [27/51], Iter [270/391] Loss: 0.1802
Epoch [27/51], Iter [280/391] Loss: 0.1145
Epoch [27/51], Iter [290/391] Loss: 0.1172
Epoch [27/51], Iter [300/391] Loss: 0.1143
Epoch [27/51], Iter [310/391] Loss: 0.1338
Epoch [27/51], Iter [320/391] Loss: 0.1767
Epoch [27/51], Iter [330/391] Loss: 0.1213
Epoch [27/51], Iter [340/391] Loss: 0.1571
Epoch [27/51], Iter [350/391] Loss: 0.1541
Epoch [27/51], Iter [360/391] Loss: 0.1751
Epoch [27/51], Iter [370/391] Loss: 0.1214
Epoch [27/51], Iter [380/391] Loss: 0.1544
Epoch [27/51], Iter [390/391] Loss: 0.1416
Epoch [28/51], Iter [10/391] Loss: 0.1515
Epoch [28/51], Iter [20/391] Loss: 0.1002
Epoch [28/51], Iter [30/391] Loss: 0.1030
Epoch [28/51], Iter [40/391] Loss: 0.1582
Epoch [28/51], Iter [50/391] Loss: 0.0875
Epoch [28/51], Iter [60/391] Loss: 0.1055
Epoch [28/51], Iter [70/391] Loss: 0.1442
Epoch [28/51], Iter [80/391] Loss: 0.1204
Epoch [28/51], Iter [90/391] Loss: 0.1111
Epoch [28/51], Iter [100/391] Loss: 0.1295
Epoch [28/51], Iter [110/391] Loss: 0.1159
Epoch [28/51], Iter [120/391] Loss: 0.1541
Epoch [28/51], Iter [130/391] Loss: 0.1312
Epoch [28/51], Iter [140/391] Loss: 0.1005
Epoch [28/51], Iter [150/391] Loss: 0.1301
Epoch [28/51], Iter [160/391] Loss: 0.1323
Epoch [28/51], Iter [170/391] Loss: 0.1186
Epoch [28/51], Iter [180/391] Loss: 0.1647
Epoch [28/51], Iter [190/391] Loss: 0.1222
Epoch [28/51], Iter [200/391] Loss: 0.1398
Epoch [28/51], Iter [210/391] Loss: 0.1839
Epoch [28/51], Iter [220/391] Loss: 0.1313
Epoch [28/51], Iter [230/391] Loss: 0.0850
Epoch [28/51], Iter [240/391] Loss: 0.1354
Epoch [28/51], Iter [250/391] Loss: 0.1602
Epoch [28/51], Iter [260/391] Loss: 0.1588
Epoch [28/51], Iter [270/391] Loss: 0.1245
Epoch [28/51], Iter [280/391] Loss: 0.1808
Epoch [28/51], Iter [290/391] Loss: 0.1412
Epoch [28/51], Iter [300/391] Loss: 0.1278
Epoch [28/51], Iter [310/391] Loss: 0.1600
Epoch [28/51], Iter [320/391] Loss: 0.1737
Epoch [28/51], Iter [330/391] Loss: 0.1266
Epoch [28/51], Iter [340/391] Loss: 0.1585
Epoch [28/51], Iter [350/391] Loss: 0.1274
Epoch [28/51], Iter [360/391] Loss: 0.1136
Epoch [28/51], Iter [370/391] Loss: 0.1430
Epoch [28/51], Iter [380/391] Loss: 0.1188
Epoch [28/51], Iter [390/391] Loss: 0.1160
Epoch [29/51], Iter [10/391] Loss: 0.1313
Epoch [29/51], Iter [20/391] Loss: 0.1107
Epoch [29/51], Iter [30/391] Loss: 0.1454
Epoch [29/51], Iter [40/391] Loss: 0.1333
Epoch [29/51], Iter [50/391] Loss: 0.1130
Epoch [29/51], Iter [60/391] Loss: 0.1313
Epoch [29/51], Iter [70/391] Loss: 0.1365
Epoch [29/51], Iter [80/391] Loss: 0.1227
Epoch [29/51], Iter [90/391] Loss: 0.1125
Epoch [29/51], Iter [100/391] Loss: 0.1277
Epoch [29/51], Iter [110/391] Loss: 0.0966
Epoch [29/51], Iter [120/391] Loss: 0.1315
Epoch [29/51], Iter [130/391] Loss: 0.0813
Epoch [29/51], Iter [140/391] Loss: 0.1236
Epoch [29/51], Iter [150/391] Loss: 0.1456
Epoch [29/51], Iter [160/391] Loss: 0.0983
Epoch [29/51], Iter [170/391] Loss: 0.1231
Epoch [29/51], Iter [180/391] Loss: 0.1348
Epoch [29/51], Iter [190/391] Loss: 0.1028
Epoch [29/51], Iter [200/391] Loss: 0.1399
Epoch [29/51], Iter [210/391] Loss: 0.1285
Epoch [29/51], Iter [220/391] Loss: 0.1060
Epoch [29/51], Iter [230/391] Loss: 0.1226
Epoch [29/51], Iter [240/391] Loss: 0.1787
Epoch [29/51], Iter [250/391] Loss: 0.1431
Epoch [29/51], Iter [260/391] Loss: 0.1472
Epoch [29/51], Iter [270/391] Loss: 0.1563
Epoch [29/51], Iter [280/391] Loss: 0.1554
Epoch [29/51], Iter [290/391] Loss: 0.1139
Epoch [29/51], Iter [300/391] Loss: 0.0891
Epoch [29/51], Iter [310/391] Loss: 0.1440
Epoch [29/51], Iter [320/391] Loss: 0.1484
Epoch [29/51], Iter [330/391] Loss: 0.1262
Epoch [29/51], Iter [340/391] Loss: 0.1414
Epoch [29/51], Iter [350/391] Loss: 0.1501
Epoch [29/51], Iter [360/391] Loss: 0.1078
Epoch [29/51], Iter [370/391] Loss: 0.0964
Epoch [29/51], Iter [380/391] Loss: 0.1359
Epoch [29/51], Iter [390/391] Loss: 0.1641
Epoch [30/51], Iter [10/391] Loss: 0.0982
Epoch [30/51], Iter [20/391] Loss: 0.1128
Epoch [30/51], Iter [30/391] Loss: 0.0995
Epoch [30/51], Iter [40/391] Loss: 0.1636
Epoch [30/51], Iter [50/391] Loss: 0.1490
Epoch [30/51], Iter [60/391] Loss: 0.1477
Epoch [30/51], Iter [70/391] Loss: 0.1147
Epoch [30/51], Iter [80/391] Loss: 0.0860
Epoch [30/51], Iter [90/391] Loss: 0.1123
Epoch [30/51], Iter [100/391] Loss: 0.1292
Epoch [30/51], Iter [110/391] Loss: 0.1206
Epoch [30/51], Iter [120/391] Loss: 0.1373
Epoch [30/51], Iter [130/391] Loss: 0.1271
Epoch [30/51], Iter [140/391] Loss: 0.1412
Epoch [30/51], Iter [150/391] Loss: 0.1292
Epoch [30/51], Iter [160/391] Loss: 0.1245
Epoch [30/51], Iter [170/391] Loss: 0.1190
Epoch [30/51], Iter [180/391] Loss: 0.0861
Epoch [30/51], Iter [190/391] Loss: 0.1084
Epoch [30/51], Iter [200/391] Loss: 0.1446
Epoch [30/51], Iter [210/391] Loss: 0.1720
Epoch [30/51], Iter [220/391] Loss: 0.1066
Epoch [30/51], Iter [230/391] Loss: 0.1175
Epoch [30/51], Iter [240/391] Loss: 0.1280
Epoch [30/51], Iter [250/391] Loss: 0.1459
Epoch [30/51], Iter [260/391] Loss: 0.1226
Epoch [30/51], Iter [270/391] Loss: 0.1298
Epoch [30/51], Iter [280/391] Loss: 0.1491
Epoch [30/51], Iter [290/391] Loss: 0.1041
Epoch [30/51], Iter [300/391] Loss: 0.1344
Epoch [30/51], Iter [310/391] Loss: 0.1201
Epoch [30/51], Iter [320/391] Loss: 0.1255
Epoch [30/51], Iter [330/391] Loss: 0.1245
Epoch [30/51], Iter [340/391] Loss: 0.1840
Epoch [30/51], Iter [350/391] Loss: 0.1181
Epoch [30/51], Iter [360/391] Loss: 0.1506
Epoch [30/51], Iter [370/391] Loss: 0.0949
Epoch [30/51], Iter [380/391] Loss: 0.1652
Epoch [30/51], Iter [390/391] Loss: 0.1271
[Saving Checkpoint]
Epoch [31/51], Iter [10/391] Loss: 0.1297
Epoch [31/51], Iter [20/391] Loss: 0.1378
Epoch [31/51], Iter [30/391] Loss: 0.0923
Epoch [31/51], Iter [40/391] Loss: 0.1120
Epoch [31/51], Iter [50/391] Loss: 0.1543
Epoch [31/51], Iter [60/391] Loss: 0.0934
Epoch [31/51], Iter [70/391] Loss: 0.1131
Epoch [31/51], Iter [80/391] Loss: 0.1360
Epoch [31/51], Iter [90/391] Loss: 0.1222
Epoch [31/51], Iter [100/391] Loss: 0.1013
Epoch [31/51], Iter [110/391] Loss: 0.1367
Epoch [31/51], Iter [120/391] Loss: 0.1087
Epoch [31/51], Iter [130/391] Loss: 0.1483
Epoch [31/51], Iter [140/391] Loss: 0.1517
Epoch [31/51], Iter [150/391] Loss: 0.1280
Epoch [31/51], Iter [160/391] Loss: 0.1163
Epoch [31/51], Iter [170/391] Loss: 0.0938
Epoch [31/51], Iter [180/391] Loss: 0.1451
Epoch [31/51], Iter [190/391] Loss: 0.1240
Epoch [31/51], Iter [200/391] Loss: 0.0935
Epoch [31/51], Iter [210/391] Loss: 0.1051
Epoch [31/51], Iter [220/391] Loss: 0.1088
Epoch [31/51], Iter [230/391] Loss: 0.0975
Epoch [31/51], Iter [240/391] Loss: 0.1543
Epoch [31/51], Iter [250/391] Loss: 0.1543
Epoch [31/51], Iter [260/391] Loss: 0.1381
Epoch [31/51], Iter [270/391] Loss: 0.0815
Epoch [31/51], Iter [280/391] Loss: 0.1341
Epoch [31/51], Iter [290/391] Loss: 0.1409
Epoch [31/51], Iter [300/391] Loss: 0.1475
Epoch [31/51], Iter [310/391] Loss: 0.1328
Epoch [31/51], Iter [320/391] Loss: 0.0914
Epoch [31/51], Iter [330/391] Loss: 0.1298
Epoch [31/51], Iter [340/391] Loss: 0.1417
Epoch [31/51], Iter [350/391] Loss: 0.1119
Epoch [31/51], Iter [360/391] Loss: 0.1160
Epoch [31/51], Iter [370/391] Loss: 0.1168
Epoch [31/51], Iter [380/391] Loss: 0.1105
Epoch [31/51], Iter [390/391] Loss: 0.1031
Epoch [32/51], Iter [10/391] Loss: 0.1215
Epoch [32/51], Iter [20/391] Loss: 0.1064
Epoch [32/51], Iter [30/391] Loss: 0.1085
Epoch [32/51], Iter [40/391] Loss: 0.1315
Epoch [32/51], Iter [50/391] Loss: 0.1253
Epoch [32/51], Iter [60/391] Loss: 0.1160
Epoch [32/51], Iter [70/391] Loss: 0.1135
Epoch [32/51], Iter [80/391] Loss: 0.1215
Epoch [32/51], Iter [90/391] Loss: 0.1079
Epoch [32/51], Iter [100/391] Loss: 0.1143
Epoch [32/51], Iter [110/391] Loss: 0.0719
Epoch [32/51], Iter [120/391] Loss: 0.1025
Epoch [32/51], Iter [130/391] Loss: 0.1095
Epoch [32/51], Iter [140/391] Loss: 0.1337
Epoch [32/51], Iter [150/391] Loss: 0.1359
Epoch [32/51], Iter [160/391] Loss: 0.1307
Epoch [32/51], Iter [170/391] Loss: 0.1731
Epoch [32/51], Iter [180/391] Loss: 0.1605
Epoch [32/51], Iter [190/391] Loss: 0.0963
Epoch [32/51], Iter [200/391] Loss: 0.1451
Epoch [32/51], Iter [210/391] Loss: 0.1315
Epoch [32/51], Iter [220/391] Loss: 0.1084
Epoch [32/51], Iter [230/391] Loss: 0.1357
Epoch [32/51], Iter [240/391] Loss: 0.1094
Epoch [32/51], Iter [250/391] Loss: 0.1050
Epoch [32/51], Iter [260/391] Loss: 0.1289
Epoch [32/51], Iter [270/391] Loss: 0.1054
Epoch [32/51], Iter [280/391] Loss: 0.1523
Epoch [32/51], Iter [290/391] Loss: 0.1235
Epoch [32/51], Iter [300/391] Loss: 0.1154
Epoch [32/51], Iter [310/391] Loss: 0.1362
Epoch [32/51], Iter [320/391] Loss: 0.0909
Epoch [32/51], Iter [330/391] Loss: 0.1299
Epoch [32/51], Iter [340/391] Loss: 0.1214
Epoch [32/51], Iter [350/391] Loss: 0.1580
Epoch [32/51], Iter [360/391] Loss: 0.1238
Epoch [32/51], Iter [370/391] Loss: 0.1094
Epoch [32/51], Iter [380/391] Loss: 0.0643
Epoch [32/51], Iter [390/391] Loss: 0.1307
Epoch [33/51], Iter [10/391] Loss: 0.1252
Epoch [33/51], Iter [20/391] Loss: 0.1201
Epoch [33/51], Iter [30/391] Loss: 0.1305
Epoch [33/51], Iter [40/391] Loss: 0.0723
Epoch [33/51], Iter [50/391] Loss: 0.0740
Epoch [33/51], Iter [60/391] Loss: 0.1129
Epoch [33/51], Iter [70/391] Loss: 0.1253
Epoch [33/51], Iter [80/391] Loss: 0.0950
Epoch [33/51], Iter [90/391] Loss: 0.1060
Epoch [33/51], Iter [100/391] Loss: 0.1005
Epoch [33/51], Iter [110/391] Loss: 0.1081
Epoch [33/51], Iter [120/391] Loss: 0.1470
Epoch [33/51], Iter [130/391] Loss: 0.1092
Epoch [33/51], Iter [140/391] Loss: 0.1081
Epoch [33/51], Iter [150/391] Loss: 0.0942
Epoch [33/51], Iter [160/391] Loss: 0.1227
Epoch [33/51], Iter [170/391] Loss: 0.0742
Epoch [33/51], Iter [180/391] Loss: 0.1299
Epoch [33/51], Iter [190/391] Loss: 0.1207
Epoch [33/51], Iter [200/391] Loss: 0.1211
Epoch [33/51], Iter [210/391] Loss: 0.0893
Epoch [33/51], Iter [220/391] Loss: 0.1041
Epoch [33/51], Iter [230/391] Loss: 0.1220
Epoch [33/51], Iter [240/391] Loss: 0.0980
Epoch [33/51], Iter [250/391] Loss: 0.1195
Epoch [33/51], Iter [260/391] Loss: 0.1021
Epoch [33/51], Iter [270/391] Loss: 0.1223
Epoch [33/51], Iter [280/391] Loss: 0.1424
Epoch [33/51], Iter [290/391] Loss: 0.1241
Epoch [33/51], Iter [300/391] Loss: 0.1165
Epoch [33/51], Iter [310/391] Loss: 0.1350
Epoch [33/51], Iter [320/391] Loss: 0.1037
Epoch [33/51], Iter [330/391] Loss: 0.0913
Epoch [33/51], Iter [340/391] Loss: 0.1081
Epoch [33/51], Iter [350/391] Loss: 0.1401
Epoch [33/51], Iter [360/391] Loss: 0.1341
Epoch [33/51], Iter [370/391] Loss: 0.1325
Epoch [33/51], Iter [380/391] Loss: 0.2021
Epoch [33/51], Iter [390/391] Loss: 0.1269
Epoch [34/51], Iter [10/391] Loss: 0.0842
Epoch [34/51], Iter [20/391] Loss: 0.1383
Epoch [34/51], Iter [30/391] Loss: 0.1189
Epoch [34/51], Iter [40/391] Loss: 0.1388
Epoch [34/51], Iter [50/391] Loss: 0.1285
Epoch [34/51], Iter [60/391] Loss: 0.0705
Epoch [34/51], Iter [70/391] Loss: 0.1033
Epoch [34/51], Iter [80/391] Loss: 0.1219
Epoch [34/51], Iter [90/391] Loss: 0.0934
Epoch [34/51], Iter [100/391] Loss: 0.1115
Epoch [34/51], Iter [110/391] Loss: 0.0935
Epoch [34/51], Iter [120/391] Loss: 0.1227
Epoch [34/51], Iter [130/391] Loss: 0.1245
Epoch [34/51], Iter [140/391] Loss: 0.1008
Epoch [34/51], Iter [150/391] Loss: 0.0747
Epoch [34/51], Iter [160/391] Loss: 0.1370
Epoch [34/51], Iter [170/391] Loss: 0.1364
Epoch [34/51], Iter [180/391] Loss: 0.1456
Epoch [34/51], Iter [190/391] Loss: 0.1233
Epoch [34/51], Iter [200/391] Loss: 0.1092
Epoch [34/51], Iter [210/391] Loss: 0.0939
Epoch [34/51], Iter [220/391] Loss: 0.1086
Epoch [34/51], Iter [230/391] Loss: 0.0774
Epoch [34/51], Iter [240/391] Loss: 0.1127
Epoch [34/51], Iter [250/391] Loss: 0.1102
Epoch [34/51], Iter [260/391] Loss: 0.1353
Epoch [34/51], Iter [270/391] Loss: 0.1091
Epoch [34/51], Iter [280/391] Loss: 0.1229
Epoch [34/51], Iter [290/391] Loss: 0.0955
Epoch [34/51], Iter [300/391] Loss: 0.1010
Epoch [34/51], Iter [310/391] Loss: 0.1700
Epoch [34/51], Iter [320/391] Loss: 0.1163
Epoch [34/51], Iter [330/391] Loss: 0.1409
Epoch [34/51], Iter [340/391] Loss: 0.0807
Epoch [34/51], Iter [350/391] Loss: 0.1254
Epoch [34/51], Iter [360/391] Loss: 0.0957
Epoch [34/51], Iter [370/391] Loss: 0.1745
Epoch [34/51], Iter [380/391] Loss: 0.1255
Epoch [34/51], Iter [390/391] Loss: 0.1418
Epoch [35/51], Iter [10/391] Loss: 0.1149
Epoch [35/51], Iter [20/391] Loss: 0.1204
Epoch [35/51], Iter [30/391] Loss: 0.1428
Epoch [35/51], Iter [40/391] Loss: 0.0836
Epoch [35/51], Iter [50/391] Loss: 0.0821
Epoch [35/51], Iter [60/391] Loss: 0.1154
Epoch [35/51], Iter [70/391] Loss: 0.1017
Epoch [35/51], Iter [80/391] Loss: 0.1125
Epoch [35/51], Iter [90/391] Loss: 0.0931
Epoch [35/51], Iter [100/391] Loss: 0.1069
Epoch [35/51], Iter [110/391] Loss: 0.1202
Epoch [35/51], Iter [120/391] Loss: 0.1121
Epoch [35/51], Iter [130/391] Loss: 0.0826
Epoch [35/51], Iter [140/391] Loss: 0.0973
Epoch [35/51], Iter [150/391] Loss: 0.0994
Epoch [35/51], Iter [160/391] Loss: 0.1225
Epoch [35/51], Iter [170/391] Loss: 0.0976
Epoch [35/51], Iter [180/391] Loss: 0.0850
Epoch [35/51], Iter [190/391] Loss: 0.1354
Epoch [35/51], Iter [200/391] Loss: 0.1123
Epoch [35/51], Iter [210/391] Loss: 0.0912
Epoch [35/51], Iter [220/391] Loss: 0.1076
Epoch [35/51], Iter [230/391] Loss: 0.1168
Epoch [35/51], Iter [240/391] Loss: 0.0983
Epoch [35/51], Iter [250/391] Loss: 0.1297
Epoch [35/51], Iter [260/391] Loss: 0.0885
Epoch [35/51], Iter [270/391] Loss: 0.1162
Epoch [35/51], Iter [280/391] Loss: 0.1068
Epoch [35/51], Iter [290/391] Loss: 0.1228
Epoch [35/51], Iter [300/391] Loss: 0.1092
Epoch [35/51], Iter [310/391] Loss: 0.1316
Epoch [35/51], Iter [320/391] Loss: 0.1046
Epoch [35/51], Iter [330/391] Loss: 0.1401
Epoch [35/51], Iter [340/391] Loss: 0.1180
Epoch [35/51], Iter [350/391] Loss: 0.1511
Epoch [35/51], Iter [360/391] Loss: 0.1404
Epoch [35/51], Iter [370/391] Loss: 0.1039
Epoch [35/51], Iter [380/391] Loss: 0.1509
Epoch [35/51], Iter [390/391] Loss: 0.1417
Epoch [36/51], Iter [10/391] Loss: 0.1174
Epoch [36/51], Iter [20/391] Loss: 0.1188
Epoch [36/51], Iter [30/391] Loss: 0.1098
Epoch [36/51], Iter [40/391] Loss: 0.0802
Epoch [36/51], Iter [50/391] Loss: 0.1145
Epoch [36/51], Iter [60/391] Loss: 0.1020
Epoch [36/51], Iter [70/391] Loss: 0.1089
Epoch [36/51], Iter [80/391] Loss: 0.1384
Epoch [36/51], Iter [90/391] Loss: 0.0912
Epoch [36/51], Iter [100/391] Loss: 0.0997
Epoch [36/51], Iter [110/391] Loss: 0.1254
Epoch [36/51], Iter [120/391] Loss: 0.1169
Epoch [36/51], Iter [130/391] Loss: 0.0937
Epoch [36/51], Iter [140/391] Loss: 0.0854
Epoch [36/51], Iter [150/391] Loss: 0.1448
Epoch [36/51], Iter [160/391] Loss: 0.0984
Epoch [36/51], Iter [170/391] Loss: 0.1445
Epoch [36/51], Iter [180/391] Loss: 0.1358
Epoch [36/51], Iter [190/391] Loss: 0.1012
Epoch [36/51], Iter [200/391] Loss: 0.1163
Epoch [36/51], Iter [210/391] Loss: 0.1028
Epoch [36/51], Iter [220/391] Loss: 0.1038
Epoch [36/51], Iter [230/391] Loss: 0.1142
Epoch [36/51], Iter [240/391] Loss: 0.1162
Epoch [36/51], Iter [250/391] Loss: 0.1140
Epoch [36/51], Iter [260/391] Loss: 0.1007
Epoch [36/51], Iter [270/391] Loss: 0.1234
Epoch [36/51], Iter [280/391] Loss: 0.0902
Epoch [36/51], Iter [290/391] Loss: 0.1027
Epoch [36/51], Iter [300/391] Loss: 0.1335
Epoch [36/51], Iter [310/391] Loss: 0.1036
Epoch [36/51], Iter [320/391] Loss: 0.0803
Epoch [36/51], Iter [330/391] Loss: 0.1663
Epoch [36/51], Iter [340/391] Loss: 0.1273
Epoch [36/51], Iter [350/391] Loss: 0.0835
Epoch [36/51], Iter [360/391] Loss: 0.0962
Epoch [36/51], Iter [370/391] Loss: 0.0837
Epoch [36/51], Iter [380/391] Loss: 0.0625
Epoch [36/51], Iter [390/391] Loss: 0.1322
Epoch [37/51], Iter [10/391] Loss: 0.0999
Epoch [37/51], Iter [20/391] Loss: 0.1283
Epoch [37/51], Iter [30/391] Loss: 0.0903
Epoch [37/51], Iter [40/391] Loss: 0.0890
Epoch [37/51], Iter [50/391] Loss: 0.0798
Epoch [37/51], Iter [60/391] Loss: 0.0749
Epoch [37/51], Iter [70/391] Loss: 0.1266
Epoch [37/51], Iter [80/391] Loss: 0.1140
Epoch [37/51], Iter [90/391] Loss: 0.0626
Epoch [37/51], Iter [100/391] Loss: 0.0992
Epoch [37/51], Iter [110/391] Loss: 0.1035
Epoch [37/51], Iter [120/391] Loss: 0.1152
Epoch [37/51], Iter [130/391] Loss: 0.0749
Epoch [37/51], Iter [140/391] Loss: 0.0870
Epoch [37/51], Iter [150/391] Loss: 0.0875
Epoch [37/51], Iter [160/391] Loss: 0.1117
Epoch [37/51], Iter [170/391] Loss: 0.1208
Epoch [37/51], Iter [180/391] Loss: 0.0897
Epoch [37/51], Iter [190/391] Loss: 0.1045
Epoch [37/51], Iter [200/391] Loss: 0.1153
Epoch [37/51], Iter [210/391] Loss: 0.0882
Epoch [37/51], Iter [220/391] Loss: 0.0816
Epoch [37/51], Iter [230/391] Loss: 0.0992
Epoch [37/51], Iter [240/391] Loss: 0.1028
Epoch [37/51], Iter [250/391] Loss: 0.1076
Epoch [37/51], Iter [260/391] Loss: 0.0802
Epoch [37/51], Iter [270/391] Loss: 0.1019
Epoch [37/51], Iter [280/391] Loss: 0.0887
Epoch [37/51], Iter [290/391] Loss: 0.1384
Epoch [37/51], Iter [300/391] Loss: 0.1303
Epoch [37/51], Iter [310/391] Loss: 0.1321
Epoch [37/51], Iter [320/391] Loss: 0.1134
Epoch [37/51], Iter [330/391] Loss: 0.1001
Epoch [37/51], Iter [340/391] Loss: 0.0987
Epoch [37/51], Iter [350/391] Loss: 0.0922
Epoch [37/51], Iter [360/391] Loss: 0.1091
Epoch [37/51], Iter [370/391] Loss: 0.1018
Epoch [37/51], Iter [380/391] Loss: 0.0770
Epoch [37/51], Iter [390/391] Loss: 0.0886
Epoch [38/51], Iter [10/391] Loss: 0.1086
Epoch [38/51], Iter [20/391] Loss: 0.0960
Epoch [38/51], Iter [30/391] Loss: 0.1369
Epoch [38/51], Iter [40/391] Loss: 0.0655
Epoch [38/51], Iter [50/391] Loss: 0.0866
Epoch [38/51], Iter [60/391] Loss: 0.1018
Epoch [38/51], Iter [70/391] Loss: 0.0989
Epoch [38/51], Iter [80/391] Loss: 0.0762
Epoch [38/51], Iter [90/391] Loss: 0.0945
Epoch [38/51], Iter [100/391] Loss: 0.1028
Epoch [38/51], Iter [110/391] Loss: 0.0891
Epoch [38/51], Iter [120/391] Loss: 0.1184
Epoch [38/51], Iter [130/391] Loss: 0.0920
Epoch [38/51], Iter [140/391] Loss: 0.1140
Epoch [38/51], Iter [150/391] Loss: 0.1149
Epoch [38/51], Iter [160/391] Loss: 0.1078
Epoch [38/51], Iter [170/391] Loss: 0.0787
Epoch [38/51], Iter [180/391] Loss: 0.1063
Epoch [38/51], Iter [190/391] Loss: 0.1023
Epoch [38/51], Iter [200/391] Loss: 0.0617
Epoch [38/51], Iter [210/391] Loss: 0.0995
Epoch [38/51], Iter [220/391] Loss: 0.0910
Epoch [38/51], Iter [230/391] Loss: 0.1323
Epoch [38/51], Iter [240/391] Loss: 0.1201
Epoch [38/51], Iter [250/391] Loss: 0.1036
Epoch [38/51], Iter [260/391] Loss: 0.0761
Epoch [38/51], Iter [270/391] Loss: 0.1155
Epoch [38/51], Iter [280/391] Loss: 0.0964
Epoch [38/51], Iter [290/391] Loss: 0.0877
Epoch [38/51], Iter [300/391] Loss: 0.1084
Epoch [38/51], Iter [310/391] Loss: 0.1019
Epoch [38/51], Iter [320/391] Loss: 0.0968
Epoch [38/51], Iter [330/391] Loss: 0.1226
Epoch [38/51], Iter [340/391] Loss: 0.0930
Epoch [38/51], Iter [350/391] Loss: 0.0815
Epoch [38/51], Iter [360/391] Loss: 0.0900
Epoch [38/51], Iter [370/391] Loss: 0.1023
Epoch [38/51], Iter [380/391] Loss: 0.1044
Epoch [38/51], Iter [390/391] Loss: 0.0974
Epoch [39/51], Iter [10/391] Loss: 0.0831
Epoch [39/51], Iter [20/391] Loss: 0.0771
Epoch [39/51], Iter [30/391] Loss: 0.0710
Epoch [39/51], Iter [40/391] Loss: 0.0678
Epoch [39/51], Iter [50/391] Loss: 0.0659
Epoch [39/51], Iter [60/391] Loss: 0.0939
Epoch [39/51], Iter [70/391] Loss: 0.1194
Epoch [39/51], Iter [80/391] Loss: 0.1112
Epoch [39/51], Iter [90/391] Loss: 0.1131
Epoch [39/51], Iter [100/391] Loss: 0.0929
Epoch [39/51], Iter [110/391] Loss: 0.1431
Epoch [39/51], Iter [120/391] Loss: 0.0970
Epoch [39/51], Iter [130/391] Loss: 0.0665
Epoch [39/51], Iter [140/391] Loss: 0.0989
Epoch [39/51], Iter [150/391] Loss: 0.0706
Epoch [39/51], Iter [160/391] Loss: 0.0894
Epoch [39/51], Iter [170/391] Loss: 0.0673
Epoch [39/51], Iter [180/391] Loss: 0.1089
Epoch [39/51], Iter [190/391] Loss: 0.0834
Epoch [39/51], Iter [200/391] Loss: 0.0751
Epoch [39/51], Iter [210/391] Loss: 0.0807
Epoch [39/51], Iter [220/391] Loss: 0.0999
Epoch [39/51], Iter [230/391] Loss: 0.0677
Epoch [39/51], Iter [240/391] Loss: 0.0600
Epoch [39/51], Iter [250/391] Loss: 0.0932
Epoch [39/51], Iter [260/391] Loss: 0.1192
Epoch [39/51], Iter [270/391] Loss: 0.0896
Epoch [39/51], Iter [280/391] Loss: 0.0654
Epoch [39/51], Iter [290/391] Loss: 0.0935
Epoch [39/51], Iter [300/391] Loss: 0.0975
Epoch [39/51], Iter [310/391] Loss: 0.1107
Epoch [39/51], Iter [320/391] Loss: 0.0932
Epoch [39/51], Iter [330/391] Loss: 0.1153
Epoch [39/51], Iter [340/391] Loss: 0.0887
Epoch [39/51], Iter [350/391] Loss: 0.0725
Epoch [39/51], Iter [360/391] Loss: 0.1367
Epoch [39/51], Iter [370/391] Loss: 0.0982
Epoch [39/51], Iter [380/391] Loss: 0.1323
Epoch [39/51], Iter [390/391] Loss: 0.0914
Epoch [40/51], Iter [10/391] Loss: 0.0781
Epoch [40/51], Iter [20/391] Loss: 0.0856
Epoch [40/51], Iter [30/391] Loss: 0.0871
Epoch [40/51], Iter [40/391] Loss: 0.0708
Epoch [40/51], Iter [50/391] Loss: 0.0908
Epoch [40/51], Iter [60/391] Loss: 0.0578
Epoch [40/51], Iter [70/391] Loss: 0.0853
Epoch [40/51], Iter [80/391] Loss: 0.0746
Epoch [40/51], Iter [90/391] Loss: 0.0888
Epoch [40/51], Iter [100/391] Loss: 0.0930
Epoch [40/51], Iter [110/391] Loss: 0.1332
Epoch [40/51], Iter [120/391] Loss: 0.0697
Epoch [40/51], Iter [130/391] Loss: 0.0657
Epoch [40/51], Iter [140/391] Loss: 0.0761
Epoch [40/51], Iter [150/391] Loss: 0.1170
Epoch [40/51], Iter [160/391] Loss: 0.1164
Epoch [40/51], Iter [170/391] Loss: 0.1119
Epoch [40/51], Iter [180/391] Loss: 0.1009
Epoch [40/51], Iter [190/391] Loss: 0.1151
Epoch [40/51], Iter [200/391] Loss: 0.0843
Epoch [40/51], Iter [210/391] Loss: 0.0726
Epoch [40/51], Iter [220/391] Loss: 0.1025
Epoch [40/51], Iter [230/391] Loss: 0.0978
Epoch [40/51], Iter [240/391] Loss: 0.0930
Epoch [40/51], Iter [250/391] Loss: 0.0980
Epoch [40/51], Iter [260/391] Loss: 0.1010
Epoch [40/51], Iter [270/391] Loss: 0.0983
Epoch [40/51], Iter [280/391] Loss: 0.0848
Epoch [40/51], Iter [290/391] Loss: 0.0975
Epoch [40/51], Iter [300/391] Loss: 0.0923
Epoch [40/51], Iter [310/391] Loss: 0.0863
Epoch [40/51], Iter [320/391] Loss: 0.1211
Epoch [40/51], Iter [330/391] Loss: 0.0900
Epoch [40/51], Iter [340/391] Loss: 0.1041
Epoch [40/51], Iter [350/391] Loss: 0.1209
Epoch [40/51], Iter [360/391] Loss: 0.0874
Epoch [40/51], Iter [370/391] Loss: 0.0978
Epoch [40/51], Iter [380/391] Loss: 0.0891
Epoch [40/51], Iter [390/391] Loss: 0.0641
[Saving Checkpoint]
Epoch [41/51], Iter [10/391] Loss: 0.0821
Epoch [41/51], Iter [20/391] Loss: 0.0642
Epoch [41/51], Iter [30/391] Loss: 0.0958
Epoch [41/51], Iter [40/391] Loss: 0.0689
Epoch [41/51], Iter [50/391] Loss: 0.0676
Epoch [41/51], Iter [60/391] Loss: 0.0687
Epoch [41/51], Iter [70/391] Loss: 0.0638
Epoch [41/51], Iter [80/391] Loss: 0.1094
Epoch [41/51], Iter [90/391] Loss: 0.0890
Epoch [41/51], Iter [100/391] Loss: 0.0975
Epoch [41/51], Iter [110/391] Loss: 0.1131
Epoch [41/51], Iter [120/391] Loss: 0.0677
Epoch [41/51], Iter [130/391] Loss: 0.0794
Epoch [41/51], Iter [140/391] Loss: 0.0903
Epoch [41/51], Iter [150/391] Loss: 0.0792
Epoch [41/51], Iter [160/391] Loss: 0.1037
Epoch [41/51], Iter [170/391] Loss: 0.0813
Epoch [41/51], Iter [180/391] Loss: 0.0836
Epoch [41/51], Iter [190/391] Loss: 0.0930
Epoch [41/51], Iter [200/391] Loss: 0.1148
Epoch [41/51], Iter [210/391] Loss: 0.1027
Epoch [41/51], Iter [220/391] Loss: 0.1145
Epoch [41/51], Iter [230/391] Loss: 0.0978
Epoch [41/51], Iter [240/391] Loss: 0.0683
Epoch [41/51], Iter [250/391] Loss: 0.0698
Epoch [41/51], Iter [260/391] Loss: 0.0901
Epoch [41/51], Iter [270/391] Loss: 0.0759
Epoch [41/51], Iter [280/391] Loss: 0.0850
Epoch [41/51], Iter [290/391] Loss: 0.0841
Epoch [41/51], Iter [300/391] Loss: 0.0847
Epoch [41/51], Iter [310/391] Loss: 0.1150
Epoch [41/51], Iter [320/391] Loss: 0.0939
Epoch [41/51], Iter [330/391] Loss: 0.0988
Epoch [41/51], Iter [340/391] Loss: 0.1053
Epoch [41/51], Iter [350/391] Loss: 0.1038
Epoch [41/51], Iter [360/391] Loss: 0.0782
Epoch [41/51], Iter [370/391] Loss: 0.1027
Epoch [41/51], Iter [380/391] Loss: 0.0762
Epoch [41/51], Iter [390/391] Loss: 0.1062
Epoch [42/51], Iter [10/391] Loss: 0.0880
Epoch [42/51], Iter [20/391] Loss: 0.0766
Epoch [42/51], Iter [30/391] Loss: 0.0974
Epoch [42/51], Iter [40/391] Loss: 0.0845
Epoch [42/51], Iter [50/391] Loss: 0.0737
Epoch [42/51], Iter [60/391] Loss: 0.0799
Epoch [42/51], Iter [70/391] Loss: 0.0660
Epoch [42/51], Iter [80/391] Loss: 0.0712
Epoch [42/51], Iter [90/391] Loss: 0.1257
Epoch [42/51], Iter [100/391] Loss: 0.0664
Epoch [42/51], Iter [110/391] Loss: 0.0887
Epoch [42/51], Iter [120/391] Loss: 0.1064
Epoch [42/51], Iter [130/391] Loss: 0.1134
Epoch [42/51], Iter [140/391] Loss: 0.0807
Epoch [42/51], Iter [150/391] Loss: 0.0916
Epoch [42/51], Iter [160/391] Loss: 0.0804
Epoch [42/51], Iter [170/391] Loss: 0.1139
Epoch [42/51], Iter [180/391] Loss: 0.0840
Epoch [42/51], Iter [190/391] Loss: 0.0702
Epoch [42/51], Iter [200/391] Loss: 0.0871
Epoch [42/51], Iter [210/391] Loss: 0.0723
Epoch [42/51], Iter [220/391] Loss: 0.1185
Epoch [42/51], Iter [230/391] Loss: 0.0960
Epoch [42/51], Iter [240/391] Loss: 0.0947
Epoch [42/51], Iter [250/391] Loss: 0.0913
Epoch [42/51], Iter [260/391] Loss: 0.0626
Epoch [42/51], Iter [270/391] Loss: 0.0799
Epoch [42/51], Iter [280/391] Loss: 0.1000
Epoch [42/51], Iter [290/391] Loss: 0.1141
Epoch [42/51], Iter [300/391] Loss: 0.0966
Epoch [42/51], Iter [310/391] Loss: 0.0763
Epoch [42/51], Iter [320/391] Loss: 0.0876
Epoch [42/51], Iter [330/391] Loss: 0.0905
Epoch [42/51], Iter [340/391] Loss: 0.1095
Epoch [42/51], Iter [350/391] Loss: 0.0837
Epoch [42/51], Iter [360/391] Loss: 0.0877
Epoch [42/51], Iter [370/391] Loss: 0.1167
Epoch [42/51], Iter [380/391] Loss: 0.1087
Epoch [42/51], Iter [390/391] Loss: 0.0995
Epoch [43/51], Iter [10/391] Loss: 0.0718
Epoch [43/51], Iter [20/391] Loss: 0.0677
Epoch [43/51], Iter [30/391] Loss: 0.0592
Epoch [43/51], Iter [40/391] Loss: 0.0931
Epoch [43/51], Iter [50/391] Loss: 0.1032
Epoch [43/51], Iter [60/391] Loss: 0.0857
Epoch [43/51], Iter [70/391] Loss: 0.0972
Epoch [43/51], Iter [80/391] Loss: 0.0954
Epoch [43/51], Iter [90/391] Loss: 0.0729
Epoch [43/51], Iter [100/391] Loss: 0.0776
Epoch [43/51], Iter [110/391] Loss: 0.0924
Epoch [43/51], Iter [120/391] Loss: 0.0895
Epoch [43/51], Iter [130/391] Loss: 0.0676
Epoch [43/51], Iter [140/391] Loss: 0.0838
Epoch [43/51], Iter [150/391] Loss: 0.0617
Epoch [43/51], Iter [160/391] Loss: 0.0957
Epoch [43/51], Iter [170/391] Loss: 0.0834
Epoch [43/51], Iter [180/391] Loss: 0.0839
Epoch [43/51], Iter [190/391] Loss: 0.1268
Epoch [43/51], Iter [200/391] Loss: 0.0878
Epoch [43/51], Iter [210/391] Loss: 0.0997
Epoch [43/51], Iter [220/391] Loss: 0.1185
Epoch [43/51], Iter [230/391] Loss: 0.0835
Epoch [43/51], Iter [240/391] Loss: 0.0917
Epoch [43/51], Iter [250/391] Loss: 0.1060
Epoch [43/51], Iter [260/391] Loss: 0.1189
Epoch [43/51], Iter [270/391] Loss: 0.1139
Epoch [43/51], Iter [280/391] Loss: 0.0659
Epoch [43/51], Iter [290/391] Loss: 0.1100
Epoch [43/51], Iter [300/391] Loss: 0.0962
Epoch [43/51], Iter [310/391] Loss: 0.0814
Epoch [43/51], Iter [320/391] Loss: 0.0777
Epoch [43/51], Iter [330/391] Loss: 0.0965
Epoch [43/51], Iter [340/391] Loss: 0.0897
Epoch [43/51], Iter [350/391] Loss: 0.0774
Epoch [43/51], Iter [360/391] Loss: 0.0758
Epoch [43/51], Iter [370/391] Loss: 0.1336
Epoch [43/51], Iter [380/391] Loss: 0.0787
Epoch [43/51], Iter [390/391] Loss: 0.1017
Epoch [44/51], Iter [10/391] Loss: 0.0752
Epoch [44/51], Iter [20/391] Loss: 0.0730
Epoch [44/51], Iter [30/391] Loss: 0.1005
Epoch [44/51], Iter [40/391] Loss: 0.0529
Epoch [44/51], Iter [50/391] Loss: 0.0658
Epoch [44/51], Iter [60/391] Loss: 0.0906
Epoch [44/51], Iter [70/391] Loss: 0.0771
Epoch [44/51], Iter [80/391] Loss: 0.0884
Epoch [44/51], Iter [90/391] Loss: 0.0720
Epoch [44/51], Iter [100/391] Loss: 0.1040
Epoch [44/51], Iter [110/391] Loss: 0.0863
Epoch [44/51], Iter [120/391] Loss: 0.0919
Epoch [44/51], Iter [130/391] Loss: 0.0569
Epoch [44/51], Iter [140/391] Loss: 0.0777
Epoch [44/51], Iter [150/391] Loss: 0.1053
Epoch [44/51], Iter [160/391] Loss: 0.0775
Epoch [44/51], Iter [170/391] Loss: 0.0731
Epoch [44/51], Iter [180/391] Loss: 0.0585
Epoch [44/51], Iter [190/391] Loss: 0.0713
Epoch [44/51], Iter [200/391] Loss: 0.0828
Epoch [44/51], Iter [210/391] Loss: 0.1038
Epoch [44/51], Iter [220/391] Loss: 0.0942
Epoch [44/51], Iter [230/391] Loss: 0.1366
Epoch [44/51], Iter [240/391] Loss: 0.0659
Epoch [44/51], Iter [250/391] Loss: 0.0608
Epoch [44/51], Iter [260/391] Loss: 0.0829
Epoch [44/51], Iter [270/391] Loss: 0.0931
Epoch [44/51], Iter [280/391] Loss: 0.0960
Epoch [44/51], Iter [290/391] Loss: 0.0892
Epoch [44/51], Iter [300/391] Loss: 0.0600
Epoch [44/51], Iter [310/391] Loss: 0.0905
Epoch [44/51], Iter [320/391] Loss: 0.0688
Epoch [44/51], Iter [330/391] Loss: 0.0771
Epoch [44/51], Iter [340/391] Loss: 0.1243
Epoch [44/51], Iter [350/391] Loss: 0.0848
Epoch [44/51], Iter [360/391] Loss: 0.0862
Epoch [44/51], Iter [370/391] Loss: 0.0971
Epoch [44/51], Iter [380/391] Loss: 0.0832
Epoch [44/51], Iter [390/391] Loss: 0.0755
Epoch [45/51], Iter [10/391] Loss: 0.0620
Epoch [45/51], Iter [20/391] Loss: 0.0922
Epoch [45/51], Iter [30/391] Loss: 0.0824
Epoch [45/51], Iter [40/391] Loss: 0.0720
Epoch [45/51], Iter [50/391] Loss: 0.0928
Epoch [45/51], Iter [60/391] Loss: 0.0821
Epoch [45/51], Iter [70/391] Loss: 0.0600
Epoch [45/51], Iter [80/391] Loss: 0.0623
Epoch [45/51], Iter [90/391] Loss: 0.0909
Epoch [45/51], Iter [100/391] Loss: 0.1001
Epoch [45/51], Iter [110/391] Loss: 0.0959
Epoch [45/51], Iter [120/391] Loss: 0.0632
Epoch [45/51], Iter [130/391] Loss: 0.0751
Epoch [45/51], Iter [140/391] Loss: 0.0962
Epoch [45/51], Iter [150/391] Loss: 0.0778
Epoch [45/51], Iter [160/391] Loss: 0.1104
Epoch [45/51], Iter [170/391] Loss: 0.0875
Epoch [45/51], Iter [180/391] Loss: 0.0669
Epoch [45/51], Iter [190/391] Loss: 0.0685
Epoch [45/51], Iter [200/391] Loss: 0.0841
Epoch [45/51], Iter [210/391] Loss: 0.0975
Epoch [45/51], Iter [220/391] Loss: 0.0946
Epoch [45/51], Iter [230/391] Loss: 0.0774
Epoch [45/51], Iter [240/391] Loss: 0.1085
Epoch [45/51], Iter [250/391] Loss: 0.1069
Epoch [45/51], Iter [260/391] Loss: 0.1332
Epoch [45/51], Iter [270/391] Loss: 0.0776
Epoch [45/51], Iter [280/391] Loss: 0.0871
Epoch [45/51], Iter [290/391] Loss: 0.0657
Epoch [45/51], Iter [300/391] Loss: 0.0673
Epoch [45/51], Iter [310/391] Loss: 0.0895
Epoch [45/51], Iter [320/391] Loss: 0.0843
Epoch [45/51], Iter [330/391] Loss: 0.0681
Epoch [45/51], Iter [340/391] Loss: 0.0881
Epoch [45/51], Iter [350/391] Loss: 0.0834
Epoch [45/51], Iter [360/391] Loss: 0.0790
Epoch [45/51], Iter [370/391] Loss: 0.1095
Epoch [45/51], Iter [380/391] Loss: 0.0640
Epoch [45/51], Iter [390/391] Loss: 0.1253
Epoch [46/51], Iter [10/391] Loss: 0.0951
Epoch [46/51], Iter [20/391] Loss: 0.1054
Epoch [46/51], Iter [30/391] Loss: 0.0818
Epoch [46/51], Iter [40/391] Loss: 0.0872
Epoch [46/51], Iter [50/391] Loss: 0.0937
Epoch [46/51], Iter [60/391] Loss: 0.0645
Epoch [46/51], Iter [70/391] Loss: 0.0654
Epoch [46/51], Iter [80/391] Loss: 0.0749
Epoch [46/51], Iter [90/391] Loss: 0.0704
Epoch [46/51], Iter [100/391] Loss: 0.0745
Epoch [46/51], Iter [110/391] Loss: 0.0856
Epoch [46/51], Iter [120/391] Loss: 0.0704
Epoch [46/51], Iter [130/391] Loss: 0.0765
Epoch [46/51], Iter [140/391] Loss: 0.0703
Epoch [46/51], Iter [150/391] Loss: 0.0809
Epoch [46/51], Iter [160/391] Loss: 0.0995
Epoch [46/51], Iter [170/391] Loss: 0.0581
Epoch [46/51], Iter [180/391] Loss: 0.0838
Epoch [46/51], Iter [190/391] Loss: 0.0620
Epoch [46/51], Iter [200/391] Loss: 0.0833
Epoch [46/51], Iter [210/391] Loss: 0.0706
Epoch [46/51], Iter [220/391] Loss: 0.0815
Epoch [46/51], Iter [230/391] Loss: 0.0668
Epoch [46/51], Iter [240/391] Loss: 0.0415
Epoch [46/51], Iter [250/391] Loss: 0.0902
Epoch [46/51], Iter [260/391] Loss: 0.0830
Epoch [46/51], Iter [270/391] Loss: 0.0904
Epoch [46/51], Iter [280/391] Loss: 0.1052
Epoch [46/51], Iter [290/391] Loss: 0.0901
Epoch [46/51], Iter [300/391] Loss: 0.0957
Epoch [46/51], Iter [310/391] Loss: 0.0800
Epoch [46/51], Iter [320/391] Loss: 0.0900
Epoch [46/51], Iter [330/391] Loss: 0.1180
Epoch [46/51], Iter [340/391] Loss: 0.0801
Epoch [46/51], Iter [350/391] Loss: 0.0718
Epoch [46/51], Iter [360/391] Loss: 0.0693
Epoch [46/51], Iter [370/391] Loss: 0.0705
Epoch [46/51], Iter [380/391] Loss: 0.0836
Epoch [46/51], Iter [390/391] Loss: 0.1091
Epoch [47/51], Iter [10/391] Loss: 0.0994
Epoch [47/51], Iter [20/391] Loss: 0.0722
Epoch [47/51], Iter [30/391] Loss: 0.0730
Epoch [47/51], Iter [40/391] Loss: 0.0735
Epoch [47/51], Iter [50/391] Loss: 0.1254
Epoch [47/51], Iter [60/391] Loss: 0.0730
Epoch [47/51], Iter [70/391] Loss: 0.0861
Epoch [47/51], Iter [80/391] Loss: 0.0666
Epoch [47/51], Iter [90/391] Loss: 0.0595
Epoch [47/51], Iter [100/391] Loss: 0.0748
Epoch [47/51], Iter [110/391] Loss: 0.0685
Epoch [47/51], Iter [120/391] Loss: 0.0882
Epoch [47/51], Iter [130/391] Loss: 0.0885
Epoch [47/51], Iter [140/391] Loss: 0.0734
Epoch [47/51], Iter [150/391] Loss: 0.0844
Epoch [47/51], Iter [160/391] Loss: 0.0848
Epoch [47/51], Iter [170/391] Loss: 0.1000
Epoch [47/51], Iter [180/391] Loss: 0.0701
Epoch [47/51], Iter [190/391] Loss: 0.0651
Epoch [47/51], Iter [200/391] Loss: 0.0688
Epoch [47/51], Iter [210/391] Loss: 0.0816
Epoch [47/51], Iter [220/391] Loss: 0.0539
Epoch [47/51], Iter [230/391] Loss: 0.0917
Epoch [47/51], Iter [240/391] Loss: 0.0774
Epoch [47/51], Iter [250/391] Loss: 0.0641
Epoch [47/51], Iter [260/391] Loss: 0.1136
Epoch [47/51], Iter [270/391] Loss: 0.0764
Epoch [47/51], Iter [280/391] Loss: 0.0938
Epoch [47/51], Iter [290/391] Loss: 0.0732
Epoch [47/51], Iter [300/391] Loss: 0.0718
Epoch [47/51], Iter [310/391] Loss: 0.0789
Epoch [47/51], Iter [320/391] Loss: 0.0906
Epoch [47/51], Iter [330/391] Loss: 0.0887
Epoch [47/51], Iter [340/391] Loss: 0.0704
Epoch [47/51], Iter [350/391] Loss: 0.0608
Epoch [47/51], Iter [360/391] Loss: 0.0710
Epoch [47/51], Iter [370/391] Loss: 0.0722
Epoch [47/51], Iter [380/391] Loss: 0.0927
Epoch [47/51], Iter [390/391] Loss: 0.0596
Epoch [48/51], Iter [10/391] Loss: 0.0495
Epoch [48/51], Iter [20/391] Loss: 0.0654
Epoch [48/51], Iter [30/391] Loss: 0.0679
Epoch [48/51], Iter [40/391] Loss: 0.0678
Epoch [48/51], Iter [50/391] Loss: 0.0670
Epoch [48/51], Iter [60/391] Loss: 0.0589
Epoch [48/51], Iter [70/391] Loss: 0.0779
Epoch [48/51], Iter [80/391] Loss: 0.0830
Epoch [48/51], Iter [90/391] Loss: 0.0697
Epoch [48/51], Iter [100/391] Loss: 0.0817
Epoch [48/51], Iter [110/391] Loss: 0.0636
Epoch [48/51], Iter [120/391] Loss: 0.1021
Epoch [48/51], Iter [130/391] Loss: 0.0630
Epoch [48/51], Iter [140/391] Loss: 0.0693
Epoch [48/51], Iter [150/391] Loss: 0.0789
Epoch [48/51], Iter [160/391] Loss: 0.0528
Epoch [48/51], Iter [170/391] Loss: 0.0764
Epoch [48/51], Iter [180/391] Loss: 0.0822
Epoch [48/51], Iter [190/391] Loss: 0.1140
Epoch [48/51], Iter [200/391] Loss: 0.0874
Epoch [48/51], Iter [210/391] Loss: 0.0860
Epoch [48/51], Iter [220/391] Loss: 0.0675
Epoch [48/51], Iter [230/391] Loss: 0.0556
Epoch [48/51], Iter [240/391] Loss: 0.0699
Epoch [48/51], Iter [250/391] Loss: 0.0764
Epoch [48/51], Iter [260/391] Loss: 0.0731
Epoch [48/51], Iter [270/391] Loss: 0.0638
Epoch [48/51], Iter [280/391] Loss: 0.0461
Epoch [48/51], Iter [290/391] Loss: 0.0883
Epoch [48/51], Iter [300/391] Loss: 0.0831
Epoch [48/51], Iter [310/391] Loss: 0.0695
Epoch [48/51], Iter [320/391] Loss: 0.0529
Epoch [48/51], Iter [330/391] Loss: 0.0736
Epoch [48/51], Iter [340/391] Loss: 0.0796
Epoch [48/51], Iter [350/391] Loss: 0.0871
Epoch [48/51], Iter [360/391] Loss: 0.0807
Epoch [48/51], Iter [370/391] Loss: 0.0762
Epoch [48/51], Iter [380/391] Loss: 0.0798
Epoch [48/51], Iter [390/391] Loss: 0.0835
Epoch [49/51], Iter [10/391] Loss: 0.0969
Epoch [49/51], Iter [20/391] Loss: 0.0811
Epoch [49/51], Iter [30/391] Loss: 0.0880
Epoch [49/51], Iter [40/391] Loss: 0.0739
Epoch [49/51], Iter [50/391] Loss: 0.0986
Epoch [49/51], Iter [60/391] Loss: 0.0626
Epoch [49/51], Iter [70/391] Loss: 0.0675
Epoch [49/51], Iter [80/391] Loss: 0.0680
Epoch [49/51], Iter [90/391] Loss: 0.0678
Epoch [49/51], Iter [100/391] Loss: 0.0617
Epoch [49/51], Iter [110/391] Loss: 0.0682
Epoch [49/51], Iter [120/391] Loss: 0.0820
Epoch [49/51], Iter [130/391] Loss: 0.0591
Epoch [49/51], Iter [140/391] Loss: 0.0736
Epoch [49/51], Iter [150/391] Loss: 0.0652
Epoch [49/51], Iter [160/391] Loss: 0.0715
Epoch [49/51], Iter [170/391] Loss: 0.0687
Epoch [49/51], Iter [180/391] Loss: 0.0747
Epoch [49/51], Iter [190/391] Loss: 0.0717
Epoch [49/51], Iter [200/391] Loss: 0.0649
Epoch [49/51], Iter [210/391] Loss: 0.1071
Epoch [49/51], Iter [220/391] Loss: 0.0684
Epoch [49/51], Iter [230/391] Loss: 0.0707
Epoch [49/51], Iter [240/391] Loss: 0.0612
Epoch [49/51], Iter [250/391] Loss: 0.0949
Epoch [49/51], Iter [260/391] Loss: 0.0651
Epoch [49/51], Iter [270/391] Loss: 0.0666
Epoch [49/51], Iter [280/391] Loss: 0.0699
Epoch [49/51], Iter [290/391] Loss: 0.1054
Epoch [49/51], Iter [300/391] Loss: 0.0946
Epoch [49/51], Iter [310/391] Loss: 0.0995
Epoch [49/51], Iter [320/391] Loss: 0.0898
Epoch [49/51], Iter [330/391] Loss: 0.0836
Epoch [49/51], Iter [340/391] Loss: 0.0786
Epoch [49/51], Iter [350/391] Loss: 0.0838
Epoch [49/51], Iter [360/391] Loss: 0.0590
Epoch [49/51], Iter [370/391] Loss: 0.0884
Epoch [49/51], Iter [380/391] Loss: 0.0871
Epoch [49/51], Iter [390/391] Loss: 0.0821
Epoch [50/51], Iter [10/391] Loss: 0.0777
Epoch [50/51], Iter [20/391] Loss: 0.0629
Epoch [50/51], Iter [30/391] Loss: 0.0710
Epoch [50/51], Iter [40/391] Loss: 0.0637
Epoch [50/51], Iter [50/391] Loss: 0.1032
Epoch [50/51], Iter [60/391] Loss: 0.0807
Epoch [50/51], Iter [70/391] Loss: 0.0771
Epoch [50/51], Iter [80/391] Loss: 0.0739
Epoch [50/51], Iter [90/391] Loss: 0.0816
Epoch [50/51], Iter [100/391] Loss: 0.0709
Epoch [50/51], Iter [110/391] Loss: 0.0807
Epoch [50/51], Iter [120/391] Loss: 0.0712
Epoch [50/51], Iter [130/391] Loss: 0.0683
Epoch [50/51], Iter [140/391] Loss: 0.0592
Epoch [50/51], Iter [150/391] Loss: 0.0591
Epoch [50/51], Iter [160/391] Loss: 0.0801
Epoch [50/51], Iter [170/391] Loss: 0.0703
Epoch [50/51], Iter [180/391] Loss: 0.0846
Epoch [50/51], Iter [190/391] Loss: 0.0721
Epoch [50/51], Iter [200/391] Loss: 0.0738
Epoch [50/51], Iter [210/391] Loss: 0.0980
Epoch [50/51], Iter [220/391] Loss: 0.0712
Epoch [50/51], Iter [230/391] Loss: 0.0714
Epoch [50/51], Iter [240/391] Loss: 0.0817
Epoch [50/51], Iter [250/391] Loss: 0.0688
Epoch [50/51], Iter [260/391] Loss: 0.0457
Epoch [50/51], Iter [270/391] Loss: 0.0832
Epoch [50/51], Iter [280/391] Loss: 0.0722
Epoch [50/51], Iter [290/391] Loss: 0.0653
Epoch [50/51], Iter [300/391] Loss: 0.0625
Epoch [50/51], Iter [310/391] Loss: 0.0684
Epoch [50/51], Iter [320/391] Loss: 0.0694
Epoch [50/51], Iter [330/391] Loss: 0.0622
Epoch [50/51], Iter [340/391] Loss: 0.0738
Epoch [50/51], Iter [350/391] Loss: 0.0714
Epoch [50/51], Iter [360/391] Loss: 0.0872
Epoch [50/51], Iter [370/391] Loss: 0.0530
Epoch [50/51], Iter [380/391] Loss: 0.0878
Epoch [50/51], Iter [390/391] Loss: 0.0725
[Saving Checkpoint]
Epoch [51/51], Iter [10/391] Loss: 0.0700
Epoch [51/51], Iter [20/391] Loss: 0.0770
Epoch [51/51], Iter [30/391] Loss: 0.0662
Epoch [51/51], Iter [40/391] Loss: 0.0988
Epoch [51/51], Iter [50/391] Loss: 0.0493
Epoch [51/51], Iter [60/391] Loss: 0.0694
Epoch [51/51], Iter [70/391] Loss: 0.0904
Epoch [51/51], Iter [80/391] Loss: 0.0753
Epoch [51/51], Iter [90/391] Loss: 0.0767
Epoch [51/51], Iter [100/391] Loss: 0.0561
Epoch [51/51], Iter [110/391] Loss: 0.0628
Epoch [51/51], Iter [120/391] Loss: 0.0699
Epoch [51/51], Iter [130/391] Loss: 0.0761
Epoch [51/51], Iter [140/391] Loss: 0.0580
Epoch [51/51], Iter [150/391] Loss: 0.0586
Epoch [51/51], Iter [160/391] Loss: 0.0981
Epoch [51/51], Iter [170/391] Loss: 0.0696
Epoch [51/51], Iter [180/391] Loss: 0.0856
Epoch [51/51], Iter [190/391] Loss: 0.0777
Epoch [51/51], Iter [200/391] Loss: 0.0939
Epoch [51/51], Iter [210/391] Loss: 0.0864
Epoch [51/51], Iter [220/391] Loss: 0.0869
Epoch [51/51], Iter [230/391] Loss: 0.0715
Epoch [51/51], Iter [240/391] Loss: 0.0823
Epoch [51/51], Iter [250/391] Loss: 0.0584
Epoch [51/51], Iter [260/391] Loss: 0.0812
Epoch [51/51], Iter [270/391] Loss: 0.0911
Epoch [51/51], Iter [280/391] Loss: 0.0695
Epoch [51/51], Iter [290/391] Loss: 0.0850
Epoch [51/51], Iter [300/391] Loss: 0.1013
Epoch [51/51], Iter [310/391] Loss: 0.0917
Epoch [51/51], Iter [320/391] Loss: 0.0629
Epoch [51/51], Iter [330/391] Loss: 0.0872
Epoch [51/51], Iter [340/391] Loss: 0.0771
Epoch [51/51], Iter [350/391] Loss: 0.0831
Epoch [51/51], Iter [360/391] Loss: 0.0800
Epoch [51/51], Iter [370/391] Loss: 0.0697
Epoch [51/51], Iter [380/391] Loss: 0.0711
Epoch [51/51], Iter [390/391] Loss: 0.0674
# | a=0.5 | T=2 | epochs = 51 |
resnet_child_a0dot5_t2_e51 = copy.deepcopy(resnet_child) #let's save for future reference
test_harness( testloader, resnet_child_a0dot5_t2_e51 )
Accuracy of the model on the test images: 89 %
(tensor(8905, device='cuda:0'), 10000)
# | a=0.5 | T=5 | epochs = 51 |
resnet_child, optimizer_child = get_new_child_model()
epoch = 0
kd_loss_a0dot5_t5 = partial( knowledge_distillation_loss, alpha=0.5, T=5 )
training_harness( trainloader, optimizer_child, kd_loss_a0dot5_t5, resnet_parent, resnet_child, model_name='DeepResNet_a0dot5_t5_e51' )
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:12: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
if sys.path[0] == '':
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:26: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:15: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
from ipykernel import kernelapp as app
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:20: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:21: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
ENABLING GPU ACCELERATION || GeForce GTX 1080 Ti
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:46: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
Epoch [1/51], Iter [10/391] Loss: 1.8004
Epoch [1/51], Iter [20/391] Loss: 1.6407
Epoch [1/51], Iter [30/391] Loss: 1.5258
Epoch [1/51], Iter [40/391] Loss: 1.5474
Epoch [1/51], Iter [50/391] Loss: 1.5558
Epoch [1/51], Iter [60/391] Loss: 1.4236
Epoch [1/51], Iter [70/391] Loss: 1.4652
Epoch [1/51], Iter [80/391] Loss: 1.4942
Epoch [1/51], Iter [90/391] Loss: 1.4445
Epoch [1/51], Iter [100/391] Loss: 1.3261
Epoch [1/51], Iter [110/391] Loss: 1.4052
Epoch [1/51], Iter [120/391] Loss: 1.2929
Epoch [1/51], Iter [130/391] Loss: 1.3664
Epoch [1/51], Iter [140/391] Loss: 1.3333
Epoch [1/51], Iter [150/391] Loss: 1.2884
Epoch [1/51], Iter [160/391] Loss: 1.2442
Epoch [1/51], Iter [170/391] Loss: 1.2056
Epoch [1/51], Iter [180/391] Loss: 1.2325
Epoch [1/51], Iter [190/391] Loss: 1.1925
Epoch [1/51], Iter [200/391] Loss: 1.1303
Epoch [1/51], Iter [210/391] Loss: 1.1765
Epoch [1/51], Iter [220/391] Loss: 1.1579
Epoch [1/51], Iter [230/391] Loss: 1.0622
Epoch [1/51], Iter [240/391] Loss: 1.1931
Epoch [1/51], Iter [250/391] Loss: 1.1930
Epoch [1/51], Iter [260/391] Loss: 1.1333
Epoch [1/51], Iter [270/391] Loss: 1.1524
Epoch [1/51], Iter [280/391] Loss: 1.2232
Epoch [1/51], Iter [290/391] Loss: 1.0864
Epoch [1/51], Iter [300/391] Loss: 1.1302
Epoch [1/51], Iter [310/391] Loss: 1.0605
Epoch [1/51], Iter [320/391] Loss: 1.0510
Epoch [1/51], Iter [330/391] Loss: 1.0267
Epoch [1/51], Iter [340/391] Loss: 1.0921
Epoch [1/51], Iter [350/391] Loss: 1.0572
Epoch [1/51], Iter [360/391] Loss: 1.1050
Epoch [1/51], Iter [370/391] Loss: 1.0218
Epoch [1/51], Iter [380/391] Loss: 0.9805
Epoch [1/51], Iter [390/391] Loss: 1.0036
Epoch [2/51], Iter [10/391] Loss: 0.9304
Epoch [2/51], Iter [20/391] Loss: 1.0494
Epoch [2/51], Iter [30/391] Loss: 0.9020
Epoch [2/51], Iter [40/391] Loss: 1.0087
Epoch [2/51], Iter [50/391] Loss: 0.9322
Epoch [2/51], Iter [60/391] Loss: 0.9640
Epoch [2/51], Iter [70/391] Loss: 0.8782
Epoch [2/51], Iter [80/391] Loss: 0.9900
Epoch [2/51], Iter [90/391] Loss: 0.9124
Epoch [2/51], Iter [100/391] Loss: 0.9748
Epoch [2/51], Iter [110/391] Loss: 0.8836
Epoch [2/51], Iter [120/391] Loss: 0.9461
Epoch [2/51], Iter [130/391] Loss: 0.9287
Epoch [2/51], Iter [140/391] Loss: 1.0680
Epoch [2/51], Iter [150/391] Loss: 0.9083
Epoch [2/51], Iter [160/391] Loss: 0.9402
Epoch [2/51], Iter [170/391] Loss: 0.9733
Epoch [2/51], Iter [180/391] Loss: 0.9172
Epoch [2/51], Iter [190/391] Loss: 0.9105
Epoch [2/51], Iter [200/391] Loss: 0.9409
Epoch [2/51], Iter [210/391] Loss: 0.9081
Epoch [2/51], Iter [220/391] Loss: 0.9213
Epoch [2/51], Iter [230/391] Loss: 0.9608
Epoch [2/51], Iter [240/391] Loss: 0.9703
Epoch [2/51], Iter [250/391] Loss: 1.0025
Epoch [2/51], Iter [260/391] Loss: 0.9035
Epoch [2/51], Iter [270/391] Loss: 0.9551
Epoch [2/51], Iter [280/391] Loss: 0.8459
Epoch [2/51], Iter [290/391] Loss: 0.8504
Epoch [2/51], Iter [300/391] Loss: 0.8001
Epoch [2/51], Iter [310/391] Loss: 0.9529
Epoch [2/51], Iter [320/391] Loss: 0.8869
Epoch [2/51], Iter [330/391] Loss: 0.8521
Epoch [2/51], Iter [340/391] Loss: 0.9126
Epoch [2/51], Iter [350/391] Loss: 0.9181
Epoch [2/51], Iter [360/391] Loss: 0.9368
Epoch [2/51], Iter [370/391] Loss: 0.9599
Epoch [2/51], Iter [380/391] Loss: 0.9264
Epoch [2/51], Iter [390/391] Loss: 0.8423
Epoch [3/51], Iter [10/391] Loss: 0.8124
Epoch [3/51], Iter [20/391] Loss: 0.8400
Epoch [3/51], Iter [30/391] Loss: 0.8276
Epoch [3/51], Iter [40/391] Loss: 0.7876
Epoch [3/51], Iter [50/391] Loss: 0.8915
Epoch [3/51], Iter [60/391] Loss: 0.8406
Epoch [3/51], Iter [70/391] Loss: 0.7579
Epoch [3/51], Iter [80/391] Loss: 0.8519
Epoch [3/51], Iter [90/391] Loss: 0.8030
Epoch [3/51], Iter [100/391] Loss: 0.8006
Epoch [3/51], Iter [110/391] Loss: 0.7492
Epoch [3/51], Iter [120/391] Loss: 0.8071
Epoch [3/51], Iter [130/391] Loss: 0.8329
Epoch [3/51], Iter [140/391] Loss: 0.7483
Epoch [3/51], Iter [150/391] Loss: 0.7999
Epoch [3/51], Iter [160/391] Loss: 0.8035
Epoch [3/51], Iter [170/391] Loss: 0.8110
Epoch [3/51], Iter [180/391] Loss: 0.8088
Epoch [3/51], Iter [190/391] Loss: 0.8538
Epoch [3/51], Iter [200/391] Loss: 0.8506
Epoch [3/51], Iter [210/391] Loss: 0.7737
Epoch [3/51], Iter [220/391] Loss: 0.7496
Epoch [3/51], Iter [230/391] Loss: 0.7925
Epoch [3/51], Iter [240/391] Loss: 0.7569
Epoch [3/51], Iter [250/391] Loss: 0.7555
Epoch [3/51], Iter [260/391] Loss: 0.7911
Epoch [3/51], Iter [270/391] Loss: 0.7860
Epoch [3/51], Iter [280/391] Loss: 0.8145
Epoch [3/51], Iter [290/391] Loss: 0.7787
Epoch [3/51], Iter [300/391] Loss: 0.7687
Epoch [3/51], Iter [310/391] Loss: 0.7212
Epoch [3/51], Iter [320/391] Loss: 0.8365
Epoch [3/51], Iter [330/391] Loss: 0.7840
Epoch [3/51], Iter [340/391] Loss: 0.7940
Epoch [3/51], Iter [350/391] Loss: 0.7758
Epoch [3/51], Iter [360/391] Loss: 0.8076
Epoch [3/51], Iter [370/391] Loss: 0.7654
Epoch [3/51], Iter [380/391] Loss: 0.7331
Epoch [3/51], Iter [390/391] Loss: 0.8030
Epoch [4/51], Iter [10/391] Loss: 0.7613
Epoch [4/51], Iter [20/391] Loss: 0.8505
Epoch [4/51], Iter [30/391] Loss: 0.6940
Epoch [4/51], Iter [40/391] Loss: 0.7825
Epoch [4/51], Iter [50/391] Loss: 0.7362
Epoch [4/51], Iter [60/391] Loss: 0.7282
Epoch [4/51], Iter [70/391] Loss: 0.7520
Epoch [4/51], Iter [80/391] Loss: 0.7404
Epoch [4/51], Iter [90/391] Loss: 0.7314
Epoch [4/51], Iter [100/391] Loss: 0.7295
Epoch [4/51], Iter [110/391] Loss: 0.6939
Epoch [4/51], Iter [120/391] Loss: 0.7031
Epoch [4/51], Iter [130/391] Loss: 0.7070
Epoch [4/51], Iter [140/391] Loss: 0.6866
Epoch [4/51], Iter [150/391] Loss: 0.7948
Epoch [4/51], Iter [160/391] Loss: 0.7821
Epoch [4/51], Iter [170/391] Loss: 0.7160
Epoch [4/51], Iter [180/391] Loss: 0.7279
Epoch [4/51], Iter [190/391] Loss: 0.7360
Epoch [4/51], Iter [200/391] Loss: 0.7444
Epoch [4/51], Iter [210/391] Loss: 0.7522
Epoch [4/51], Iter [220/391] Loss: 0.7612
Epoch [4/51], Iter [230/391] Loss: 0.7401
Epoch [4/51], Iter [240/391] Loss: 0.7699
Epoch [4/51], Iter [250/391] Loss: 0.6994
Epoch [4/51], Iter [260/391] Loss: 0.7737
Epoch [4/51], Iter [270/391] Loss: 0.6919
Epoch [4/51], Iter [280/391] Loss: 0.6886
Epoch [4/51], Iter [290/391] Loss: 0.8158
Epoch [4/51], Iter [300/391] Loss: 0.6830
Epoch [4/51], Iter [310/391] Loss: 0.7069
Epoch [4/51], Iter [320/391] Loss: 0.7421
Epoch [4/51], Iter [330/391] Loss: 0.7483
Epoch [4/51], Iter [340/391] Loss: 0.6640
Epoch [4/51], Iter [350/391] Loss: 0.6586
Epoch [4/51], Iter [360/391] Loss: 0.6322
Epoch [4/51], Iter [370/391] Loss: 0.7230
Epoch [4/51], Iter [380/391] Loss: 0.7280
Epoch [4/51], Iter [390/391] Loss: 0.6882
Epoch [5/51], Iter [10/391] Loss: 0.6394
Epoch [5/51], Iter [20/391] Loss: 0.6611
Epoch [5/51], Iter [30/391] Loss: 0.5971
Epoch [5/51], Iter [40/391] Loss: 0.6176
Epoch [5/51], Iter [50/391] Loss: 0.6337
Epoch [5/51], Iter [60/391] Loss: 0.6889
Epoch [5/51], Iter [70/391] Loss: 0.6551
Epoch [5/51], Iter [80/391] Loss: 0.7052
Epoch [5/51], Iter [90/391] Loss: 0.6034
Epoch [5/51], Iter [100/391] Loss: 0.7508
Epoch [5/51], Iter [110/391] Loss: 0.7073
Epoch [5/51], Iter [120/391] Loss: 0.6643
Epoch [5/51], Iter [130/391] Loss: 0.6021
Epoch [5/51], Iter [140/391] Loss: 0.6717
Epoch [5/51], Iter [150/391] Loss: 0.6248
Epoch [5/51], Iter [160/391] Loss: 0.6743
Epoch [5/51], Iter [170/391] Loss: 0.7150
Epoch [5/51], Iter [180/391] Loss: 0.6622
Epoch [5/51], Iter [190/391] Loss: 0.7340
Epoch [5/51], Iter [200/391] Loss: 0.7021
Epoch [5/51], Iter [210/391] Loss: 0.6067
Epoch [5/51], Iter [220/391] Loss: 0.5999
Epoch [5/51], Iter [230/391] Loss: 0.6976
Epoch [5/51], Iter [240/391] Loss: 0.6515
Epoch [5/51], Iter [250/391] Loss: 0.7138
Epoch [5/51], Iter [260/391] Loss: 0.6212
Epoch [5/51], Iter [270/391] Loss: 0.6325
Epoch [5/51], Iter [280/391] Loss: 0.5804
Epoch [5/51], Iter [290/391] Loss: 0.6333
Epoch [5/51], Iter [300/391] Loss: 0.6167
Epoch [5/51], Iter [310/391] Loss: 0.6944
Epoch [5/51], Iter [320/391] Loss: 0.7113
Epoch [5/51], Iter [330/391] Loss: 0.5858
Epoch [5/51], Iter [340/391] Loss: 0.6447
Epoch [5/51], Iter [350/391] Loss: 0.6498
Epoch [5/51], Iter [360/391] Loss: 0.6384
Epoch [5/51], Iter [370/391] Loss: 0.5867
Epoch [5/51], Iter [380/391] Loss: 0.6385
Epoch [5/51], Iter [390/391] Loss: 0.6356
Epoch [6/51], Iter [10/391] Loss: 0.6759
Epoch [6/51], Iter [20/391] Loss: 0.6061
Epoch [6/51], Iter [30/391] Loss: 0.6297
Epoch [6/51], Iter [40/391] Loss: 0.5729
Epoch [6/51], Iter [50/391] Loss: 0.6079
Epoch [6/51], Iter [60/391] Loss: 0.5775
Epoch [6/51], Iter [70/391] Loss: 0.5846
Epoch [6/51], Iter [80/391] Loss: 0.6574
Epoch [6/51], Iter [90/391] Loss: 0.6727
Epoch [6/51], Iter [100/391] Loss: 0.6811
Epoch [6/51], Iter [110/391] Loss: 0.6734
Epoch [6/51], Iter [120/391] Loss: 0.5935
Epoch [6/51], Iter [130/391] Loss: 0.6479
Epoch [6/51], Iter [140/391] Loss: 0.5735
Epoch [6/51], Iter [150/391] Loss: 0.6394
Epoch [6/51], Iter [160/391] Loss: 0.5592
Epoch [6/51], Iter [170/391] Loss: 0.5870
Epoch [6/51], Iter [180/391] Loss: 0.5723
Epoch [6/51], Iter [190/391] Loss: 0.6468
Epoch [6/51], Iter [200/391] Loss: 0.5686
Epoch [6/51], Iter [210/391] Loss: 0.5798
Epoch [6/51], Iter [220/391] Loss: 0.6130
Epoch [6/51], Iter [230/391] Loss: 0.5903
Epoch [6/51], Iter [240/391] Loss: 0.7285
Epoch [6/51], Iter [250/391] Loss: 0.6211
Epoch [6/51], Iter [260/391] Loss: 0.6916
Epoch [6/51], Iter [270/391] Loss: 0.6733
Epoch [6/51], Iter [280/391] Loss: 0.6567
Epoch [6/51], Iter [290/391] Loss: 0.6007
Epoch [6/51], Iter [300/391] Loss: 0.5948
Epoch [6/51], Iter [310/391] Loss: 0.6168
Epoch [6/51], Iter [320/391] Loss: 0.5753
Epoch [6/51], Iter [330/391] Loss: 0.6362
Epoch [6/51], Iter [340/391] Loss: 0.5879
Epoch [6/51], Iter [350/391] Loss: 0.6512
Epoch [6/51], Iter [360/391] Loss: 0.6076
Epoch [6/51], Iter [370/391] Loss: 0.6471
Epoch [6/51], Iter [380/391] Loss: 0.6544
Epoch [6/51], Iter [390/391] Loss: 0.6393
Epoch [7/51], Iter [10/391] Loss: 0.6528
Epoch [7/51], Iter [20/391] Loss: 0.5855
Epoch [7/51], Iter [30/391] Loss: 0.5900
Epoch [7/51], Iter [40/391] Loss: 0.5847
Epoch [7/51], Iter [50/391] Loss: 0.5858
Epoch [7/51], Iter [60/391] Loss: 0.5687
Epoch [7/51], Iter [70/391] Loss: 0.6166
Epoch [7/51], Iter [80/391] Loss: 0.5327
Epoch [7/51], Iter [90/391] Loss: 0.6377
Epoch [7/51], Iter [100/391] Loss: 0.6221
Epoch [7/51], Iter [110/391] Loss: 0.6182
Epoch [7/51], Iter [120/391] Loss: 0.5637
Epoch [7/51], Iter [130/391] Loss: 0.5613
Epoch [7/51], Iter [140/391] Loss: 0.5727
Epoch [7/51], Iter [150/391] Loss: 0.5772
Epoch [7/51], Iter [160/391] Loss: 0.5809
Epoch [7/51], Iter [170/391] Loss: 0.6012
Epoch [7/51], Iter [180/391] Loss: 0.5522
Epoch [7/51], Iter [190/391] Loss: 0.6110
Epoch [7/51], Iter [200/391] Loss: 0.5256
Epoch [7/51], Iter [210/391] Loss: 0.6099
Epoch [7/51], Iter [220/391] Loss: 0.6405
Epoch [7/51], Iter [230/391] Loss: 0.6354
Epoch [7/51], Iter [240/391] Loss: 0.5775
Epoch [7/51], Iter [250/391] Loss: 0.5457
Epoch [7/51], Iter [260/391] Loss: 0.6122
Epoch [7/51], Iter [270/391] Loss: 0.5589
Epoch [7/51], Iter [280/391] Loss: 0.5659
Epoch [7/51], Iter [290/391] Loss: 0.5983
Epoch [7/51], Iter [300/391] Loss: 0.5814
Epoch [7/51], Iter [310/391] Loss: 0.5781
Epoch [7/51], Iter [320/391] Loss: 0.5717
Epoch [7/51], Iter [330/391] Loss: 0.5915
Epoch [7/51], Iter [340/391] Loss: 0.5656
Epoch [7/51], Iter [350/391] Loss: 0.5577
Epoch [7/51], Iter [360/391] Loss: 0.5813
Epoch [7/51], Iter [370/391] Loss: 0.5603
Epoch [7/51], Iter [380/391] Loss: 0.6234
Epoch [7/51], Iter [390/391] Loss: 0.5450
Epoch [8/51], Iter [10/391] Loss: 0.5897
Epoch [8/51], Iter [20/391] Loss: 0.5261
Epoch [8/51], Iter [30/391] Loss: 0.5404
Epoch [8/51], Iter [40/391] Loss: 0.5808
Epoch [8/51], Iter [50/391] Loss: 0.6252
Epoch [8/51], Iter [60/391] Loss: 0.5417
Epoch [8/51], Iter [70/391] Loss: 0.5999
Epoch [8/51], Iter [80/391] Loss: 0.5092
Epoch [8/51], Iter [90/391] Loss: 0.5566
Epoch [8/51], Iter [100/391] Loss: 0.5735
Epoch [8/51], Iter [110/391] Loss: 0.5263
Epoch [8/51], Iter [120/391] Loss: 0.5249
Epoch [8/51], Iter [130/391] Loss: 0.5426
Epoch [8/51], Iter [140/391] Loss: 0.5562
Epoch [8/51], Iter [150/391] Loss: 0.4836
Epoch [8/51], Iter [160/391] Loss: 0.5594
Epoch [8/51], Iter [170/391] Loss: 0.5612
Epoch [8/51], Iter [180/391] Loss: 0.5546
Epoch [8/51], Iter [190/391] Loss: 0.5583
Epoch [8/51], Iter [200/391] Loss: 0.5735
Epoch [8/51], Iter [210/391] Loss: 0.4999
Epoch [8/51], Iter [220/391] Loss: 0.6376
Epoch [8/51], Iter [230/391] Loss: 0.6111
Epoch [8/51], Iter [240/391] Loss: 0.5077
Epoch [8/51], Iter [250/391] Loss: 0.6337
Epoch [8/51], Iter [260/391] Loss: 0.5904
Epoch [8/51], Iter [270/391] Loss: 0.5892
Epoch [8/51], Iter [280/391] Loss: 0.5394
Epoch [8/51], Iter [290/391] Loss: 0.5693
Epoch [8/51], Iter [300/391] Loss: 0.5563
Epoch [8/51], Iter [310/391] Loss: 0.5243
Epoch [8/51], Iter [320/391] Loss: 0.5362
Epoch [8/51], Iter [330/391] Loss: 0.5806
Epoch [8/51], Iter [340/391] Loss: 0.5943
Epoch [8/51], Iter [350/391] Loss: 0.5713
Epoch [8/51], Iter [360/391] Loss: 0.5455
Epoch [8/51], Iter [370/391] Loss: 0.5085
Epoch [8/51], Iter [380/391] Loss: 0.5338
Epoch [8/51], Iter [390/391] Loss: 0.5534
Epoch [9/51], Iter [10/391] Loss: 0.5344
Epoch [9/51], Iter [20/391] Loss: 0.5801
Epoch [9/51], Iter [30/391] Loss: 0.5363
Epoch [9/51], Iter [40/391] Loss: 0.4994
Epoch [9/51], Iter [50/391] Loss: 0.6056
Epoch [9/51], Iter [60/391] Loss: 0.5141
Epoch [9/51], Iter [70/391] Loss: 0.5977
Epoch [9/51], Iter [80/391] Loss: 0.5797
Epoch [9/51], Iter [90/391] Loss: 0.5959
Epoch [9/51], Iter [100/391] Loss: 0.5530
Epoch [9/51], Iter [110/391] Loss: 0.5404
Epoch [9/51], Iter [120/391] Loss: 0.5543
Epoch [9/51], Iter [130/391] Loss: 0.5701
Epoch [9/51], Iter [140/391] Loss: 0.5477
Epoch [9/51], Iter [150/391] Loss: 0.5384
Epoch [9/51], Iter [160/391] Loss: 0.5590
Epoch [9/51], Iter [170/391] Loss: 0.5769
Epoch [9/51], Iter [180/391] Loss: 0.4945
Epoch [9/51], Iter [190/391] Loss: 0.5831
Epoch [9/51], Iter [200/391] Loss: 0.5778
Epoch [9/51], Iter [210/391] Loss: 0.5287
Epoch [9/51], Iter [220/391] Loss: 0.5248
Epoch [9/51], Iter [230/391] Loss: 0.5148
Epoch [9/51], Iter [240/391] Loss: 0.5287
Epoch [9/51], Iter [250/391] Loss: 0.5535
Epoch [9/51], Iter [260/391] Loss: 0.4960
Epoch [9/51], Iter [270/391] Loss: 0.4733
Epoch [9/51], Iter [280/391] Loss: 0.5061
Epoch [9/51], Iter [290/391] Loss: 0.5322
Epoch [9/51], Iter [300/391] Loss: 0.4811
Epoch [9/51], Iter [310/391] Loss: 0.5693
Epoch [9/51], Iter [320/391] Loss: 0.5880
Epoch [9/51], Iter [330/391] Loss: 0.5719
Epoch [9/51], Iter [340/391] Loss: 0.5827
Epoch [9/51], Iter [350/391] Loss: 0.5233
Epoch [9/51], Iter [360/391] Loss: 0.5386
Epoch [9/51], Iter [370/391] Loss: 0.4758
Epoch [9/51], Iter [380/391] Loss: 0.5035
Epoch [9/51], Iter [390/391] Loss: 0.5060
Epoch [10/51], Iter [10/391] Loss: 0.4535
Epoch [10/51], Iter [20/391] Loss: 0.4965
Epoch [10/51], Iter [30/391] Loss: 0.4970
Epoch [10/51], Iter [40/391] Loss: 0.5035
Epoch [10/51], Iter [50/391] Loss: 0.5395
Epoch [10/51], Iter [60/391] Loss: 0.5793
Epoch [10/51], Iter [70/391] Loss: 0.4590
Epoch [10/51], Iter [80/391] Loss: 0.5009
Epoch [10/51], Iter [90/391] Loss: 0.5132
Epoch [10/51], Iter [100/391] Loss: 0.5268
Epoch [10/51], Iter [110/391] Loss: 0.5539
Epoch [10/51], Iter [120/391] Loss: 0.5358
Epoch [10/51], Iter [130/391] Loss: 0.5149
Epoch [10/51], Iter [140/391] Loss: 0.5246
Epoch [10/51], Iter [150/391] Loss: 0.5153
Epoch [10/51], Iter [160/391] Loss: 0.5174
Epoch [10/51], Iter [170/391] Loss: 0.5515
Epoch [10/51], Iter [180/391] Loss: 0.5476
Epoch [10/51], Iter [190/391] Loss: 0.4733
Epoch [10/51], Iter [200/391] Loss: 0.5321
Epoch [10/51], Iter [210/391] Loss: 0.4687
Epoch [10/51], Iter [220/391] Loss: 0.5491
Epoch [10/51], Iter [230/391] Loss: 0.5412
Epoch [10/51], Iter [240/391] Loss: 0.5061
Epoch [10/51], Iter [250/391] Loss: 0.5231
Epoch [10/51], Iter [260/391] Loss: 0.4811
Epoch [10/51], Iter [270/391] Loss: 0.4943
Epoch [10/51], Iter [280/391] Loss: 0.5326
Epoch [10/51], Iter [290/391] Loss: 0.5094
Epoch [10/51], Iter [300/391] Loss: 0.4920
Epoch [10/51], Iter [310/391] Loss: 0.4854
Epoch [10/51], Iter [320/391] Loss: 0.5494
Epoch [10/51], Iter [330/391] Loss: 0.5459
Epoch [10/51], Iter [340/391] Loss: 0.4999
Epoch [10/51], Iter [350/391] Loss: 0.5132
Epoch [10/51], Iter [360/391] Loss: 0.5042
Epoch [10/51], Iter [370/391] Loss: 0.4977
Epoch [10/51], Iter [380/391] Loss: 0.5150
Epoch [10/51], Iter [390/391] Loss: 0.5433
[Saving Checkpoint]
Epoch [11/51], Iter [10/391] Loss: 0.4999
Epoch [11/51], Iter [20/391] Loss: 0.5286
Epoch [11/51], Iter [30/391] Loss: 0.5477
Epoch [11/51], Iter [40/391] Loss: 0.4959
Epoch [11/51], Iter [50/391] Loss: 0.4980
Epoch [11/51], Iter [60/391] Loss: 0.4974
Epoch [11/51], Iter [70/391] Loss: 0.4681
Epoch [11/51], Iter [80/391] Loss: 0.5031
Epoch [11/51], Iter [90/391] Loss: 0.5530
Epoch [11/51], Iter [100/391] Loss: 0.5036
Epoch [11/51], Iter [110/391] Loss: 0.5102
Epoch [11/51], Iter [120/391] Loss: 0.4986
Epoch [11/51], Iter [130/391] Loss: 0.4879
Epoch [11/51], Iter [140/391] Loss: 0.4760
Epoch [11/51], Iter [150/391] Loss: 0.5145
Epoch [11/51], Iter [160/391] Loss: 0.4798
Epoch [11/51], Iter [170/391] Loss: 0.4955
Epoch [11/51], Iter [180/391] Loss: 0.4782
Epoch [11/51], Iter [190/391] Loss: 0.5027
Epoch [11/51], Iter [200/391] Loss: 0.4947
Epoch [11/51], Iter [210/391] Loss: 0.5192
Epoch [11/51], Iter [220/391] Loss: 0.5333
Epoch [11/51], Iter [230/391] Loss: 0.5381
Epoch [11/51], Iter [240/391] Loss: 0.4878
Epoch [11/51], Iter [250/391] Loss: 0.4914
Epoch [11/51], Iter [260/391] Loss: 0.4579
Epoch [11/51], Iter [270/391] Loss: 0.5193
Epoch [11/51], Iter [280/391] Loss: 0.4635
Epoch [11/51], Iter [290/391] Loss: 0.5118
Epoch [11/51], Iter [300/391] Loss: 0.4568
Epoch [11/51], Iter [310/391] Loss: 0.5237
Epoch [11/51], Iter [320/391] Loss: 0.5059
Epoch [11/51], Iter [330/391] Loss: 0.4782
Epoch [11/51], Iter [340/391] Loss: 0.5348
Epoch [11/51], Iter [350/391] Loss: 0.5121
Epoch [11/51], Iter [360/391] Loss: 0.4672
Epoch [11/51], Iter [370/391] Loss: 0.5349
Epoch [11/51], Iter [380/391] Loss: 0.5345
Epoch [11/51], Iter [390/391] Loss: 0.4930
Epoch [12/51], Iter [10/391] Loss: 0.4967
Epoch [12/51], Iter [20/391] Loss: 0.4741
Epoch [12/51], Iter [30/391] Loss: 0.5321
Epoch [12/51], Iter [40/391] Loss: 0.5046
Epoch [12/51], Iter [50/391] Loss: 0.5130
Epoch [12/51], Iter [60/391] Loss: 0.5296
Epoch [12/51], Iter [70/391] Loss: 0.4755
Epoch [12/51], Iter [80/391] Loss: 0.4484
Epoch [12/51], Iter [90/391] Loss: 0.4958
Epoch [12/51], Iter [100/391] Loss: 0.4853
Epoch [12/51], Iter [110/391] Loss: 0.5128
Epoch [12/51], Iter [120/391] Loss: 0.4712
Epoch [12/51], Iter [130/391] Loss: 0.4558
Epoch [12/51], Iter [140/391] Loss: 0.4764
Epoch [12/51], Iter [150/391] Loss: 0.5123
Epoch [12/51], Iter [160/391] Loss: 0.4433
Epoch [12/51], Iter [170/391] Loss: 0.5295
Epoch [12/51], Iter [180/391] Loss: 0.4956
Epoch [12/51], Iter [190/391] Loss: 0.4578
Epoch [12/51], Iter [200/391] Loss: 0.4757
Epoch [12/51], Iter [210/391] Loss: 0.4975
Epoch [12/51], Iter [220/391] Loss: 0.5004
Epoch [12/51], Iter [230/391] Loss: 0.5514
Epoch [12/51], Iter [240/391] Loss: 0.5282
Epoch [12/51], Iter [250/391] Loss: 0.4917
Epoch [12/51], Iter [260/391] Loss: 0.5338
Epoch [12/51], Iter [270/391] Loss: 0.4651
Epoch [12/51], Iter [280/391] Loss: 0.4961
Epoch [12/51], Iter [290/391] Loss: 0.4851
Epoch [12/51], Iter [300/391] Loss: 0.5055
Epoch [12/51], Iter [310/391] Loss: 0.5127
Epoch [12/51], Iter [320/391] Loss: 0.4811
Epoch [12/51], Iter [330/391] Loss: 0.4853
Epoch [12/51], Iter [340/391] Loss: 0.5623
Epoch [12/51], Iter [350/391] Loss: 0.5545
Epoch [12/51], Iter [360/391] Loss: 0.5402
Epoch [12/51], Iter [370/391] Loss: 0.5143
Epoch [12/51], Iter [380/391] Loss: 0.5188
Epoch [12/51], Iter [390/391] Loss: 0.4769
Epoch [13/51], Iter [10/391] Loss: 0.4587
Epoch [13/51], Iter [20/391] Loss: 0.5133
Epoch [13/51], Iter [30/391] Loss: 0.4893
Epoch [13/51], Iter [40/391] Loss: 0.4721
Epoch [13/51], Iter [50/391] Loss: 0.4405
Epoch [13/51], Iter [60/391] Loss: 0.5104
Epoch [13/51], Iter [70/391] Loss: 0.5099
Epoch [13/51], Iter [80/391] Loss: 0.4723
Epoch [13/51], Iter [90/391] Loss: 0.4715
Epoch [13/51], Iter [100/391] Loss: 0.4930
Epoch [13/51], Iter [110/391] Loss: 0.4411
Epoch [13/51], Iter [120/391] Loss: 0.4577
Epoch [13/51], Iter [130/391] Loss: 0.4569
Epoch [13/51], Iter [140/391] Loss: 0.4638
Epoch [13/51], Iter [150/391] Loss: 0.4527
Epoch [13/51], Iter [160/391] Loss: 0.5176
Epoch [13/51], Iter [170/391] Loss: 0.4951
Epoch [13/51], Iter [180/391] Loss: 0.4618
Epoch [13/51], Iter [190/391] Loss: 0.4447
Epoch [13/51], Iter [200/391] Loss: 0.4633
Epoch [13/51], Iter [210/391] Loss: 0.5015
Epoch [13/51], Iter [220/391] Loss: 0.5086
Epoch [13/51], Iter [230/391] Loss: 0.4483
Epoch [13/51], Iter [240/391] Loss: 0.5101
Epoch [13/51], Iter [250/391] Loss: 0.4658
Epoch [13/51], Iter [260/391] Loss: 0.5215
Epoch [13/51], Iter [270/391] Loss: 0.4763
Epoch [13/51], Iter [280/391] Loss: 0.4543
Epoch [13/51], Iter [290/391] Loss: 0.5202
Epoch [13/51], Iter [300/391] Loss: 0.4337
Epoch [13/51], Iter [310/391] Loss: 0.4187
Epoch [13/51], Iter [320/391] Loss: 0.4388
Epoch [13/51], Iter [330/391] Loss: 0.5398
Epoch [13/51], Iter [340/391] Loss: 0.5269
Epoch [13/51], Iter [350/391] Loss: 0.4610
Epoch [13/51], Iter [360/391] Loss: 0.4697
Epoch [13/51], Iter [370/391] Loss: 0.5167
Epoch [13/51], Iter [380/391] Loss: 0.4762
Epoch [13/51], Iter [390/391] Loss: 0.4390
Epoch [14/51], Iter [10/391] Loss: 0.4934
Epoch [14/51], Iter [20/391] Loss: 0.4585
Epoch [14/51], Iter [30/391] Loss: 0.4259
Epoch [14/51], Iter [40/391] Loss: 0.4653
Epoch [14/51], Iter [50/391] Loss: 0.4839
Epoch [14/51], Iter [60/391] Loss: 0.4796
Epoch [14/51], Iter [70/391] Loss: 0.4586
Epoch [14/51], Iter [80/391] Loss: 0.4868
Epoch [14/51], Iter [90/391] Loss: 0.4548
Epoch [14/51], Iter [100/391] Loss: 0.4889
Epoch [14/51], Iter [110/391] Loss: 0.4676
Epoch [14/51], Iter [120/391] Loss: 0.4601
Epoch [14/51], Iter [130/391] Loss: 0.4704
Epoch [14/51], Iter [140/391] Loss: 0.4685
Epoch [14/51], Iter [150/391] Loss: 0.4864
Epoch [14/51], Iter [160/391] Loss: 0.4539
Epoch [14/51], Iter [170/391] Loss: 0.4673
Epoch [14/51], Iter [180/391] Loss: 0.4814
Epoch [14/51], Iter [190/391] Loss: 0.4218
Epoch [14/51], Iter [200/391] Loss: 0.4323
Epoch [14/51], Iter [210/391] Loss: 0.4484
Epoch [14/51], Iter [220/391] Loss: 0.4765
Epoch [14/51], Iter [230/391] Loss: 0.4643
Epoch [14/51], Iter [240/391] Loss: 0.4399
Epoch [14/51], Iter [250/391] Loss: 0.5008
Epoch [14/51], Iter [260/391] Loss: 0.5044
Epoch [14/51], Iter [270/391] Loss: 0.4366
Epoch [14/51], Iter [280/391] Loss: 0.4598
Epoch [14/51], Iter [290/391] Loss: 0.5120
Epoch [14/51], Iter [300/391] Loss: 0.5226
Epoch [14/51], Iter [310/391] Loss: 0.4890
Epoch [14/51], Iter [320/391] Loss: 0.4626
Epoch [14/51], Iter [330/391] Loss: 0.4712
Epoch [14/51], Iter [340/391] Loss: 0.4642
Epoch [14/51], Iter [350/391] Loss: 0.4823
Epoch [14/51], Iter [360/391] Loss: 0.4670
Epoch [14/51], Iter [370/391] Loss: 0.4861
Epoch [14/51], Iter [380/391] Loss: 0.4947
Epoch [14/51], Iter [390/391] Loss: 0.4407
Epoch [15/51], Iter [10/391] Loss: 0.5042
Epoch [15/51], Iter [20/391] Loss: 0.4815
Epoch [15/51], Iter [30/391] Loss: 0.4230
Epoch [15/51], Iter [40/391] Loss: 0.4573
Epoch [15/51], Iter [50/391] Loss: 0.4518
Epoch [15/51], Iter [60/391] Loss: 0.4658
Epoch [15/51], Iter [70/391] Loss: 0.4644
Epoch [15/51], Iter [80/391] Loss: 0.4847
Epoch [15/51], Iter [90/391] Loss: 0.4734
Epoch [15/51], Iter [100/391] Loss: 0.4777
Epoch [15/51], Iter [110/391] Loss: 0.4325
Epoch [15/51], Iter [120/391] Loss: 0.4949
Epoch [15/51], Iter [130/391] Loss: 0.4588
Epoch [15/51], Iter [140/391] Loss: 0.4893
Epoch [15/51], Iter [150/391] Loss: 0.5088
Epoch [15/51], Iter [160/391] Loss: 0.4403
Epoch [15/51], Iter [170/391] Loss: 0.4276
Epoch [15/51], Iter [180/391] Loss: 0.4903
Epoch [15/51], Iter [190/391] Loss: 0.4065
Epoch [15/51], Iter [200/391] Loss: 0.4451
Epoch [15/51], Iter [210/391] Loss: 0.4686
Epoch [15/51], Iter [220/391] Loss: 0.4644
Epoch [15/51], Iter [230/391] Loss: 0.4829
Epoch [15/51], Iter [240/391] Loss: 0.5080
Epoch [15/51], Iter [250/391] Loss: 0.4124
Epoch [15/51], Iter [260/391] Loss: 0.4685
Epoch [15/51], Iter [270/391] Loss: 0.4468
Epoch [15/51], Iter [280/391] Loss: 0.4791
Epoch [15/51], Iter [290/391] Loss: 0.5011
Epoch [15/51], Iter [300/391] Loss: 0.4952
Epoch [15/51], Iter [310/391] Loss: 0.4670
Epoch [15/51], Iter [320/391] Loss: 0.4646
Epoch [15/51], Iter [330/391] Loss: 0.4528
Epoch [15/51], Iter [340/391] Loss: 0.4436
Epoch [15/51], Iter [350/391] Loss: 0.4477
Epoch [15/51], Iter [360/391] Loss: 0.4571
Epoch [15/51], Iter [370/391] Loss: 0.4341
Epoch [15/51], Iter [380/391] Loss: 0.4627
Epoch [15/51], Iter [390/391] Loss: 0.4984
Epoch [16/51], Iter [10/391] Loss: 0.4786
Epoch [16/51], Iter [20/391] Loss: 0.3925
Epoch [16/51], Iter [30/391] Loss: 0.4201
Epoch [16/51], Iter [40/391] Loss: 0.4652
Epoch [16/51], Iter [50/391] Loss: 0.4450
Epoch [16/51], Iter [60/391] Loss: 0.4490
Epoch [16/51], Iter [70/391] Loss: 0.4024
Epoch [16/51], Iter [80/391] Loss: 0.4963
Epoch [16/51], Iter [90/391] Loss: 0.4353
Epoch [16/51], Iter [100/391] Loss: 0.4358
Epoch [16/51], Iter [110/391] Loss: 0.4736
Epoch [16/51], Iter [120/391] Loss: 0.4275
Epoch [16/51], Iter [130/391] Loss: 0.4225
Epoch [16/51], Iter [140/391] Loss: 0.4241
Epoch [16/51], Iter [150/391] Loss: 0.4205
Epoch [16/51], Iter [160/391] Loss: 0.4667
Epoch [16/51], Iter [170/391] Loss: 0.4589
Epoch [16/51], Iter [180/391] Loss: 0.4665
Epoch [16/51], Iter [190/391] Loss: 0.4836
Epoch [16/51], Iter [200/391] Loss: 0.4653
Epoch [16/51], Iter [210/391] Loss: 0.4299
Epoch [16/51], Iter [220/391] Loss: 0.4518
Epoch [16/51], Iter [230/391] Loss: 0.4721
Epoch [16/51], Iter [240/391] Loss: 0.4277
Epoch [16/51], Iter [250/391] Loss: 0.4790
Epoch [16/51], Iter [260/391] Loss: 0.4294
Epoch [16/51], Iter [270/391] Loss: 0.4821
Epoch [16/51], Iter [280/391] Loss: 0.4252
Epoch [16/51], Iter [290/391] Loss: 0.4419
Epoch [16/51], Iter [300/391] Loss: 0.4416
Epoch [16/51], Iter [310/391] Loss: 0.4485
Epoch [16/51], Iter [320/391] Loss: 0.4338
Epoch [16/51], Iter [330/391] Loss: 0.4662
Epoch [16/51], Iter [340/391] Loss: 0.4815
Epoch [16/51], Iter [350/391] Loss: 0.4358
Epoch [16/51], Iter [360/391] Loss: 0.4539
Epoch [16/51], Iter [370/391] Loss: 0.4614
Epoch [16/51], Iter [380/391] Loss: 0.4783
Epoch [16/51], Iter [390/391] Loss: 0.4369
Epoch [17/51], Iter [10/391] Loss: 0.4403
Epoch [17/51], Iter [20/391] Loss: 0.4136
Epoch [17/51], Iter [30/391] Loss: 0.3939
Epoch [17/51], Iter [40/391] Loss: 0.4296
Epoch [17/51], Iter [50/391] Loss: 0.4163
Epoch [17/51], Iter [60/391] Loss: 0.4355
Epoch [17/51], Iter [70/391] Loss: 0.4407
Epoch [17/51], Iter [80/391] Loss: 0.4458
Epoch [17/51], Iter [90/391] Loss: 0.4190
Epoch [17/51], Iter [100/391] Loss: 0.4497
Epoch [17/51], Iter [110/391] Loss: 0.4818
Epoch [17/51], Iter [120/391] Loss: 0.4221
Epoch [17/51], Iter [130/391] Loss: 0.4698
Epoch [17/51], Iter [140/391] Loss: 0.4227
Epoch [17/51], Iter [150/391] Loss: 0.4241
Epoch [17/51], Iter [160/391] Loss: 0.4930
Epoch [17/51], Iter [170/391] Loss: 0.4854
Epoch [17/51], Iter [180/391] Loss: 0.3921
Epoch [17/51], Iter [190/391] Loss: 0.4564
Epoch [17/51], Iter [200/391] Loss: 0.4399
Epoch [17/51], Iter [210/391] Loss: 0.4561
Epoch [17/51], Iter [220/391] Loss: 0.4194
Epoch [17/51], Iter [230/391] Loss: 0.4119
Epoch [17/51], Iter [240/391] Loss: 0.4824
Epoch [17/51], Iter [250/391] Loss: 0.4770
Epoch [17/51], Iter [260/391] Loss: 0.4709
Epoch [17/51], Iter [270/391] Loss: 0.4729
Epoch [17/51], Iter [280/391] Loss: 0.4323
Epoch [17/51], Iter [290/391] Loss: 0.4565
Epoch [17/51], Iter [300/391] Loss: 0.4787
Epoch [17/51], Iter [310/391] Loss: 0.4818
Epoch [17/51], Iter [320/391] Loss: 0.4603
Epoch [17/51], Iter [330/391] Loss: 0.4399
Epoch [17/51], Iter [340/391] Loss: 0.4403
Epoch [17/51], Iter [350/391] Loss: 0.4580
Epoch [17/51], Iter [360/391] Loss: 0.4183
Epoch [17/51], Iter [370/391] Loss: 0.4427
Epoch [17/51], Iter [380/391] Loss: 0.4266
Epoch [17/51], Iter [390/391] Loss: 0.4004
Epoch [18/51], Iter [10/391] Loss: 0.4516
Epoch [18/51], Iter [20/391] Loss: 0.4306
Epoch [18/51], Iter [30/391] Loss: 0.4629
Epoch [18/51], Iter [40/391] Loss: 0.4270
Epoch [18/51], Iter [50/391] Loss: 0.4380
Epoch [18/51], Iter [60/391] Loss: 0.4206
Epoch [18/51], Iter [70/391] Loss: 0.4317
Epoch [18/51], Iter [80/391] Loss: 0.4468
Epoch [18/51], Iter [90/391] Loss: 0.4483
Epoch [18/51], Iter [100/391] Loss: 0.4079
Epoch [18/51], Iter [110/391] Loss: 0.4193
Epoch [18/51], Iter [120/391] Loss: 0.4176
Epoch [18/51], Iter [130/391] Loss: 0.4356
Epoch [18/51], Iter [140/391] Loss: 0.4581
Epoch [18/51], Iter [150/391] Loss: 0.4684
Epoch [18/51], Iter [160/391] Loss: 0.4550
Epoch [18/51], Iter [170/391] Loss: 0.4801
Epoch [18/51], Iter [180/391] Loss: 0.4227
Epoch [18/51], Iter [190/391] Loss: 0.4249
Epoch [18/51], Iter [200/391] Loss: 0.4374
Epoch [18/51], Iter [210/391] Loss: 0.4308
Epoch [18/51], Iter [220/391] Loss: 0.4524
Epoch [18/51], Iter [230/391] Loss: 0.4434
Epoch [18/51], Iter [240/391] Loss: 0.4482
Epoch [18/51], Iter [250/391] Loss: 0.4256
Epoch [18/51], Iter [260/391] Loss: 0.4702
Epoch [18/51], Iter [270/391] Loss: 0.4192
Epoch [18/51], Iter [280/391] Loss: 0.4429
Epoch [18/51], Iter [290/391] Loss: 0.4697
Epoch [18/51], Iter [300/391] Loss: 0.4242
Epoch [18/51], Iter [310/391] Loss: 0.4102
Epoch [18/51], Iter [320/391] Loss: 0.4090
Epoch [18/51], Iter [330/391] Loss: 0.4298
Epoch [18/51], Iter [340/391] Loss: 0.4336
Epoch [18/51], Iter [350/391] Loss: 0.4507
Epoch [18/51], Iter [360/391] Loss: 0.4145
Epoch [18/51], Iter [370/391] Loss: 0.4605
Epoch [18/51], Iter [380/391] Loss: 0.4599
Epoch [18/51], Iter [390/391] Loss: 0.4179
Epoch [19/51], Iter [10/391] Loss: 0.4235
Epoch [19/51], Iter [20/391] Loss: 0.4259
Epoch [19/51], Iter [30/391] Loss: 0.4220
Epoch [19/51], Iter [40/391] Loss: 0.4498
Epoch [19/51], Iter [50/391] Loss: 0.4168
Epoch [19/51], Iter [60/391] Loss: 0.4125
Epoch [19/51], Iter [70/391] Loss: 0.4581
Epoch [19/51], Iter [80/391] Loss: 0.4595
Epoch [19/51], Iter [90/391] Loss: 0.3964
Epoch [19/51], Iter [100/391] Loss: 0.4295
Epoch [19/51], Iter [110/391] Loss: 0.4092
Epoch [19/51], Iter [120/391] Loss: 0.4267
Epoch [19/51], Iter [130/391] Loss: 0.4506
Epoch [19/51], Iter [140/391] Loss: 0.4217
Epoch [19/51], Iter [150/391] Loss: 0.4186
Epoch [19/51], Iter [160/391] Loss: 0.4787
Epoch [19/51], Iter [170/391] Loss: 0.4037
Epoch [19/51], Iter [180/391] Loss: 0.4129
Epoch [19/51], Iter [190/391] Loss: 0.4129
Epoch [19/51], Iter [200/391] Loss: 0.4579
Epoch [19/51], Iter [210/391] Loss: 0.3868
Epoch [19/51], Iter [220/391] Loss: 0.3863
Epoch [19/51], Iter [230/391] Loss: 0.4142
Epoch [19/51], Iter [240/391] Loss: 0.4361
Epoch [19/51], Iter [250/391] Loss: 0.4225
Epoch [19/51], Iter [260/391] Loss: 0.4089
Epoch [19/51], Iter [270/391] Loss: 0.4259
Epoch [19/51], Iter [280/391] Loss: 0.4120
Epoch [19/51], Iter [290/391] Loss: 0.4091
Epoch [19/51], Iter [300/391] Loss: 0.4014
Epoch [19/51], Iter [310/391] Loss: 0.4188
Epoch [19/51], Iter [320/391] Loss: 0.4337
Epoch [19/51], Iter [330/391] Loss: 0.4204
Epoch [19/51], Iter [340/391] Loss: 0.4829
Epoch [19/51], Iter [350/391] Loss: 0.4367
Epoch [19/51], Iter [360/391] Loss: 0.3908
Epoch [19/51], Iter [370/391] Loss: 0.4466
Epoch [19/51], Iter [380/391] Loss: 0.4273
Epoch [19/51], Iter [390/391] Loss: 0.4291
Epoch [20/51], Iter [10/391] Loss: 0.4462
Epoch [20/51], Iter [20/391] Loss: 0.4462
Epoch [20/51], Iter [30/391] Loss: 0.4325
Epoch [20/51], Iter [40/391] Loss: 0.4042
Epoch [20/51], Iter [50/391] Loss: 0.4398
Epoch [20/51], Iter [60/391] Loss: 0.4305
Epoch [20/51], Iter [70/391] Loss: 0.4342
Epoch [20/51], Iter [80/391] Loss: 0.4128
Epoch [20/51], Iter [90/391] Loss: 0.4237
Epoch [20/51], Iter [100/391] Loss: 0.4568
Epoch [20/51], Iter [110/391] Loss: 0.4530
Epoch [20/51], Iter [120/391] Loss: 0.4128
Epoch [20/51], Iter [130/391] Loss: 0.4105
Epoch [20/51], Iter [140/391] Loss: 0.4402
Epoch [20/51], Iter [150/391] Loss: 0.3932
Epoch [20/51], Iter [160/391] Loss: 0.3935
Epoch [20/51], Iter [170/391] Loss: 0.4298
Epoch [20/51], Iter [180/391] Loss: 0.4214
Epoch [20/51], Iter [190/391] Loss: 0.4126
Epoch [20/51], Iter [200/391] Loss: 0.4114
Epoch [20/51], Iter [210/391] Loss: 0.4262
Epoch [20/51], Iter [220/391] Loss: 0.4719
Epoch [20/51], Iter [230/391] Loss: 0.4400
Epoch [20/51], Iter [240/391] Loss: 0.3844
Epoch [20/51], Iter [250/391] Loss: 0.4132
Epoch [20/51], Iter [260/391] Loss: 0.3965
Epoch [20/51], Iter [270/391] Loss: 0.4348
Epoch [20/51], Iter [280/391] Loss: 0.4445
Epoch [20/51], Iter [290/391] Loss: 0.4153
Epoch [20/51], Iter [300/391] Loss: 0.4322
Epoch [20/51], Iter [310/391] Loss: 0.4605
Epoch [20/51], Iter [320/391] Loss: 0.4253
Epoch [20/51], Iter [330/391] Loss: 0.4566
Epoch [20/51], Iter [340/391] Loss: 0.3964
Epoch [20/51], Iter [350/391] Loss: 0.4114
Epoch [20/51], Iter [360/391] Loss: 0.4174
Epoch [20/51], Iter [370/391] Loss: 0.4262
Epoch [20/51], Iter [380/391] Loss: 0.4250
Epoch [20/51], Iter [390/391] Loss: 0.3955
[Saving Checkpoint]
Epoch [21/51], Iter [10/391] Loss: 0.4099
Epoch [21/51], Iter [20/391] Loss: 0.3747
Epoch [21/51], Iter [30/391] Loss: 0.3932
Epoch [21/51], Iter [40/391] Loss: 0.3760
Epoch [21/51], Iter [50/391] Loss: 0.4354
Epoch [21/51], Iter [60/391] Loss: 0.4097
Epoch [21/51], Iter [70/391] Loss: 0.3906
Epoch [21/51], Iter [80/391] Loss: 0.3991
Epoch [21/51], Iter [90/391] Loss: 0.4270
Epoch [21/51], Iter [100/391] Loss: 0.4056
Epoch [21/51], Iter [110/391] Loss: 0.4063
Epoch [21/51], Iter [120/391] Loss: 0.4249
Epoch [21/51], Iter [130/391] Loss: 0.4033
Epoch [21/51], Iter [140/391] Loss: 0.4162
Epoch [21/51], Iter [150/391] Loss: 0.4020
Epoch [21/51], Iter [160/391] Loss: 0.3811
Epoch [21/51], Iter [170/391] Loss: 0.3894
Epoch [21/51], Iter [180/391] Loss: 0.4326
Epoch [21/51], Iter [190/391] Loss: 0.3756
Epoch [21/51], Iter [200/391] Loss: 0.4254
Epoch [21/51], Iter [210/391] Loss: 0.4119
Epoch [21/51], Iter [220/391] Loss: 0.3895
Epoch [21/51], Iter [230/391] Loss: 0.4225
Epoch [21/51], Iter [240/391] Loss: 0.4308
Epoch [21/51], Iter [250/391] Loss: 0.4426
Epoch [21/51], Iter [260/391] Loss: 0.4011
Epoch [21/51], Iter [270/391] Loss: 0.4263
Epoch [21/51], Iter [280/391] Loss: 0.4050
Epoch [21/51], Iter [290/391] Loss: 0.4229
Epoch [21/51], Iter [300/391] Loss: 0.4769
Epoch [21/51], Iter [310/391] Loss: 0.3976
Epoch [21/51], Iter [320/391] Loss: 0.3663
Epoch [21/51], Iter [330/391] Loss: 0.4138
Epoch [21/51], Iter [340/391] Loss: 0.3992
Epoch [21/51], Iter [350/391] Loss: 0.4007
Epoch [21/51], Iter [360/391] Loss: 0.4301
Epoch [21/51], Iter [370/391] Loss: 0.4117
Epoch [21/51], Iter [380/391] Loss: 0.4279
Epoch [21/51], Iter [390/391] Loss: 0.4030
Epoch [22/51], Iter [10/391] Loss: 0.3927
Epoch [22/51], Iter [20/391] Loss: 0.3970
Epoch [22/51], Iter [30/391] Loss: 0.4129
Epoch [22/51], Iter [40/391] Loss: 0.3764
Epoch [22/51], Iter [50/391] Loss: 0.4381
Epoch [22/51], Iter [60/391] Loss: 0.4247
Epoch [22/51], Iter [70/391] Loss: 0.4340
Epoch [22/51], Iter [80/391] Loss: 0.4213
Epoch [22/51], Iter [90/391] Loss: 0.3818
Epoch [22/51], Iter [100/391] Loss: 0.3877
Epoch [22/51], Iter [110/391] Loss: 0.3792
Epoch [22/51], Iter [120/391] Loss: 0.3702
Epoch [22/51], Iter [130/391] Loss: 0.3705
Epoch [22/51], Iter [140/391] Loss: 0.3944
Epoch [22/51], Iter [150/391] Loss: 0.3978
Epoch [22/51], Iter [160/391] Loss: 0.4229
Epoch [22/51], Iter [170/391] Loss: 0.4160
Epoch [22/51], Iter [180/391] Loss: 0.4007
Epoch [22/51], Iter [190/391] Loss: 0.3651
Epoch [22/51], Iter [200/391] Loss: 0.4099
Epoch [22/51], Iter [210/391] Loss: 0.4241
Epoch [22/51], Iter [220/391] Loss: 0.3959
Epoch [22/51], Iter [230/391] Loss: 0.4253
Epoch [22/51], Iter [240/391] Loss: 0.4062
Epoch [22/51], Iter [250/391] Loss: 0.4266
Epoch [22/51], Iter [260/391] Loss: 0.4408
Epoch [22/51], Iter [270/391] Loss: 0.4100
Epoch [22/51], Iter [280/391] Loss: 0.3804
Epoch [22/51], Iter [290/391] Loss: 0.4637
Epoch [22/51], Iter [300/391] Loss: 0.4257
Epoch [22/51], Iter [310/391] Loss: 0.3901
Epoch [22/51], Iter [320/391] Loss: 0.4411
Epoch [22/51], Iter [330/391] Loss: 0.4351
Epoch [22/51], Iter [340/391] Loss: 0.4001
Epoch [22/51], Iter [350/391] Loss: 0.3941
Epoch [22/51], Iter [360/391] Loss: 0.3896
Epoch [22/51], Iter [370/391] Loss: 0.4138
Epoch [22/51], Iter [380/391] Loss: 0.4111
Epoch [22/51], Iter [390/391] Loss: 0.3730
Epoch [23/51], Iter [10/391] Loss: 0.3873
Epoch [23/51], Iter [20/391] Loss: 0.3977
Epoch [23/51], Iter [30/391] Loss: 0.4102
Epoch [23/51], Iter [40/391] Loss: 0.4198
Epoch [23/51], Iter [50/391] Loss: 0.3832
Epoch [23/51], Iter [60/391] Loss: 0.3954
Epoch [23/51], Iter [70/391] Loss: 0.3884
Epoch [23/51], Iter [80/391] Loss: 0.4203
Epoch [23/51], Iter [90/391] Loss: 0.4106
Epoch [23/51], Iter [100/391] Loss: 0.3727
Epoch [23/51], Iter [110/391] Loss: 0.3889
Epoch [23/51], Iter [120/391] Loss: 0.3792
Epoch [23/51], Iter [130/391] Loss: 0.4149
Epoch [23/51], Iter [140/391] Loss: 0.3588
Epoch [23/51], Iter [150/391] Loss: 0.4025
Epoch [23/51], Iter [160/391] Loss: 0.3969
Epoch [23/51], Iter [170/391] Loss: 0.3933
Epoch [23/51], Iter [180/391] Loss: 0.3940
Epoch [23/51], Iter [190/391] Loss: 0.3971
Epoch [23/51], Iter [200/391] Loss: 0.4111
Epoch [23/51], Iter [210/391] Loss: 0.4188
Epoch [23/51], Iter [220/391] Loss: 0.3861
Epoch [23/51], Iter [230/391] Loss: 0.4127
Epoch [23/51], Iter [240/391] Loss: 0.4081
Epoch [23/51], Iter [250/391] Loss: 0.3757
Epoch [23/51], Iter [260/391] Loss: 0.3716
Epoch [23/51], Iter [270/391] Loss: 0.4081
Epoch [23/51], Iter [280/391] Loss: 0.4383
Epoch [23/51], Iter [290/391] Loss: 0.4167
Epoch [23/51], Iter [300/391] Loss: 0.4191
Epoch [23/51], Iter [310/391] Loss: 0.4476
Epoch [23/51], Iter [320/391] Loss: 0.3869
Epoch [23/51], Iter [330/391] Loss: 0.3958
Epoch [23/51], Iter [340/391] Loss: 0.3921
Epoch [23/51], Iter [350/391] Loss: 0.4143
Epoch [23/51], Iter [360/391] Loss: 0.3956
Epoch [23/51], Iter [370/391] Loss: 0.3940
Epoch [23/51], Iter [380/391] Loss: 0.4140
Epoch [23/51], Iter [390/391] Loss: 0.4315
Epoch [24/51], Iter [10/391] Loss: 0.4197
Epoch [24/51], Iter [20/391] Loss: 0.3830
Epoch [24/51], Iter [30/391] Loss: 0.3615
Epoch [24/51], Iter [40/391] Loss: 0.4061
Epoch [24/51], Iter [50/391] Loss: 0.4257
Epoch [24/51], Iter [60/391] Loss: 0.3892
Epoch [24/51], Iter [70/391] Loss: 0.4066
Epoch [24/51], Iter [80/391] Loss: 0.3964
Epoch [24/51], Iter [90/391] Loss: 0.4290
Epoch [24/51], Iter [100/391] Loss: 0.3971
Epoch [24/51], Iter [110/391] Loss: 0.4073
Epoch [24/51], Iter [120/391] Loss: 0.4157
Epoch [24/51], Iter [130/391] Loss: 0.4200
Epoch [24/51], Iter [140/391] Loss: 0.4111
Epoch [24/51], Iter [150/391] Loss: 0.4080
Epoch [24/51], Iter [160/391] Loss: 0.4288
Epoch [24/51], Iter [170/391] Loss: 0.4037
Epoch [24/51], Iter [180/391] Loss: 0.4321
Epoch [24/51], Iter [190/391] Loss: 0.3510
Epoch [24/51], Iter [200/391] Loss: 0.3807
Epoch [24/51], Iter [210/391] Loss: 0.3741
Epoch [24/51], Iter [220/391] Loss: 0.4055
Epoch [24/51], Iter [230/391] Loss: 0.4383
Epoch [24/51], Iter [240/391] Loss: 0.4512
Epoch [24/51], Iter [250/391] Loss: 0.3777
Epoch [24/51], Iter [260/391] Loss: 0.3900
Epoch [24/51], Iter [270/391] Loss: 0.4098
Epoch [24/51], Iter [280/391] Loss: 0.3906
Epoch [24/51], Iter [290/391] Loss: 0.3926
Epoch [24/51], Iter [300/391] Loss: 0.4350
Epoch [24/51], Iter [310/391] Loss: 0.4048
Epoch [24/51], Iter [320/391] Loss: 0.4045
Epoch [24/51], Iter [330/391] Loss: 0.3800
Epoch [24/51], Iter [340/391] Loss: 0.4377
Epoch [24/51], Iter [350/391] Loss: 0.3889
Epoch [24/51], Iter [360/391] Loss: 0.4261
Epoch [24/51], Iter [370/391] Loss: 0.3866
Epoch [24/51], Iter [380/391] Loss: 0.4286
Epoch [24/51], Iter [390/391] Loss: 0.3821
Epoch [25/51], Iter [10/391] Loss: 0.3740
Epoch [25/51], Iter [20/391] Loss: 0.3767
Epoch [25/51], Iter [30/391] Loss: 0.3854
Epoch [25/51], Iter [40/391] Loss: 0.4102
Epoch [25/51], Iter [50/391] Loss: 0.3914
Epoch [25/51], Iter [60/391] Loss: 0.4093
Epoch [25/51], Iter [70/391] Loss: 0.3902
Epoch [25/51], Iter [80/391] Loss: 0.3926
Epoch [25/51], Iter [90/391] Loss: 0.3925
Epoch [25/51], Iter [100/391] Loss: 0.4057
Epoch [25/51], Iter [110/391] Loss: 0.4036
Epoch [25/51], Iter [120/391] Loss: 0.4028
Epoch [25/51], Iter [130/391] Loss: 0.3757
Epoch [25/51], Iter [140/391] Loss: 0.4045
Epoch [25/51], Iter [150/391] Loss: 0.3902
Epoch [25/51], Iter [160/391] Loss: 0.4167
Epoch [25/51], Iter [170/391] Loss: 0.3678
Epoch [25/51], Iter [180/391] Loss: 0.4392
Epoch [25/51], Iter [190/391] Loss: 0.4152
Epoch [25/51], Iter [200/391] Loss: 0.4044
Epoch [25/51], Iter [210/391] Loss: 0.4580
Epoch [25/51], Iter [220/391] Loss: 0.3700
Epoch [25/51], Iter [230/391] Loss: 0.4314
Epoch [25/51], Iter [240/391] Loss: 0.3747
Epoch [25/51], Iter [250/391] Loss: 0.4432
Epoch [25/51], Iter [260/391] Loss: 0.3871
Epoch [25/51], Iter [270/391] Loss: 0.3712
Epoch [25/51], Iter [280/391] Loss: 0.3942
Epoch [25/51], Iter [290/391] Loss: 0.3900
Epoch [25/51], Iter [300/391] Loss: 0.3843
Epoch [25/51], Iter [310/391] Loss: 0.3953
Epoch [25/51], Iter [320/391] Loss: 0.4011
Epoch [25/51], Iter [330/391] Loss: 0.4304
Epoch [25/51], Iter [340/391] Loss: 0.3900
Epoch [25/51], Iter [350/391] Loss: 0.3764
Epoch [25/51], Iter [360/391] Loss: 0.3842
Epoch [25/51], Iter [370/391] Loss: 0.4188
Epoch [25/51], Iter [380/391] Loss: 0.3877
Epoch [25/51], Iter [390/391] Loss: 0.3926
Epoch [26/51], Iter [10/391] Loss: 0.3642
Epoch [26/51], Iter [20/391] Loss: 0.3570
Epoch [26/51], Iter [30/391] Loss: 0.3760
Epoch [26/51], Iter [40/391] Loss: 0.3715
Epoch [26/51], Iter [50/391] Loss: 0.3899
Epoch [26/51], Iter [60/391] Loss: 0.3684
Epoch [26/51], Iter [70/391] Loss: 0.3911
Epoch [26/51], Iter [80/391] Loss: 0.4306
Epoch [26/51], Iter [90/391] Loss: 0.3722
Epoch [26/51], Iter [100/391] Loss: 0.3762
Epoch [26/51], Iter [110/391] Loss: 0.3905
Epoch [26/51], Iter [120/391] Loss: 0.3752
Epoch [26/51], Iter [130/391] Loss: 0.3979
Epoch [26/51], Iter [140/391] Loss: 0.3876
Epoch [26/51], Iter [150/391] Loss: 0.3709
Epoch [26/51], Iter [160/391] Loss: 0.3702
Epoch [26/51], Iter [170/391] Loss: 0.3888
Epoch [26/51], Iter [180/391] Loss: 0.4047
Epoch [26/51], Iter [190/391] Loss: 0.4134
Epoch [26/51], Iter [200/391] Loss: 0.4039
Epoch [26/51], Iter [210/391] Loss: 0.3675
Epoch [26/51], Iter [220/391] Loss: 0.3849
Epoch [26/51], Iter [230/391] Loss: 0.3936
Epoch [26/51], Iter [240/391] Loss: 0.4158
Epoch [26/51], Iter [250/391] Loss: 0.3907
Epoch [26/51], Iter [260/391] Loss: 0.4465
Epoch [26/51], Iter [270/391] Loss: 0.3758
Epoch [26/51], Iter [280/391] Loss: 0.4136
Epoch [26/51], Iter [290/391] Loss: 0.3917
Epoch [26/51], Iter [300/391] Loss: 0.4079
Epoch [26/51], Iter [310/391] Loss: 0.4211
Epoch [26/51], Iter [320/391] Loss: 0.3798
Epoch [26/51], Iter [330/391] Loss: 0.3505
Epoch [26/51], Iter [340/391] Loss: 0.4250
Epoch [26/51], Iter [350/391] Loss: 0.4226
Epoch [26/51], Iter [360/391] Loss: 0.4059
Epoch [26/51], Iter [370/391] Loss: 0.4240
Epoch [26/51], Iter [380/391] Loss: 0.4125
Epoch [26/51], Iter [390/391] Loss: 0.4075
Epoch [27/51], Iter [10/391] Loss: 0.4045
Epoch [27/51], Iter [20/391] Loss: 0.3620
Epoch [27/51], Iter [30/391] Loss: 0.4121
Epoch [27/51], Iter [40/391] Loss: 0.4046
Epoch [27/51], Iter [50/391] Loss: 0.3969
Epoch [27/51], Iter [60/391] Loss: 0.3864
Epoch [27/51], Iter [70/391] Loss: 0.3630
Epoch [27/51], Iter [80/391] Loss: 0.3842
Epoch [27/51], Iter [90/391] Loss: 0.3872
Epoch [27/51], Iter [100/391] Loss: 0.3710
Epoch [27/51], Iter [110/391] Loss: 0.4166
Epoch [27/51], Iter [120/391] Loss: 0.3820
Epoch [27/51], Iter [130/391] Loss: 0.3978
Epoch [27/51], Iter [140/391] Loss: 0.3747
Epoch [27/51], Iter [150/391] Loss: 0.4073
Epoch [27/51], Iter [160/391] Loss: 0.3781
Epoch [27/51], Iter [170/391] Loss: 0.3871
Epoch [27/51], Iter [180/391] Loss: 0.3947
Epoch [27/51], Iter [190/391] Loss: 0.3852
Epoch [27/51], Iter [200/391] Loss: 0.3875
Epoch [27/51], Iter [210/391] Loss: 0.3724
Epoch [27/51], Iter [220/391] Loss: 0.4078
Epoch [27/51], Iter [230/391] Loss: 0.3757
Epoch [27/51], Iter [240/391] Loss: 0.4042
Epoch [27/51], Iter [250/391] Loss: 0.4087
Epoch [27/51], Iter [260/391] Loss: 0.4025
Epoch [27/51], Iter [270/391] Loss: 0.3911
Epoch [27/51], Iter [280/391] Loss: 0.3897
Epoch [27/51], Iter [290/391] Loss: 0.4151
Epoch [27/51], Iter [300/391] Loss: 0.4065
Epoch [27/51], Iter [310/391] Loss: 0.4073
Epoch [27/51], Iter [320/391] Loss: 0.4022
Epoch [27/51], Iter [330/391] Loss: 0.4216
Epoch [27/51], Iter [340/391] Loss: 0.3980
Epoch [27/51], Iter [350/391] Loss: 0.4119
Epoch [27/51], Iter [360/391] Loss: 0.3760
Epoch [27/51], Iter [370/391] Loss: 0.3895
Epoch [27/51], Iter [380/391] Loss: 0.3816
Epoch [27/51], Iter [390/391] Loss: 0.3872
Epoch [28/51], Iter [10/391] Loss: 0.3867
Epoch [28/51], Iter [20/391] Loss: 0.3784
Epoch [28/51], Iter [30/391] Loss: 0.4158
Epoch [28/51], Iter [40/391] Loss: 0.3962
Epoch [28/51], Iter [50/391] Loss: 0.3817
Epoch [28/51], Iter [60/391] Loss: 0.3807
Epoch [28/51], Iter [70/391] Loss: 0.4225
Epoch [28/51], Iter [80/391] Loss: 0.3917
Epoch [28/51], Iter [90/391] Loss: 0.3326
Epoch [28/51], Iter [100/391] Loss: 0.4124
Epoch [28/51], Iter [110/391] Loss: 0.3804
Epoch [28/51], Iter [120/391] Loss: 0.3871
Epoch [28/51], Iter [130/391] Loss: 0.3648
Epoch [28/51], Iter [140/391] Loss: 0.4007
Epoch [28/51], Iter [150/391] Loss: 0.3650
Epoch [28/51], Iter [160/391] Loss: 0.3989
Epoch [28/51], Iter [170/391] Loss: 0.3555
Epoch [28/51], Iter [180/391] Loss: 0.4057
Epoch [28/51], Iter [190/391] Loss: 0.3854
Epoch [28/51], Iter [200/391] Loss: 0.3943
Epoch [28/51], Iter [210/391] Loss: 0.3770
Epoch [28/51], Iter [220/391] Loss: 0.3880
Epoch [28/51], Iter [230/391] Loss: 0.3744
Epoch [28/51], Iter [240/391] Loss: 0.3889
Epoch [28/51], Iter [250/391] Loss: 0.3835
Epoch [28/51], Iter [260/391] Loss: 0.3870
Epoch [28/51], Iter [270/391] Loss: 0.3865
Epoch [28/51], Iter [280/391] Loss: 0.3753
Epoch [28/51], Iter [290/391] Loss: 0.4073
Epoch [28/51], Iter [300/391] Loss: 0.3739
Epoch [28/51], Iter [310/391] Loss: 0.4100
Epoch [28/51], Iter [320/391] Loss: 0.3895
Epoch [28/51], Iter [330/391] Loss: 0.3767
Epoch [28/51], Iter [340/391] Loss: 0.3821
Epoch [28/51], Iter [350/391] Loss: 0.4149
Epoch [28/51], Iter [360/391] Loss: 0.3767
Epoch [28/51], Iter [370/391] Loss: 0.3910
Epoch [28/51], Iter [380/391] Loss: 0.4254
Epoch [28/51], Iter [390/391] Loss: 0.3879
Epoch [29/51], Iter [10/391] Loss: 0.3625
Epoch [29/51], Iter [20/391] Loss: 0.3787
Epoch [29/51], Iter [30/391] Loss: 0.3996
Epoch [29/51], Iter [40/391] Loss: 0.3664
Epoch [29/51], Iter [50/391] Loss: 0.3613
Epoch [29/51], Iter [60/391] Loss: 0.4025
Epoch [29/51], Iter [70/391] Loss: 0.3842
Epoch [29/51], Iter [80/391] Loss: 0.3793
Epoch [29/51], Iter [90/391] Loss: 0.3605
Epoch [29/51], Iter [100/391] Loss: 0.4003
Epoch [29/51], Iter [110/391] Loss: 0.4242
Epoch [29/51], Iter [120/391] Loss: 0.4074
Epoch [29/51], Iter [130/391] Loss: 0.3580
Epoch [29/51], Iter [140/391] Loss: 0.3875
Epoch [29/51], Iter [150/391] Loss: 0.3527
Epoch [29/51], Iter [160/391] Loss: 0.3778
Epoch [29/51], Iter [170/391] Loss: 0.4127
Epoch [29/51], Iter [180/391] Loss: 0.3851
Epoch [29/51], Iter [190/391] Loss: 0.3687
Epoch [29/51], Iter [200/391] Loss: 0.4320
Epoch [29/51], Iter [210/391] Loss: 0.3836
Epoch [29/51], Iter [220/391] Loss: 0.3731
Epoch [29/51], Iter [230/391] Loss: 0.3717
Epoch [29/51], Iter [240/391] Loss: 0.3952
Epoch [29/51], Iter [250/391] Loss: 0.4053
Epoch [29/51], Iter [260/391] Loss: 0.3678
Epoch [29/51], Iter [270/391] Loss: 0.3558
Epoch [29/51], Iter [280/391] Loss: 0.3497
Epoch [29/51], Iter [290/391] Loss: 0.3827
Epoch [29/51], Iter [300/391] Loss: 0.3923
Epoch [29/51], Iter [310/391] Loss: 0.3657
Epoch [29/51], Iter [320/391] Loss: 0.3484
Epoch [29/51], Iter [330/391] Loss: 0.3514
Epoch [29/51], Iter [340/391] Loss: 0.4066
Epoch [29/51], Iter [350/391] Loss: 0.4215
Epoch [29/51], Iter [360/391] Loss: 0.4237
Epoch [29/51], Iter [370/391] Loss: 0.4247
Epoch [29/51], Iter [380/391] Loss: 0.3753
Epoch [29/51], Iter [390/391] Loss: 0.4109
Epoch [30/51], Iter [10/391] Loss: 0.3855
Epoch [30/51], Iter [20/391] Loss: 0.3526
Epoch [30/51], Iter [30/391] Loss: 0.3965
Epoch [30/51], Iter [40/391] Loss: 0.3616
Epoch [30/51], Iter [50/391] Loss: 0.3699
Epoch [30/51], Iter [60/391] Loss: 0.4013
Epoch [30/51], Iter [70/391] Loss: 0.3844
Epoch [30/51], Iter [80/391] Loss: 0.4125
Epoch [30/51], Iter [90/391] Loss: 0.3881
Epoch [30/51], Iter [100/391] Loss: 0.4157
Epoch [30/51], Iter [110/391] Loss: 0.3825
Epoch [30/51], Iter [120/391] Loss: 0.3897
Epoch [30/51], Iter [130/391] Loss: 0.4172
Epoch [30/51], Iter [140/391] Loss: 0.3968
Epoch [30/51], Iter [150/391] Loss: 0.3834
Epoch [30/51], Iter [160/391] Loss: 0.3967
Epoch [30/51], Iter [170/391] Loss: 0.3850
Epoch [30/51], Iter [180/391] Loss: 0.3492
Epoch [30/51], Iter [190/391] Loss: 0.3772
Epoch [30/51], Iter [200/391] Loss: 0.3708
Epoch [30/51], Iter [210/391] Loss: 0.3511
Epoch [30/51], Iter [220/391] Loss: 0.3788
Epoch [30/51], Iter [230/391] Loss: 0.3597
Epoch [30/51], Iter [240/391] Loss: 0.3759
Epoch [30/51], Iter [250/391] Loss: 0.3893
Epoch [30/51], Iter [260/391] Loss: 0.3934
Epoch [30/51], Iter [270/391] Loss: 0.3732
Epoch [30/51], Iter [280/391] Loss: 0.3845
Epoch [30/51], Iter [290/391] Loss: 0.3934
Epoch [30/51], Iter [300/391] Loss: 0.3827
Epoch [30/51], Iter [310/391] Loss: 0.4145
Epoch [30/51], Iter [320/391] Loss: 0.3722
Epoch [30/51], Iter [330/391] Loss: 0.3691
Epoch [30/51], Iter [340/391] Loss: 0.4139
Epoch [30/51], Iter [350/391] Loss: 0.3580
Epoch [30/51], Iter [360/391] Loss: 0.3733
Epoch [30/51], Iter [370/391] Loss: 0.3933
Epoch [30/51], Iter [380/391] Loss: 0.4014
Epoch [30/51], Iter [390/391] Loss: 0.3966
[Saving Checkpoint]
Epoch [31/51], Iter [10/391] Loss: 0.3876
Epoch [31/51], Iter [20/391] Loss: 0.3503
Epoch [31/51], Iter [30/391] Loss: 0.3473
Epoch [31/51], Iter [40/391] Loss: 0.3483
Epoch [31/51], Iter [50/391] Loss: 0.3673
Epoch [31/51], Iter [60/391] Loss: 0.3542
Epoch [31/51], Iter [70/391] Loss: 0.3726
Epoch [31/51], Iter [80/391] Loss: 0.3831
Epoch [31/51], Iter [90/391] Loss: 0.3646
Epoch [31/51], Iter [100/391] Loss: 0.3727
Epoch [31/51], Iter [110/391] Loss: 0.3925
Epoch [31/51], Iter [120/391] Loss: 0.3731
Epoch [31/51], Iter [130/391] Loss: 0.3899
Epoch [31/51], Iter [140/391] Loss: 0.3632
Epoch [31/51], Iter [150/391] Loss: 0.3874
Epoch [31/51], Iter [160/391] Loss: 0.3555
Epoch [31/51], Iter [170/391] Loss: 0.3922
Epoch [31/51], Iter [180/391] Loss: 0.4039
Epoch [31/51], Iter [190/391] Loss: 0.3573
Epoch [31/51], Iter [200/391] Loss: 0.3980
Epoch [31/51], Iter [210/391] Loss: 0.3699
Epoch [31/51], Iter [220/391] Loss: 0.3861
Epoch [31/51], Iter [230/391] Loss: 0.3569
Epoch [31/51], Iter [240/391] Loss: 0.3576
Epoch [31/51], Iter [250/391] Loss: 0.3757
Epoch [31/51], Iter [260/391] Loss: 0.3710
Epoch [31/51], Iter [270/391] Loss: 0.3641
Epoch [31/51], Iter [280/391] Loss: 0.3682
Epoch [31/51], Iter [290/391] Loss: 0.3615
Epoch [31/51], Iter [300/391] Loss: 0.3550
Epoch [31/51], Iter [310/391] Loss: 0.3613
Epoch [31/51], Iter [320/391] Loss: 0.3775
Epoch [31/51], Iter [330/391] Loss: 0.4165
Epoch [31/51], Iter [340/391] Loss: 0.3667
Epoch [31/51], Iter [350/391] Loss: 0.3697
Epoch [31/51], Iter [360/391] Loss: 0.3941
Epoch [31/51], Iter [370/391] Loss: 0.3718
Epoch [31/51], Iter [380/391] Loss: 0.3798
Epoch [31/51], Iter [390/391] Loss: 0.3864
Epoch [32/51], Iter [10/391] Loss: 0.3441
Epoch [32/51], Iter [20/391] Loss: 0.3713
Epoch [32/51], Iter [30/391] Loss: 0.3821
Epoch [32/51], Iter [40/391] Loss: 0.3409
Epoch [32/51], Iter [50/391] Loss: 0.3569
Epoch [32/51], Iter [60/391] Loss: 0.3983
Epoch [32/51], Iter [70/391] Loss: 0.3807
Epoch [32/51], Iter [80/391] Loss: 0.3563
Epoch [32/51], Iter [90/391] Loss: 0.3937
Epoch [32/51], Iter [100/391] Loss: 0.3916
Epoch [32/51], Iter [110/391] Loss: 0.3298
Epoch [32/51], Iter [120/391] Loss: 0.3805
Epoch [32/51], Iter [130/391] Loss: 0.3659
Epoch [32/51], Iter [140/391] Loss: 0.3743
Epoch [32/51], Iter [150/391] Loss: 0.3746
Epoch [32/51], Iter [160/391] Loss: 0.3782
Epoch [32/51], Iter [170/391] Loss: 0.3476
Epoch [32/51], Iter [180/391] Loss: 0.3860
Epoch [32/51], Iter [190/391] Loss: 0.3501
Epoch [32/51], Iter [200/391] Loss: 0.3539
Epoch [32/51], Iter [210/391] Loss: 0.3798
Epoch [32/51], Iter [220/391] Loss: 0.3268
Epoch [32/51], Iter [230/391] Loss: 0.3529
Epoch [32/51], Iter [240/391] Loss: 0.3999
Epoch [32/51], Iter [250/391] Loss: 0.3667
Epoch [32/51], Iter [260/391] Loss: 0.4027
Epoch [32/51], Iter [270/391] Loss: 0.3780
Epoch [32/51], Iter [280/391] Loss: 0.3444
Epoch [32/51], Iter [290/391] Loss: 0.3626
Epoch [32/51], Iter [300/391] Loss: 0.3735
Epoch [32/51], Iter [310/391] Loss: 0.3930
Epoch [32/51], Iter [320/391] Loss: 0.3726
Epoch [32/51], Iter [330/391] Loss: 0.3787
Epoch [32/51], Iter [340/391] Loss: 0.3924
Epoch [32/51], Iter [350/391] Loss: 0.3722
Epoch [32/51], Iter [360/391] Loss: 0.3752
Epoch [32/51], Iter [370/391] Loss: 0.3436
Epoch [32/51], Iter [380/391] Loss: 0.3762
Epoch [32/51], Iter [390/391] Loss: 0.3600
Epoch [33/51], Iter [10/391] Loss: 0.3681
Epoch [33/51], Iter [20/391] Loss: 0.3733
Epoch [33/51], Iter [30/391] Loss: 0.3428
Epoch [33/51], Iter [40/391] Loss: 0.3817
Epoch [33/51], Iter [50/391] Loss: 0.3436
Epoch [33/51], Iter [60/391] Loss: 0.3558
Epoch [33/51], Iter [70/391] Loss: 0.3611
Epoch [33/51], Iter [80/391] Loss: 0.3839
Epoch [33/51], Iter [90/391] Loss: 0.3587
Epoch [33/51], Iter [100/391] Loss: 0.3655
Epoch [33/51], Iter [110/391] Loss: 0.3865
Epoch [33/51], Iter [120/391] Loss: 0.3731
Epoch [33/51], Iter [130/391] Loss: 0.3956
Epoch [33/51], Iter [140/391] Loss: 0.3567
Epoch [33/51], Iter [150/391] Loss: 0.3527
Epoch [33/51], Iter [160/391] Loss: 0.3504
Epoch [33/51], Iter [170/391] Loss: 0.3557
Epoch [33/51], Iter [180/391] Loss: 0.3502
Epoch [33/51], Iter [190/391] Loss: 0.3477
Epoch [33/51], Iter [200/391] Loss: 0.3652
Epoch [33/51], Iter [210/391] Loss: 0.3394
Epoch [33/51], Iter [220/391] Loss: 0.3509
Epoch [33/51], Iter [230/391] Loss: 0.3665
Epoch [33/51], Iter [240/391] Loss: 0.3528
Epoch [33/51], Iter [250/391] Loss: 0.3619
Epoch [33/51], Iter [260/391] Loss: 0.3894
Epoch [33/51], Iter [270/391] Loss: 0.3833
Epoch [33/51], Iter [280/391] Loss: 0.3711
Epoch [33/51], Iter [290/391] Loss: 0.3567
Epoch [33/51], Iter [300/391] Loss: 0.3647
Epoch [33/51], Iter [310/391] Loss: 0.3745
Epoch [33/51], Iter [320/391] Loss: 0.3332
Epoch [33/51], Iter [330/391] Loss: 0.3579
Epoch [33/51], Iter [340/391] Loss: 0.3945
Epoch [33/51], Iter [350/391] Loss: 0.3810
Epoch [33/51], Iter [360/391] Loss: 0.3523
Epoch [33/51], Iter [370/391] Loss: 0.3713
Epoch [33/51], Iter [380/391] Loss: 0.3746
Epoch [33/51], Iter [390/391] Loss: 0.3376
Epoch [34/51], Iter [10/391] Loss: 0.3549
Epoch [34/51], Iter [20/391] Loss: 0.3483
Epoch [34/51], Iter [30/391] Loss: 0.3545
Epoch [34/51], Iter [40/391] Loss: 0.3803
Epoch [34/51], Iter [50/391] Loss: 0.3746
Epoch [34/51], Iter [60/391] Loss: 0.3328
Epoch [34/51], Iter [70/391] Loss: 0.3690
Epoch [34/51], Iter [80/391] Loss: 0.3540
Epoch [34/51], Iter [90/391] Loss: 0.3405
Epoch [34/51], Iter [100/391] Loss: 0.3614
Epoch [34/51], Iter [110/391] Loss: 0.3886
Epoch [34/51], Iter [120/391] Loss: 0.3598
Epoch [34/51], Iter [130/391] Loss: 0.3369
Epoch [34/51], Iter [140/391] Loss: 0.3441
Epoch [34/51], Iter [150/391] Loss: 0.3479
Epoch [34/51], Iter [160/391] Loss: 0.3565
Epoch [34/51], Iter [170/391] Loss: 0.3349
Epoch [34/51], Iter [180/391] Loss: 0.3819
Epoch [34/51], Iter [190/391] Loss: 0.3626
Epoch [34/51], Iter [200/391] Loss: 0.3470
Epoch [34/51], Iter [210/391] Loss: 0.3566
Epoch [34/51], Iter [220/391] Loss: 0.3558
Epoch [34/51], Iter [230/391] Loss: 0.3556
Epoch [34/51], Iter [240/391] Loss: 0.3674
Epoch [34/51], Iter [250/391] Loss: 0.4096
Epoch [34/51], Iter [260/391] Loss: 0.3886
Epoch [34/51], Iter [270/391] Loss: 0.3622
Epoch [34/51], Iter [280/391] Loss: 0.3672
Epoch [34/51], Iter [290/391] Loss: 0.3685
Epoch [34/51], Iter [300/391] Loss: 0.3539
Epoch [34/51], Iter [310/391] Loss: 0.3531
Epoch [34/51], Iter [320/391] Loss: 0.3966
Epoch [34/51], Iter [330/391] Loss: 0.3928
Epoch [34/51], Iter [340/391] Loss: 0.3712
Epoch [34/51], Iter [350/391] Loss: 0.3862
Epoch [34/51], Iter [360/391] Loss: 0.3723
Epoch [34/51], Iter [370/391] Loss: 0.3922
Epoch [34/51], Iter [380/391] Loss: 0.3599
Epoch [34/51], Iter [390/391] Loss: 0.3408
Epoch [35/51], Iter [10/391] Loss: 0.3447
Epoch [35/51], Iter [20/391] Loss: 0.3605
Epoch [35/51], Iter [30/391] Loss: 0.3541
Epoch [35/51], Iter [40/391] Loss: 0.3528
Epoch [35/51], Iter [50/391] Loss: 0.3513
Epoch [35/51], Iter [60/391] Loss: 0.3748
Epoch [35/51], Iter [70/391] Loss: 0.3607
Epoch [35/51], Iter [80/391] Loss: 0.3595
Epoch [35/51], Iter [90/391] Loss: 0.3591
Epoch [35/51], Iter [100/391] Loss: 0.3750
Epoch [35/51], Iter [110/391] Loss: 0.3655
Epoch [35/51], Iter [120/391] Loss: 0.3404
Epoch [35/51], Iter [130/391] Loss: 0.3417
Epoch [35/51], Iter [140/391] Loss: 0.3553
Epoch [35/51], Iter [150/391] Loss: 0.3802
Epoch [35/51], Iter [160/391] Loss: 0.3644
Epoch [35/51], Iter [170/391] Loss: 0.3804
Epoch [35/51], Iter [180/391] Loss: 0.3366
Epoch [35/51], Iter [190/391] Loss: 0.3552
Epoch [35/51], Iter [200/391] Loss: 0.3677
Epoch [35/51], Iter [210/391] Loss: 0.3391
Epoch [35/51], Iter [220/391] Loss: 0.3708
Epoch [35/51], Iter [230/391] Loss: 0.3750
Epoch [35/51], Iter [240/391] Loss: 0.3623
Epoch [35/51], Iter [250/391] Loss: 0.3703
Epoch [35/51], Iter [260/391] Loss: 0.3366
Epoch [35/51], Iter [270/391] Loss: 0.3342
Epoch [35/51], Iter [280/391] Loss: 0.3669
Epoch [35/51], Iter [290/391] Loss: 0.3678
Epoch [35/51], Iter [300/391] Loss: 0.3638
Epoch [35/51], Iter [310/391] Loss: 0.4078
Epoch [35/51], Iter [320/391] Loss: 0.3774
Epoch [35/51], Iter [330/391] Loss: 0.3723
Epoch [35/51], Iter [340/391] Loss: 0.3597
Epoch [35/51], Iter [350/391] Loss: 0.3786
Epoch [35/51], Iter [360/391] Loss: 0.3545
Epoch [35/51], Iter [370/391] Loss: 0.3670
Epoch [35/51], Iter [380/391] Loss: 0.3807
Epoch [35/51], Iter [390/391] Loss: 0.3666
Epoch [36/51], Iter [10/391] Loss: 0.3354
Epoch [36/51], Iter [20/391] Loss: 0.3324
Epoch [36/51], Iter [30/391] Loss: 0.3522
Epoch [36/51], Iter [40/391] Loss: 0.3508
Epoch [36/51], Iter [50/391] Loss: 0.3569
Epoch [36/51], Iter [60/391] Loss: 0.3355
Epoch [36/51], Iter [70/391] Loss: 0.3507
Epoch [36/51], Iter [80/391] Loss: 0.3650
Epoch [36/51], Iter [90/391] Loss: 0.3450
Epoch [36/51], Iter [100/391] Loss: 0.3553
Epoch [36/51], Iter [110/391] Loss: 0.3532
Epoch [36/51], Iter [120/391] Loss: 0.3864
Epoch [36/51], Iter [130/391] Loss: 0.3438
Epoch [36/51], Iter [140/391] Loss: 0.3427
Epoch [36/51], Iter [150/391] Loss: 0.3548
Epoch [36/51], Iter [160/391] Loss: 0.3449
Epoch [36/51], Iter [170/391] Loss: 0.3582
Epoch [36/51], Iter [180/391] Loss: 0.3507
Epoch [36/51], Iter [190/391] Loss: 0.3856
Epoch [36/51], Iter [200/391] Loss: 0.3366
Epoch [36/51], Iter [210/391] Loss: 0.3462
Epoch [36/51], Iter [220/391] Loss: 0.3748
Epoch [36/51], Iter [230/391] Loss: 0.3580
Epoch [36/51], Iter [240/391] Loss: 0.3633
Epoch [36/51], Iter [250/391] Loss: 0.3659
Epoch [36/51], Iter [260/391] Loss: 0.3747
Epoch [36/51], Iter [270/391] Loss: 0.3558
Epoch [36/51], Iter [280/391] Loss: 0.3640
Epoch [36/51], Iter [290/391] Loss: 0.3560
Epoch [36/51], Iter [300/391] Loss: 0.3581
Epoch [36/51], Iter [310/391] Loss: 0.3542
Epoch [36/51], Iter [320/391] Loss: 0.3746
Epoch [36/51], Iter [330/391] Loss: 0.3694
Epoch [36/51], Iter [340/391] Loss: 0.3805
Epoch [36/51], Iter [350/391] Loss: 0.3517
Epoch [36/51], Iter [360/391] Loss: 0.3516
Epoch [36/51], Iter [370/391] Loss: 0.3824
Epoch [36/51], Iter [380/391] Loss: 0.3964
Epoch [36/51], Iter [390/391] Loss: 0.3699
Epoch [37/51], Iter [10/391] Loss: 0.3512
Epoch [37/51], Iter [20/391] Loss: 0.3535
Epoch [37/51], Iter [30/391] Loss: 0.3368
Epoch [37/51], Iter [40/391] Loss: 0.3474
Epoch [37/51], Iter [50/391] Loss: 0.3352
Epoch [37/51], Iter [60/391] Loss: 0.3177
Epoch [37/51], Iter [70/391] Loss: 0.3827
Epoch [37/51], Iter [80/391] Loss: 0.3348
Epoch [37/51], Iter [90/391] Loss: 0.3684
Epoch [37/51], Iter [100/391] Loss: 0.3669
Epoch [37/51], Iter [110/391] Loss: 0.3625
Epoch [37/51], Iter [120/391] Loss: 0.3401
Epoch [37/51], Iter [130/391] Loss: 0.3390
Epoch [37/51], Iter [140/391] Loss: 0.3520
Epoch [37/51], Iter [150/391] Loss: 0.3307
Epoch [37/51], Iter [160/391] Loss: 0.3765
Epoch [37/51], Iter [170/391] Loss: 0.4031
Epoch [37/51], Iter [180/391] Loss: 0.3663
Epoch [37/51], Iter [190/391] Loss: 0.3763
Epoch [37/51], Iter [200/391] Loss: 0.3676
Epoch [37/51], Iter [210/391] Loss: 0.3263
Epoch [37/51], Iter [220/391] Loss: 0.3779
Epoch [37/51], Iter [230/391] Loss: 0.3822
Epoch [37/51], Iter [240/391] Loss: 0.3790
Epoch [37/51], Iter [250/391] Loss: 0.3636
Epoch [37/51], Iter [260/391] Loss: 0.3966
Epoch [37/51], Iter [270/391] Loss: 0.3595
Epoch [37/51], Iter [280/391] Loss: 0.3403
Epoch [37/51], Iter [290/391] Loss: 0.3553
Epoch [37/51], Iter [300/391] Loss: 0.3897
Epoch [37/51], Iter [310/391] Loss: 0.3392
Epoch [37/51], Iter [320/391] Loss: 0.3688
Epoch [37/51], Iter [330/391] Loss: 0.3268
Epoch [37/51], Iter [340/391] Loss: 0.3581
Epoch [37/51], Iter [350/391] Loss: 0.3703
Epoch [37/51], Iter [360/391] Loss: 0.3451
Epoch [37/51], Iter [370/391] Loss: 0.3647
Epoch [37/51], Iter [380/391] Loss: 0.3706
Epoch [37/51], Iter [390/391] Loss: 0.3761
Epoch [38/51], Iter [10/391] Loss: 0.3380
Epoch [38/51], Iter [20/391] Loss: 0.3677
Epoch [38/51], Iter [30/391] Loss: 0.3525
Epoch [38/51], Iter [40/391] Loss: 0.3663
Epoch [38/51], Iter [50/391] Loss: 0.3355
Epoch [38/51], Iter [60/391] Loss: 0.3817
Epoch [38/51], Iter [70/391] Loss: 0.3369
Epoch [38/51], Iter [80/391] Loss: 0.3699
Epoch [38/51], Iter [90/391] Loss: 0.3342
Epoch [38/51], Iter [100/391] Loss: 0.3628
Epoch [38/51], Iter [110/391] Loss: 0.3363
Epoch [38/51], Iter [120/391] Loss: 0.3624
Epoch [38/51], Iter [130/391] Loss: 0.3297
Epoch [38/51], Iter [140/391] Loss: 0.3537
Epoch [38/51], Iter [150/391] Loss: 0.3621
Epoch [38/51], Iter [160/391] Loss: 0.3158
Epoch [38/51], Iter [170/391] Loss: 0.3413
Epoch [38/51], Iter [180/391] Loss: 0.3427
Epoch [38/51], Iter [190/391] Loss: 0.3686
Epoch [38/51], Iter [200/391] Loss: 0.3592
Epoch [38/51], Iter [210/391] Loss: 0.3702
Epoch [38/51], Iter [220/391] Loss: 0.3756
Epoch [38/51], Iter [230/391] Loss: 0.3608
Epoch [38/51], Iter [240/391] Loss: 0.3466
Epoch [38/51], Iter [250/391] Loss: 0.3724
Epoch [38/51], Iter [260/391] Loss: 0.3617
Epoch [38/51], Iter [270/391] Loss: 0.3413
Epoch [38/51], Iter [280/391] Loss: 0.3321
Epoch [38/51], Iter [290/391] Loss: 0.3476
Epoch [38/51], Iter [300/391] Loss: 0.3463
Epoch [38/51], Iter [310/391] Loss: 0.3300
Epoch [38/51], Iter [320/391] Loss: 0.3310
Epoch [38/51], Iter [330/391] Loss: 0.3886
Epoch [38/51], Iter [340/391] Loss: 0.3613
Epoch [38/51], Iter [350/391] Loss: 0.3867
Epoch [38/51], Iter [360/391] Loss: 0.3566
Epoch [38/51], Iter [370/391] Loss: 0.3589
Epoch [38/51], Iter [380/391] Loss: 0.3422
Epoch [38/51], Iter [390/391] Loss: 0.3625
Epoch [39/51], Iter [10/391] Loss: 0.3532
Epoch [39/51], Iter [20/391] Loss: 0.3548
Epoch [39/51], Iter [30/391] Loss: 0.3239
Epoch [39/51], Iter [40/391] Loss: 0.3418
Epoch [39/51], Iter [50/391] Loss: 0.3398
Epoch [39/51], Iter [60/391] Loss: 0.3509
Epoch [39/51], Iter [70/391] Loss: 0.3554
Epoch [39/51], Iter [80/391] Loss: 0.3460
Epoch [39/51], Iter [90/391] Loss: 0.3452
Epoch [39/51], Iter [100/391] Loss: 0.3636
Epoch [39/51], Iter [110/391] Loss: 0.3406
Epoch [39/51], Iter [120/391] Loss: 0.3422
Epoch [39/51], Iter [130/391] Loss: 0.3590
Epoch [39/51], Iter [140/391] Loss: 0.3936
Epoch [39/51], Iter [150/391] Loss: 0.3372
Epoch [39/51], Iter [160/391] Loss: 0.3619
Epoch [39/51], Iter [170/391] Loss: 0.3776
Epoch [39/51], Iter [180/391] Loss: 0.3314
Epoch [39/51], Iter [190/391] Loss: 0.3635
Epoch [39/51], Iter [200/391] Loss: 0.3651
Epoch [39/51], Iter [210/391] Loss: 0.3529
Epoch [39/51], Iter [220/391] Loss: 0.3429
Epoch [39/51], Iter [230/391] Loss: 0.3416
Epoch [39/51], Iter [240/391] Loss: 0.3675
Epoch [39/51], Iter [250/391] Loss: 0.3377
Epoch [39/51], Iter [260/391] Loss: 0.3640
Epoch [39/51], Iter [270/391] Loss: 0.3554
Epoch [39/51], Iter [280/391] Loss: 0.3682
Epoch [39/51], Iter [290/391] Loss: 0.3554
Epoch [39/51], Iter [300/391] Loss: 0.3480
Epoch [39/51], Iter [310/391] Loss: 0.3320
Epoch [39/51], Iter [320/391] Loss: 0.3399
Epoch [39/51], Iter [330/391] Loss: 0.3538
Epoch [39/51], Iter [340/391] Loss: 0.3651
Epoch [39/51], Iter [350/391] Loss: 0.3509
Epoch [39/51], Iter [360/391] Loss: 0.3303
Epoch [39/51], Iter [370/391] Loss: 0.3540
Epoch [39/51], Iter [380/391] Loss: 0.3645
Epoch [39/51], Iter [390/391] Loss: 0.3662
Epoch [40/51], Iter [10/391] Loss: 0.3187
Epoch [40/51], Iter [20/391] Loss: 0.3274
Epoch [40/51], Iter [30/391] Loss: 0.3496
Epoch [40/51], Iter [40/391] Loss: 0.3350
Epoch [40/51], Iter [50/391] Loss: 0.3436
Epoch [40/51], Iter [60/391] Loss: 0.3582
Epoch [40/51], Iter [70/391] Loss: 0.3548
Epoch [40/51], Iter [80/391] Loss: 0.3282
Epoch [40/51], Iter [90/391] Loss: 0.3329
Epoch [40/51], Iter [100/391] Loss: 0.3280
Epoch [40/51], Iter [110/391] Loss: 0.3550
Epoch [40/51], Iter [120/391] Loss: 0.3631
Epoch [40/51], Iter [130/391] Loss: 0.3521
Epoch [40/51], Iter [140/391] Loss: 0.3248
Epoch [40/51], Iter [150/391] Loss: 0.3414
Epoch [40/51], Iter [160/391] Loss: 0.3376
Epoch [40/51], Iter [170/391] Loss: 0.3315
Epoch [40/51], Iter [180/391] Loss: 0.3464
Epoch [40/51], Iter [190/391] Loss: 0.3491
Epoch [40/51], Iter [200/391] Loss: 0.3440
Epoch [40/51], Iter [210/391] Loss: 0.3446
Epoch [40/51], Iter [220/391] Loss: 0.3604
Epoch [40/51], Iter [230/391] Loss: 0.3386
Epoch [40/51], Iter [240/391] Loss: 0.3348
Epoch [40/51], Iter [250/391] Loss: 0.3295
Epoch [40/51], Iter [260/391] Loss: 0.3312
Epoch [40/51], Iter [270/391] Loss: 0.3798
Epoch [40/51], Iter [280/391] Loss: 0.3306
Epoch [40/51], Iter [290/391] Loss: 0.3150
Epoch [40/51], Iter [300/391] Loss: 0.3815
Epoch [40/51], Iter [310/391] Loss: 0.3442
Epoch [40/51], Iter [320/391] Loss: 0.3441
Epoch [40/51], Iter [330/391] Loss: 0.3574
Epoch [40/51], Iter [340/391] Loss: 0.3524
Epoch [40/51], Iter [350/391] Loss: 0.3690
Epoch [40/51], Iter [360/391] Loss: 0.3478
Epoch [40/51], Iter [370/391] Loss: 0.3381
Epoch [40/51], Iter [380/391] Loss: 0.3504
Epoch [40/51], Iter [390/391] Loss: 0.3697
[Saving Checkpoint]
Epoch [41/51], Iter [10/391] Loss: 0.3521
Epoch [41/51], Iter [20/391] Loss: 0.3411
Epoch [41/51], Iter [30/391] Loss: 0.3372
Epoch [41/51], Iter [40/391] Loss: 0.3329
Epoch [41/51], Iter [50/391] Loss: 0.2948
Epoch [41/51], Iter [60/391] Loss: 0.3573
Epoch [41/51], Iter [70/391] Loss: 0.3410
Epoch [41/51], Iter [80/391] Loss: 0.3455
Epoch [41/51], Iter [90/391] Loss: 0.3569
Epoch [41/51], Iter [100/391] Loss: 0.3220
Epoch [41/51], Iter [110/391] Loss: 0.3402
Epoch [41/51], Iter [120/391] Loss: 0.3415
Epoch [41/51], Iter [130/391] Loss: 0.3236
Epoch [41/51], Iter [140/391] Loss: 0.3567
Epoch [41/51], Iter [150/391] Loss: 0.3559
Epoch [41/51], Iter [160/391] Loss: 0.3368
Epoch [41/51], Iter [170/391] Loss: 0.3674
Epoch [41/51], Iter [180/391] Loss: 0.3383
Epoch [41/51], Iter [190/391] Loss: 0.3576
Epoch [41/51], Iter [200/391] Loss: 0.3523
Epoch [41/51], Iter [210/391] Loss: 0.3352
Epoch [41/51], Iter [220/391] Loss: 0.3330
Epoch [41/51], Iter [230/391] Loss: 0.3406
Epoch [41/51], Iter [240/391] Loss: 0.3445
Epoch [41/51], Iter [250/391] Loss: 0.3769
Epoch [41/51], Iter [260/391] Loss: 0.3499
Epoch [41/51], Iter [270/391] Loss: 0.3379
Epoch [41/51], Iter [280/391] Loss: 0.3199
Epoch [41/51], Iter [290/391] Loss: 0.3680
Epoch [41/51], Iter [300/391] Loss: 0.3332
Epoch [41/51], Iter [310/391] Loss: 0.3235
Epoch [41/51], Iter [320/391] Loss: 0.3227
Epoch [41/51], Iter [330/391] Loss: 0.3681
Epoch [41/51], Iter [340/391] Loss: 0.3554
Epoch [41/51], Iter [350/391] Loss: 0.3672
Epoch [41/51], Iter [360/391] Loss: 0.3533
Epoch [41/51], Iter [370/391] Loss: 0.3330
Epoch [41/51], Iter [380/391] Loss: 0.3565
Epoch [41/51], Iter [390/391] Loss: 0.3462
Epoch [42/51], Iter [10/391] Loss: 0.3168
Epoch [42/51], Iter [20/391] Loss: 0.3356
Epoch [42/51], Iter [30/391] Loss: 0.3429
Epoch [42/51], Iter [40/391] Loss: 0.3257
Epoch [42/51], Iter [50/391] Loss: 0.3628
Epoch [42/51], Iter [60/391] Loss: 0.3272
Epoch [42/51], Iter [70/391] Loss: 0.3369
Epoch [42/51], Iter [80/391] Loss: 0.3540
Epoch [42/51], Iter [90/391] Loss: 0.3598
Epoch [42/51], Iter [100/391] Loss: 0.3546
Epoch [42/51], Iter [110/391] Loss: 0.3154
Epoch [42/51], Iter [120/391] Loss: 0.3523
Epoch [42/51], Iter [130/391] Loss: 0.3166
Epoch [42/51], Iter [140/391] Loss: 0.3569
Epoch [42/51], Iter [150/391] Loss: 0.3241
Epoch [42/51], Iter [160/391] Loss: 0.3381
Epoch [42/51], Iter [170/391] Loss: 0.3958
Epoch [42/51], Iter [180/391] Loss: 0.3433
Epoch [42/51], Iter [190/391] Loss: 0.3475
Epoch [42/51], Iter [200/391] Loss: 0.3288
Epoch [42/51], Iter [210/391] Loss: 0.3810
Epoch [42/51], Iter [220/391] Loss: 0.3763
Epoch [42/51], Iter [230/391] Loss: 0.3505
Epoch [42/51], Iter [240/391] Loss: 0.3495
Epoch [42/51], Iter [250/391] Loss: 0.3236
Epoch [42/51], Iter [260/391] Loss: 0.3302
Epoch [42/51], Iter [270/391] Loss: 0.3449
Epoch [42/51], Iter [280/391] Loss: 0.3435
Epoch [42/51], Iter [290/391] Loss: 0.3307
Epoch [42/51], Iter [300/391] Loss: 0.3726
Epoch [42/51], Iter [310/391] Loss: 0.3439
Epoch [42/51], Iter [320/391] Loss: 0.3219
Epoch [42/51], Iter [330/391] Loss: 0.3981
Epoch [42/51], Iter [340/391] Loss: 0.3393
Epoch [42/51], Iter [350/391] Loss: 0.3681
Epoch [42/51], Iter [360/391] Loss: 0.3557
Epoch [42/51], Iter [370/391] Loss: 0.3465
Epoch [42/51], Iter [380/391] Loss: 0.3631
Epoch [42/51], Iter [390/391] Loss: 0.3455
Epoch [43/51], Iter [10/391] Loss: 0.3447
Epoch [43/51], Iter [20/391] Loss: 0.3490
Epoch [43/51], Iter [30/391] Loss: 0.3518
Epoch [43/51], Iter [40/391] Loss: 0.3253
Epoch [43/51], Iter [50/391] Loss: 0.3487
Epoch [43/51], Iter [60/391] Loss: 0.3142
Epoch [43/51], Iter [70/391] Loss: 0.3402
Epoch [43/51], Iter [80/391] Loss: 0.3356
Epoch [43/51], Iter [90/391] Loss: 0.3504
Epoch [43/51], Iter [100/391] Loss: 0.3460
Epoch [43/51], Iter [110/391] Loss: 0.3404
Epoch [43/51], Iter [120/391] Loss: 0.3173
Epoch [43/51], Iter [130/391] Loss: 0.3421
Epoch [43/51], Iter [140/391] Loss: 0.3494
Epoch [43/51], Iter [150/391] Loss: 0.3432
Epoch [43/51], Iter [160/391] Loss: 0.3306
Epoch [43/51], Iter [170/391] Loss: 0.3181
Epoch [43/51], Iter [180/391] Loss: 0.3379
Epoch [43/51], Iter [190/391] Loss: 0.3520
Epoch [43/51], Iter [200/391] Loss: 0.3569
Epoch [43/51], Iter [210/391] Loss: 0.3355
Epoch [43/51], Iter [220/391] Loss: 0.3543
Epoch [43/51], Iter [230/391] Loss: 0.3466
Epoch [43/51], Iter [240/391] Loss: 0.3351
Epoch [43/51], Iter [250/391] Loss: 0.3605
Epoch [43/51], Iter [260/391] Loss: 0.3449
Epoch [43/51], Iter [270/391] Loss: 0.3497
Epoch [43/51], Iter [280/391] Loss: 0.3221
Epoch [43/51], Iter [290/391] Loss: 0.3339
Epoch [43/51], Iter [300/391] Loss: 0.3601
Epoch [43/51], Iter [310/391] Loss: 0.3412
Epoch [43/51], Iter [320/391] Loss: 0.3268
Epoch [43/51], Iter [330/391] Loss: 0.3312
Epoch [43/51], Iter [340/391] Loss: 0.3573
Epoch [43/51], Iter [350/391] Loss: 0.3557
Epoch [43/51], Iter [360/391] Loss: 0.3514
Epoch [43/51], Iter [370/391] Loss: 0.3537
Epoch [43/51], Iter [380/391] Loss: 0.3605
Epoch [43/51], Iter [390/391] Loss: 0.3534
Epoch [44/51], Iter [10/391] Loss: 0.3208
Epoch [44/51], Iter [20/391] Loss: 0.3320
Epoch [44/51], Iter [30/391] Loss: 0.3621
Epoch [44/51], Iter [40/391] Loss: 0.3330
Epoch [44/51], Iter [50/391] Loss: 0.3368
Epoch [44/51], Iter [60/391] Loss: 0.3183
Epoch [44/51], Iter [70/391] Loss: 0.3569
Epoch [44/51], Iter [80/391] Loss: 0.3447
Epoch [44/51], Iter [90/391] Loss: 0.3317
Epoch [44/51], Iter [100/391] Loss: 0.3225
Epoch [44/51], Iter [110/391] Loss: 0.3296
Epoch [44/51], Iter [120/391] Loss: 0.3250
Epoch [44/51], Iter [130/391] Loss: 0.3365
Epoch [44/51], Iter [140/391] Loss: 0.3331
Epoch [44/51], Iter [150/391] Loss: 0.3305
Epoch [44/51], Iter [160/391] Loss: 0.3414
Epoch [44/51], Iter [170/391] Loss: 0.3391
Epoch [44/51], Iter [180/391] Loss: 0.3431
Epoch [44/51], Iter [190/391] Loss: 0.3411
Epoch [44/51], Iter [200/391] Loss: 0.3320
Epoch [44/51], Iter [210/391] Loss: 0.3538
Epoch [44/51], Iter [220/391] Loss: 0.3298
Epoch [44/51], Iter [230/391] Loss: 0.3207
Epoch [44/51], Iter [240/391] Loss: 0.3659
Epoch [44/51], Iter [250/391] Loss: 0.3516
Epoch [44/51], Iter [260/391] Loss: 0.3524
Epoch [44/51], Iter [270/391] Loss: 0.3465
Epoch [44/51], Iter [280/391] Loss: 0.3633
Epoch [44/51], Iter [290/391] Loss: 0.3467
Epoch [44/51], Iter [300/391] Loss: 0.3645
Epoch [44/51], Iter [310/391] Loss: 0.3558
Epoch [44/51], Iter [320/391] Loss: 0.3425
Epoch [44/51], Iter [330/391] Loss: 0.3443
Epoch [44/51], Iter [340/391] Loss: 0.3440
Epoch [44/51], Iter [350/391] Loss: 0.3513
Epoch [44/51], Iter [360/391] Loss: 0.3576
Epoch [44/51], Iter [370/391] Loss: 0.3517
Epoch [44/51], Iter [380/391] Loss: 0.3455
Epoch [44/51], Iter [390/391] Loss: 0.3370
Epoch [45/51], Iter [10/391] Loss: 0.3363
Epoch [45/51], Iter [20/391] Loss: 0.3506
Epoch [45/51], Iter [30/391] Loss: 0.3106
Epoch [45/51], Iter [40/391] Loss: 0.3334
Epoch [45/51], Iter [50/391] Loss: 0.3623
Epoch [45/51], Iter [60/391] Loss: 0.3629
Epoch [45/51], Iter [70/391] Loss: 0.3553
Epoch [45/51], Iter [80/391] Loss: 0.3041
Epoch [45/51], Iter [90/391] Loss: 0.3187
Epoch [45/51], Iter [100/391] Loss: 0.3367
Epoch [45/51], Iter [110/391] Loss: 0.3205
Epoch [45/51], Iter [120/391] Loss: 0.3441
Epoch [45/51], Iter [130/391] Loss: 0.3272
Epoch [45/51], Iter [140/391] Loss: 0.3471
Epoch [45/51], Iter [150/391] Loss: 0.3270
Epoch [45/51], Iter [160/391] Loss: 0.3410
Epoch [45/51], Iter [170/391] Loss: 0.3380
Epoch [45/51], Iter [180/391] Loss: 0.3258
Epoch [45/51], Iter [190/391] Loss: 0.3196
Epoch [45/51], Iter [200/391] Loss: 0.3282
Epoch [45/51], Iter [210/391] Loss: 0.3379
Epoch [45/51], Iter [220/391] Loss: 0.3356
Epoch [45/51], Iter [230/391] Loss: 0.3443
Epoch [45/51], Iter [240/391] Loss: 0.3494
Epoch [45/51], Iter [250/391] Loss: 0.3371
Epoch [45/51], Iter [260/391] Loss: 0.3155
Epoch [45/51], Iter [270/391] Loss: 0.3310
Epoch [45/51], Iter [280/391] Loss: 0.3658
Epoch [45/51], Iter [290/391] Loss: 0.3271
Epoch [45/51], Iter [300/391] Loss: 0.3225
Epoch [45/51], Iter [310/391] Loss: 0.3182
Epoch [45/51], Iter [320/391] Loss: 0.3241
Epoch [45/51], Iter [330/391] Loss: 0.3506
Epoch [45/51], Iter [340/391] Loss: 0.3652
Epoch [45/51], Iter [350/391] Loss: 0.3416
Epoch [45/51], Iter [360/391] Loss: 0.3494
Epoch [45/51], Iter [370/391] Loss: 0.3444
Epoch [45/51], Iter [380/391] Loss: 0.3319
Epoch [45/51], Iter [390/391] Loss: 0.3296
Epoch [46/51], Iter [10/391] Loss: 0.3380
Epoch [46/51], Iter [20/391] Loss: 0.3263
Epoch [46/51], Iter [30/391] Loss: 0.3225
Epoch [46/51], Iter [40/391] Loss: 0.3503
Epoch [46/51], Iter [50/391] Loss: 0.3219
Epoch [46/51], Iter [60/391] Loss: 0.3313
Epoch [46/51], Iter [70/391] Loss: 0.3296
Epoch [46/51], Iter [80/391] Loss: 0.3239
Epoch [46/51], Iter [90/391] Loss: 0.3354
Epoch [46/51], Iter [100/391] Loss: 0.3374
Epoch [46/51], Iter [110/391] Loss: 0.3151
Epoch [46/51], Iter [120/391] Loss: 0.3240
Epoch [46/51], Iter [130/391] Loss: 0.3254
Epoch [46/51], Iter [140/391] Loss: 0.3476
Epoch [46/51], Iter [150/391] Loss: 0.3462
Epoch [46/51], Iter [160/391] Loss: 0.3511
Epoch [46/51], Iter [170/391] Loss: 0.3318
Epoch [46/51], Iter [180/391] Loss: 0.3275
Epoch [46/51], Iter [190/391] Loss: 0.3397
Epoch [46/51], Iter [200/391] Loss: 0.3345
Epoch [46/51], Iter [210/391] Loss: 0.3652
Epoch [46/51], Iter [220/391] Loss: 0.3355
Epoch [46/51], Iter [230/391] Loss: 0.3367
Epoch [46/51], Iter [240/391] Loss: 0.3439
Epoch [46/51], Iter [250/391] Loss: 0.3457
Epoch [46/51], Iter [260/391] Loss: 0.3387
Epoch [46/51], Iter [270/391] Loss: 0.3282
Epoch [46/51], Iter [280/391] Loss: 0.3418
Epoch [46/51], Iter [290/391] Loss: 0.3535
Epoch [46/51], Iter [300/391] Loss: 0.3547
Epoch [46/51], Iter [310/391] Loss: 0.3491
Epoch [46/51], Iter [320/391] Loss: 0.3242
Epoch [46/51], Iter [330/391] Loss: 0.3425
Epoch [46/51], Iter [340/391] Loss: 0.3385
Epoch [46/51], Iter [350/391] Loss: 0.3371
Epoch [46/51], Iter [360/391] Loss: 0.3423
Epoch [46/51], Iter [370/391] Loss: 0.3398
Epoch [46/51], Iter [380/391] Loss: 0.3535
Epoch [46/51], Iter [390/391] Loss: 0.3535
Epoch [47/51], Iter [10/391] Loss: 0.3267
Epoch [47/51], Iter [20/391] Loss: 0.3309
Epoch [47/51], Iter [30/391] Loss: 0.3442
Epoch [47/51], Iter [40/391] Loss: 0.3225
Epoch [47/51], Iter [50/391] Loss: 0.3188
Epoch [47/51], Iter [60/391] Loss: 0.3463
Epoch [47/51], Iter [70/391] Loss: 0.3330
Epoch [47/51], Iter [80/391] Loss: 0.3096
Epoch [47/51], Iter [90/391] Loss: 0.3406
Epoch [47/51], Iter [100/391] Loss: 0.3091
Epoch [47/51], Iter [110/391] Loss: 0.3302
Epoch [47/51], Iter [120/391] Loss: 0.3216
Epoch [47/51], Iter [130/391] Loss: 0.3429
Epoch [47/51], Iter [140/391] Loss: 0.3517
Epoch [47/51], Iter [150/391] Loss: 0.3232
Epoch [47/51], Iter [160/391] Loss: 0.3244
Epoch [47/51], Iter [170/391] Loss: 0.3297
Epoch [47/51], Iter [180/391] Loss: 0.3622
Epoch [47/51], Iter [190/391] Loss: 0.3201
Epoch [47/51], Iter [200/391] Loss: 0.3370
Epoch [47/51], Iter [210/391] Loss: 0.3380
Epoch [47/51], Iter [220/391] Loss: 0.3328
Epoch [47/51], Iter [230/391] Loss: 0.3191
Epoch [47/51], Iter [240/391] Loss: 0.3565
Epoch [47/51], Iter [250/391] Loss: 0.3416
Epoch [47/51], Iter [260/391] Loss: 0.3377
Epoch [47/51], Iter [270/391] Loss: 0.3317
Epoch [47/51], Iter [280/391] Loss: 0.3409
Epoch [47/51], Iter [290/391] Loss: 0.3204
Epoch [47/51], Iter [300/391] Loss: 0.3294
Epoch [47/51], Iter [310/391] Loss: 0.3400
Epoch [47/51], Iter [320/391] Loss: 0.3452
Epoch [47/51], Iter [330/391] Loss: 0.3367
Epoch [47/51], Iter [340/391] Loss: 0.3560
Epoch [47/51], Iter [350/391] Loss: 0.3331
Epoch [47/51], Iter [360/391] Loss: 0.3370
Epoch [47/51], Iter [370/391] Loss: 0.3318
Epoch [47/51], Iter [380/391] Loss: 0.3535
Epoch [47/51], Iter [390/391] Loss: 0.3428
Epoch [48/51], Iter [10/391] Loss: 0.3465
Epoch [48/51], Iter [20/391] Loss: 0.3345
Epoch [48/51], Iter [30/391] Loss: 0.3346
Epoch [48/51], Iter [40/391] Loss: 0.3397
Epoch [48/51], Iter [50/391] Loss: 0.3235
Epoch [48/51], Iter [60/391] Loss: 0.3351
Epoch [48/51], Iter [70/391] Loss: 0.3255
Epoch [48/51], Iter [80/391] Loss: 0.3220
Epoch [48/51], Iter [90/391] Loss: 0.3327
Epoch [48/51], Iter [100/391] Loss: 0.3459
Epoch [48/51], Iter [110/391] Loss: 0.3179
Epoch [48/51], Iter [120/391] Loss: 0.3255
Epoch [48/51], Iter [130/391] Loss: 0.3354
Epoch [48/51], Iter [140/391] Loss: 0.3119
Epoch [48/51], Iter [150/391] Loss: 0.3399
Epoch [48/51], Iter [160/391] Loss: 0.3520
Epoch [48/51], Iter [170/391] Loss: 0.3417
Epoch [48/51], Iter [180/391] Loss: 0.3220
Epoch [48/51], Iter [190/391] Loss: 0.3451
Epoch [48/51], Iter [200/391] Loss: 0.3316
Epoch [48/51], Iter [210/391] Loss: 0.3325
Epoch [48/51], Iter [220/391] Loss: 0.3129
Epoch [48/51], Iter [230/391] Loss: 0.3628
Epoch [48/51], Iter [240/391] Loss: 0.3324
Epoch [48/51], Iter [250/391] Loss: 0.3625
Epoch [48/51], Iter [260/391] Loss: 0.3311
Epoch [48/51], Iter [270/391] Loss: 0.3359
Epoch [48/51], Iter [280/391] Loss: 0.3506
Epoch [48/51], Iter [290/391] Loss: 0.3201
Epoch [48/51], Iter [300/391] Loss: 0.3565
Epoch [48/51], Iter [310/391] Loss: 0.3659
Epoch [48/51], Iter [320/391] Loss: 0.3305
Epoch [48/51], Iter [330/391] Loss: 0.3259
Epoch [48/51], Iter [340/391] Loss: 0.3334
Epoch [48/51], Iter [350/391] Loss: 0.3506
Epoch [48/51], Iter [360/391] Loss: 0.3567
Epoch [48/51], Iter [370/391] Loss: 0.3338
Epoch [48/51], Iter [380/391] Loss: 0.3337
Epoch [48/51], Iter [390/391] Loss: 0.3509
Epoch [49/51], Iter [10/391] Loss: 0.3398
Epoch [49/51], Iter [20/391] Loss: 0.3209
Epoch [49/51], Iter [30/391] Loss: 0.3164
Epoch [49/51], Iter [40/391] Loss: 0.3214
Epoch [49/51], Iter [50/391] Loss: 0.3452
Epoch [49/51], Iter [60/391] Loss: 0.3114
Epoch [49/51], Iter [70/391] Loss: 0.3293
Epoch [49/51], Iter [80/391] Loss: 0.2986
Epoch [49/51], Iter [90/391] Loss: 0.3420
Epoch [49/51], Iter [100/391] Loss: 0.3134
Epoch [49/51], Iter [110/391] Loss: 0.3531
Epoch [49/51], Iter [120/391] Loss: 0.3603
Epoch [49/51], Iter [130/391] Loss: 0.3198
Epoch [49/51], Iter [140/391] Loss: 0.3239
Epoch [49/51], Iter [150/391] Loss: 0.3122
Epoch [49/51], Iter [160/391] Loss: 0.3149
Epoch [49/51], Iter [170/391] Loss: 0.3523
Epoch [49/51], Iter [180/391] Loss: 0.3373
Epoch [49/51], Iter [190/391] Loss: 0.3445
Epoch [49/51], Iter [200/391] Loss: 0.3500
Epoch [49/51], Iter [210/391] Loss: 0.3338
Epoch [49/51], Iter [220/391] Loss: 0.3237
Epoch [49/51], Iter [230/391] Loss: 0.3297
Epoch [49/51], Iter [240/391] Loss: 0.3502
Epoch [49/51], Iter [250/391] Loss: 0.3498
Epoch [49/51], Iter [260/391] Loss: 0.3192
Epoch [49/51], Iter [270/391] Loss: 0.3288
Epoch [49/51], Iter [280/391] Loss: 0.3367
Epoch [49/51], Iter [290/391] Loss: 0.3383
Epoch [49/51], Iter [300/391] Loss: 0.3228
Epoch [49/51], Iter [310/391] Loss: 0.3317
Epoch [49/51], Iter [320/391] Loss: 0.3236
Epoch [49/51], Iter [330/391] Loss: 0.3505
Epoch [49/51], Iter [340/391] Loss: 0.3434
Epoch [49/51], Iter [350/391] Loss: 0.3348
Epoch [49/51], Iter [360/391] Loss: 0.3427
Epoch [49/51], Iter [370/391] Loss: 0.2999
Epoch [49/51], Iter [380/391] Loss: 0.3730
Epoch [49/51], Iter [390/391] Loss: 0.3583
Epoch [50/51], Iter [10/391] Loss: 0.3345
Epoch [50/51], Iter [20/391] Loss: 0.3385
Epoch [50/51], Iter [30/391] Loss: 0.3236
Epoch [50/51], Iter [40/391] Loss: 0.3264
Epoch [50/51], Iter [50/391] Loss: 0.3552
Epoch [50/51], Iter [60/391] Loss: 0.3248
Epoch [50/51], Iter [70/391] Loss: 0.3290
Epoch [50/51], Iter [80/391] Loss: 0.3290
Epoch [50/51], Iter [90/391] Loss: 0.3214
Epoch [50/51], Iter [100/391] Loss: 0.3131
Epoch [50/51], Iter [110/391] Loss: 0.3209
Epoch [50/51], Iter [120/391] Loss: 0.3460
Epoch [50/51], Iter [130/391] Loss: 0.3179
Epoch [50/51], Iter [140/391] Loss: 0.2969
Epoch [50/51], Iter [150/391] Loss: 0.3070
Epoch [50/51], Iter [160/391] Loss: 0.3474
Epoch [50/51], Iter [170/391] Loss: 0.3218
Epoch [50/51], Iter [180/391] Loss: 0.3248
Epoch [50/51], Iter [190/391] Loss: 0.3223
Epoch [50/51], Iter [200/391] Loss: 0.3274
Epoch [50/51], Iter [210/391] Loss: 0.3078
Epoch [50/51], Iter [220/391] Loss: 0.3260
Epoch [50/51], Iter [230/391] Loss: 0.3333
Epoch [50/51], Iter [240/391] Loss: 0.3449
Epoch [50/51], Iter [250/391] Loss: 0.3226
Epoch [50/51], Iter [260/391] Loss: 0.3431
Epoch [50/51], Iter [270/391] Loss: 0.3091
Epoch [50/51], Iter [280/391] Loss: 0.3117
Epoch [50/51], Iter [290/391] Loss: 0.3493
Epoch [50/51], Iter [300/391] Loss: 0.3428
Epoch [50/51], Iter [310/391] Loss: 0.3365
Epoch [50/51], Iter [320/391] Loss: 0.3226
Epoch [50/51], Iter [330/391] Loss: 0.3242
Epoch [50/51], Iter [340/391] Loss: 0.3368
Epoch [50/51], Iter [350/391] Loss: 0.3230
Epoch [50/51], Iter [360/391] Loss: 0.3222
Epoch [50/51], Iter [370/391] Loss: 0.3447
Epoch [50/51], Iter [380/391] Loss: 0.3346
Epoch [50/51], Iter [390/391] Loss: 0.3459
[Saving Checkpoint]
Epoch [51/51], Iter [10/391] Loss: 0.3192
Epoch [51/51], Iter [20/391] Loss: 0.3334
Epoch [51/51], Iter [30/391] Loss: 0.3708
Epoch [51/51], Iter [40/391] Loss: 0.3178
Epoch [51/51], Iter [50/391] Loss: 0.3245
Epoch [51/51], Iter [60/391] Loss: 0.3241
Epoch [51/51], Iter [70/391] Loss: 0.3283
Epoch [51/51], Iter [80/391] Loss: 0.3404
Epoch [51/51], Iter [90/391] Loss: 0.3216
Epoch [51/51], Iter [100/391] Loss: 0.3238
Epoch [51/51], Iter [110/391] Loss: 0.3266
Epoch [51/51], Iter [120/391] Loss: 0.3396
Epoch [51/51], Iter [130/391] Loss: 0.3199
Epoch [51/51], Iter [140/391] Loss: 0.3266
Epoch [51/51], Iter [150/391] Loss: 0.3363
Epoch [51/51], Iter [160/391] Loss: 0.3405
Epoch [51/51], Iter [170/391] Loss: 0.3583
Epoch [51/51], Iter [180/391] Loss: 0.3397
Epoch [51/51], Iter [190/391] Loss: 0.3265
Epoch [51/51], Iter [200/391] Loss: 0.3099
Epoch [51/51], Iter [210/391] Loss: 0.3411
Epoch [51/51], Iter [220/391] Loss: 0.3268
Epoch [51/51], Iter [230/391] Loss: 0.3229
Epoch [51/51], Iter [240/391] Loss: 0.3204
Epoch [51/51], Iter [250/391] Loss: 0.3212
Epoch [51/51], Iter [260/391] Loss: 0.3325
Epoch [51/51], Iter [270/391] Loss: 0.3254
Epoch [51/51], Iter [280/391] Loss: 0.3092
Epoch [51/51], Iter [290/391] Loss: 0.3174
Epoch [51/51], Iter [300/391] Loss: 0.3344
Epoch [51/51], Iter [310/391] Loss: 0.3421
Epoch [51/51], Iter [320/391] Loss: 0.3528
Epoch [51/51], Iter [330/391] Loss: 0.3421
Epoch [51/51], Iter [340/391] Loss: 0.3371
Epoch [51/51], Iter [350/391] Loss: 0.3601
Epoch [51/51], Iter [360/391] Loss: 0.3245
Epoch [51/51], Iter [370/391] Loss: 0.3311
Epoch [51/51], Iter [380/391] Loss: 0.3458
Epoch [51/51], Iter [390/391] Loss: 0.3325
# | a=0.5 | T=5 | epochs = 51 |
resnet_child_a0dot5_t5_e51 = copy.deepcopy(resnet_child) #let's save for future reference
test_harness( testloader, resnet_child_a0dot5_t5_e51 )
Accuracy of the model on the test images: 90 %
(tensor(9015, device='cuda:0'), 10000)
# | a=0.5 | T=10 | epochs = 51 |
resnet_child, optimizer_child = get_new_child_model()
epoch = 0
kd_loss_a0dot5_t10 = partial( knowledge_distillation_loss, alpha=0.5, T=10 )
training_harness( trainloader, optimizer_child, kd_loss_a0dot5_t10, resnet_parent, resnet_child, model_name='DeepResNet_a0dot5_t10_e51' )
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:12: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
if sys.path[0] == '':
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:26: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:15: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
from ipykernel import kernelapp as app
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:20: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:21: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
ENABLING GPU ACCELERATION || GeForce GTX 1080 Ti
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:46: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
Epoch [1/51], Iter [10/391] Loss: 1.8819
Epoch [1/51], Iter [20/391] Loss: 1.6865
Epoch [1/51], Iter [30/391] Loss: 1.6079
Epoch [1/51], Iter [40/391] Loss: 1.5598
Epoch [1/51], Iter [50/391] Loss: 1.5404
Epoch [1/51], Iter [60/391] Loss: 1.4628
Epoch [1/51], Iter [70/391] Loss: 1.5066
Epoch [1/51], Iter [80/391] Loss: 1.4717
Epoch [1/51], Iter [90/391] Loss: 1.4877
Epoch [1/51], Iter [100/391] Loss: 1.4412
Epoch [1/51], Iter [110/391] Loss: 1.4022
Epoch [1/51], Iter [120/391] Loss: 1.3695
Epoch [1/51], Iter [130/391] Loss: 1.4287
Epoch [1/51], Iter [140/391] Loss: 1.4179
Epoch [1/51], Iter [150/391] Loss: 1.4064
Epoch [1/51], Iter [160/391] Loss: 1.2951
Epoch [1/51], Iter [170/391] Loss: 1.3249
Epoch [1/51], Iter [180/391] Loss: 1.2895
Epoch [1/51], Iter [190/391] Loss: 1.3655
Epoch [1/51], Iter [200/391] Loss: 1.3520
Epoch [1/51], Iter [210/391] Loss: 1.3384
Epoch [1/51], Iter [220/391] Loss: 1.2554
Epoch [1/51], Iter [230/391] Loss: 1.3024
Epoch [1/51], Iter [240/391] Loss: 1.2640
Epoch [1/51], Iter [250/391] Loss: 1.2482
Epoch [1/51], Iter [260/391] Loss: 1.3328
Epoch [1/51], Iter [270/391] Loss: 1.2456
Epoch [1/51], Iter [280/391] Loss: 1.2831
Epoch [1/51], Iter [290/391] Loss: 1.1681
Epoch [1/51], Iter [300/391] Loss: 1.2373
Epoch [1/51], Iter [310/391] Loss: 1.1831
Epoch [1/51], Iter [320/391] Loss: 1.2285
Epoch [1/51], Iter [330/391] Loss: 1.2402
Epoch [1/51], Iter [340/391] Loss: 1.2443
Epoch [1/51], Iter [350/391] Loss: 1.2369
Epoch [1/51], Iter [360/391] Loss: 1.2271
Epoch [1/51], Iter [370/391] Loss: 1.1763
Epoch [1/51], Iter [380/391] Loss: 1.2245
Epoch [1/51], Iter [390/391] Loss: 1.1468
Epoch [2/51], Iter [10/391] Loss: 1.1368
Epoch [2/51], Iter [20/391] Loss: 1.2007
Epoch [2/51], Iter [30/391] Loss: 1.1727
Epoch [2/51], Iter [40/391] Loss: 1.1285
Epoch [2/51], Iter [50/391] Loss: 1.1609
Epoch [2/51], Iter [60/391] Loss: 1.0801
Epoch [2/51], Iter [70/391] Loss: 1.1043
Epoch [2/51], Iter [80/391] Loss: 1.2126
Epoch [2/51], Iter [90/391] Loss: 1.1580
Epoch [2/51], Iter [100/391] Loss: 1.1604
Epoch [2/51], Iter [110/391] Loss: 1.0968
Epoch [2/51], Iter [120/391] Loss: 1.1374
Epoch [2/51], Iter [130/391] Loss: 1.0879
Epoch [2/51], Iter [140/391] Loss: 1.1229
Epoch [2/51], Iter [150/391] Loss: 1.0808
Epoch [2/51], Iter [160/391] Loss: 1.1385
Epoch [2/51], Iter [170/391] Loss: 1.0984
Epoch [2/51], Iter [180/391] Loss: 1.0826
Epoch [2/51], Iter [190/391] Loss: 1.0965
Epoch [2/51], Iter [200/391] Loss: 1.1265
Epoch [2/51], Iter [210/391] Loss: 1.1042
Epoch [2/51], Iter [220/391] Loss: 1.1422
Epoch [2/51], Iter [230/391] Loss: 1.0307
Epoch [2/51], Iter [240/391] Loss: 1.0886
Epoch [2/51], Iter [250/391] Loss: 1.0844
Epoch [2/51], Iter [260/391] Loss: 1.0910
Epoch [2/51], Iter [270/391] Loss: 1.1454
Epoch [2/51], Iter [280/391] Loss: 1.0737
Epoch [2/51], Iter [290/391] Loss: 1.1018
Epoch [2/51], Iter [300/391] Loss: 1.0557
Epoch [2/51], Iter [310/391] Loss: 1.1311
Epoch [2/51], Iter [320/391] Loss: 1.0719
Epoch [2/51], Iter [330/391] Loss: 1.0842
Epoch [2/51], Iter [340/391] Loss: 1.0142
Epoch [2/51], Iter [350/391] Loss: 1.0449
Epoch [2/51], Iter [360/391] Loss: 1.0506
Epoch [2/51], Iter [370/391] Loss: 1.0085
Epoch [2/51], Iter [380/391] Loss: 1.1316
Epoch [2/51], Iter [390/391] Loss: 1.0639
Epoch [3/51], Iter [10/391] Loss: 0.9809
Epoch [3/51], Iter [20/391] Loss: 1.0473
Epoch [3/51], Iter [30/391] Loss: 1.0437
Epoch [3/51], Iter [40/391] Loss: 1.0139
Epoch [3/51], Iter [50/391] Loss: 1.0409
Epoch [3/51], Iter [60/391] Loss: 1.0391
Epoch [3/51], Iter [70/391] Loss: 0.9891
Epoch [3/51], Iter [80/391] Loss: 1.1098
Epoch [3/51], Iter [90/391] Loss: 1.0107
Epoch [3/51], Iter [100/391] Loss: 1.0218
Epoch [3/51], Iter [110/391] Loss: 1.0263
Epoch [3/51], Iter [120/391] Loss: 0.9852
Epoch [3/51], Iter [130/391] Loss: 1.0231
Epoch [3/51], Iter [140/391] Loss: 1.0426
Epoch [3/51], Iter [150/391] Loss: 0.9997
Epoch [3/51], Iter [160/391] Loss: 1.0425
Epoch [3/51], Iter [170/391] Loss: 1.0019
Epoch [3/51], Iter [180/391] Loss: 1.0180
Epoch [3/51], Iter [190/391] Loss: 1.0220
Epoch [3/51], Iter [200/391] Loss: 0.9910
Epoch [3/51], Iter [210/391] Loss: 0.9972
Epoch [3/51], Iter [220/391] Loss: 1.0314
Epoch [3/51], Iter [230/391] Loss: 1.0702
Epoch [3/51], Iter [240/391] Loss: 0.9746
Epoch [3/51], Iter [250/391] Loss: 1.0076
Epoch [3/51], Iter [260/391] Loss: 0.9791
Epoch [3/51], Iter [270/391] Loss: 1.0211
Epoch [3/51], Iter [280/391] Loss: 0.9950
Epoch [3/51], Iter [290/391] Loss: 0.9303
Epoch [3/51], Iter [300/391] Loss: 0.9524
Epoch [3/51], Iter [310/391] Loss: 0.9954
Epoch [3/51], Iter [320/391] Loss: 0.9927
Epoch [3/51], Iter [330/391] Loss: 0.9744
Epoch [3/51], Iter [340/391] Loss: 1.0512
Epoch [3/51], Iter [350/391] Loss: 1.0310
Epoch [3/51], Iter [360/391] Loss: 0.9865
Epoch [3/51], Iter [370/391] Loss: 0.9968
Epoch [3/51], Iter [380/391] Loss: 0.9971
Epoch [3/51], Iter [390/391] Loss: 0.9521
Epoch [4/51], Iter [10/391] Loss: 0.9498
Epoch [4/51], Iter [20/391] Loss: 0.9587
Epoch [4/51], Iter [30/391] Loss: 0.9609
Epoch [4/51], Iter [40/391] Loss: 0.9901
Epoch [4/51], Iter [50/391] Loss: 0.9436
Epoch [4/51], Iter [60/391] Loss: 0.9819
Epoch [4/51], Iter [70/391] Loss: 0.9536
Epoch [4/51], Iter [80/391] Loss: 0.9905
Epoch [4/51], Iter [90/391] Loss: 1.0021
Epoch [4/51], Iter [100/391] Loss: 0.9487
Epoch [4/51], Iter [110/391] Loss: 0.9602
Epoch [4/51], Iter [120/391] Loss: 0.9613
Epoch [4/51], Iter [130/391] Loss: 0.9534
Epoch [4/51], Iter [140/391] Loss: 0.9589
Epoch [4/51], Iter [150/391] Loss: 0.9676
Epoch [4/51], Iter [160/391] Loss: 1.0048
Epoch [4/51], Iter [170/391] Loss: 0.9421
Epoch [4/51], Iter [180/391] Loss: 0.9765
Epoch [4/51], Iter [190/391] Loss: 0.9438
Epoch [4/51], Iter [200/391] Loss: 0.9634
Epoch [4/51], Iter [210/391] Loss: 0.9419
Epoch [4/51], Iter [220/391] Loss: 0.9379
Epoch [4/51], Iter [230/391] Loss: 0.9060
Epoch [4/51], Iter [240/391] Loss: 0.9591
Epoch [4/51], Iter [250/391] Loss: 0.9474
Epoch [4/51], Iter [260/391] Loss: 0.9186
Epoch [4/51], Iter [270/391] Loss: 0.9446
Epoch [4/51], Iter [280/391] Loss: 0.9179
Epoch [4/51], Iter [290/391] Loss: 0.9663
Epoch [4/51], Iter [300/391] Loss: 0.9461
Epoch [4/51], Iter [310/391] Loss: 0.9843
Epoch [4/51], Iter [320/391] Loss: 0.9554
Epoch [4/51], Iter [330/391] Loss: 0.9781
Epoch [4/51], Iter [340/391] Loss: 0.9437
Epoch [4/51], Iter [350/391] Loss: 0.9407
Epoch [4/51], Iter [360/391] Loss: 0.9587
Epoch [4/51], Iter [370/391] Loss: 0.9271
Epoch [4/51], Iter [380/391] Loss: 0.8959
Epoch [4/51], Iter [390/391] Loss: 0.9098
Epoch [5/51], Iter [10/391] Loss: 0.9081
Epoch [5/51], Iter [20/391] Loss: 0.8555
Epoch [5/51], Iter [30/391] Loss: 0.8896
Epoch [5/51], Iter [40/391] Loss: 0.9346
Epoch [5/51], Iter [50/391] Loss: 0.9089
Epoch [5/51], Iter [60/391] Loss: 0.9597
Epoch [5/51], Iter [70/391] Loss: 0.9646
Epoch [5/51], Iter [80/391] Loss: 0.9453
Epoch [5/51], Iter [90/391] Loss: 0.9203
Epoch [5/51], Iter [100/391] Loss: 0.9501
Epoch [5/51], Iter [110/391] Loss: 0.9683
Epoch [5/51], Iter [120/391] Loss: 0.9247
Epoch [5/51], Iter [130/391] Loss: 0.9082
Epoch [5/51], Iter [140/391] Loss: 0.9293
Epoch [5/51], Iter [150/391] Loss: 0.8950
Epoch [5/51], Iter [160/391] Loss: 0.8957
Epoch [5/51], Iter [170/391] Loss: 0.9204
Epoch [5/51], Iter [180/391] Loss: 0.9175
Epoch [5/51], Iter [190/391] Loss: 0.9247
Epoch [5/51], Iter [200/391] Loss: 0.9090
Epoch [5/51], Iter [210/391] Loss: 0.8920
Epoch [5/51], Iter [220/391] Loss: 0.8913
Epoch [5/51], Iter [230/391] Loss: 0.9500
Epoch [5/51], Iter [240/391] Loss: 0.8747
Epoch [5/51], Iter [250/391] Loss: 0.9251
Epoch [5/51], Iter [260/391] Loss: 0.9274
Epoch [5/51], Iter [270/391] Loss: 0.9196
Epoch [5/51], Iter [280/391] Loss: 0.9487
Epoch [5/51], Iter [290/391] Loss: 0.9221
Epoch [5/51], Iter [300/391] Loss: 0.9260
Epoch [5/51], Iter [310/391] Loss: 0.8357
Epoch [5/51], Iter [320/391] Loss: 0.9099
Epoch [5/51], Iter [330/391] Loss: 0.9400
Epoch [5/51], Iter [340/391] Loss: 0.9446
Epoch [5/51], Iter [350/391] Loss: 0.8822
Epoch [5/51], Iter [360/391] Loss: 0.8786
Epoch [5/51], Iter [370/391] Loss: 0.8681
Epoch [5/51], Iter [380/391] Loss: 0.9013
Epoch [5/51], Iter [390/391] Loss: 0.9148
Epoch [6/51], Iter [10/391] Loss: 0.8798
Epoch [6/51], Iter [20/391] Loss: 0.8722
Epoch [6/51], Iter [30/391] Loss: 0.8744
Epoch [6/51], Iter [40/391] Loss: 0.8828
Epoch [6/51], Iter [50/391] Loss: 0.8995
Epoch [6/51], Iter [60/391] Loss: 0.8891
Epoch [6/51], Iter [70/391] Loss: 0.8956
Epoch [6/51], Iter [80/391] Loss: 0.9315
Epoch [6/51], Iter [90/391] Loss: 0.8968
Epoch [6/51], Iter [100/391] Loss: 0.9220
Epoch [6/51], Iter [110/391] Loss: 0.9100
Epoch [6/51], Iter [120/391] Loss: 0.8589
Epoch [6/51], Iter [130/391] Loss: 0.8818
Epoch [6/51], Iter [140/391] Loss: 0.8387
Epoch [6/51], Iter [150/391] Loss: 0.9155
Epoch [6/51], Iter [160/391] Loss: 0.9080
Epoch [6/51], Iter [170/391] Loss: 0.8736
Epoch [6/51], Iter [180/391] Loss: 0.8562
Epoch [6/51], Iter [190/391] Loss: 0.9348
Epoch [6/51], Iter [200/391] Loss: 0.8771
Epoch [6/51], Iter [210/391] Loss: 0.9237
Epoch [6/51], Iter [220/391] Loss: 0.8577
Epoch [6/51], Iter [230/391] Loss: 0.8723
Epoch [6/51], Iter [240/391] Loss: 0.8919
Epoch [6/51], Iter [250/391] Loss: 0.9108
Epoch [6/51], Iter [260/391] Loss: 0.8712
Epoch [6/51], Iter [270/391] Loss: 0.8375
Epoch [6/51], Iter [280/391] Loss: 0.9185
Epoch [6/51], Iter [290/391] Loss: 0.8760
Epoch [6/51], Iter [300/391] Loss: 0.8798
Epoch [6/51], Iter [310/391] Loss: 0.9002
Epoch [6/51], Iter [320/391] Loss: 0.8909
Epoch [6/51], Iter [330/391] Loss: 0.8191
Epoch [6/51], Iter [340/391] Loss: 0.8848
Epoch [6/51], Iter [350/391] Loss: 0.9246
Epoch [6/51], Iter [360/391] Loss: 0.9149
Epoch [6/51], Iter [370/391] Loss: 0.8608
Epoch [6/51], Iter [380/391] Loss: 0.8432
Epoch [6/51], Iter [390/391] Loss: 0.8886
Epoch [7/51], Iter [10/391] Loss: 0.8832
Epoch [7/51], Iter [20/391] Loss: 0.8980
Epoch [7/51], Iter [30/391] Loss: 0.8979
Epoch [7/51], Iter [40/391] Loss: 0.8510
Epoch [7/51], Iter [50/391] Loss: 0.8405
Epoch [7/51], Iter [60/391] Loss: 0.8695
Epoch [7/51], Iter [70/391] Loss: 0.8686
Epoch [7/51], Iter [80/391] Loss: 0.8319
Epoch [7/51], Iter [90/391] Loss: 0.8917
Epoch [7/51], Iter [100/391] Loss: 0.9173
Epoch [7/51], Iter [110/391] Loss: 0.8666
Epoch [7/51], Iter [120/391] Loss: 0.8876
Epoch [7/51], Iter [130/391] Loss: 0.8778
Epoch [7/51], Iter [140/391] Loss: 0.8350
Epoch [7/51], Iter [150/391] Loss: 0.8492
Epoch [7/51], Iter [160/391] Loss: 0.8412
Epoch [7/51], Iter [170/391] Loss: 0.8442
Epoch [7/51], Iter [180/391] Loss: 0.8415
Epoch [7/51], Iter [190/391] Loss: 0.8703
Epoch [7/51], Iter [200/391] Loss: 0.9086
Epoch [7/51], Iter [210/391] Loss: 0.8762
Epoch [7/51], Iter [220/391] Loss: 0.9089
Epoch [7/51], Iter [230/391] Loss: 0.8112
Epoch [7/51], Iter [240/391] Loss: 0.8784
Epoch [7/51], Iter [250/391] Loss: 0.8280
Epoch [7/51], Iter [260/391] Loss: 0.8547
Epoch [7/51], Iter [270/391] Loss: 0.8498
Epoch [7/51], Iter [280/391] Loss: 0.8544
Epoch [7/51], Iter [290/391] Loss: 0.8242
Epoch [7/51], Iter [300/391] Loss: 0.8709
Epoch [7/51], Iter [310/391] Loss: 0.8695
Epoch [7/51], Iter [320/391] Loss: 0.8517
Epoch [7/51], Iter [330/391] Loss: 0.8839
Epoch [7/51], Iter [340/391] Loss: 0.8452
Epoch [7/51], Iter [350/391] Loss: 0.8318
Epoch [7/51], Iter [360/391] Loss: 0.8200
Epoch [7/51], Iter [370/391] Loss: 0.8146
Epoch [7/51], Iter [380/391] Loss: 0.8281
Epoch [7/51], Iter [390/391] Loss: 0.8600
Epoch [8/51], Iter [10/391] Loss: 0.8209
Epoch [8/51], Iter [20/391] Loss: 0.8617
Epoch [8/51], Iter [30/391] Loss: 0.8753
Epoch [8/51], Iter [40/391] Loss: 0.8462
Epoch [8/51], Iter [50/391] Loss: 0.8522
Epoch [8/51], Iter [60/391] Loss: 0.8509
Epoch [8/51], Iter [70/391] Loss: 0.8094
Epoch [8/51], Iter [80/391] Loss: 0.8397
Epoch [8/51], Iter [90/391] Loss: 0.8105
Epoch [8/51], Iter [100/391] Loss: 0.8635
Epoch [8/51], Iter [110/391] Loss: 0.8805
Epoch [8/51], Iter [120/391] Loss: 0.8397
Epoch [8/51], Iter [130/391] Loss: 0.8318
Epoch [8/51], Iter [140/391] Loss: 0.7935
Epoch [8/51], Iter [150/391] Loss: 0.8326
Epoch [8/51], Iter [160/391] Loss: 0.8270
Epoch [8/51], Iter [170/391] Loss: 0.8325
Epoch [8/51], Iter [180/391] Loss: 0.8010
Epoch [8/51], Iter [190/391] Loss: 0.8431
Epoch [8/51], Iter [200/391] Loss: 0.8205
Epoch [8/51], Iter [210/391] Loss: 0.8487
Epoch [8/51], Iter [220/391] Loss: 0.8451
Epoch [8/51], Iter [230/391] Loss: 0.8815
Epoch [8/51], Iter [240/391] Loss: 0.8704
Epoch [8/51], Iter [250/391] Loss: 0.8649
Epoch [8/51], Iter [260/391] Loss: 0.7995
Epoch [8/51], Iter [270/391] Loss: 0.8589
Epoch [8/51], Iter [280/391] Loss: 0.8629
Epoch [8/51], Iter [290/391] Loss: 0.8163
Epoch [8/51], Iter [300/391] Loss: 0.8171
Epoch [8/51], Iter [310/391] Loss: 0.8908
Epoch [8/51], Iter [320/391] Loss: 0.8193
Epoch [8/51], Iter [330/391] Loss: 0.8297
Epoch [8/51], Iter [340/391] Loss: 0.8527
Epoch [8/51], Iter [350/391] Loss: 0.8363
Epoch [8/51], Iter [360/391] Loss: 0.8608
Epoch [8/51], Iter [370/391] Loss: 0.8130
Epoch [8/51], Iter [380/391] Loss: 0.8303
Epoch [8/51], Iter [390/391] Loss: 0.8619
Epoch [9/51], Iter [10/391] Loss: 0.8500
Epoch [9/51], Iter [20/391] Loss: 0.8255
Epoch [9/51], Iter [30/391] Loss: 0.8027
Epoch [9/51], Iter [40/391] Loss: 0.8132
Epoch [9/51], Iter [50/391] Loss: 0.8033
Epoch [9/51], Iter [60/391] Loss: 0.8199
Epoch [9/51], Iter [70/391] Loss: 0.8046
Epoch [9/51], Iter [80/391] Loss: 0.8162
Epoch [9/51], Iter [90/391] Loss: 0.8185
Epoch [9/51], Iter [100/391] Loss: 0.7794
Epoch [9/51], Iter [110/391] Loss: 0.8692
Epoch [9/51], Iter [120/391] Loss: 0.8298
Epoch [9/51], Iter [130/391] Loss: 0.8092
Epoch [9/51], Iter [140/391] Loss: 0.8581
Epoch [9/51], Iter [150/391] Loss: 0.8249
Epoch [9/51], Iter [160/391] Loss: 0.8335
Epoch [9/51], Iter [170/391] Loss: 0.8395
Epoch [9/51], Iter [180/391] Loss: 0.7807
Epoch [9/51], Iter [190/391] Loss: 0.8311
Epoch [9/51], Iter [200/391] Loss: 0.8470
Epoch [9/51], Iter [210/391] Loss: 0.8239
Epoch [9/51], Iter [220/391] Loss: 0.8095
Epoch [9/51], Iter [230/391] Loss: 0.8369
Epoch [9/51], Iter [240/391] Loss: 0.8084
Epoch [9/51], Iter [250/391] Loss: 0.8244
Epoch [9/51], Iter [260/391] Loss: 0.7867
Epoch [9/51], Iter [270/391] Loss: 0.7911
Epoch [9/51], Iter [280/391] Loss: 0.8001
Epoch [9/51], Iter [290/391] Loss: 0.7830
Epoch [9/51], Iter [300/391] Loss: 0.8064
Epoch [9/51], Iter [310/391] Loss: 0.8286
Epoch [9/51], Iter [320/391] Loss: 0.8037
Epoch [9/51], Iter [330/391] Loss: 0.8166
Epoch [9/51], Iter [340/391] Loss: 0.7967
Epoch [9/51], Iter [350/391] Loss: 0.8385
Epoch [9/51], Iter [360/391] Loss: 0.8701
Epoch [9/51], Iter [370/391] Loss: 0.8404
Epoch [9/51], Iter [380/391] Loss: 0.8397
Epoch [9/51], Iter [390/391] Loss: 0.8033
Epoch [10/51], Iter [10/391] Loss: 0.8274
Epoch [10/51], Iter [20/391] Loss: 0.7996
Epoch [10/51], Iter [30/391] Loss: 0.8221
Epoch [10/51], Iter [40/391] Loss: 0.8079
Epoch [10/51], Iter [50/391] Loss: 0.7768
Epoch [10/51], Iter [60/391] Loss: 0.7749
Epoch [10/51], Iter [70/391] Loss: 0.8265
Epoch [10/51], Iter [80/391] Loss: 0.7908
Epoch [10/51], Iter [90/391] Loss: 0.7748
Epoch [10/51], Iter [100/391] Loss: 0.8118
Epoch [10/51], Iter [110/391] Loss: 0.8171
Epoch [10/51], Iter [120/391] Loss: 0.8378
Epoch [10/51], Iter [130/391] Loss: 0.8162
Epoch [10/51], Iter [140/391] Loss: 0.7842
Epoch [10/51], Iter [150/391] Loss: 0.7840
Epoch [10/51], Iter [160/391] Loss: 0.7973
Epoch [10/51], Iter [170/391] Loss: 0.8324
Epoch [10/51], Iter [180/391] Loss: 0.7987
Epoch [10/51], Iter [190/391] Loss: 0.8031
Epoch [10/51], Iter [200/391] Loss: 0.8119
Epoch [10/51], Iter [210/391] Loss: 0.7872
Epoch [10/51], Iter [220/391] Loss: 0.8092
Epoch [10/51], Iter [230/391] Loss: 0.8022
Epoch [10/51], Iter [240/391] Loss: 0.8180
Epoch [10/51], Iter [250/391] Loss: 0.7860
Epoch [10/51], Iter [260/391] Loss: 0.8426
Epoch [10/51], Iter [270/391] Loss: 0.7776
Epoch [10/51], Iter [280/391] Loss: 0.8026
Epoch [10/51], Iter [290/391] Loss: 0.7692
Epoch [10/51], Iter [300/391] Loss: 0.7880
Epoch [10/51], Iter [310/391] Loss: 0.8218
Epoch [10/51], Iter [320/391] Loss: 0.7661
Epoch [10/51], Iter [330/391] Loss: 0.8354
Epoch [10/51], Iter [340/391] Loss: 0.7917
Epoch [10/51], Iter [350/391] Loss: 0.8016
Epoch [10/51], Iter [360/391] Loss: 0.8052
Epoch [10/51], Iter [370/391] Loss: 0.7351
Epoch [10/51], Iter [380/391] Loss: 0.8146
Epoch [10/51], Iter [390/391] Loss: 0.7996
[Saving Checkpoint]
Epoch [11/51], Iter [10/391] Loss: 0.7881
Epoch [11/51], Iter [20/391] Loss: 0.7894
Epoch [11/51], Iter [30/391] Loss: 0.8253
Epoch [11/51], Iter [40/391] Loss: 0.7854
Epoch [11/51], Iter [50/391] Loss: 0.7660
Epoch [11/51], Iter [60/391] Loss: 0.7890
Epoch [11/51], Iter [70/391] Loss: 0.8533
Epoch [11/51], Iter [80/391] Loss: 0.7975
Epoch [11/51], Iter [90/391] Loss: 0.7807
Epoch [11/51], Iter [100/391] Loss: 0.7985
Epoch [11/51], Iter [110/391] Loss: 0.8007
Epoch [11/51], Iter [120/391] Loss: 0.7902
Epoch [11/51], Iter [130/391] Loss: 0.7936
Epoch [11/51], Iter [140/391] Loss: 0.8180
Epoch [11/51], Iter [150/391] Loss: 0.7903
Epoch [11/51], Iter [160/391] Loss: 0.7482
Epoch [11/51], Iter [170/391] Loss: 0.7705
Epoch [11/51], Iter [180/391] Loss: 0.7875
Epoch [11/51], Iter [190/391] Loss: 0.7535
Epoch [11/51], Iter [200/391] Loss: 0.7875
Epoch [11/51], Iter [210/391] Loss: 0.7884
Epoch [11/51], Iter [220/391] Loss: 0.8017
Epoch [11/51], Iter [230/391] Loss: 0.7658
Epoch [11/51], Iter [240/391] Loss: 0.7891
Epoch [11/51], Iter [250/391] Loss: 0.7918
Epoch [11/51], Iter [260/391] Loss: 0.8279
Epoch [11/51], Iter [270/391] Loss: 0.8078
Epoch [11/51], Iter [280/391] Loss: 0.8081
Epoch [11/51], Iter [290/391] Loss: 0.8039
Epoch [11/51], Iter [300/391] Loss: 0.8180
Epoch [11/51], Iter [310/391] Loss: 0.8091
Epoch [11/51], Iter [320/391] Loss: 0.7785
Epoch [11/51], Iter [330/391] Loss: 0.8057
Epoch [11/51], Iter [340/391] Loss: 0.8023
Epoch [11/51], Iter [350/391] Loss: 0.8293
Epoch [11/51], Iter [360/391] Loss: 0.8080
Epoch [11/51], Iter [370/391] Loss: 0.7901
Epoch [11/51], Iter [380/391] Loss: 0.7961
Epoch [11/51], Iter [390/391] Loss: 0.8060
Epoch [12/51], Iter [10/391] Loss: 0.7834
Epoch [12/51], Iter [20/391] Loss: 0.8026
Epoch [12/51], Iter [30/391] Loss: 0.8150
Epoch [12/51], Iter [40/391] Loss: 0.7676
Epoch [12/51], Iter [50/391] Loss: 0.7800
Epoch [12/51], Iter [60/391] Loss: 0.8088
Epoch [12/51], Iter [70/391] Loss: 0.7776
Epoch [12/51], Iter [80/391] Loss: 0.7895
Epoch [12/51], Iter [90/391] Loss: 0.7928
Epoch [12/51], Iter [100/391] Loss: 0.8083
Epoch [12/51], Iter [110/391] Loss: 0.7627
Epoch [12/51], Iter [120/391] Loss: 0.7753
Epoch [12/51], Iter [130/391] Loss: 0.8098
Epoch [12/51], Iter [140/391] Loss: 0.7972
Epoch [12/51], Iter [150/391] Loss: 0.7952
Epoch [12/51], Iter [160/391] Loss: 0.8070
Epoch [12/51], Iter [170/391] Loss: 0.7678
Epoch [12/51], Iter [180/391] Loss: 0.7799
Epoch [12/51], Iter [190/391] Loss: 0.7805
Epoch [12/51], Iter [200/391] Loss: 0.7905
Epoch [12/51], Iter [210/391] Loss: 0.7945
Epoch [12/51], Iter [220/391] Loss: 0.7924
Epoch [12/51], Iter [230/391] Loss: 0.7846
Epoch [12/51], Iter [240/391] Loss: 0.8103
Epoch [12/51], Iter [250/391] Loss: 0.7683
Epoch [12/51], Iter [260/391] Loss: 0.7938
Epoch [12/51], Iter [270/391] Loss: 0.7835
Epoch [12/51], Iter [280/391] Loss: 0.7544
Epoch [12/51], Iter [290/391] Loss: 0.7591
Epoch [12/51], Iter [300/391] Loss: 0.7506
Epoch [12/51], Iter [310/391] Loss: 0.7803
Epoch [12/51], Iter [320/391] Loss: 0.7879
Epoch [12/51], Iter [330/391] Loss: 0.8053
Epoch [12/51], Iter [340/391] Loss: 0.7758
Epoch [12/51], Iter [350/391] Loss: 0.7710
Epoch [12/51], Iter [360/391] Loss: 0.7666
Epoch [12/51], Iter [370/391] Loss: 0.7389
Epoch [12/51], Iter [380/391] Loss: 0.7937
Epoch [12/51], Iter [390/391] Loss: 0.7698
Epoch [13/51], Iter [10/391] Loss: 0.7802
Epoch [13/51], Iter [20/391] Loss: 0.7624
Epoch [13/51], Iter [30/391] Loss: 0.7781
Epoch [13/51], Iter [40/391] Loss: 0.7426
Epoch [13/51], Iter [50/391] Loss: 0.8048
Epoch [13/51], Iter [60/391] Loss: 0.7875
Epoch [13/51], Iter [70/391] Loss: 0.7750
Epoch [13/51], Iter [80/391] Loss: 0.7909
Epoch [13/51], Iter [90/391] Loss: 0.7959
Epoch [13/51], Iter [100/391] Loss: 0.7894
Epoch [13/51], Iter [110/391] Loss: 0.7770
Epoch [13/51], Iter [120/391] Loss: 0.7470
Epoch [13/51], Iter [130/391] Loss: 0.7865
Epoch [13/51], Iter [140/391] Loss: 0.7858
Epoch [13/51], Iter [150/391] Loss: 0.7858
Epoch [13/51], Iter [160/391] Loss: 0.7573
Epoch [13/51], Iter [170/391] Loss: 0.7687
Epoch [13/51], Iter [180/391] Loss: 0.7921
Epoch [13/51], Iter [190/391] Loss: 0.8058
Epoch [13/51], Iter [200/391] Loss: 0.7792
Epoch [13/51], Iter [210/391] Loss: 0.7835
Epoch [13/51], Iter [220/391] Loss: 0.7807
Epoch [13/51], Iter [230/391] Loss: 0.7804
Epoch [13/51], Iter [240/391] Loss: 0.7888
Epoch [13/51], Iter [250/391] Loss: 0.7775
Epoch [13/51], Iter [260/391] Loss: 0.7751
Epoch [13/51], Iter [270/391] Loss: 0.7393
Epoch [13/51], Iter [280/391] Loss: 0.7591
Epoch [13/51], Iter [290/391] Loss: 0.7945
Epoch [13/51], Iter [300/391] Loss: 0.7801
Epoch [13/51], Iter [310/391] Loss: 0.7422
Epoch [13/51], Iter [320/391] Loss: 0.7854
Epoch [13/51], Iter [330/391] Loss: 0.7752
Epoch [13/51], Iter [340/391] Loss: 0.7725
Epoch [13/51], Iter [350/391] Loss: 0.7696
Epoch [13/51], Iter [360/391] Loss: 0.8048
Epoch [13/51], Iter [370/391] Loss: 0.8017
Epoch [13/51], Iter [380/391] Loss: 0.7792
Epoch [13/51], Iter [390/391] Loss: 0.7678
Epoch [14/51], Iter [10/391] Loss: 0.7449
Epoch [14/51], Iter [20/391] Loss: 0.7637
Epoch [14/51], Iter [30/391] Loss: 0.7620
Epoch [14/51], Iter [40/391] Loss: 0.7356
Epoch [14/51], Iter [50/391] Loss: 0.7555
Epoch [14/51], Iter [60/391] Loss: 0.7627
Epoch [14/51], Iter [70/391] Loss: 0.7442
Epoch [14/51], Iter [80/391] Loss: 0.7641
Epoch [14/51], Iter [90/391] Loss: 0.7638
Epoch [14/51], Iter [100/391] Loss: 0.7406
Epoch [14/51], Iter [110/391] Loss: 0.7898
Epoch [14/51], Iter [120/391] Loss: 0.7919
Epoch [14/51], Iter [130/391] Loss: 0.7856
Epoch [14/51], Iter [140/391] Loss: 0.7864
Epoch [14/51], Iter [150/391] Loss: 0.7976
Epoch [14/51], Iter [160/391] Loss: 0.7289
Epoch [14/51], Iter [170/391] Loss: 0.7541
Epoch [14/51], Iter [180/391] Loss: 0.7421
Epoch [14/51], Iter [190/391] Loss: 0.7576
Epoch [14/51], Iter [200/391] Loss: 0.7716
Epoch [14/51], Iter [210/391] Loss: 0.7509
Epoch [14/51], Iter [220/391] Loss: 0.7475
Epoch [14/51], Iter [230/391] Loss: 0.7714
Epoch [14/51], Iter [240/391] Loss: 0.7849
Epoch [14/51], Iter [250/391] Loss: 0.7730
Epoch [14/51], Iter [260/391] Loss: 0.7412
Epoch [14/51], Iter [270/391] Loss: 0.7704
Epoch [14/51], Iter [280/391] Loss: 0.7717
Epoch [14/51], Iter [290/391] Loss: 0.7697
Epoch [14/51], Iter [300/391] Loss: 0.7823
Epoch [14/51], Iter [310/391] Loss: 0.7743
Epoch [14/51], Iter [320/391] Loss: 0.7679
Epoch [14/51], Iter [330/391] Loss: 0.7340
Epoch [14/51], Iter [340/391] Loss: 0.7945
Epoch [14/51], Iter [350/391] Loss: 0.7811
Epoch [14/51], Iter [360/391] Loss: 0.7595
Epoch [14/51], Iter [370/391] Loss: 0.7850
Epoch [14/51], Iter [380/391] Loss: 0.7665
Epoch [14/51], Iter [390/391] Loss: 0.7880
Epoch [15/51], Iter [10/391] Loss: 0.8012
Epoch [15/51], Iter [20/391] Loss: 0.7685
Epoch [15/51], Iter [30/391] Loss: 0.7753
Epoch [15/51], Iter [40/391] Loss: 0.7309
Epoch [15/51], Iter [50/391] Loss: 0.7301
Epoch [15/51], Iter [60/391] Loss: 0.7664
Epoch [15/51], Iter [70/391] Loss: 0.7487
Epoch [15/51], Iter [80/391] Loss: 0.7710
Epoch [15/51], Iter [90/391] Loss: 0.7945
Epoch [15/51], Iter [100/391] Loss: 0.7151
Epoch [15/51], Iter [110/391] Loss: 0.7655
Epoch [15/51], Iter [120/391] Loss: 0.7581
Epoch [15/51], Iter [130/391] Loss: 0.7484
Epoch [15/51], Iter [140/391] Loss: 0.7452
Epoch [15/51], Iter [150/391] Loss: 0.7453
Epoch [15/51], Iter [160/391] Loss: 0.7626
Epoch [15/51], Iter [170/391] Loss: 0.7309
Epoch [15/51], Iter [180/391] Loss: 0.7459
Epoch [15/51], Iter [190/391] Loss: 0.7545
Epoch [15/51], Iter [200/391] Loss: 0.7611
Epoch [15/51], Iter [210/391] Loss: 0.7921
Epoch [15/51], Iter [220/391] Loss: 0.7398
Epoch [15/51], Iter [230/391] Loss: 0.7538
Epoch [15/51], Iter [240/391] Loss: 0.7620
Epoch [15/51], Iter [250/391] Loss: 0.7906
Epoch [15/51], Iter [260/391] Loss: 0.7613
Epoch [15/51], Iter [270/391] Loss: 0.7802
Epoch [15/51], Iter [280/391] Loss: 0.7249
Epoch [15/51], Iter [290/391] Loss: 0.7448
Epoch [15/51], Iter [300/391] Loss: 0.7400
Epoch [15/51], Iter [310/391] Loss: 0.7602
Epoch [15/51], Iter [320/391] Loss: 0.7684
Epoch [15/51], Iter [330/391] Loss: 0.7441
Epoch [15/51], Iter [340/391] Loss: 0.7566
Epoch [15/51], Iter [350/391] Loss: 0.7665
Epoch [15/51], Iter [360/391] Loss: 0.7618
Epoch [15/51], Iter [370/391] Loss: 0.7443
Epoch [15/51], Iter [380/391] Loss: 0.7579
Epoch [15/51], Iter [390/391] Loss: 0.7565
Epoch [16/51], Iter [10/391] Loss: 0.7536
Epoch [16/51], Iter [20/391] Loss: 0.7775
Epoch [16/51], Iter [30/391] Loss: 0.7525
Epoch [16/51], Iter [40/391] Loss: 0.7647
Epoch [16/51], Iter [50/391] Loss: 0.7563
Epoch [16/51], Iter [60/391] Loss: 0.7254
Epoch [16/51], Iter [70/391] Loss: 0.7594
Epoch [16/51], Iter [80/391] Loss: 0.7435
Epoch [16/51], Iter [90/391] Loss: 0.7426
Epoch [16/51], Iter [100/391] Loss: 0.8058
Epoch [16/51], Iter [110/391] Loss: 0.7099
Epoch [16/51], Iter [120/391] Loss: 0.7421
Epoch [16/51], Iter [130/391] Loss: 0.7396
Epoch [16/51], Iter [140/391] Loss: 0.7526
Epoch [16/51], Iter [150/391] Loss: 0.7671
Epoch [16/51], Iter [160/391] Loss: 0.7541
Epoch [16/51], Iter [170/391] Loss: 0.7524
Epoch [16/51], Iter [180/391] Loss: 0.7671
Epoch [16/51], Iter [190/391] Loss: 0.7869
Epoch [16/51], Iter [200/391] Loss: 0.7144
Epoch [16/51], Iter [210/391] Loss: 0.7524
Epoch [16/51], Iter [220/391] Loss: 0.7584
Epoch [16/51], Iter [230/391] Loss: 0.7959
Epoch [16/51], Iter [240/391] Loss: 0.7357
Epoch [16/51], Iter [250/391] Loss: 0.7649
Epoch [16/51], Iter [260/391] Loss: 0.7674
Epoch [16/51], Iter [270/391] Loss: 0.7835
Epoch [16/51], Iter [280/391] Loss: 0.7313
Epoch [16/51], Iter [290/391] Loss: 0.7278
Epoch [16/51], Iter [300/391] Loss: 0.7192
Epoch [16/51], Iter [310/391] Loss: 0.7583
Epoch [16/51], Iter [320/391] Loss: 0.7498
Epoch [16/51], Iter [330/391] Loss: 0.7579
Epoch [16/51], Iter [340/391] Loss: 0.7664
Epoch [16/51], Iter [350/391] Loss: 0.7362
Epoch [16/51], Iter [360/391] Loss: 0.7638
Epoch [16/51], Iter [370/391] Loss: 0.7919
Epoch [16/51], Iter [380/391] Loss: 0.7446
Epoch [16/51], Iter [390/391] Loss: 0.7455
Epoch [17/51], Iter [10/391] Loss: 0.7182
Epoch [17/51], Iter [20/391] Loss: 0.7580
Epoch [17/51], Iter [30/391] Loss: 0.7462
Epoch [17/51], Iter [40/391] Loss: 0.7453
Epoch [17/51], Iter [50/391] Loss: 0.7531
Epoch [17/51], Iter [60/391] Loss: 0.7503
Epoch [17/51], Iter [70/391] Loss: 0.7541
Epoch [17/51], Iter [80/391] Loss: 0.7302
Epoch [17/51], Iter [90/391] Loss: 0.7277
Epoch [17/51], Iter [100/391] Loss: 0.7625
Epoch [17/51], Iter [110/391] Loss: 0.7449
Epoch [17/51], Iter [120/391] Loss: 0.7582
Epoch [17/51], Iter [130/391] Loss: 0.7609
Epoch [17/51], Iter [140/391] Loss: 0.7183
Epoch [17/51], Iter [150/391] Loss: 0.7235
Epoch [17/51], Iter [160/391] Loss: 0.7496
Epoch [17/51], Iter [170/391] Loss: 0.7446
Epoch [17/51], Iter [180/391] Loss: 0.7592
Epoch [17/51], Iter [190/391] Loss: 0.7853
Epoch [17/51], Iter [200/391] Loss: 0.7334
Epoch [17/51], Iter [210/391] Loss: 0.7492
Epoch [17/51], Iter [220/391] Loss: 0.7313
Epoch [17/51], Iter [230/391] Loss: 0.7600
Epoch [17/51], Iter [240/391] Loss: 0.7539
Epoch [17/51], Iter [250/391] Loss: 0.7315
Epoch [17/51], Iter [260/391] Loss: 0.7593
Epoch [17/51], Iter [270/391] Loss: 0.7575
Epoch [17/51], Iter [280/391] Loss: 0.7601
Epoch [17/51], Iter [290/391] Loss: 0.7817
Epoch [17/51], Iter [300/391] Loss: 0.7467
Epoch [17/51], Iter [310/391] Loss: 0.7425
Epoch [17/51], Iter [320/391] Loss: 0.7600
Epoch [17/51], Iter [330/391] Loss: 0.7307
Epoch [17/51], Iter [340/391] Loss: 0.7487
Epoch [17/51], Iter [350/391] Loss: 0.7321
Epoch [17/51], Iter [360/391] Loss: 0.7598
Epoch [17/51], Iter [370/391] Loss: 0.7338
Epoch [17/51], Iter [380/391] Loss: 0.7710
Epoch [17/51], Iter [390/391] Loss: 0.7197
Epoch [18/51], Iter [10/391] Loss: 0.7480
Epoch [18/51], Iter [20/391] Loss: 0.7502
Epoch [18/51], Iter [30/391] Loss: 0.7299
Epoch [18/51], Iter [40/391] Loss: 0.7094
Epoch [18/51], Iter [50/391] Loss: 0.7170
Epoch [18/51], Iter [60/391] Loss: 0.7362
Epoch [18/51], Iter [70/391] Loss: 0.7304
Epoch [18/51], Iter [80/391] Loss: 0.7511
Epoch [18/51], Iter [90/391] Loss: 0.7241
Epoch [18/51], Iter [100/391] Loss: 0.7521
Epoch [18/51], Iter [110/391] Loss: 0.7478
Epoch [18/51], Iter [120/391] Loss: 0.7488
Epoch [18/51], Iter [130/391] Loss: 0.7369
Epoch [18/51], Iter [140/391] Loss: 0.7107
Epoch [18/51], Iter [150/391] Loss: 0.7351
Epoch [18/51], Iter [160/391] Loss: 0.7424
Epoch [18/51], Iter [170/391] Loss: 0.7055
Epoch [18/51], Iter [180/391] Loss: 0.7447
Epoch [18/51], Iter [190/391] Loss: 0.7241
Epoch [18/51], Iter [200/391] Loss: 0.7391
Epoch [18/51], Iter [210/391] Loss: 0.7654
Epoch [18/51], Iter [220/391] Loss: 0.7419
Epoch [18/51], Iter [230/391] Loss: 0.7296
Epoch [18/51], Iter [240/391] Loss: 0.7453
Epoch [18/51], Iter [250/391] Loss: 0.7440
Epoch [18/51], Iter [260/391] Loss: 0.7473
Epoch [18/51], Iter [270/391] Loss: 0.7368
Epoch [18/51], Iter [280/391] Loss: 0.7494
Epoch [18/51], Iter [290/391] Loss: 0.7177
Epoch [18/51], Iter [300/391] Loss: 0.7581
Epoch [18/51], Iter [310/391] Loss: 0.7569
Epoch [18/51], Iter [320/391] Loss: 0.7772
Epoch [18/51], Iter [330/391] Loss: 0.7564
Epoch [18/51], Iter [340/391] Loss: 0.7148
Epoch [18/51], Iter [350/391] Loss: 0.7297
Epoch [18/51], Iter [360/391] Loss: 0.7631
Epoch [18/51], Iter [370/391] Loss: 0.7631
Epoch [18/51], Iter [380/391] Loss: 0.7365
Epoch [18/51], Iter [390/391] Loss: 0.7284
Epoch [19/51], Iter [10/391] Loss: 0.7402
Epoch [19/51], Iter [20/391] Loss: 0.7260
Epoch [19/51], Iter [30/391] Loss: 0.7727
Epoch [19/51], Iter [40/391] Loss: 0.7224
Epoch [19/51], Iter [50/391] Loss: 0.7373
Epoch [19/51], Iter [60/391] Loss: 0.7228
Epoch [19/51], Iter [70/391] Loss: 0.7302
Epoch [19/51], Iter [80/391] Loss: 0.7541
Epoch [19/51], Iter [90/391] Loss: 0.7188
Epoch [19/51], Iter [100/391] Loss: 0.7366
Epoch [19/51], Iter [110/391] Loss: 0.7237
Epoch [19/51], Iter [120/391] Loss: 0.7271
Epoch [19/51], Iter [130/391] Loss: 0.7247
Epoch [19/51], Iter [140/391] Loss: 0.7058
Epoch [19/51], Iter [150/391] Loss: 0.7461
Epoch [19/51], Iter [160/391] Loss: 0.7260
Epoch [19/51], Iter [170/391] Loss: 0.7218
Epoch [19/51], Iter [180/391] Loss: 0.7345
Epoch [19/51], Iter [190/391] Loss: 0.7292
Epoch [19/51], Iter [200/391] Loss: 0.7577
Epoch [19/51], Iter [210/391] Loss: 0.7399
Epoch [19/51], Iter [220/391] Loss: 0.7793
Epoch [19/51], Iter [230/391] Loss: 0.7060
Epoch [19/51], Iter [240/391] Loss: 0.7237
Epoch [19/51], Iter [250/391] Loss: 0.7256
Epoch [19/51], Iter [260/391] Loss: 0.7363
Epoch [19/51], Iter [270/391] Loss: 0.7269
Epoch [19/51], Iter [280/391] Loss: 0.7371
Epoch [19/51], Iter [290/391] Loss: 0.7379
Epoch [19/51], Iter [300/391] Loss: 0.7498
Epoch [19/51], Iter [310/391] Loss: 0.7261
Epoch [19/51], Iter [320/391] Loss: 0.7602
Epoch [19/51], Iter [330/391] Loss: 0.7357
Epoch [19/51], Iter [340/391] Loss: 0.6972
Epoch [19/51], Iter [350/391] Loss: 0.7374
Epoch [19/51], Iter [360/391] Loss: 0.7200
Epoch [19/51], Iter [370/391] Loss: 0.7411
Epoch [19/51], Iter [380/391] Loss: 0.7127
Epoch [19/51], Iter [390/391] Loss: 0.7147
Epoch [20/51], Iter [10/391] Loss: 0.7185
Epoch [20/51], Iter [20/391] Loss: 0.7417
Epoch [20/51], Iter [30/391] Loss: 0.7216
Epoch [20/51], Iter [40/391] Loss: 0.7408
Epoch [20/51], Iter [50/391] Loss: 0.7270
Epoch [20/51], Iter [60/391] Loss: 0.7348
Epoch [20/51], Iter [70/391] Loss: 0.7328
Epoch [20/51], Iter [80/391] Loss: 0.7444
Epoch [20/51], Iter [90/391] Loss: 0.7176
Epoch [20/51], Iter [100/391] Loss: 0.7377
Epoch [20/51], Iter [110/391] Loss: 0.7188
Epoch [20/51], Iter [120/391] Loss: 0.7275
Epoch [20/51], Iter [130/391] Loss: 0.7150
Epoch [20/51], Iter [140/391] Loss: 0.7134
Epoch [20/51], Iter [150/391] Loss: 0.7558
Epoch [20/51], Iter [160/391] Loss: 0.7260
Epoch [20/51], Iter [170/391] Loss: 0.7110
Epoch [20/51], Iter [180/391] Loss: 0.7259
Epoch [20/51], Iter [190/391] Loss: 0.7097
Epoch [20/51], Iter [200/391] Loss: 0.7340
Epoch [20/51], Iter [210/391] Loss: 0.7393
Epoch [20/51], Iter [220/391] Loss: 0.7229
Epoch [20/51], Iter [230/391] Loss: 0.7533
Epoch [20/51], Iter [240/391] Loss: 0.7181
Epoch [20/51], Iter [250/391] Loss: 0.6985
Epoch [20/51], Iter [260/391] Loss: 0.7509
Epoch [20/51], Iter [270/391] Loss: 0.7587
Epoch [20/51], Iter [280/391] Loss: 0.7343
Epoch [20/51], Iter [290/391] Loss: 0.7232
Epoch [20/51], Iter [300/391] Loss: 0.7287
Epoch [20/51], Iter [310/391] Loss: 0.6953
Epoch [20/51], Iter [320/391] Loss: 0.7431
Epoch [20/51], Iter [330/391] Loss: 0.7262
Epoch [20/51], Iter [340/391] Loss: 0.7256
Epoch [20/51], Iter [350/391] Loss: 0.7254
Epoch [20/51], Iter [360/391] Loss: 0.7626
Epoch [20/51], Iter [370/391] Loss: 0.7362
Epoch [20/51], Iter [380/391] Loss: 0.7351
Epoch [20/51], Iter [390/391] Loss: 0.7341
[Saving Checkpoint]
Epoch [21/51], Iter [10/391] Loss: 0.7215
Epoch [21/51], Iter [20/391] Loss: 0.7135
Epoch [21/51], Iter [30/391] Loss: 0.7499
Epoch [21/51], Iter [40/391] Loss: 0.7233
Epoch [21/51], Iter [50/391] Loss: 0.7479
Epoch [21/51], Iter [60/391] Loss: 0.7352
Epoch [21/51], Iter [70/391] Loss: 0.7280
Epoch [21/51], Iter [80/391] Loss: 0.7270
Epoch [21/51], Iter [90/391] Loss: 0.7172
Epoch [21/51], Iter [100/391] Loss: 0.7418
Epoch [21/51], Iter [110/391] Loss: 0.7044
Epoch [21/51], Iter [120/391] Loss: 0.7200
Epoch [21/51], Iter [130/391] Loss: 0.6927
Epoch [21/51], Iter [140/391] Loss: 0.7211
Epoch [21/51], Iter [150/391] Loss: 0.7408
Epoch [21/51], Iter [160/391] Loss: 0.7034
Epoch [21/51], Iter [170/391] Loss: 0.7469
Epoch [21/51], Iter [180/391] Loss: 0.6963
Epoch [21/51], Iter [190/391] Loss: 0.7437
Epoch [21/51], Iter [200/391] Loss: 0.7460
Epoch [21/51], Iter [210/391] Loss: 0.7093
Epoch [21/51], Iter [220/391] Loss: 0.7265
Epoch [21/51], Iter [230/391] Loss: 0.7204
Epoch [21/51], Iter [240/391] Loss: 0.7266
Epoch [21/51], Iter [250/391] Loss: 0.6992
Epoch [21/51], Iter [260/391] Loss: 0.7149
Epoch [21/51], Iter [270/391] Loss: 0.7259
Epoch [21/51], Iter [280/391] Loss: 0.7406
Epoch [21/51], Iter [290/391] Loss: 0.7108
Epoch [21/51], Iter [300/391] Loss: 0.7379
Epoch [21/51], Iter [310/391] Loss: 0.7504
Epoch [21/51], Iter [320/391] Loss: 0.7252
Epoch [21/51], Iter [330/391] Loss: 0.7665
Epoch [21/51], Iter [340/391] Loss: 0.7228
Epoch [21/51], Iter [350/391] Loss: 0.7470
Epoch [21/51], Iter [360/391] Loss: 0.7346
Epoch [21/51], Iter [370/391] Loss: 0.7106
Epoch [21/51], Iter [380/391] Loss: 0.7366
Epoch [21/51], Iter [390/391] Loss: 0.7215
Epoch [22/51], Iter [10/391] Loss: 0.6957
Epoch [22/51], Iter [20/391] Loss: 0.7256
Epoch [22/51], Iter [30/391] Loss: 0.7210
Epoch [22/51], Iter [40/391] Loss: 0.7564
Epoch [22/51], Iter [50/391] Loss: 0.7239
Epoch [22/51], Iter [60/391] Loss: 0.7164
Epoch [22/51], Iter [70/391] Loss: 0.7375
Epoch [22/51], Iter [80/391] Loss: 0.7171
Epoch [22/51], Iter [90/391] Loss: 0.6883
Epoch [22/51], Iter [100/391] Loss: 0.7131
Epoch [22/51], Iter [110/391] Loss: 0.7143
Epoch [22/51], Iter [120/391] Loss: 0.7456
Epoch [22/51], Iter [130/391] Loss: 0.7297
Epoch [22/51], Iter [140/391] Loss: 0.7150
Epoch [22/51], Iter [150/391] Loss: 0.7075
Epoch [22/51], Iter [160/391] Loss: 0.7153
Epoch [22/51], Iter [170/391] Loss: 0.7139
Epoch [22/51], Iter [180/391] Loss: 0.7294
Epoch [22/51], Iter [190/391] Loss: 0.7120
Epoch [22/51], Iter [200/391] Loss: 0.7148
Epoch [22/51], Iter [210/391] Loss: 0.7407
Epoch [22/51], Iter [220/391] Loss: 0.7276
Epoch [22/51], Iter [230/391] Loss: 0.7298
Epoch [22/51], Iter [240/391] Loss: 0.7161
Epoch [22/51], Iter [250/391] Loss: 0.7079
Epoch [22/51], Iter [260/391] Loss: 0.7267
Epoch [22/51], Iter [270/391] Loss: 0.7199
Epoch [22/51], Iter [280/391] Loss: 0.7402
Epoch [22/51], Iter [290/391] Loss: 0.7444
Epoch [22/51], Iter [300/391] Loss: 0.7111
Epoch [22/51], Iter [310/391] Loss: 0.7529
Epoch [22/51], Iter [320/391] Loss: 0.7039
Epoch [22/51], Iter [330/391] Loss: 0.7215
Epoch [22/51], Iter [340/391] Loss: 0.7059
Epoch [22/51], Iter [350/391] Loss: 0.7011
Epoch [22/51], Iter [360/391] Loss: 0.7254
Epoch [22/51], Iter [370/391] Loss: 0.7259
Epoch [22/51], Iter [380/391] Loss: 0.7037
Epoch [22/51], Iter [390/391] Loss: 0.7326
Epoch [23/51], Iter [10/391] Loss: 0.7190
Epoch [23/51], Iter [20/391] Loss: 0.7192
Epoch [23/51], Iter [30/391] Loss: 0.7265
Epoch [23/51], Iter [40/391] Loss: 0.6990
Epoch [23/51], Iter [50/391] Loss: 0.7218
Epoch [23/51], Iter [60/391] Loss: 0.6981
Epoch [23/51], Iter [70/391] Loss: 0.6874
Epoch [23/51], Iter [80/391] Loss: 0.6944
Epoch [23/51], Iter [90/391] Loss: 0.7072
Epoch [23/51], Iter [100/391] Loss: 0.7159
Epoch [23/51], Iter [110/391] Loss: 0.6905
Epoch [23/51], Iter [120/391] Loss: 0.7258
Epoch [23/51], Iter [130/391] Loss: 0.7261
Epoch [23/51], Iter [140/391] Loss: 0.6939
Epoch [23/51], Iter [150/391] Loss: 0.7438
Epoch [23/51], Iter [160/391] Loss: 0.7189
Epoch [23/51], Iter [170/391] Loss: 0.7150
Epoch [23/51], Iter [180/391] Loss: 0.7093
Epoch [23/51], Iter [190/391] Loss: 0.7076
Epoch [23/51], Iter [200/391] Loss: 0.7176
Epoch [23/51], Iter [210/391] Loss: 0.7283
Epoch [23/51], Iter [220/391] Loss: 0.7451
Epoch [23/51], Iter [230/391] Loss: 0.7014
Epoch [23/51], Iter [240/391] Loss: 0.7083
Epoch [23/51], Iter [250/391] Loss: 0.7067
Epoch [23/51], Iter [260/391] Loss: 0.7186
Epoch [23/51], Iter [270/391] Loss: 0.7015
Epoch [23/51], Iter [280/391] Loss: 0.7283
Epoch [23/51], Iter [290/391] Loss: 0.7342
Epoch [23/51], Iter [300/391] Loss: 0.7267
Epoch [23/51], Iter [310/391] Loss: 0.6985
Epoch [23/51], Iter [320/391] Loss: 0.7240
Epoch [23/51], Iter [330/391] Loss: 0.7134
Epoch [23/51], Iter [340/391] Loss: 0.6986
Epoch [23/51], Iter [350/391] Loss: 0.7220
Epoch [23/51], Iter [360/391] Loss: 0.7020
Epoch [23/51], Iter [370/391] Loss: 0.7271
Epoch [23/51], Iter [380/391] Loss: 0.7175
Epoch [23/51], Iter [390/391] Loss: 0.7111
Epoch [24/51], Iter [10/391] Loss: 0.6953
Epoch [24/51], Iter [20/391] Loss: 0.7268
Epoch [24/51], Iter [30/391] Loss: 0.7052
Epoch [24/51], Iter [40/391] Loss: 0.6781
Epoch [24/51], Iter [50/391] Loss: 0.7137
Epoch [24/51], Iter [60/391] Loss: 0.7036
Epoch [24/51], Iter [70/391] Loss: 0.7110
Epoch [24/51], Iter [80/391] Loss: 0.7123
Epoch [24/51], Iter [90/391] Loss: 0.7145
Epoch [24/51], Iter [100/391] Loss: 0.7092
Epoch [24/51], Iter [110/391] Loss: 0.7371
Epoch [24/51], Iter [120/391] Loss: 0.6969
Epoch [24/51], Iter [130/391] Loss: 0.7290
Epoch [24/51], Iter [140/391] Loss: 0.7243
Epoch [24/51], Iter [150/391] Loss: 0.6931
Epoch [24/51], Iter [160/391] Loss: 0.7132
Epoch [24/51], Iter [170/391] Loss: 0.7293
Epoch [24/51], Iter [180/391] Loss: 0.7003
Epoch [24/51], Iter [190/391] Loss: 0.7029
Epoch [24/51], Iter [200/391] Loss: 0.7133
Epoch [24/51], Iter [210/391] Loss: 0.7196
Epoch [24/51], Iter [220/391] Loss: 0.7274
Epoch [24/51], Iter [230/391] Loss: 0.7184
Epoch [24/51], Iter [240/391] Loss: 0.6988
Epoch [24/51], Iter [250/391] Loss: 0.7122
Epoch [24/51], Iter [260/391] Loss: 0.7046
Epoch [24/51], Iter [270/391] Loss: 0.6923
Epoch [24/51], Iter [280/391] Loss: 0.7320
Epoch [24/51], Iter [290/391] Loss: 0.7011
Epoch [24/51], Iter [300/391] Loss: 0.7021
Epoch [24/51], Iter [310/391] Loss: 0.7114
Epoch [24/51], Iter [320/391] Loss: 0.7315
Epoch [24/51], Iter [330/391] Loss: 0.7027
Epoch [24/51], Iter [340/391] Loss: 0.7086
Epoch [24/51], Iter [350/391] Loss: 0.7220
Epoch [24/51], Iter [360/391] Loss: 0.7087
Epoch [24/51], Iter [370/391] Loss: 0.7161
Epoch [24/51], Iter [380/391] Loss: 0.7201
Epoch [24/51], Iter [390/391] Loss: 0.7054
Epoch [25/51], Iter [10/391] Loss: 0.6974
Epoch [25/51], Iter [20/391] Loss: 0.7139
Epoch [25/51], Iter [30/391] Loss: 0.7395
Epoch [25/51], Iter [40/391] Loss: 0.7227
Epoch [25/51], Iter [50/391] Loss: 0.7066
Epoch [25/51], Iter [60/391] Loss: 0.6927
Epoch [25/51], Iter [70/391] Loss: 0.7015
Epoch [25/51], Iter [80/391] Loss: 0.7046
Epoch [25/51], Iter [90/391] Loss: 0.6989
Epoch [25/51], Iter [100/391] Loss: 0.7059
Epoch [25/51], Iter [110/391] Loss: 0.7158
Epoch [25/51], Iter [120/391] Loss: 0.7079
Epoch [25/51], Iter [130/391] Loss: 0.7167
Epoch [25/51], Iter [140/391] Loss: 0.7403
Epoch [25/51], Iter [150/391] Loss: 0.6951
Epoch [25/51], Iter [160/391] Loss: 0.7184
Epoch [25/51], Iter [170/391] Loss: 0.7457
Epoch [25/51], Iter [180/391] Loss: 0.7004
Epoch [25/51], Iter [190/391] Loss: 0.6965
Epoch [25/51], Iter [200/391] Loss: 0.6990
Epoch [25/51], Iter [210/391] Loss: 0.7015
Epoch [25/51], Iter [220/391] Loss: 0.7131
Epoch [25/51], Iter [230/391] Loss: 0.6726
Epoch [25/51], Iter [240/391] Loss: 0.7033
Epoch [25/51], Iter [250/391] Loss: 0.7070
Epoch [25/51], Iter [260/391] Loss: 0.7036
Epoch [25/51], Iter [270/391] Loss: 0.6973
Epoch [25/51], Iter [280/391] Loss: 0.7212
Epoch [25/51], Iter [290/391] Loss: 0.7129
Epoch [25/51], Iter [300/391] Loss: 0.7161
Epoch [25/51], Iter [310/391] Loss: 0.7178
Epoch [25/51], Iter [320/391] Loss: 0.7083
Epoch [25/51], Iter [330/391] Loss: 0.7112
Epoch [25/51], Iter [340/391] Loss: 0.7120
Epoch [25/51], Iter [350/391] Loss: 0.6914
Epoch [25/51], Iter [360/391] Loss: 0.7082
Epoch [25/51], Iter [370/391] Loss: 0.6988
Epoch [25/51], Iter [380/391] Loss: 0.7055
Epoch [25/51], Iter [390/391] Loss: 0.7054
Epoch [26/51], Iter [10/391] Loss: 0.7348
Epoch [26/51], Iter [20/391] Loss: 0.7127
Epoch [26/51], Iter [30/391] Loss: 0.7078
Epoch [26/51], Iter [40/391] Loss: 0.7131
Epoch [26/51], Iter [50/391] Loss: 0.6920
Epoch [26/51], Iter [60/391] Loss: 0.7150
Epoch [26/51], Iter [70/391] Loss: 0.7201
Epoch [26/51], Iter [80/391] Loss: 0.6890
Epoch [26/51], Iter [90/391] Loss: 0.6992
Epoch [26/51], Iter [100/391] Loss: 0.6954
Epoch [26/51], Iter [110/391] Loss: 0.7001
Epoch [26/51], Iter [120/391] Loss: 0.7396
Epoch [26/51], Iter [130/391] Loss: 0.7280
Epoch [26/51], Iter [140/391] Loss: 0.7047
Epoch [26/51], Iter [150/391] Loss: 0.6834
Epoch [26/51], Iter [160/391] Loss: 0.7265
Epoch [26/51], Iter [170/391] Loss: 0.7012
Epoch [26/51], Iter [180/391] Loss: 0.6945
Epoch [26/51], Iter [190/391] Loss: 0.6961
Epoch [26/51], Iter [200/391] Loss: 0.7305
Epoch [26/51], Iter [210/391] Loss: 0.7478
Epoch [26/51], Iter [220/391] Loss: 0.7256
Epoch [26/51], Iter [230/391] Loss: 0.7261
Epoch [26/51], Iter [240/391] Loss: 0.7246
Epoch [26/51], Iter [250/391] Loss: 0.6946
Epoch [26/51], Iter [260/391] Loss: 0.7124
Epoch [26/51], Iter [270/391] Loss: 0.7121
Epoch [26/51], Iter [280/391] Loss: 0.7336
Epoch [26/51], Iter [290/391] Loss: 0.7095
Epoch [26/51], Iter [300/391] Loss: 0.6998
Epoch [26/51], Iter [310/391] Loss: 0.7250
Epoch [26/51], Iter [320/391] Loss: 0.6780
Epoch [26/51], Iter [330/391] Loss: 0.7198
Epoch [26/51], Iter [340/391] Loss: 0.7414
Epoch [26/51], Iter [350/391] Loss: 0.6994
Epoch [26/51], Iter [360/391] Loss: 0.7319
Epoch [26/51], Iter [370/391] Loss: 0.6901
Epoch [26/51], Iter [380/391] Loss: 0.6920
Epoch [26/51], Iter [390/391] Loss: 0.6956
Epoch [27/51], Iter [10/391] Loss: 0.6927
Epoch [27/51], Iter [20/391] Loss: 0.7107
Epoch [27/51], Iter [30/391] Loss: 0.6949
Epoch [27/51], Iter [40/391] Loss: 0.6790
Epoch [27/51], Iter [50/391] Loss: 0.7105
Epoch [27/51], Iter [60/391] Loss: 0.6856
Epoch [27/51], Iter [70/391] Loss: 0.6875
Epoch [27/51], Iter [80/391] Loss: 0.7024
Epoch [27/51], Iter [90/391] Loss: 0.7072
Epoch [27/51], Iter [100/391] Loss: 0.7154
Epoch [27/51], Iter [110/391] Loss: 0.6899
Epoch [27/51], Iter [120/391] Loss: 0.6776
Epoch [27/51], Iter [130/391] Loss: 0.6914
Epoch [27/51], Iter [140/391] Loss: 0.7288
Epoch [27/51], Iter [150/391] Loss: 0.7141
Epoch [27/51], Iter [160/391] Loss: 0.7120
Epoch [27/51], Iter [170/391] Loss: 0.6959
Epoch [27/51], Iter [180/391] Loss: 0.6871
Epoch [27/51], Iter [190/391] Loss: 0.7089
Epoch [27/51], Iter [200/391] Loss: 0.7128
Epoch [27/51], Iter [210/391] Loss: 0.7067
Epoch [27/51], Iter [220/391] Loss: 0.7085
Epoch [27/51], Iter [230/391] Loss: 0.6954
Epoch [27/51], Iter [240/391] Loss: 0.7205
Epoch [27/51], Iter [250/391] Loss: 0.7096
Epoch [27/51], Iter [260/391] Loss: 0.7186
Epoch [27/51], Iter [270/391] Loss: 0.7237
Epoch [27/51], Iter [280/391] Loss: 0.7035
Epoch [27/51], Iter [290/391] Loss: 0.6724
Epoch [27/51], Iter [300/391] Loss: 0.6943
Epoch [27/51], Iter [310/391] Loss: 0.7101
Epoch [27/51], Iter [320/391] Loss: 0.7309
Epoch [27/51], Iter [330/391] Loss: 0.6803
Epoch [27/51], Iter [340/391] Loss: 0.7163
Epoch [27/51], Iter [350/391] Loss: 0.6814
Epoch [27/51], Iter [360/391] Loss: 0.6904
Epoch [27/51], Iter [370/391] Loss: 0.6897
Epoch [27/51], Iter [380/391] Loss: 0.7164
Epoch [27/51], Iter [390/391] Loss: 0.6893
Epoch [28/51], Iter [10/391] Loss: 0.7370
Epoch [28/51], Iter [20/391] Loss: 0.7012
Epoch [28/51], Iter [30/391] Loss: 0.6983
Epoch [28/51], Iter [40/391] Loss: 0.7053
Epoch [28/51], Iter [50/391] Loss: 0.7167
Epoch [28/51], Iter [60/391] Loss: 0.7024
Epoch [28/51], Iter [70/391] Loss: 0.7147
Epoch [28/51], Iter [80/391] Loss: 0.7191
Epoch [28/51], Iter [90/391] Loss: 0.7113
Epoch [28/51], Iter [100/391] Loss: 0.7230
Epoch [28/51], Iter [110/391] Loss: 0.6926
Epoch [28/51], Iter [120/391] Loss: 0.6989
Epoch [28/51], Iter [130/391] Loss: 0.7024
Epoch [28/51], Iter [140/391] Loss: 0.7227
Epoch [28/51], Iter [150/391] Loss: 0.6952
Epoch [28/51], Iter [160/391] Loss: 0.6862
Epoch [28/51], Iter [170/391] Loss: 0.6699
Epoch [28/51], Iter [180/391] Loss: 0.6817
Epoch [28/51], Iter [190/391] Loss: 0.7062
Epoch [28/51], Iter [200/391] Loss: 0.6970
Epoch [28/51], Iter [210/391] Loss: 0.7035
Epoch [28/51], Iter [220/391] Loss: 0.7104
Epoch [28/51], Iter [230/391] Loss: 0.7004
Epoch [28/51], Iter [240/391] Loss: 0.6914
Epoch [28/51], Iter [250/391] Loss: 0.7100
Epoch [28/51], Iter [260/391] Loss: 0.6941
Epoch [28/51], Iter [270/391] Loss: 0.6886
Epoch [28/51], Iter [280/391] Loss: 0.6933
Epoch [28/51], Iter [290/391] Loss: 0.6981
Epoch [28/51], Iter [300/391] Loss: 0.6867
Epoch [28/51], Iter [310/391] Loss: 0.6814
Epoch [28/51], Iter [320/391] Loss: 0.7114
Epoch [28/51], Iter [330/391] Loss: 0.7092
Epoch [28/51], Iter [340/391] Loss: 0.6988
Epoch [28/51], Iter [350/391] Loss: 0.7152
Epoch [28/51], Iter [360/391] Loss: 0.6991
Epoch [28/51], Iter [370/391] Loss: 0.6942
Epoch [28/51], Iter [380/391] Loss: 0.6977
Epoch [28/51], Iter [390/391] Loss: 0.7080
Epoch [29/51], Iter [10/391] Loss: 0.7333
Epoch [29/51], Iter [20/391] Loss: 0.6964
Epoch [29/51], Iter [30/391] Loss: 0.6945
Epoch [29/51], Iter [40/391] Loss: 0.7109
Epoch [29/51], Iter [50/391] Loss: 0.6736
Epoch [29/51], Iter [60/391] Loss: 0.7020
Epoch [29/51], Iter [70/391] Loss: 0.7171
Epoch [29/51], Iter [80/391] Loss: 0.6951
Epoch [29/51], Iter [90/391] Loss: 0.7303
Epoch [29/51], Iter [100/391] Loss: 0.6928
Epoch [29/51], Iter [110/391] Loss: 0.7323
Epoch [29/51], Iter [120/391] Loss: 0.6931
Epoch [29/51], Iter [130/391] Loss: 0.6926
Epoch [29/51], Iter [140/391] Loss: 0.7369
Epoch [29/51], Iter [150/391] Loss: 0.7076
Epoch [29/51], Iter [160/391] Loss: 0.6905
Epoch [29/51], Iter [170/391] Loss: 0.7051
Epoch [29/51], Iter [180/391] Loss: 0.7137
Epoch [29/51], Iter [190/391] Loss: 0.6912
Epoch [29/51], Iter [200/391] Loss: 0.7021
Epoch [29/51], Iter [210/391] Loss: 0.7038
Epoch [29/51], Iter [220/391] Loss: 0.6932
Epoch [29/51], Iter [230/391] Loss: 0.6942
Epoch [29/51], Iter [240/391] Loss: 0.7094
Epoch [29/51], Iter [250/391] Loss: 0.7156
Epoch [29/51], Iter [260/391] Loss: 0.6604
Epoch [29/51], Iter [270/391] Loss: 0.6919
Epoch [29/51], Iter [280/391] Loss: 0.7131
Epoch [29/51], Iter [290/391] Loss: 0.6996
Epoch [29/51], Iter [300/391] Loss: 0.7030
Epoch [29/51], Iter [310/391] Loss: 0.6880
Epoch [29/51], Iter [320/391] Loss: 0.7090
Epoch [29/51], Iter [330/391] Loss: 0.6881
Epoch [29/51], Iter [340/391] Loss: 0.6792
Epoch [29/51], Iter [350/391] Loss: 0.6999
Epoch [29/51], Iter [360/391] Loss: 0.7080
Epoch [29/51], Iter [370/391] Loss: 0.7072
Epoch [29/51], Iter [380/391] Loss: 0.6962
Epoch [29/51], Iter [390/391] Loss: 0.6951
Epoch [30/51], Iter [10/391] Loss: 0.7040
Epoch [30/51], Iter [20/391] Loss: 0.6875
Epoch [30/51], Iter [30/391] Loss: 0.6954
Epoch [30/51], Iter [40/391] Loss: 0.7098
Epoch [30/51], Iter [50/391] Loss: 0.7137
Epoch [30/51], Iter [60/391] Loss: 0.7028
Epoch [30/51], Iter [70/391] Loss: 0.6808
Epoch [30/51], Iter [80/391] Loss: 0.7160
Epoch [30/51], Iter [90/391] Loss: 0.6912
Epoch [30/51], Iter [100/391] Loss: 0.6938
Epoch [30/51], Iter [110/391] Loss: 0.6959
Epoch [30/51], Iter [120/391] Loss: 0.6965
Epoch [30/51], Iter [130/391] Loss: 0.6907
Epoch [30/51], Iter [140/391] Loss: 0.6852
Epoch [30/51], Iter [150/391] Loss: 0.6854
Epoch [30/51], Iter [160/391] Loss: 0.6762
Epoch [30/51], Iter [170/391] Loss: 0.6997
Epoch [30/51], Iter [180/391] Loss: 0.7125
Epoch [30/51], Iter [190/391] Loss: 0.6891
Epoch [30/51], Iter [200/391] Loss: 0.6999
Epoch [30/51], Iter [210/391] Loss: 0.7012
Epoch [30/51], Iter [220/391] Loss: 0.6918
Epoch [30/51], Iter [230/391] Loss: 0.7000
Epoch [30/51], Iter [240/391] Loss: 0.6820
Epoch [30/51], Iter [250/391] Loss: 0.7006
Epoch [30/51], Iter [260/391] Loss: 0.6888
Epoch [30/51], Iter [270/391] Loss: 0.7165
Epoch [30/51], Iter [280/391] Loss: 0.6923
Epoch [30/51], Iter [290/391] Loss: 0.6776
Epoch [30/51], Iter [300/391] Loss: 0.7142
Epoch [30/51], Iter [310/391] Loss: 0.7165
Epoch [30/51], Iter [320/391] Loss: 0.6905
Epoch [30/51], Iter [330/391] Loss: 0.6985
Epoch [30/51], Iter [340/391] Loss: 0.7068
Epoch [30/51], Iter [350/391] Loss: 0.7075
Epoch [30/51], Iter [360/391] Loss: 0.6958
Epoch [30/51], Iter [370/391] Loss: 0.6914
Epoch [30/51], Iter [380/391] Loss: 0.6994
Epoch [30/51], Iter [390/391] Loss: 0.6730
[Saving Checkpoint]
Epoch [31/51], Iter [10/391] Loss: 0.6951
Epoch [31/51], Iter [20/391] Loss: 0.6988
Epoch [31/51], Iter [30/391] Loss: 0.6843
Epoch [31/51], Iter [40/391] Loss: 0.6851
Epoch [31/51], Iter [50/391] Loss: 0.7001
Epoch [31/51], Iter [60/391] Loss: 0.6765
Epoch [31/51], Iter [70/391] Loss: 0.6973
Epoch [31/51], Iter [80/391] Loss: 0.6851
Epoch [31/51], Iter [90/391] Loss: 0.6960
Epoch [31/51], Iter [100/391] Loss: 0.6997
Epoch [31/51], Iter [110/391] Loss: 0.6773
Epoch [31/51], Iter [120/391] Loss: 0.6717
Epoch [31/51], Iter [130/391] Loss: 0.7138
Epoch [31/51], Iter [140/391] Loss: 0.6903
Epoch [31/51], Iter [150/391] Loss: 0.6781
Epoch [31/51], Iter [160/391] Loss: 0.7215
Epoch [31/51], Iter [170/391] Loss: 0.6819
Epoch [31/51], Iter [180/391] Loss: 0.7049
Epoch [31/51], Iter [190/391] Loss: 0.6950
Epoch [31/51], Iter [200/391] Loss: 0.6857
Epoch [31/51], Iter [210/391] Loss: 0.7033
Epoch [31/51], Iter [220/391] Loss: 0.7174
Epoch [31/51], Iter [230/391] Loss: 0.6996
Epoch [31/51], Iter [240/391] Loss: 0.7062
Epoch [31/51], Iter [250/391] Loss: 0.6961
Epoch [31/51], Iter [260/391] Loss: 0.6850
Epoch [31/51], Iter [270/391] Loss: 0.7080
Epoch [31/51], Iter [280/391] Loss: 0.7010
Epoch [31/51], Iter [290/391] Loss: 0.7008
Epoch [31/51], Iter [300/391] Loss: 0.7138
Epoch [31/51], Iter [310/391] Loss: 0.6842
Epoch [31/51], Iter [320/391] Loss: 0.7248
Epoch [31/51], Iter [330/391] Loss: 0.7084
Epoch [31/51], Iter [340/391] Loss: 0.6930
Epoch [31/51], Iter [350/391] Loss: 0.7263
Epoch [31/51], Iter [360/391] Loss: 0.6889
Epoch [31/51], Iter [370/391] Loss: 0.6976
Epoch [31/51], Iter [380/391] Loss: 0.6912
Epoch [31/51], Iter [390/391] Loss: 0.7020
Epoch [32/51], Iter [10/391] Loss: 0.7081
Epoch [32/51], Iter [20/391] Loss: 0.6788
Epoch [32/51], Iter [30/391] Loss: 0.6726
Epoch [32/51], Iter [40/391] Loss: 0.7012
Epoch [32/51], Iter [50/391] Loss: 0.6975
Epoch [32/51], Iter [60/391] Loss: 0.6793
Epoch [32/51], Iter [70/391] Loss: 0.6800
Epoch [32/51], Iter [80/391] Loss: 0.6978
Epoch [32/51], Iter [90/391] Loss: 0.7095
Epoch [32/51], Iter [100/391] Loss: 0.7027
Epoch [32/51], Iter [110/391] Loss: 0.6992
Epoch [32/51], Iter [120/391] Loss: 0.7080
Epoch [32/51], Iter [130/391] Loss: 0.6815
Epoch [32/51], Iter [140/391] Loss: 0.6991
Epoch [32/51], Iter [150/391] Loss: 0.7031
Epoch [32/51], Iter [160/391] Loss: 0.6799
Epoch [32/51], Iter [170/391] Loss: 0.6969
Epoch [32/51], Iter [180/391] Loss: 0.6982
Epoch [32/51], Iter [190/391] Loss: 0.7150
Epoch [32/51], Iter [200/391] Loss: 0.6707
Epoch [32/51], Iter [210/391] Loss: 0.6738
Epoch [32/51], Iter [220/391] Loss: 0.6954
Epoch [32/51], Iter [230/391] Loss: 0.6730
Epoch [32/51], Iter [240/391] Loss: 0.6827
Epoch [32/51], Iter [250/391] Loss: 0.6781
Epoch [32/51], Iter [260/391] Loss: 0.6940
Epoch [32/51], Iter [270/391] Loss: 0.6839
Epoch [32/51], Iter [280/391] Loss: 0.6755
Epoch [32/51], Iter [290/391] Loss: 0.7019
Epoch [32/51], Iter [300/391] Loss: 0.7168
Epoch [32/51], Iter [310/391] Loss: 0.6990
Epoch [32/51], Iter [320/391] Loss: 0.6879
Epoch [32/51], Iter [330/391] Loss: 0.7006
Epoch [32/51], Iter [340/391] Loss: 0.6916
Epoch [32/51], Iter [350/391] Loss: 0.6727
Epoch [32/51], Iter [360/391] Loss: 0.6757
Epoch [32/51], Iter [370/391] Loss: 0.6908
Epoch [32/51], Iter [380/391] Loss: 0.6973
Epoch [32/51], Iter [390/391] Loss: 0.6972
Epoch [33/51], Iter [10/391] Loss: 0.7180
Epoch [33/51], Iter [20/391] Loss: 0.6939
Epoch [33/51], Iter [30/391] Loss: 0.6717
Epoch [33/51], Iter [40/391] Loss: 0.6777
Epoch [33/51], Iter [50/391] Loss: 0.6877
Epoch [33/51], Iter [60/391] Loss: 0.6915
Epoch [33/51], Iter [70/391] Loss: 0.6887
Epoch [33/51], Iter [80/391] Loss: 0.6922
Epoch [33/51], Iter [90/391] Loss: 0.6867
Epoch [33/51], Iter [100/391] Loss: 0.6563
Epoch [33/51], Iter [110/391] Loss: 0.6859
Epoch [33/51], Iter [120/391] Loss: 0.6846
Epoch [33/51], Iter [130/391] Loss: 0.6977
Epoch [33/51], Iter [140/391] Loss: 0.7015
Epoch [33/51], Iter [150/391] Loss: 0.6892
Epoch [33/51], Iter [160/391] Loss: 0.6854
Epoch [33/51], Iter [170/391] Loss: 0.6751
Epoch [33/51], Iter [180/391] Loss: 0.6878
Epoch [33/51], Iter [190/391] Loss: 0.6663
Epoch [33/51], Iter [200/391] Loss: 0.6845
Epoch [33/51], Iter [210/391] Loss: 0.6953
Epoch [33/51], Iter [220/391] Loss: 0.6739
Epoch [33/51], Iter [230/391] Loss: 0.6815
Epoch [33/51], Iter [240/391] Loss: 0.6968
Epoch [33/51], Iter [250/391] Loss: 0.6959
Epoch [33/51], Iter [260/391] Loss: 0.7021
Epoch [33/51], Iter [270/391] Loss: 0.6840
Epoch [33/51], Iter [280/391] Loss: 0.6824
Epoch [33/51], Iter [290/391] Loss: 0.6854
Epoch [33/51], Iter [300/391] Loss: 0.6897
Epoch [33/51], Iter [310/391] Loss: 0.6741
Epoch [33/51], Iter [320/391] Loss: 0.6795
Epoch [33/51], Iter [330/391] Loss: 0.6884
Epoch [33/51], Iter [340/391] Loss: 0.6799
Epoch [33/51], Iter [350/391] Loss: 0.6805
Epoch [33/51], Iter [360/391] Loss: 0.6565
Epoch [33/51], Iter [370/391] Loss: 0.6774
Epoch [33/51], Iter [380/391] Loss: 0.6874
Epoch [33/51], Iter [390/391] Loss: 0.6705
Epoch [34/51], Iter [10/391] Loss: 0.6987
Epoch [34/51], Iter [20/391] Loss: 0.6945
Epoch [34/51], Iter [30/391] Loss: 0.6765
Epoch [34/51], Iter [40/391] Loss: 0.6776
Epoch [34/51], Iter [50/391] Loss: 0.6828
Epoch [34/51], Iter [60/391] Loss: 0.6890
Epoch [34/51], Iter [70/391] Loss: 0.6777
Epoch [34/51], Iter [80/391] Loss: 0.6933
Epoch [34/51], Iter [90/391] Loss: 0.6990
Epoch [34/51], Iter [100/391] Loss: 0.7095
Epoch [34/51], Iter [110/391] Loss: 0.7072
Epoch [34/51], Iter [120/391] Loss: 0.6976
Epoch [34/51], Iter [130/391] Loss: 0.6877
Epoch [34/51], Iter [140/391] Loss: 0.6845
Epoch [34/51], Iter [150/391] Loss: 0.6871
Epoch [34/51], Iter [160/391] Loss: 0.6601
Epoch [34/51], Iter [170/391] Loss: 0.6892
Epoch [34/51], Iter [180/391] Loss: 0.6779
Epoch [34/51], Iter [190/391] Loss: 0.6792
Epoch [34/51], Iter [200/391] Loss: 0.6812
Epoch [34/51], Iter [210/391] Loss: 0.6763
Epoch [34/51], Iter [220/391] Loss: 0.6992
Epoch [34/51], Iter [230/391] Loss: 0.6990
Epoch [34/51], Iter [240/391] Loss: 0.7300
Epoch [34/51], Iter [250/391] Loss: 0.6938
Epoch [34/51], Iter [260/391] Loss: 0.7017
Epoch [34/51], Iter [270/391] Loss: 0.6810
Epoch [34/51], Iter [280/391] Loss: 0.7022
Epoch [34/51], Iter [290/391] Loss: 0.6809
Epoch [34/51], Iter [300/391] Loss: 0.6924
Epoch [34/51], Iter [310/391] Loss: 0.6979
Epoch [34/51], Iter [320/391] Loss: 0.6867
Epoch [34/51], Iter [330/391] Loss: 0.6846
Epoch [34/51], Iter [340/391] Loss: 0.6976
Epoch [34/51], Iter [350/391] Loss: 0.6961
Epoch [34/51], Iter [360/391] Loss: 0.6808
Epoch [34/51], Iter [370/391] Loss: 0.6831
Epoch [34/51], Iter [380/391] Loss: 0.7193
Epoch [34/51], Iter [390/391] Loss: 0.6801
Epoch [35/51], Iter [10/391] Loss: 0.6986
Epoch [35/51], Iter [20/391] Loss: 0.6975
Epoch [35/51], Iter [30/391] Loss: 0.6786
Epoch [35/51], Iter [40/391] Loss: 0.6829
Epoch [35/51], Iter [50/391] Loss: 0.7257
Epoch [35/51], Iter [60/391] Loss: 0.6625
Epoch [35/51], Iter [70/391] Loss: 0.6874
Epoch [35/51], Iter [80/391] Loss: 0.6674
Epoch [35/51], Iter [90/391] Loss: 0.6656
Epoch [35/51], Iter [100/391] Loss: 0.6838
Epoch [35/51], Iter [110/391] Loss: 0.6733
Epoch [35/51], Iter [120/391] Loss: 0.6846
Epoch [35/51], Iter [130/391] Loss: 0.6913
Epoch [35/51], Iter [140/391] Loss: 0.6848
Epoch [35/51], Iter [150/391] Loss: 0.6838
Epoch [35/51], Iter [160/391] Loss: 0.6964
Epoch [35/51], Iter [170/391] Loss: 0.6951
Epoch [35/51], Iter [180/391] Loss: 0.6908
Epoch [35/51], Iter [190/391] Loss: 0.6863
Epoch [35/51], Iter [200/391] Loss: 0.6779
Epoch [35/51], Iter [210/391] Loss: 0.6704
Epoch [35/51], Iter [220/391] Loss: 0.6598
Epoch [35/51], Iter [230/391] Loss: 0.6834
Epoch [35/51], Iter [240/391] Loss: 0.6886
Epoch [35/51], Iter [250/391] Loss: 0.6794
Epoch [35/51], Iter [260/391] Loss: 0.6821
Epoch [35/51], Iter [270/391] Loss: 0.6873
Epoch [35/51], Iter [280/391] Loss: 0.6821
Epoch [35/51], Iter [290/391] Loss: 0.6777
Epoch [35/51], Iter [300/391] Loss: 0.6667
Epoch [35/51], Iter [310/391] Loss: 0.6938
Epoch [35/51], Iter [320/391] Loss: 0.7002
Epoch [35/51], Iter [330/391] Loss: 0.6723
Epoch [35/51], Iter [340/391] Loss: 0.6707
Epoch [35/51], Iter [350/391] Loss: 0.7005
Epoch [35/51], Iter [360/391] Loss: 0.6829
Epoch [35/51], Iter [370/391] Loss: 0.6862
Epoch [35/51], Iter [380/391] Loss: 0.6833
Epoch [35/51], Iter [390/391] Loss: 0.6630
Epoch [36/51], Iter [10/391] Loss: 0.7088
Epoch [36/51], Iter [20/391] Loss: 0.6566
Epoch [36/51], Iter [30/391] Loss: 0.6673
Epoch [36/51], Iter [40/391] Loss: 0.6952
Epoch [36/51], Iter [50/391] Loss: 0.6790
Epoch [36/51], Iter [60/391] Loss: 0.6943
Epoch [36/51], Iter [70/391] Loss: 0.7007
Epoch [36/51], Iter [80/391] Loss: 0.6612
Epoch [36/51], Iter [90/391] Loss: 0.6636
Epoch [36/51], Iter [100/391] Loss: 0.6851
Epoch [36/51], Iter [110/391] Loss: 0.7035
Epoch [36/51], Iter [120/391] Loss: 0.6699
Epoch [36/51], Iter [130/391] Loss: 0.7000
Epoch [36/51], Iter [140/391] Loss: 0.6758
Epoch [36/51], Iter [150/391] Loss: 0.6932
Epoch [36/51], Iter [160/391] Loss: 0.6824
Epoch [36/51], Iter [170/391] Loss: 0.6860
Epoch [36/51], Iter [180/391] Loss: 0.6770
Epoch [36/51], Iter [190/391] Loss: 0.6875
Epoch [36/51], Iter [200/391] Loss: 0.6961
Epoch [36/51], Iter [210/391] Loss: 0.6906
Epoch [36/51], Iter [220/391] Loss: 0.6695
Epoch [36/51], Iter [230/391] Loss: 0.6733
Epoch [36/51], Iter [240/391] Loss: 0.6807
Epoch [36/51], Iter [250/391] Loss: 0.6875
Epoch [36/51], Iter [260/391] Loss: 0.6866
Epoch [36/51], Iter [270/391] Loss: 0.6957
Epoch [36/51], Iter [280/391] Loss: 0.6627
Epoch [36/51], Iter [290/391] Loss: 0.6923
Epoch [36/51], Iter [300/391] Loss: 0.6840
Epoch [36/51], Iter [310/391] Loss: 0.7073
Epoch [36/51], Iter [320/391] Loss: 0.7395
Epoch [36/51], Iter [330/391] Loss: 0.6790
Epoch [36/51], Iter [340/391] Loss: 0.6680
Epoch [36/51], Iter [350/391] Loss: 0.6964
Epoch [36/51], Iter [360/391] Loss: 0.6771
Epoch [36/51], Iter [370/391] Loss: 0.7134
Epoch [36/51], Iter [380/391] Loss: 0.6891
Epoch [36/51], Iter [390/391] Loss: 0.6827
Epoch [37/51], Iter [10/391] Loss: 0.6698
Epoch [37/51], Iter [20/391] Loss: 0.6674
Epoch [37/51], Iter [30/391] Loss: 0.6977
Epoch [37/51], Iter [40/391] Loss: 0.6841
Epoch [37/51], Iter [50/391] Loss: 0.6877
Epoch [37/51], Iter [60/391] Loss: 0.6775
Epoch [37/51], Iter [70/391] Loss: 0.6552
Epoch [37/51], Iter [80/391] Loss: 0.6974
Epoch [37/51], Iter [90/391] Loss: 0.6816
Epoch [37/51], Iter [100/391] Loss: 0.6887
Epoch [37/51], Iter [110/391] Loss: 0.6596
Epoch [37/51], Iter [120/391] Loss: 0.6829
Epoch [37/51], Iter [130/391] Loss: 0.6904
Epoch [37/51], Iter [140/391] Loss: 0.6867
Epoch [37/51], Iter [150/391] Loss: 0.6673
Epoch [37/51], Iter [160/391] Loss: 0.6670
Epoch [37/51], Iter [170/391] Loss: 0.6781
Epoch [37/51], Iter [180/391] Loss: 0.6710
Epoch [37/51], Iter [190/391] Loss: 0.6775
Epoch [37/51], Iter [200/391] Loss: 0.6954
Epoch [37/51], Iter [210/391] Loss: 0.6929
Epoch [37/51], Iter [220/391] Loss: 0.6766
Epoch [37/51], Iter [230/391] Loss: 0.6683
Epoch [37/51], Iter [240/391] Loss: 0.6842
Epoch [37/51], Iter [250/391] Loss: 0.6827
Epoch [37/51], Iter [260/391] Loss: 0.6926
Epoch [37/51], Iter [270/391] Loss: 0.6753
Epoch [37/51], Iter [280/391] Loss: 0.6990
Epoch [37/51], Iter [290/391] Loss: 0.6745
Epoch [37/51], Iter [300/391] Loss: 0.6730
Epoch [37/51], Iter [310/391] Loss: 0.6771
Epoch [37/51], Iter [320/391] Loss: 0.6675
Epoch [37/51], Iter [330/391] Loss: 0.6854
Epoch [37/51], Iter [340/391] Loss: 0.6688
Epoch [37/51], Iter [350/391] Loss: 0.6869
Epoch [37/51], Iter [360/391] Loss: 0.6942
Epoch [37/51], Iter [370/391] Loss: 0.6900
Epoch [37/51], Iter [380/391] Loss: 0.6804
Epoch [37/51], Iter [390/391] Loss: 0.6808
Epoch [38/51], Iter [10/391] Loss: 0.6796
Epoch [38/51], Iter [20/391] Loss: 0.6749
Epoch [38/51], Iter [30/391] Loss: 0.6660
Epoch [38/51], Iter [40/391] Loss: 0.6614
Epoch [38/51], Iter [50/391] Loss: 0.7079
Epoch [38/51], Iter [60/391] Loss: 0.6579
Epoch [38/51], Iter [70/391] Loss: 0.6606
Epoch [38/51], Iter [80/391] Loss: 0.6790
Epoch [38/51], Iter [90/391] Loss: 0.6691
Epoch [38/51], Iter [100/391] Loss: 0.6836
Epoch [38/51], Iter [110/391] Loss: 0.6883
Epoch [38/51], Iter [120/391] Loss: 0.6765
Epoch [38/51], Iter [130/391] Loss: 0.6578
Epoch [38/51], Iter [140/391] Loss: 0.6902
Epoch [38/51], Iter [150/391] Loss: 0.6835
Epoch [38/51], Iter [160/391] Loss: 0.6868
Epoch [38/51], Iter [170/391] Loss: 0.6787
Epoch [38/51], Iter [180/391] Loss: 0.6897
Epoch [38/51], Iter [190/391] Loss: 0.6755
Epoch [38/51], Iter [200/391] Loss: 0.6903
Epoch [38/51], Iter [210/391] Loss: 0.6930
Epoch [38/51], Iter [220/391] Loss: 0.6837
Epoch [38/51], Iter [230/391] Loss: 0.6859
Epoch [38/51], Iter [240/391] Loss: 0.6790
Epoch [38/51], Iter [250/391] Loss: 0.6880
Epoch [38/51], Iter [260/391] Loss: 0.6721
Epoch [38/51], Iter [270/391] Loss: 0.6778
Epoch [38/51], Iter [280/391] Loss: 0.6929
Epoch [38/51], Iter [290/391] Loss: 0.7047
Epoch [38/51], Iter [300/391] Loss: 0.6923
Epoch [38/51], Iter [310/391] Loss: 0.6787
Epoch [38/51], Iter [320/391] Loss: 0.6930
Epoch [38/51], Iter [330/391] Loss: 0.6695
Epoch [38/51], Iter [340/391] Loss: 0.7094
Epoch [38/51], Iter [350/391] Loss: 0.7154
Epoch [38/51], Iter [360/391] Loss: 0.6735
Epoch [38/51], Iter [370/391] Loss: 0.6908
Epoch [38/51], Iter [380/391] Loss: 0.6728
Epoch [38/51], Iter [390/391] Loss: 0.6759
Epoch [39/51], Iter [10/391] Loss: 0.6799
Epoch [39/51], Iter [20/391] Loss: 0.6843
Epoch [39/51], Iter [30/391] Loss: 0.6770
Epoch [39/51], Iter [40/391] Loss: 0.6896
Epoch [39/51], Iter [50/391] Loss: 0.6764
Epoch [39/51], Iter [60/391] Loss: 0.6690
Epoch [39/51], Iter [70/391] Loss: 0.6805
Epoch [39/51], Iter [80/391] Loss: 0.6736
Epoch [39/51], Iter [90/391] Loss: 0.6555
Epoch [39/51], Iter [100/391] Loss: 0.6832
Epoch [39/51], Iter [110/391] Loss: 0.6740
Epoch [39/51], Iter [120/391] Loss: 0.6764
Epoch [39/51], Iter [130/391] Loss: 0.6708
Epoch [39/51], Iter [140/391] Loss: 0.6734
Epoch [39/51], Iter [150/391] Loss: 0.6837
Epoch [39/51], Iter [160/391] Loss: 0.6689
Epoch [39/51], Iter [170/391] Loss: 0.6689
Epoch [39/51], Iter [180/391] Loss: 0.6651
Epoch [39/51], Iter [190/391] Loss: 0.6732
Epoch [39/51], Iter [200/391] Loss: 0.6719
Epoch [39/51], Iter [210/391] Loss: 0.6718
Epoch [39/51], Iter [220/391] Loss: 0.6719
Epoch [39/51], Iter [230/391] Loss: 0.6743
Epoch [39/51], Iter [240/391] Loss: 0.6676
Epoch [39/51], Iter [250/391] Loss: 0.6804
Epoch [39/51], Iter [260/391] Loss: 0.6696
Epoch [39/51], Iter [270/391] Loss: 0.6619
Epoch [39/51], Iter [280/391] Loss: 0.6986
Epoch [39/51], Iter [290/391] Loss: 0.6793
Epoch [39/51], Iter [300/391] Loss: 0.6889
Epoch [39/51], Iter [310/391] Loss: 0.6738
Epoch [39/51], Iter [320/391] Loss: 0.6854
Epoch [39/51], Iter [330/391] Loss: 0.6810
Epoch [39/51], Iter [340/391] Loss: 0.6719
Epoch [39/51], Iter [350/391] Loss: 0.6678
Epoch [39/51], Iter [360/391] Loss: 0.6777
Epoch [39/51], Iter [370/391] Loss: 0.6912
Epoch [39/51], Iter [380/391] Loss: 0.7128
Epoch [39/51], Iter [390/391] Loss: 0.6902
Epoch [40/51], Iter [10/391] Loss: 0.6975
Epoch [40/51], Iter [20/391] Loss: 0.6712
Epoch [40/51], Iter [30/391] Loss: 0.6854
Epoch [40/51], Iter [40/391] Loss: 0.6717
Epoch [40/51], Iter [50/391] Loss: 0.6726
Epoch [40/51], Iter [60/391] Loss: 0.6872
Epoch [40/51], Iter [70/391] Loss: 0.6744
Epoch [40/51], Iter [80/391] Loss: 0.6623
Epoch [40/51], Iter [90/391] Loss: 0.6855
Epoch [40/51], Iter [100/391] Loss: 0.6667
Epoch [40/51], Iter [110/391] Loss: 0.6698
Epoch [40/51], Iter [120/391] Loss: 0.6778
Epoch [40/51], Iter [130/391] Loss: 0.6752
Epoch [40/51], Iter [140/391] Loss: 0.6728
Epoch [40/51], Iter [150/391] Loss: 0.6772
Epoch [40/51], Iter [160/391] Loss: 0.6784
Epoch [40/51], Iter [170/391] Loss: 0.6988
Epoch [40/51], Iter [180/391] Loss: 0.6729
Epoch [40/51], Iter [190/391] Loss: 0.6728
Epoch [40/51], Iter [200/391] Loss: 0.6860
Epoch [40/51], Iter [210/391] Loss: 0.6717
Epoch [40/51], Iter [220/391] Loss: 0.6781
Epoch [40/51], Iter [230/391] Loss: 0.6823
Epoch [40/51], Iter [240/391] Loss: 0.6799
Epoch [40/51], Iter [250/391] Loss: 0.6697
Epoch [40/51], Iter [260/391] Loss: 0.6861
Epoch [40/51], Iter [270/391] Loss: 0.6836
Epoch [40/51], Iter [280/391] Loss: 0.6818
Epoch [40/51], Iter [290/391] Loss: 0.6657
Epoch [40/51], Iter [300/391] Loss: 0.6860
Epoch [40/51], Iter [310/391] Loss: 0.6718
Epoch [40/51], Iter [320/391] Loss: 0.6868
Epoch [40/51], Iter [330/391] Loss: 0.6792
Epoch [40/51], Iter [340/391] Loss: 0.6925
Epoch [40/51], Iter [350/391] Loss: 0.6762
Epoch [40/51], Iter [360/391] Loss: 0.6762
Epoch [40/51], Iter [370/391] Loss: 0.6959
Epoch [40/51], Iter [380/391] Loss: 0.6579
Epoch [40/51], Iter [390/391] Loss: 0.6550
[Saving Checkpoint]
Epoch [41/51], Iter [10/391] Loss: 0.6846
Epoch [41/51], Iter [20/391] Loss: 0.6668
Epoch [41/51], Iter [30/391] Loss: 0.6683
Epoch [41/51], Iter [40/391] Loss: 0.6627
Epoch [41/51], Iter [50/391] Loss: 0.6826
Epoch [41/51], Iter [60/391] Loss: 0.6626
Epoch [41/51], Iter [70/391] Loss: 0.6706
Epoch [41/51], Iter [80/391] Loss: 0.6907
Epoch [41/51], Iter [90/391] Loss: 0.6741
Epoch [41/51], Iter [100/391] Loss: 0.6749
Epoch [41/51], Iter [110/391] Loss: 0.6828
Epoch [41/51], Iter [120/391] Loss: 0.6910
Epoch [41/51], Iter [130/391] Loss: 0.7060
Epoch [41/51], Iter [140/391] Loss: 0.6601
Epoch [41/51], Iter [150/391] Loss: 0.6805
Epoch [41/51], Iter [160/391] Loss: 0.6918
Epoch [41/51], Iter [170/391] Loss: 0.6710
Epoch [41/51], Iter [180/391] Loss: 0.6557
Epoch [41/51], Iter [190/391] Loss: 0.6724
Epoch [41/51], Iter [200/391] Loss: 0.6771
Epoch [41/51], Iter [210/391] Loss: 0.6687
Epoch [41/51], Iter [220/391] Loss: 0.6623
Epoch [41/51], Iter [230/391] Loss: 0.6738
Epoch [41/51], Iter [240/391] Loss: 0.6832
Epoch [41/51], Iter [250/391] Loss: 0.6854
Epoch [41/51], Iter [260/391] Loss: 0.6577
Epoch [41/51], Iter [270/391] Loss: 0.6658
Epoch [41/51], Iter [280/391] Loss: 0.6745
Epoch [41/51], Iter [290/391] Loss: 0.6798
Epoch [41/51], Iter [300/391] Loss: 0.6710
Epoch [41/51], Iter [310/391] Loss: 0.6831
Epoch [41/51], Iter [320/391] Loss: 0.6830
Epoch [41/51], Iter [330/391] Loss: 0.6829
Epoch [41/51], Iter [340/391] Loss: 0.6522
Epoch [41/51], Iter [350/391] Loss: 0.6683
Epoch [41/51], Iter [360/391] Loss: 0.6651
Epoch [41/51], Iter [370/391] Loss: 0.6855
Epoch [41/51], Iter [380/391] Loss: 0.6861
Epoch [41/51], Iter [390/391] Loss: 0.6698
Epoch [42/51], Iter [10/391] Loss: 0.6527
Epoch [42/51], Iter [20/391] Loss: 0.6773
Epoch [42/51], Iter [30/391] Loss: 0.6531
Epoch [42/51], Iter [40/391] Loss: 0.6727
Epoch [42/51], Iter [50/391] Loss: 0.6689
Epoch [42/51], Iter [60/391] Loss: 0.6728
Epoch [42/51], Iter [70/391] Loss: 0.6592
Epoch [42/51], Iter [80/391] Loss: 0.6814
Epoch [42/51], Iter [90/391] Loss: 0.6606
Epoch [42/51], Iter [100/391] Loss: 0.6824
Epoch [42/51], Iter [110/391] Loss: 0.7061
Epoch [42/51], Iter [120/391] Loss: 0.6963
Epoch [42/51], Iter [130/391] Loss: 0.6519
Epoch [42/51], Iter [140/391] Loss: 0.6564
Epoch [42/51], Iter [150/391] Loss: 0.6740
Epoch [42/51], Iter [160/391] Loss: 0.6910
Epoch [42/51], Iter [170/391] Loss: 0.6734
Epoch [42/51], Iter [180/391] Loss: 0.6650
Epoch [42/51], Iter [190/391] Loss: 0.6690
Epoch [42/51], Iter [200/391] Loss: 0.6603
Epoch [42/51], Iter [210/391] Loss: 0.6751
Epoch [42/51], Iter [220/391] Loss: 0.6951
Epoch [42/51], Iter [230/391] Loss: 0.6773
Epoch [42/51], Iter [240/391] Loss: 0.6600
Epoch [42/51], Iter [250/391] Loss: 0.6892
Epoch [42/51], Iter [260/391] Loss: 0.6663
Epoch [42/51], Iter [270/391] Loss: 0.6625
Epoch [42/51], Iter [280/391] Loss: 0.6719
Epoch [42/51], Iter [290/391] Loss: 0.7040
Epoch [42/51], Iter [300/391] Loss: 0.6994
Epoch [42/51], Iter [310/391] Loss: 0.6917
Epoch [42/51], Iter [320/391] Loss: 0.6717
Epoch [42/51], Iter [330/391] Loss: 0.6793
Epoch [42/51], Iter [340/391] Loss: 0.6678
Epoch [42/51], Iter [350/391] Loss: 0.6691
Epoch [42/51], Iter [360/391] Loss: 0.6648
Epoch [42/51], Iter [370/391] Loss: 0.6804
Epoch [42/51], Iter [380/391] Loss: 0.6766
Epoch [42/51], Iter [390/391] Loss: 0.6566
Epoch [43/51], Iter [10/391] Loss: 0.6773
Epoch [43/51], Iter [20/391] Loss: 0.6725
Epoch [43/51], Iter [30/391] Loss: 0.6841
Epoch [43/51], Iter [40/391] Loss: 0.6696
Epoch [43/51], Iter [50/391] Loss: 0.6859
Epoch [43/51], Iter [60/391] Loss: 0.6609
Epoch [43/51], Iter [70/391] Loss: 0.6683
Epoch [43/51], Iter [80/391] Loss: 0.6749
Epoch [43/51], Iter [90/391] Loss: 0.6592
Epoch [43/51], Iter [100/391] Loss: 0.6600
Epoch [43/51], Iter [110/391] Loss: 0.6591
Epoch [43/51], Iter [120/391] Loss: 0.6644
Epoch [43/51], Iter [130/391] Loss: 0.6851
Epoch [43/51], Iter [140/391] Loss: 0.6614
Epoch [43/51], Iter [150/391] Loss: 0.6627
Epoch [43/51], Iter [160/391] Loss: 0.6841
Epoch [43/51], Iter [170/391] Loss: 0.6816
Epoch [43/51], Iter [180/391] Loss: 0.6570
Epoch [43/51], Iter [190/391] Loss: 0.6718
Epoch [43/51], Iter [200/391] Loss: 0.6566
Epoch [43/51], Iter [210/391] Loss: 0.6667
Epoch [43/51], Iter [220/391] Loss: 0.6606
Epoch [43/51], Iter [230/391] Loss: 0.6791
Epoch [43/51], Iter [240/391] Loss: 0.6723
Epoch [43/51], Iter [250/391] Loss: 0.6700
Epoch [43/51], Iter [260/391] Loss: 0.6664
Epoch [43/51], Iter [270/391] Loss: 0.6765
Epoch [43/51], Iter [280/391] Loss: 0.6666
Epoch [43/51], Iter [290/391] Loss: 0.6753
Epoch [43/51], Iter [300/391] Loss: 0.6649
Epoch [43/51], Iter [310/391] Loss: 0.6625
Epoch [43/51], Iter [320/391] Loss: 0.6851
Epoch [43/51], Iter [330/391] Loss: 0.7033
Epoch [43/51], Iter [340/391] Loss: 0.6823
Epoch [43/51], Iter [350/391] Loss: 0.6553
Epoch [43/51], Iter [360/391] Loss: 0.6711
Epoch [43/51], Iter [370/391] Loss: 0.6741
Epoch [43/51], Iter [380/391] Loss: 0.6618
Epoch [43/51], Iter [390/391] Loss: 0.6903
Epoch [44/51], Iter [10/391] Loss: 0.6636
Epoch [44/51], Iter [20/391] Loss: 0.6646
Epoch [44/51], Iter [30/391] Loss: 0.6682
Epoch [44/51], Iter [40/391] Loss: 0.6618
Epoch [44/51], Iter [50/391] Loss: 0.6726
Epoch [44/51], Iter [60/391] Loss: 0.6679
Epoch [44/51], Iter [70/391] Loss: 0.6606
Epoch [44/51], Iter [80/391] Loss: 0.6629
Epoch [44/51], Iter [90/391] Loss: 0.6714
Epoch [44/51], Iter [100/391] Loss: 0.6673
Epoch [44/51], Iter [110/391] Loss: 0.6697
Epoch [44/51], Iter [120/391] Loss: 0.6804
Epoch [44/51], Iter [130/391] Loss: 0.6711
Epoch [44/51], Iter [140/391] Loss: 0.6605
Epoch [44/51], Iter [150/391] Loss: 0.6695
Epoch [44/51], Iter [160/391] Loss: 0.6856
Epoch [44/51], Iter [170/391] Loss: 0.6701
Epoch [44/51], Iter [180/391] Loss: 0.6863
Epoch [44/51], Iter [190/391] Loss: 0.6797
Epoch [44/51], Iter [200/391] Loss: 0.6610
Epoch [44/51], Iter [210/391] Loss: 0.6564
Epoch [44/51], Iter [220/391] Loss: 0.6697
Epoch [44/51], Iter [230/391] Loss: 0.6602
Epoch [44/51], Iter [240/391] Loss: 0.6651
Epoch [44/51], Iter [250/391] Loss: 0.6649
Epoch [44/51], Iter [260/391] Loss: 0.6687
Epoch [44/51], Iter [270/391] Loss: 0.6717
Epoch [44/51], Iter [280/391] Loss: 0.6535
Epoch [44/51], Iter [290/391] Loss: 0.6643
Epoch [44/51], Iter [300/391] Loss: 0.6568
Epoch [44/51], Iter [310/391] Loss: 0.6626
Epoch [44/51], Iter [320/391] Loss: 0.6795
Epoch [44/51], Iter [330/391] Loss: 0.6536
Epoch [44/51], Iter [340/391] Loss: 0.6759
Epoch [44/51], Iter [350/391] Loss: 0.6530
Epoch [44/51], Iter [360/391] Loss: 0.6701
Epoch [44/51], Iter [370/391] Loss: 0.6698
Epoch [44/51], Iter [380/391] Loss: 0.6759
Epoch [44/51], Iter [390/391] Loss: 0.6681
Epoch [45/51], Iter [10/391] Loss: 0.6845
Epoch [45/51], Iter [20/391] Loss: 0.6671
Epoch [45/51], Iter [30/391] Loss: 0.6534
Epoch [45/51], Iter [40/391] Loss: 0.6616
Epoch [45/51], Iter [50/391] Loss: 0.6600
Epoch [45/51], Iter [60/391] Loss: 0.6509
Epoch [45/51], Iter [70/391] Loss: 0.6612
Epoch [45/51], Iter [80/391] Loss: 0.6668
Epoch [45/51], Iter [90/391] Loss: 0.6564
Epoch [45/51], Iter [100/391] Loss: 0.6721
Epoch [45/51], Iter [110/391] Loss: 0.6624
Epoch [45/51], Iter [120/391] Loss: 0.6828
Epoch [45/51], Iter [130/391] Loss: 0.6716
Epoch [45/51], Iter [140/391] Loss: 0.6788
Epoch [45/51], Iter [150/391] Loss: 0.6667
Epoch [45/51], Iter [160/391] Loss: 0.6846
Epoch [45/51], Iter [170/391] Loss: 0.6562
Epoch [45/51], Iter [180/391] Loss: 0.6516
Epoch [45/51], Iter [190/391] Loss: 0.6700
Epoch [45/51], Iter [200/391] Loss: 0.6570
Epoch [45/51], Iter [210/391] Loss: 0.6638
Epoch [45/51], Iter [220/391] Loss: 0.6912
Epoch [45/51], Iter [230/391] Loss: 0.6723
Epoch [45/51], Iter [240/391] Loss: 0.6733
Epoch [45/51], Iter [250/391] Loss: 0.6456
Epoch [45/51], Iter [260/391] Loss: 0.6782
Epoch [45/51], Iter [270/391] Loss: 0.6640
Epoch [45/51], Iter [280/391] Loss: 0.6697
Epoch [45/51], Iter [290/391] Loss: 0.6647
Epoch [45/51], Iter [300/391] Loss: 0.6583
Epoch [45/51], Iter [310/391] Loss: 0.6833
Epoch [45/51], Iter [320/391] Loss: 0.6810
Epoch [45/51], Iter [330/391] Loss: 0.6837
Epoch [45/51], Iter [340/391] Loss: 0.6488
Epoch [45/51], Iter [350/391] Loss: 0.6710
Epoch [45/51], Iter [360/391] Loss: 0.6658
Epoch [45/51], Iter [370/391] Loss: 0.6470
Epoch [45/51], Iter [380/391] Loss: 0.6833
Epoch [45/51], Iter [390/391] Loss: 0.6607
Epoch [46/51], Iter [10/391] Loss: 0.6851
Epoch [46/51], Iter [20/391] Loss: 0.6958
Epoch [46/51], Iter [30/391] Loss: 0.6815
Epoch [46/51], Iter [40/391] Loss: 0.6524
Epoch [46/51], Iter [50/391] Loss: 0.6709
Epoch [46/51], Iter [60/391] Loss: 0.6645
Epoch [46/51], Iter [70/391] Loss: 0.6753
Epoch [46/51], Iter [80/391] Loss: 0.6574
Epoch [46/51], Iter [90/391] Loss: 0.6728
Epoch [46/51], Iter [100/391] Loss: 0.6728
Epoch [46/51], Iter [110/391] Loss: 0.6731
Epoch [46/51], Iter [120/391] Loss: 0.6581
Epoch [46/51], Iter [130/391] Loss: 0.6725
Epoch [46/51], Iter [140/391] Loss: 0.6908
Epoch [46/51], Iter [150/391] Loss: 0.6512
Epoch [46/51], Iter [160/391] Loss: 0.6728
Epoch [46/51], Iter [170/391] Loss: 0.6635
Epoch [46/51], Iter [180/391] Loss: 0.6602
Epoch [46/51], Iter [190/391] Loss: 0.6884
Epoch [46/51], Iter [200/391] Loss: 0.6808
Epoch [46/51], Iter [210/391] Loss: 0.6551
Epoch [46/51], Iter [220/391] Loss: 0.6678
Epoch [46/51], Iter [230/391] Loss: 0.6641
Epoch [46/51], Iter [240/391] Loss: 0.6701
Epoch [46/51], Iter [250/391] Loss: 0.6804
Epoch [46/51], Iter [260/391] Loss: 0.6639
Epoch [46/51], Iter [270/391] Loss: 0.6817
Epoch [46/51], Iter [280/391] Loss: 0.6962
Epoch [46/51], Iter [290/391] Loss: 0.6589
Epoch [46/51], Iter [300/391] Loss: 0.6630
Epoch [46/51], Iter [310/391] Loss: 0.6805
Epoch [46/51], Iter [320/391] Loss: 0.6679
Epoch [46/51], Iter [330/391] Loss: 0.6716
Epoch [46/51], Iter [340/391] Loss: 0.6892
Epoch [46/51], Iter [350/391] Loss: 0.6518
Epoch [46/51], Iter [360/391] Loss: 0.6560
Epoch [46/51], Iter [370/391] Loss: 0.7027
Epoch [46/51], Iter [380/391] Loss: 0.6763
Epoch [46/51], Iter [390/391] Loss: 0.6775
Epoch [47/51], Iter [10/391] Loss: 0.6486
Epoch [47/51], Iter [20/391] Loss: 0.6637
Epoch [47/51], Iter [30/391] Loss: 0.6511
Epoch [47/51], Iter [40/391] Loss: 0.6674
Epoch [47/51], Iter [50/391] Loss: 0.6815
Epoch [47/51], Iter [60/391] Loss: 0.6724
Epoch [47/51], Iter [70/391] Loss: 0.6611
Epoch [47/51], Iter [80/391] Loss: 0.6453
Epoch [47/51], Iter [90/391] Loss: 0.6508
Epoch [47/51], Iter [100/391] Loss: 0.6563
Epoch [47/51], Iter [110/391] Loss: 0.6517
Epoch [47/51], Iter [120/391] Loss: 0.6612
Epoch [47/51], Iter [130/391] Loss: 0.6757
Epoch [47/51], Iter [140/391] Loss: 0.6513
Epoch [47/51], Iter [150/391] Loss: 0.6738
Epoch [47/51], Iter [160/391] Loss: 0.6708
Epoch [47/51], Iter [170/391] Loss: 0.6769
Epoch [47/51], Iter [180/391] Loss: 0.6671
Epoch [47/51], Iter [190/391] Loss: 0.6684
Epoch [47/51], Iter [200/391] Loss: 0.6679
Epoch [47/51], Iter [210/391] Loss: 0.6504
Epoch [47/51], Iter [220/391] Loss: 0.6595
Epoch [47/51], Iter [230/391] Loss: 0.6627
Epoch [47/51], Iter [240/391] Loss: 0.6771
Epoch [47/51], Iter [250/391] Loss: 0.6622
Epoch [47/51], Iter [260/391] Loss: 0.6718
Epoch [47/51], Iter [270/391] Loss: 0.6793
Epoch [47/51], Iter [280/391] Loss: 0.6843
Epoch [47/51], Iter [290/391] Loss: 0.6551
Epoch [47/51], Iter [300/391] Loss: 0.6613
Epoch [47/51], Iter [310/391] Loss: 0.6647
Epoch [47/51], Iter [320/391] Loss: 0.6608
Epoch [47/51], Iter [330/391] Loss: 0.6658
Epoch [47/51], Iter [340/391] Loss: 0.6566
Epoch [47/51], Iter [350/391] Loss: 0.6761
Epoch [47/51], Iter [360/391] Loss: 0.6766
Epoch [47/51], Iter [370/391] Loss: 0.6723
Epoch [47/51], Iter [380/391] Loss: 0.6947
Epoch [47/51], Iter [390/391] Loss: 0.6848
Epoch [48/51], Iter [10/391] Loss: 0.6739
Epoch [48/51], Iter [20/391] Loss: 0.6610
Epoch [48/51], Iter [30/391] Loss: 0.6483
Epoch [48/51], Iter [40/391] Loss: 0.6768
Epoch [48/51], Iter [50/391] Loss: 0.6645
Epoch [48/51], Iter [60/391] Loss: 0.6744
Epoch [48/51], Iter [70/391] Loss: 0.6673
Epoch [48/51], Iter [80/391] Loss: 0.6702
Epoch [48/51], Iter [90/391] Loss: 0.6541
Epoch [48/51], Iter [100/391] Loss: 0.6606
Epoch [48/51], Iter [110/391] Loss: 0.6656
Epoch [48/51], Iter [120/391] Loss: 0.6667
Epoch [48/51], Iter [130/391] Loss: 0.6741
Epoch [48/51], Iter [140/391] Loss: 0.6757
Epoch [48/51], Iter [150/391] Loss: 0.6781
Epoch [48/51], Iter [160/391] Loss: 0.6454
Epoch [48/51], Iter [170/391] Loss: 0.6713
Epoch [48/51], Iter [180/391] Loss: 0.6683
Epoch [48/51], Iter [190/391] Loss: 0.6780
Epoch [48/51], Iter [200/391] Loss: 0.6513
Epoch [48/51], Iter [210/391] Loss: 0.6631
Epoch [48/51], Iter [220/391] Loss: 0.6715
Epoch [48/51], Iter [230/391] Loss: 0.6749
Epoch [48/51], Iter [240/391] Loss: 0.6719
Epoch [48/51], Iter [250/391] Loss: 0.6602
Epoch [48/51], Iter [260/391] Loss: 0.6463
Epoch [48/51], Iter [270/391] Loss: 0.6607
Epoch [48/51], Iter [280/391] Loss: 0.6609
Epoch [48/51], Iter [290/391] Loss: 0.6687
Epoch [48/51], Iter [300/391] Loss: 0.6626
Epoch [48/51], Iter [310/391] Loss: 0.6655
Epoch [48/51], Iter [320/391] Loss: 0.6787
Epoch [48/51], Iter [330/391] Loss: 0.6667
Epoch [48/51], Iter [340/391] Loss: 0.6673
Epoch [48/51], Iter [350/391] Loss: 0.6621
Epoch [48/51], Iter [360/391] Loss: 0.6666
Epoch [48/51], Iter [370/391] Loss: 0.6433
Epoch [48/51], Iter [380/391] Loss: 0.6718
Epoch [48/51], Iter [390/391] Loss: 0.6612
Epoch [49/51], Iter [10/391] Loss: 0.6637
Epoch [49/51], Iter [20/391] Loss: 0.6651
Epoch [49/51], Iter [30/391] Loss: 0.6590
Epoch [49/51], Iter [40/391] Loss: 0.6541
Epoch [49/51], Iter [50/391] Loss: 0.6503
Epoch [49/51], Iter [60/391] Loss: 0.6587
Epoch [49/51], Iter [70/391] Loss: 0.6482
Epoch [49/51], Iter [80/391] Loss: 0.6828
Epoch [49/51], Iter [90/391] Loss: 0.6633
Epoch [49/51], Iter [100/391] Loss: 0.6567
Epoch [49/51], Iter [110/391] Loss: 0.6495
Epoch [49/51], Iter [120/391] Loss: 0.6569
Epoch [49/51], Iter [130/391] Loss: 0.6556
Epoch [49/51], Iter [140/391] Loss: 0.6733
Epoch [49/51], Iter [150/391] Loss: 0.6708
Epoch [49/51], Iter [160/391] Loss: 0.6547
Epoch [49/51], Iter [170/391] Loss: 0.6577
Epoch [49/51], Iter [180/391] Loss: 0.6815
Epoch [49/51], Iter [190/391] Loss: 0.6687
Epoch [49/51], Iter [200/391] Loss: 0.6711
Epoch [49/51], Iter [210/391] Loss: 0.6636
Epoch [49/51], Iter [220/391] Loss: 0.6662
Epoch [49/51], Iter [230/391] Loss: 0.6679
Epoch [49/51], Iter [240/391] Loss: 0.6583
Epoch [49/51], Iter [250/391] Loss: 0.6701
Epoch [49/51], Iter [260/391] Loss: 0.6755
Epoch [49/51], Iter [270/391] Loss: 0.6659
Epoch [49/51], Iter [280/391] Loss: 0.6630
Epoch [49/51], Iter [290/391] Loss: 0.6671
Epoch [49/51], Iter [300/391] Loss: 0.6662
Epoch [49/51], Iter [310/391] Loss: 0.6669
Epoch [49/51], Iter [320/391] Loss: 0.6773
Epoch [49/51], Iter [330/391] Loss: 0.6725
Epoch [49/51], Iter [340/391] Loss: 0.6529
Epoch [49/51], Iter [350/391] Loss: 0.6563
Epoch [49/51], Iter [360/391] Loss: 0.6600
Epoch [49/51], Iter [370/391] Loss: 0.6624
Epoch [49/51], Iter [380/391] Loss: 0.6830
Epoch [49/51], Iter [390/391] Loss: 0.6682
Epoch [50/51], Iter [10/391] Loss: 0.6525
Epoch [50/51], Iter [20/391] Loss: 0.6743
Epoch [50/51], Iter [30/391] Loss: 0.6572
Epoch [50/51], Iter [40/391] Loss: 0.6635
Epoch [50/51], Iter [50/391] Loss: 0.6610
Epoch [50/51], Iter [60/391] Loss: 0.6425
Epoch [50/51], Iter [70/391] Loss: 0.6637
Epoch [50/51], Iter [80/391] Loss: 0.6686
Epoch [50/51], Iter [90/391] Loss: 0.6740
Epoch [50/51], Iter [100/391] Loss: 0.6743
Epoch [50/51], Iter [110/391] Loss: 0.6562
Epoch [50/51], Iter [120/391] Loss: 0.6624
Epoch [50/51], Iter [130/391] Loss: 0.6465
Epoch [50/51], Iter [140/391] Loss: 0.6670
Epoch [50/51], Iter [150/391] Loss: 0.6652
Epoch [50/51], Iter [160/391] Loss: 0.6741
Epoch [50/51], Iter [170/391] Loss: 0.6683
Epoch [50/51], Iter [180/391] Loss: 0.6602
Epoch [50/51], Iter [190/391] Loss: 0.6467
Epoch [50/51], Iter [200/391] Loss: 0.6699
Epoch [50/51], Iter [210/391] Loss: 0.6655
Epoch [50/51], Iter [220/391] Loss: 0.6569
Epoch [50/51], Iter [230/391] Loss: 0.6560
Epoch [50/51], Iter [240/391] Loss: 0.6418
Epoch [50/51], Iter [250/391] Loss: 0.6554
Epoch [50/51], Iter [260/391] Loss: 0.6468
Epoch [50/51], Iter [270/391] Loss: 0.6600
Epoch [50/51], Iter [280/391] Loss: 0.6693
Epoch [50/51], Iter [290/391] Loss: 0.6876
Epoch [50/51], Iter [300/391] Loss: 0.6609
Epoch [50/51], Iter [310/391] Loss: 0.6449
Epoch [50/51], Iter [320/391] Loss: 0.6769
Epoch [50/51], Iter [330/391] Loss: 0.6559
Epoch [50/51], Iter [340/391] Loss: 0.6583
Epoch [50/51], Iter [350/391] Loss: 0.6676
Epoch [50/51], Iter [360/391] Loss: 0.6937
Epoch [50/51], Iter [370/391] Loss: 0.6485
Epoch [50/51], Iter [380/391] Loss: 0.6556
Epoch [50/51], Iter [390/391] Loss: 0.6691
[Saving Checkpoint]
Epoch [51/51], Iter [10/391] Loss: 0.6332
Epoch [51/51], Iter [20/391] Loss: 0.6589
Epoch [51/51], Iter [30/391] Loss: 0.6537
Epoch [51/51], Iter [40/391] Loss: 0.6646
Epoch [51/51], Iter [50/391] Loss: 0.6446
Epoch [51/51], Iter [60/391] Loss: 0.6655
Epoch [51/51], Iter [70/391] Loss: 0.6476
Epoch [51/51], Iter [80/391] Loss: 0.6595
Epoch [51/51], Iter [90/391] Loss: 0.6472
Epoch [51/51], Iter [100/391] Loss: 0.6783
Epoch [51/51], Iter [110/391] Loss: 0.6492
Epoch [51/51], Iter [120/391] Loss: 0.6796
Epoch [51/51], Iter [130/391] Loss: 0.6475
Epoch [51/51], Iter [140/391] Loss: 0.6615
Epoch [51/51], Iter [150/391] Loss: 0.6648
Epoch [51/51], Iter [160/391] Loss: 0.6714
Epoch [51/51], Iter [170/391] Loss: 0.6622
Epoch [51/51], Iter [180/391] Loss: 0.6631
Epoch [51/51], Iter [190/391] Loss: 0.6707
Epoch [51/51], Iter [200/391] Loss: 0.6643
Epoch [51/51], Iter [210/391] Loss: 0.6719
Epoch [51/51], Iter [220/391] Loss: 0.6547
Epoch [51/51], Iter [230/391] Loss: 0.6677
Epoch [51/51], Iter [240/391] Loss: 0.6716
Epoch [51/51], Iter [250/391] Loss: 0.6699
Epoch [51/51], Iter [260/391] Loss: 0.6702
Epoch [51/51], Iter [270/391] Loss: 0.6583
Epoch [51/51], Iter [280/391] Loss: 0.6515
Epoch [51/51], Iter [290/391] Loss: 0.6765
Epoch [51/51], Iter [300/391] Loss: 0.6672
Epoch [51/51], Iter [310/391] Loss: 0.6714
Epoch [51/51], Iter [320/391] Loss: 0.6974
Epoch [51/51], Iter [330/391] Loss: 0.6579
Epoch [51/51], Iter [340/391] Loss: 0.6708
Epoch [51/51], Iter [350/391] Loss: 0.6677
Epoch [51/51], Iter [360/391] Loss: 0.6777
Epoch [51/51], Iter [370/391] Loss: 0.6613
Epoch [51/51], Iter [380/391] Loss: 0.6561
Epoch [51/51], Iter [390/391] Loss: 0.6816
# | a=0.5 | T=10 | epochs = 51 |
resnet_child_a0dot5_t10_e51 = copy.deepcopy(resnet_child) #let's save for future reference
test_harness( testloader, resnet_child_a0dot5_t10_e51 )
Accuracy of the model on the test images: 91 %
(tensor(9107, device='cuda:0'), 10000)
# | a=0.5 | T=15 | epochs = 51 |
resnet_child, optimizer_child = get_new_child_model()
epoch = 0
kd_loss_a0dot5_t15 = partial( knowledge_distillation_loss, alpha=0.5, T=15 )
training_harness( trainloader, optimizer_child, kd_loss_a0dot5_t15, resnet_parent, resnet_child, model_name='DeepResNet_a0dot5_t15_e51' )
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:12: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
if sys.path[0] == '':
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:26: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:15: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
from ipykernel import kernelapp as app
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:20: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:21: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
ENABLING GPU ACCELERATION || GeForce GTX 1080 Ti
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:46: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
Epoch [1/51], Iter [10/391] Loss: 2.1409
Epoch [1/51], Iter [20/391] Loss: 1.6133
Epoch [1/51], Iter [30/391] Loss: 1.6076
Epoch [1/51], Iter [40/391] Loss: 1.5616
Epoch [1/51], Iter [50/391] Loss: 1.4655
Epoch [1/51], Iter [60/391] Loss: 1.4663
Epoch [1/51], Iter [70/391] Loss: 1.4788
Epoch [1/51], Iter [80/391] Loss: 1.4604
Epoch [1/51], Iter [90/391] Loss: 1.4408
Epoch [1/51], Iter [100/391] Loss: 1.4202
Epoch [1/51], Iter [110/391] Loss: 1.4478
Epoch [1/51], Iter [120/391] Loss: 1.4102
Epoch [1/51], Iter [130/391] Loss: 1.4545
Epoch [1/51], Iter [140/391] Loss: 1.3439
Epoch [1/51], Iter [150/391] Loss: 1.3123
Epoch [1/51], Iter [160/391] Loss: 1.3852
Epoch [1/51], Iter [170/391] Loss: 1.3838
Epoch [1/51], Iter [180/391] Loss: 1.3434
Epoch [1/51], Iter [190/391] Loss: 1.3043
Epoch [1/51], Iter [200/391] Loss: 1.3718
Epoch [1/51], Iter [210/391] Loss: 1.2752
Epoch [1/51], Iter [220/391] Loss: 1.3457
Epoch [1/51], Iter [230/391] Loss: 1.3353
Epoch [1/51], Iter [240/391] Loss: 1.2928
Epoch [1/51], Iter [250/391] Loss: 1.3006
Epoch [1/51], Iter [260/391] Loss: 1.2815
Epoch [1/51], Iter [270/391] Loss: 1.3163
Epoch [1/51], Iter [280/391] Loss: 1.2974
Epoch [1/51], Iter [290/391] Loss: 1.3324
Epoch [1/51], Iter [300/391] Loss: 1.2652
Epoch [1/51], Iter [310/391] Loss: 1.2991
Epoch [1/51], Iter [320/391] Loss: 1.2920
Epoch [1/51], Iter [330/391] Loss: 1.2723
Epoch [1/51], Iter [340/391] Loss: 1.2517
Epoch [1/51], Iter [350/391] Loss: 1.2640
Epoch [1/51], Iter [360/391] Loss: 1.2431
Epoch [1/51], Iter [370/391] Loss: 1.2519
Epoch [1/51], Iter [380/391] Loss: 1.2728
Epoch [1/51], Iter [390/391] Loss: 1.2386
Epoch [2/51], Iter [10/391] Loss: 1.2750
Epoch [2/51], Iter [20/391] Loss: 1.2121
Epoch [2/51], Iter [30/391] Loss: 1.2339
Epoch [2/51], Iter [40/391] Loss: 1.2276
Epoch [2/51], Iter [50/391] Loss: 1.2408
Epoch [2/51], Iter [60/391] Loss: 1.2552
Epoch [2/51], Iter [70/391] Loss: 1.1968
Epoch [2/51], Iter [80/391] Loss: 1.2383
Epoch [2/51], Iter [90/391] Loss: 1.1876
Epoch [2/51], Iter [100/391] Loss: 1.2257
Epoch [2/51], Iter [110/391] Loss: 1.2076
Epoch [2/51], Iter [120/391] Loss: 1.1890
Epoch [2/51], Iter [130/391] Loss: 1.2064
Epoch [2/51], Iter [140/391] Loss: 1.1766
Epoch [2/51], Iter [150/391] Loss: 1.2048
Epoch [2/51], Iter [160/391] Loss: 1.2257
Epoch [2/51], Iter [170/391] Loss: 1.2234
Epoch [2/51], Iter [180/391] Loss: 1.1941
Epoch [2/51], Iter [190/391] Loss: 1.1558
Epoch [2/51], Iter [200/391] Loss: 1.2200
Epoch [2/51], Iter [210/391] Loss: 1.1983
Epoch [2/51], Iter [220/391] Loss: 1.1979
Epoch [2/51], Iter [230/391] Loss: 1.1497
Epoch [2/51], Iter [240/391] Loss: 1.1297
Epoch [2/51], Iter [250/391] Loss: 1.1834
Epoch [2/51], Iter [260/391] Loss: 1.2193
Epoch [2/51], Iter [270/391] Loss: 1.1636
Epoch [2/51], Iter [280/391] Loss: 1.1993
Epoch [2/51], Iter [290/391] Loss: 1.1465
Epoch [2/51], Iter [300/391] Loss: 1.1945
Epoch [2/51], Iter [310/391] Loss: 1.1491
Epoch [2/51], Iter [320/391] Loss: 1.1728
Epoch [2/51], Iter [330/391] Loss: 1.1755
Epoch [2/51], Iter [340/391] Loss: 1.1874
Epoch [2/51], Iter [350/391] Loss: 1.1630
Epoch [2/51], Iter [360/391] Loss: 1.1581
Epoch [2/51], Iter [370/391] Loss: 1.1776
Epoch [2/51], Iter [380/391] Loss: 1.1566
Epoch [2/51], Iter [390/391] Loss: 1.1980
Epoch [3/51], Iter [10/391] Loss: 1.1560
Epoch [3/51], Iter [20/391] Loss: 1.1749
Epoch [3/51], Iter [30/391] Loss: 1.1079
Epoch [3/51], Iter [40/391] Loss: 1.1339
Epoch [3/51], Iter [50/391] Loss: 1.0868
Epoch [3/51], Iter [60/391] Loss: 1.0967
Epoch [3/51], Iter [70/391] Loss: 1.1230
Epoch [3/51], Iter [80/391] Loss: 1.1415
Epoch [3/51], Iter [90/391] Loss: 1.1445
Epoch [3/51], Iter [100/391] Loss: 1.1081
Epoch [3/51], Iter [110/391] Loss: 1.1012
Epoch [3/51], Iter [120/391] Loss: 1.0841
Epoch [3/51], Iter [130/391] Loss: 1.1187
Epoch [3/51], Iter [140/391] Loss: 1.1381
Epoch [3/51], Iter [150/391] Loss: 1.1086
Epoch [3/51], Iter [160/391] Loss: 1.0820
Epoch [3/51], Iter [170/391] Loss: 1.1241
Epoch [3/51], Iter [180/391] Loss: 1.1392
Epoch [3/51], Iter [190/391] Loss: 1.0737
Epoch [3/51], Iter [200/391] Loss: 1.1029
Epoch [3/51], Iter [210/391] Loss: 1.1390
Epoch [3/51], Iter [220/391] Loss: 1.0854
Epoch [3/51], Iter [230/391] Loss: 1.1193
Epoch [3/51], Iter [240/391] Loss: 1.1161
Epoch [3/51], Iter [250/391] Loss: 1.0710
Epoch [3/51], Iter [260/391] Loss: 1.0439
Epoch [3/51], Iter [270/391] Loss: 1.0719
Epoch [3/51], Iter [280/391] Loss: 1.1309
Epoch [3/51], Iter [290/391] Loss: 1.0727
Epoch [3/51], Iter [300/391] Loss: 1.0892
Epoch [3/51], Iter [310/391] Loss: 1.1311
Epoch [3/51], Iter [320/391] Loss: 1.0864
Epoch [3/51], Iter [330/391] Loss: 1.1042
Epoch [3/51], Iter [340/391] Loss: 1.1347
Epoch [3/51], Iter [350/391] Loss: 1.1077
Epoch [3/51], Iter [360/391] Loss: 1.0625
Epoch [3/51], Iter [370/391] Loss: 1.0629
Epoch [3/51], Iter [380/391] Loss: 1.0634
Epoch [3/51], Iter [390/391] Loss: 1.0611
Epoch [4/51], Iter [10/391] Loss: 1.0616
Epoch [4/51], Iter [20/391] Loss: 1.0677
Epoch [4/51], Iter [30/391] Loss: 1.0775
Epoch [4/51], Iter [40/391] Loss: 1.0767
Epoch [4/51], Iter [50/391] Loss: 1.1318
Epoch [4/51], Iter [60/391] Loss: 1.0472
Epoch [4/51], Iter [70/391] Loss: 1.0854
Epoch [4/51], Iter [80/391] Loss: 1.0704
Epoch [4/51], Iter [90/391] Loss: 1.0797
Epoch [4/51], Iter [100/391] Loss: 1.0553
Epoch [4/51], Iter [110/391] Loss: 1.0638
Epoch [4/51], Iter [120/391] Loss: 1.0562
Epoch [4/51], Iter [130/391] Loss: 1.0613
Epoch [4/51], Iter [140/391] Loss: 1.0732
Epoch [4/51], Iter [150/391] Loss: 1.0997
Epoch [4/51], Iter [160/391] Loss: 1.0444
Epoch [4/51], Iter [170/391] Loss: 1.0772
Epoch [4/51], Iter [180/391] Loss: 1.1086
Epoch [4/51], Iter [190/391] Loss: 1.0865
Epoch [4/51], Iter [200/391] Loss: 1.0881
Epoch [4/51], Iter [210/391] Loss: 1.0557
Epoch [4/51], Iter [220/391] Loss: 1.0910
Epoch [4/51], Iter [230/391] Loss: 1.0480
Epoch [4/51], Iter [240/391] Loss: 1.0807
Epoch [4/51], Iter [250/391] Loss: 1.0708
Epoch [4/51], Iter [260/391] Loss: 1.0506
Epoch [4/51], Iter [270/391] Loss: 1.0291
Epoch [4/51], Iter [280/391] Loss: 1.0728
Epoch [4/51], Iter [290/391] Loss: 1.0580
Epoch [4/51], Iter [300/391] Loss: 1.0344
Epoch [4/51], Iter [310/391] Loss: 1.0694
Epoch [4/51], Iter [320/391] Loss: 1.0615
Epoch [4/51], Iter [330/391] Loss: 1.0411
Epoch [4/51], Iter [340/391] Loss: 1.0721
Epoch [4/51], Iter [350/391] Loss: 1.0708
Epoch [4/51], Iter [360/391] Loss: 1.0810
Epoch [4/51], Iter [370/391] Loss: 1.0897
Epoch [4/51], Iter [380/391] Loss: 1.0503
Epoch [4/51], Iter [390/391] Loss: 1.0234
Epoch [5/51], Iter [10/391] Loss: 1.0285
Epoch [5/51], Iter [20/391] Loss: 1.0402
Epoch [5/51], Iter [30/391] Loss: 1.0536
Epoch [5/51], Iter [40/391] Loss: 1.0323
Epoch [5/51], Iter [50/391] Loss: 1.0422
Epoch [5/51], Iter [60/391] Loss: 1.0312
Epoch [5/51], Iter [70/391] Loss: 1.0002
Epoch [5/51], Iter [80/391] Loss: 1.0254
Epoch [5/51], Iter [90/391] Loss: 1.0241
Epoch [5/51], Iter [100/391] Loss: 1.0066
Epoch [5/51], Iter [110/391] Loss: 1.0055
Epoch [5/51], Iter [120/391] Loss: 1.0385
Epoch [5/51], Iter [130/391] Loss: 1.0251
Epoch [5/51], Iter [140/391] Loss: 1.0245
Epoch [5/51], Iter [150/391] Loss: 1.0385
Epoch [5/51], Iter [160/391] Loss: 1.0324
Epoch [5/51], Iter [170/391] Loss: 1.0697
Epoch [5/51], Iter [180/391] Loss: 1.0443
Epoch [5/51], Iter [190/391] Loss: 1.0237
Epoch [5/51], Iter [200/391] Loss: 1.0157
Epoch [5/51], Iter [210/391] Loss: 1.0121
Epoch [5/51], Iter [220/391] Loss: 1.0177
Epoch [5/51], Iter [230/391] Loss: 1.0202
Epoch [5/51], Iter [240/391] Loss: 1.0239
Epoch [5/51], Iter [250/391] Loss: 1.0524
Epoch [5/51], Iter [260/391] Loss: 1.0071
Epoch [5/51], Iter [270/391] Loss: 1.0150
Epoch [5/51], Iter [280/391] Loss: 1.0196
Epoch [5/51], Iter [290/391] Loss: 1.0409
Epoch [5/51], Iter [300/391] Loss: 1.0473
Epoch [5/51], Iter [310/391] Loss: 0.9981
Epoch [5/51], Iter [320/391] Loss: 1.0282
Epoch [5/51], Iter [330/391] Loss: 1.0399
Epoch [5/51], Iter [340/391] Loss: 1.0444
Epoch [5/51], Iter [350/391] Loss: 1.0215
Epoch [5/51], Iter [360/391] Loss: 0.9955
Epoch [5/51], Iter [370/391] Loss: 1.0148
Epoch [5/51], Iter [380/391] Loss: 1.0315
Epoch [5/51], Iter [390/391] Loss: 1.0473
Epoch [6/51], Iter [10/391] Loss: 1.0241
Epoch [6/51], Iter [20/391] Loss: 1.0162
Epoch [6/51], Iter [30/391] Loss: 1.0397
Epoch [6/51], Iter [40/391] Loss: 0.9964
Epoch [6/51], Iter [50/391] Loss: 1.0393
Epoch [6/51], Iter [60/391] Loss: 1.0017
Epoch [6/51], Iter [70/391] Loss: 1.0432
Epoch [6/51], Iter [80/391] Loss: 1.0214
Epoch [6/51], Iter [90/391] Loss: 0.9969
Epoch [6/51], Iter [100/391] Loss: 0.9921
Epoch [6/51], Iter [110/391] Loss: 0.9918
Epoch [6/51], Iter [120/391] Loss: 1.0104
Epoch [6/51], Iter [130/391] Loss: 1.0133
Epoch [6/51], Iter [140/391] Loss: 0.9822
Epoch [6/51], Iter [150/391] Loss: 0.9800
Epoch [6/51], Iter [160/391] Loss: 1.0041
Epoch [6/51], Iter [170/391] Loss: 1.0233
Epoch [6/51], Iter [180/391] Loss: 0.9914
Epoch [6/51], Iter [190/391] Loss: 0.9901
Epoch [6/51], Iter [200/391] Loss: 1.0049
Epoch [6/51], Iter [210/391] Loss: 0.9823
Epoch [6/51], Iter [220/391] Loss: 0.9682
Epoch [6/51], Iter [230/391] Loss: 0.9750
Epoch [6/51], Iter [240/391] Loss: 1.0134
Epoch [6/51], Iter [250/391] Loss: 0.9730
Epoch [6/51], Iter [260/391] Loss: 0.9688
Epoch [6/51], Iter [270/391] Loss: 0.9757
Epoch [6/51], Iter [280/391] Loss: 1.0085
Epoch [6/51], Iter [290/391] Loss: 0.9933
Epoch [6/51], Iter [300/391] Loss: 1.0230
Epoch [6/51], Iter [310/391] Loss: 0.9959
Epoch [6/51], Iter [320/391] Loss: 0.9889
Epoch [6/51], Iter [330/391] Loss: 0.9920
Epoch [6/51], Iter [340/391] Loss: 0.9833
Epoch [6/51], Iter [350/391] Loss: 0.9901
Epoch [6/51], Iter [360/391] Loss: 1.0135
Epoch [6/51], Iter [370/391] Loss: 1.0121
Epoch [6/51], Iter [380/391] Loss: 1.0149
Epoch [6/51], Iter [390/391] Loss: 1.0005
Epoch [7/51], Iter [10/391] Loss: 1.0183
Epoch [7/51], Iter [20/391] Loss: 0.9996
Epoch [7/51], Iter [30/391] Loss: 0.9732
Epoch [7/51], Iter [40/391] Loss: 0.9952
Epoch [7/51], Iter [50/391] Loss: 0.9619
Epoch [7/51], Iter [60/391] Loss: 0.9656
Epoch [7/51], Iter [70/391] Loss: 0.9763
Epoch [7/51], Iter [80/391] Loss: 1.0189
Epoch [7/51], Iter [90/391] Loss: 0.9923
Epoch [7/51], Iter [100/391] Loss: 0.9711
Epoch [7/51], Iter [110/391] Loss: 0.9574
Epoch [7/51], Iter [120/391] Loss: 0.9739
Epoch [7/51], Iter [130/391] Loss: 0.9597
Epoch [7/51], Iter [140/391] Loss: 0.9784
Epoch [7/51], Iter [150/391] Loss: 0.9719
Epoch [7/51], Iter [160/391] Loss: 1.0224
Epoch [7/51], Iter [170/391] Loss: 0.9556
Epoch [7/51], Iter [180/391] Loss: 0.9791
Epoch [7/51], Iter [190/391] Loss: 0.9864
Epoch [7/51], Iter [200/391] Loss: 1.0168
Epoch [7/51], Iter [210/391] Loss: 0.9348
Epoch [7/51], Iter [220/391] Loss: 0.9724
Epoch [7/51], Iter [230/391] Loss: 1.0069
Epoch [7/51], Iter [240/391] Loss: 1.0136
Epoch [7/51], Iter [250/391] Loss: 0.9922
Epoch [7/51], Iter [260/391] Loss: 0.9705
Epoch [7/51], Iter [270/391] Loss: 0.9939
Epoch [7/51], Iter [280/391] Loss: 0.9694
Epoch [7/51], Iter [290/391] Loss: 0.9913
Epoch [7/51], Iter [300/391] Loss: 0.9668
Epoch [7/51], Iter [310/391] Loss: 0.9579
Epoch [7/51], Iter [320/391] Loss: 0.9747
Epoch [7/51], Iter [330/391] Loss: 0.9978
Epoch [7/51], Iter [340/391] Loss: 0.9395
Epoch [7/51], Iter [350/391] Loss: 0.9883
Epoch [7/51], Iter [360/391] Loss: 0.9682
Epoch [7/51], Iter [370/391] Loss: 0.9684
Epoch [7/51], Iter [380/391] Loss: 0.9420
Epoch [7/51], Iter [390/391] Loss: 0.9749
Epoch [8/51], Iter [10/391] Loss: 0.9890
Epoch [8/51], Iter [20/391] Loss: 0.9898
Epoch [8/51], Iter [30/391] Loss: 0.9685
Epoch [8/51], Iter [40/391] Loss: 0.9661
Epoch [8/51], Iter [50/391] Loss: 0.9756
Epoch [8/51], Iter [60/391] Loss: 0.9693
Epoch [8/51], Iter [70/391] Loss: 0.9479
Epoch [8/51], Iter [80/391] Loss: 0.9761
Epoch [8/51], Iter [90/391] Loss: 1.0002
Epoch [8/51], Iter [100/391] Loss: 0.9737
Epoch [8/51], Iter [110/391] Loss: 0.9716
Epoch [8/51], Iter [120/391] Loss: 0.9397
Epoch [8/51], Iter [130/391] Loss: 0.9459
Epoch [8/51], Iter [140/391] Loss: 0.9710
Epoch [8/51], Iter [150/391] Loss: 0.9423
Epoch [8/51], Iter [160/391] Loss: 0.9760
Epoch [8/51], Iter [170/391] Loss: 0.9888
Epoch [8/51], Iter [180/391] Loss: 0.9582
Epoch [8/51], Iter [190/391] Loss: 0.9517
Epoch [8/51], Iter [200/391] Loss: 0.9712
Epoch [8/51], Iter [210/391] Loss: 0.9747
Epoch [8/51], Iter [220/391] Loss: 0.9519
Epoch [8/51], Iter [230/391] Loss: 0.9511
Epoch [8/51], Iter [240/391] Loss: 0.9665
Epoch [8/51], Iter [250/391] Loss: 0.9607
Epoch [8/51], Iter [260/391] Loss: 0.9801
Epoch [8/51], Iter [270/391] Loss: 0.9377
Epoch [8/51], Iter [280/391] Loss: 1.0028
Epoch [8/51], Iter [290/391] Loss: 0.9541
Epoch [8/51], Iter [300/391] Loss: 0.9612
Epoch [8/51], Iter [310/391] Loss: 0.9650
Epoch [8/51], Iter [320/391] Loss: 1.0036
Epoch [8/51], Iter [330/391] Loss: 0.9649
Epoch [8/51], Iter [340/391] Loss: 0.9801
Epoch [8/51], Iter [350/391] Loss: 0.9619
Epoch [8/51], Iter [360/391] Loss: 0.9826
Epoch [8/51], Iter [370/391] Loss: 0.9603
Epoch [8/51], Iter [380/391] Loss: 0.9832
Epoch [8/51], Iter [390/391] Loss: 0.9756
Epoch [9/51], Iter [10/391] Loss: 0.9543
Epoch [9/51], Iter [20/391] Loss: 0.9495
Epoch [9/51], Iter [30/391] Loss: 0.9466
Epoch [9/51], Iter [40/391] Loss: 0.9569
Epoch [9/51], Iter [50/391] Loss: 0.9394
Epoch [9/51], Iter [60/391] Loss: 0.9335
Epoch [9/51], Iter [70/391] Loss: 0.9390
Epoch [9/51], Iter [80/391] Loss: 0.9566
Epoch [9/51], Iter [90/391] Loss: 0.9810
Epoch [9/51], Iter [100/391] Loss: 0.9595
Epoch [9/51], Iter [110/391] Loss: 0.9394
Epoch [9/51], Iter [120/391] Loss: 0.9719
Epoch [9/51], Iter [130/391] Loss: 0.9418
Epoch [9/51], Iter [140/391] Loss: 0.9517
Epoch [9/51], Iter [150/391] Loss: 0.9704
Epoch [9/51], Iter [160/391] Loss: 0.9470
Epoch [9/51], Iter [170/391] Loss: 0.9474
Epoch [9/51], Iter [180/391] Loss: 0.9804
Epoch [9/51], Iter [190/391] Loss: 0.9605
Epoch [9/51], Iter [200/391] Loss: 0.9597
Epoch [9/51], Iter [210/391] Loss: 0.9716
Epoch [9/51], Iter [220/391] Loss: 0.9493
Epoch [9/51], Iter [230/391] Loss: 0.9655
Epoch [9/51], Iter [240/391] Loss: 0.9242
Epoch [9/51], Iter [250/391] Loss: 0.9602
Epoch [9/51], Iter [260/391] Loss: 0.9588
Epoch [9/51], Iter [270/391] Loss: 0.9387
Epoch [9/51], Iter [280/391] Loss: 0.9359
Epoch [9/51], Iter [290/391] Loss: 0.9872
Epoch [9/51], Iter [300/391] Loss: 0.9620
Epoch [9/51], Iter [310/391] Loss: 0.9211
Epoch [9/51], Iter [320/391] Loss: 0.9429
Epoch [9/51], Iter [330/391] Loss: 0.9105
Epoch [9/51], Iter [340/391] Loss: 0.9478
Epoch [9/51], Iter [350/391] Loss: 0.9493
Epoch [9/51], Iter [360/391] Loss: 0.9914
Epoch [9/51], Iter [370/391] Loss: 0.9483
Epoch [9/51], Iter [380/391] Loss: 0.9936
Epoch [9/51], Iter [390/391] Loss: 0.9311
Epoch [10/51], Iter [10/391] Loss: 0.9538
Epoch [10/51], Iter [20/391] Loss: 0.9722
Epoch [10/51], Iter [30/391] Loss: 0.9330
Epoch [10/51], Iter [40/391] Loss: 0.9439
Epoch [10/51], Iter [50/391] Loss: 0.9618
Epoch [10/51], Iter [60/391] Loss: 0.9484
Epoch [10/51], Iter [70/391] Loss: 0.9477
Epoch [10/51], Iter [80/391] Loss: 0.9564
Epoch [10/51], Iter [90/391] Loss: 0.9701
Epoch [10/51], Iter [100/391] Loss: 0.9425
Epoch [10/51], Iter [110/391] Loss: 0.9151
Epoch [10/51], Iter [120/391] Loss: 0.9311
Epoch [10/51], Iter [130/391] Loss: 0.9572
Epoch [10/51], Iter [140/391] Loss: 0.9295
Epoch [10/51], Iter [150/391] Loss: 0.9507
Epoch [10/51], Iter [160/391] Loss: 0.9675
Epoch [10/51], Iter [170/391] Loss: 0.9240
Epoch [10/51], Iter [180/391] Loss: 0.9446
Epoch [10/51], Iter [190/391] Loss: 0.9293
Epoch [10/51], Iter [200/391] Loss: 0.9413
Epoch [10/51], Iter [210/391] Loss: 0.9369
Epoch [10/51], Iter [220/391] Loss: 0.9323
Epoch [10/51], Iter [230/391] Loss: 0.9561
Epoch [10/51], Iter [240/391] Loss: 0.9255
Epoch [10/51], Iter [250/391] Loss: 0.9888
Epoch [10/51], Iter [260/391] Loss: 0.9607
Epoch [10/51], Iter [270/391] Loss: 0.9568
Epoch [10/51], Iter [280/391] Loss: 0.9497
Epoch [10/51], Iter [290/391] Loss: 0.9504
Epoch [10/51], Iter [300/391] Loss: 0.9479
Epoch [10/51], Iter [310/391] Loss: 0.9618
Epoch [10/51], Iter [320/391] Loss: 0.9535
Epoch [10/51], Iter [330/391] Loss: 0.9340
Epoch [10/51], Iter [340/391] Loss: 0.9349
Epoch [10/51], Iter [350/391] Loss: 0.9484
Epoch [10/51], Iter [360/391] Loss: 0.9697
Epoch [10/51], Iter [370/391] Loss: 0.9081
Epoch [10/51], Iter [380/391] Loss: 0.9413
Epoch [10/51], Iter [390/391] Loss: 0.9237
[Saving Checkpoint]
Epoch [11/51], Iter [10/391] Loss: 0.9323
Epoch [11/51], Iter [20/391] Loss: 0.9333
Epoch [11/51], Iter [30/391] Loss: 0.9462
Epoch [11/51], Iter [40/391] Loss: 0.9579
Epoch [11/51], Iter [50/391] Loss: 0.9486
Epoch [11/51], Iter [60/391] Loss: 0.9380
Epoch [11/51], Iter [70/391] Loss: 0.9405
Epoch [11/51], Iter [80/391] Loss: 0.9282
Epoch [11/51], Iter [90/391] Loss: 0.9451
Epoch [11/51], Iter [100/391] Loss: 0.9148
Epoch [11/51], Iter [110/391] Loss: 0.9220
Epoch [11/51], Iter [120/391] Loss: 0.9432
Epoch [11/51], Iter [130/391] Loss: 0.9136
Epoch [11/51], Iter [140/391] Loss: 0.9001
Epoch [11/51], Iter [150/391] Loss: 0.9104
Epoch [11/51], Iter [160/391] Loss: 0.9432
Epoch [11/51], Iter [170/391] Loss: 0.9096
Epoch [11/51], Iter [180/391] Loss: 0.9265
Epoch [11/51], Iter [190/391] Loss: 0.9228
Epoch [11/51], Iter [200/391] Loss: 0.9455
Epoch [11/51], Iter [210/391] Loss: 0.9403
Epoch [11/51], Iter [220/391] Loss: 0.9305
Epoch [11/51], Iter [230/391] Loss: 0.9221
Epoch [11/51], Iter [240/391] Loss: 0.9435
Epoch [11/51], Iter [250/391] Loss: 0.9303
Epoch [11/51], Iter [260/391] Loss: 0.9468
Epoch [11/51], Iter [270/391] Loss: 0.9110
Epoch [11/51], Iter [280/391] Loss: 0.9023
Epoch [11/51], Iter [290/391] Loss: 0.9263
Epoch [11/51], Iter [300/391] Loss: 0.9150
Epoch [11/51], Iter [310/391] Loss: 0.9425
Epoch [11/51], Iter [320/391] Loss: 0.9290
Epoch [11/51], Iter [330/391] Loss: 0.9497
Epoch [11/51], Iter [340/391] Loss: 0.9209
Epoch [11/51], Iter [350/391] Loss: 0.9307
Epoch [11/51], Iter [360/391] Loss: 0.9322
Epoch [11/51], Iter [370/391] Loss: 0.9365
Epoch [11/51], Iter [380/391] Loss: 0.9157
Epoch [11/51], Iter [390/391] Loss: 0.9109
Epoch [12/51], Iter [10/391] Loss: 0.9483
Epoch [12/51], Iter [20/391] Loss: 0.9051
Epoch [12/51], Iter [30/391] Loss: 0.9131
Epoch [12/51], Iter [40/391] Loss: 0.9053
Epoch [12/51], Iter [50/391] Loss: 0.9105
Epoch [12/51], Iter [60/391] Loss: 0.9152
Epoch [12/51], Iter [70/391] Loss: 0.9286
Epoch [12/51], Iter [80/391] Loss: 0.9355
Epoch [12/51], Iter [90/391] Loss: 0.8959
Epoch [12/51], Iter [100/391] Loss: 0.8922
Epoch [12/51], Iter [110/391] Loss: 0.9215
Epoch [12/51], Iter [120/391] Loss: 0.9432
Epoch [12/51], Iter [130/391] Loss: 0.9336
Epoch [12/51], Iter [140/391] Loss: 0.9429
Epoch [12/51], Iter [150/391] Loss: 0.8962
Epoch [12/51], Iter [160/391] Loss: 0.9178
Epoch [12/51], Iter [170/391] Loss: 0.9327
Epoch [12/51], Iter [180/391] Loss: 0.9190
Epoch [12/51], Iter [190/391] Loss: 0.9173
Epoch [12/51], Iter [200/391] Loss: 0.8955
Epoch [12/51], Iter [210/391] Loss: 0.9404
Epoch [12/51], Iter [220/391] Loss: 0.9604
Epoch [12/51], Iter [230/391] Loss: 0.9103
Epoch [12/51], Iter [240/391] Loss: 0.9296
Epoch [12/51], Iter [250/391] Loss: 0.9336
Epoch [12/51], Iter [260/391] Loss: 0.9096
Epoch [12/51], Iter [270/391] Loss: 0.9103
Epoch [12/51], Iter [280/391] Loss: 0.9267
Epoch [12/51], Iter [290/391] Loss: 0.9126
Epoch [12/51], Iter [300/391] Loss: 0.9296
Epoch [12/51], Iter [310/391] Loss: 0.8983
Epoch [12/51], Iter [320/391] Loss: 0.9159
Epoch [12/51], Iter [330/391] Loss: 0.9168
Epoch [12/51], Iter [340/391] Loss: 0.9247
Epoch [12/51], Iter [350/391] Loss: 0.9186
Epoch [12/51], Iter [360/391] Loss: 0.9075
Epoch [12/51], Iter [370/391] Loss: 0.9103
Epoch [12/51], Iter [380/391] Loss: 0.9297
Epoch [12/51], Iter [390/391] Loss: 0.9421
Epoch [13/51], Iter [10/391] Loss: 0.9034
Epoch [13/51], Iter [20/391] Loss: 0.9033
Epoch [13/51], Iter [30/391] Loss: 0.8976
Epoch [13/51], Iter [40/391] Loss: 0.8980
Epoch [13/51], Iter [50/391] Loss: 0.9002
Epoch [13/51], Iter [60/391] Loss: 0.8897
Epoch [13/51], Iter [70/391] Loss: 0.8774
Epoch [13/51], Iter [80/391] Loss: 0.9104
Epoch [13/51], Iter [90/391] Loss: 0.9477
Epoch [13/51], Iter [100/391] Loss: 0.9170
Epoch [13/51], Iter [110/391] Loss: 0.8778
Epoch [13/51], Iter [120/391] Loss: 0.8953
Epoch [13/51], Iter [130/391] Loss: 0.9370
Epoch [13/51], Iter [140/391] Loss: 0.9338
Epoch [13/51], Iter [150/391] Loss: 0.9149
Epoch [13/51], Iter [160/391] Loss: 0.9027
Epoch [13/51], Iter [170/391] Loss: 0.8688
Epoch [13/51], Iter [180/391] Loss: 0.8941
Epoch [13/51], Iter [190/391] Loss: 0.9366
Epoch [13/51], Iter [200/391] Loss: 0.9140
Epoch [13/51], Iter [210/391] Loss: 0.9043
Epoch [13/51], Iter [220/391] Loss: 0.9045
Epoch [13/51], Iter [230/391] Loss: 0.9156
Epoch [13/51], Iter [240/391] Loss: 0.9224
Epoch [13/51], Iter [250/391] Loss: 0.9147
Epoch [13/51], Iter [260/391] Loss: 0.9112
Epoch [13/51], Iter [270/391] Loss: 0.9014
Epoch [13/51], Iter [280/391] Loss: 0.9220
Epoch [13/51], Iter [290/391] Loss: 0.9449
Epoch [13/51], Iter [300/391] Loss: 0.9206
Epoch [13/51], Iter [310/391] Loss: 0.8973
Epoch [13/51], Iter [320/391] Loss: 0.9062
Epoch [13/51], Iter [330/391] Loss: 0.9177
Epoch [13/51], Iter [340/391] Loss: 0.9124
Epoch [13/51], Iter [350/391] Loss: 0.9013
Epoch [13/51], Iter [360/391] Loss: 0.8909
Epoch [13/51], Iter [370/391] Loss: 0.9238
Epoch [13/51], Iter [380/391] Loss: 0.9033
Epoch [13/51], Iter [390/391] Loss: 0.9218
Epoch [14/51], Iter [10/391] Loss: 0.9131
Epoch [14/51], Iter [20/391] Loss: 0.8883
Epoch [14/51], Iter [30/391] Loss: 0.9113
Epoch [14/51], Iter [40/391] Loss: 0.8860
Epoch [14/51], Iter [50/391] Loss: 0.9082
Epoch [14/51], Iter [60/391] Loss: 0.8932
Epoch [14/51], Iter [70/391] Loss: 0.8816
Epoch [14/51], Iter [80/391] Loss: 0.9056
Epoch [14/51], Iter [90/391] Loss: 0.8992
Epoch [14/51], Iter [100/391] Loss: 0.9158
Epoch [14/51], Iter [110/391] Loss: 0.9036
Epoch [14/51], Iter [120/391] Loss: 0.8848
Epoch [14/51], Iter [130/391] Loss: 0.9023
Epoch [14/51], Iter [140/391] Loss: 0.9049
Epoch [14/51], Iter [150/391] Loss: 0.9096
Epoch [14/51], Iter [160/391] Loss: 0.8901
Epoch [14/51], Iter [170/391] Loss: 0.9080
Epoch [14/51], Iter [180/391] Loss: 0.9115
Epoch [14/51], Iter [190/391] Loss: 0.8925
Epoch [14/51], Iter [200/391] Loss: 0.9050
Epoch [14/51], Iter [210/391] Loss: 0.8988
Epoch [14/51], Iter [220/391] Loss: 0.9225
Epoch [14/51], Iter [230/391] Loss: 0.9173
Epoch [14/51], Iter [240/391] Loss: 0.9302
Epoch [14/51], Iter [250/391] Loss: 0.9176
Epoch [14/51], Iter [260/391] Loss: 0.8920
Epoch [14/51], Iter [270/391] Loss: 0.9098
Epoch [14/51], Iter [280/391] Loss: 0.8941
Epoch [14/51], Iter [290/391] Loss: 0.8983
Epoch [14/51], Iter [300/391] Loss: 0.9056
Epoch [14/51], Iter [310/391] Loss: 0.8880
Epoch [14/51], Iter [320/391] Loss: 0.9172
Epoch [14/51], Iter [330/391] Loss: 0.8871
Epoch [14/51], Iter [340/391] Loss: 0.8958
Epoch [14/51], Iter [350/391] Loss: 0.8959
Epoch [14/51], Iter [360/391] Loss: 0.9383
Epoch [14/51], Iter [370/391] Loss: 0.8854
Epoch [14/51], Iter [380/391] Loss: 0.8831
Epoch [14/51], Iter [390/391] Loss: 0.9299
Epoch [15/51], Iter [10/391] Loss: 0.8883
Epoch [15/51], Iter [20/391] Loss: 0.9066
Epoch [15/51], Iter [30/391] Loss: 0.9044
Epoch [15/51], Iter [40/391] Loss: 0.9064
Epoch [15/51], Iter [50/391] Loss: 0.8911
Epoch [15/51], Iter [60/391] Loss: 0.8828
Epoch [15/51], Iter [70/391] Loss: 0.8851
Epoch [15/51], Iter [80/391] Loss: 0.8862
Epoch [15/51], Iter [90/391] Loss: 0.9026
Epoch [15/51], Iter [100/391] Loss: 0.9161
Epoch [15/51], Iter [110/391] Loss: 0.8914
Epoch [15/51], Iter [120/391] Loss: 0.9166
Epoch [15/51], Iter [130/391] Loss: 0.9190
Epoch [15/51], Iter [140/391] Loss: 0.9037
Epoch [15/51], Iter [150/391] Loss: 0.8984
Epoch [15/51], Iter [160/391] Loss: 0.9209
Epoch [15/51], Iter [170/391] Loss: 0.9095
Epoch [15/51], Iter [180/391] Loss: 0.8952
Epoch [15/51], Iter [190/391] Loss: 0.8847
Epoch [15/51], Iter [200/391] Loss: 0.9206
Epoch [15/51], Iter [210/391] Loss: 0.9105
Epoch [15/51], Iter [220/391] Loss: 0.8925
Epoch [15/51], Iter [230/391] Loss: 0.8923
Epoch [15/51], Iter [240/391] Loss: 0.8947
Epoch [15/51], Iter [250/391] Loss: 0.9037
Epoch [15/51], Iter [260/391] Loss: 0.9201
Epoch [15/51], Iter [270/391] Loss: 0.9140
Epoch [15/51], Iter [280/391] Loss: 0.8934
Epoch [15/51], Iter [290/391] Loss: 0.8981
Epoch [15/51], Iter [300/391] Loss: 0.9056
Epoch [15/51], Iter [310/391] Loss: 0.8851
Epoch [15/51], Iter [320/391] Loss: 0.8991
Epoch [15/51], Iter [330/391] Loss: 0.8967
Epoch [15/51], Iter [340/391] Loss: 0.8798
Epoch [15/51], Iter [350/391] Loss: 0.9035
Epoch [15/51], Iter [360/391] Loss: 0.9049
Epoch [15/51], Iter [370/391] Loss: 0.9203
Epoch [15/51], Iter [380/391] Loss: 0.9013
Epoch [15/51], Iter [390/391] Loss: 0.8889
Epoch [16/51], Iter [10/391] Loss: 0.8966
Epoch [16/51], Iter [20/391] Loss: 0.8919
Epoch [16/51], Iter [30/391] Loss: 0.8801
Epoch [16/51], Iter [40/391] Loss: 0.9179
Epoch [16/51], Iter [50/391] Loss: 0.8760
Epoch [16/51], Iter [60/391] Loss: 0.8813
Epoch [16/51], Iter [70/391] Loss: 0.9016
Epoch [16/51], Iter [80/391] Loss: 0.8934
Epoch [16/51], Iter [90/391] Loss: 0.9243
Epoch [16/51], Iter [100/391] Loss: 0.8938
Epoch [16/51], Iter [110/391] Loss: 0.8792
Epoch [16/51], Iter [120/391] Loss: 0.8813
Epoch [16/51], Iter [130/391] Loss: 0.8712
Epoch [16/51], Iter [140/391] Loss: 0.8860
Epoch [16/51], Iter [150/391] Loss: 0.9070
Epoch [16/51], Iter [160/391] Loss: 0.8900
Epoch [16/51], Iter [170/391] Loss: 0.8908
Epoch [16/51], Iter [180/391] Loss: 0.9125
Epoch [16/51], Iter [190/391] Loss: 0.8883
Epoch [16/51], Iter [200/391] Loss: 0.8937
Epoch [16/51], Iter [210/391] Loss: 0.8788
Epoch [16/51], Iter [220/391] Loss: 0.8865
Epoch [16/51], Iter [230/391] Loss: 0.8942
Epoch [16/51], Iter [240/391] Loss: 0.8655
Epoch [16/51], Iter [250/391] Loss: 0.8729
Epoch [16/51], Iter [260/391] Loss: 0.8920
Epoch [16/51], Iter [270/391] Loss: 0.9094
Epoch [16/51], Iter [280/391] Loss: 0.8991
Epoch [16/51], Iter [290/391] Loss: 0.9128
Epoch [16/51], Iter [300/391] Loss: 0.9124
Epoch [16/51], Iter [310/391] Loss: 0.8895
Epoch [16/51], Iter [320/391] Loss: 0.8772
Epoch [16/51], Iter [330/391] Loss: 0.8736
Epoch [16/51], Iter [340/391] Loss: 0.9133
Epoch [16/51], Iter [350/391] Loss: 0.8783
Epoch [16/51], Iter [360/391] Loss: 0.8794
Epoch [16/51], Iter [370/391] Loss: 0.8936
Epoch [16/51], Iter [380/391] Loss: 0.8638
Epoch [16/51], Iter [390/391] Loss: 0.8853
Epoch [17/51], Iter [10/391] Loss: 0.8852
Epoch [17/51], Iter [20/391] Loss: 0.8912
Epoch [17/51], Iter [30/391] Loss: 0.8736
Epoch [17/51], Iter [40/391] Loss: 0.8897
Epoch [17/51], Iter [50/391] Loss: 0.8839
Epoch [17/51], Iter [60/391] Loss: 0.9189
Epoch [17/51], Iter [70/391] Loss: 0.8835
Epoch [17/51], Iter [80/391] Loss: 0.8791
Epoch [17/51], Iter [90/391] Loss: 0.8930
Epoch [17/51], Iter [100/391] Loss: 0.8685
Epoch [17/51], Iter [110/391] Loss: 0.8933
Epoch [17/51], Iter [120/391] Loss: 0.8903
Epoch [17/51], Iter [130/391] Loss: 0.8914
Epoch [17/51], Iter [140/391] Loss: 0.8900
Epoch [17/51], Iter [150/391] Loss: 0.8767
Epoch [17/51], Iter [160/391] Loss: 0.8713
Epoch [17/51], Iter [170/391] Loss: 0.8887
Epoch [17/51], Iter [180/391] Loss: 0.9181
Epoch [17/51], Iter [190/391] Loss: 0.8922
Epoch [17/51], Iter [200/391] Loss: 0.8809
Epoch [17/51], Iter [210/391] Loss: 0.9111
Epoch [17/51], Iter [220/391] Loss: 0.8757
Epoch [17/51], Iter [230/391] Loss: 0.8939
Epoch [17/51], Iter [240/391] Loss: 0.8861
Epoch [17/51], Iter [250/391] Loss: 0.9037
Epoch [17/51], Iter [260/391] Loss: 0.8907
Epoch [17/51], Iter [270/391] Loss: 0.8839
Epoch [17/51], Iter [280/391] Loss: 0.8909
Epoch [17/51], Iter [290/391] Loss: 0.8909
Epoch [17/51], Iter [300/391] Loss: 0.8916
Epoch [17/51], Iter [310/391] Loss: 0.8756
Epoch [17/51], Iter [320/391] Loss: 0.9203
Epoch [17/51], Iter [330/391] Loss: 0.8906
Epoch [17/51], Iter [340/391] Loss: 0.8849
Epoch [17/51], Iter [350/391] Loss: 0.8930
Epoch [17/51], Iter [360/391] Loss: 0.8757
Epoch [17/51], Iter [370/391] Loss: 0.8865
Epoch [17/51], Iter [380/391] Loss: 0.8991
Epoch [17/51], Iter [390/391] Loss: 0.8976
Epoch [18/51], Iter [10/391] Loss: 0.8884
Epoch [18/51], Iter [20/391] Loss: 0.8511
Epoch [18/51], Iter [30/391] Loss: 0.8926
Epoch [18/51], Iter [40/391] Loss: 0.8952
Epoch [18/51], Iter [50/391] Loss: 0.8918
Epoch [18/51], Iter [60/391] Loss: 0.8873
Epoch [18/51], Iter [70/391] Loss: 0.8631
Epoch [18/51], Iter [80/391] Loss: 0.8766
Epoch [18/51], Iter [90/391] Loss: 0.8986
Epoch [18/51], Iter [100/391] Loss: 0.8853
Epoch [18/51], Iter [110/391] Loss: 0.9129
Epoch [18/51], Iter [120/391] Loss: 0.8940
Epoch [18/51], Iter [130/391] Loss: 0.8909
Epoch [18/51], Iter [140/391] Loss: 0.8890
Epoch [18/51], Iter [150/391] Loss: 0.8932
Epoch [18/51], Iter [160/391] Loss: 0.8918
Epoch [18/51], Iter [170/391] Loss: 0.8880
Epoch [18/51], Iter [180/391] Loss: 0.9021
Epoch [18/51], Iter [190/391] Loss: 0.8760
Epoch [18/51], Iter [200/391] Loss: 0.8983
Epoch [18/51], Iter [210/391] Loss: 0.8905
Epoch [18/51], Iter [220/391] Loss: 0.8705
Epoch [18/51], Iter [230/391] Loss: 0.8934
Epoch [18/51], Iter [240/391] Loss: 0.8966
Epoch [18/51], Iter [250/391] Loss: 0.9060
Epoch [18/51], Iter [260/391] Loss: 0.8880
Epoch [18/51], Iter [270/391] Loss: 0.8801
Epoch [18/51], Iter [280/391] Loss: 0.8818
Epoch [18/51], Iter [290/391] Loss: 0.9079
Epoch [18/51], Iter [300/391] Loss: 0.8863
Epoch [18/51], Iter [310/391] Loss: 0.8526
Epoch [18/51], Iter [320/391] Loss: 0.8994
Epoch [18/51], Iter [330/391] Loss: 0.9053
Epoch [18/51], Iter [340/391] Loss: 0.8772
Epoch [18/51], Iter [350/391] Loss: 0.8859
Epoch [18/51], Iter [360/391] Loss: 0.8994
Epoch [18/51], Iter [370/391] Loss: 0.8947
Epoch [18/51], Iter [380/391] Loss: 0.8685
Epoch [18/51], Iter [390/391] Loss: 0.8752
Epoch [19/51], Iter [10/391] Loss: 0.8652
Epoch [19/51], Iter [20/391] Loss: 0.8893
Epoch [19/51], Iter [30/391] Loss: 0.8794
Epoch [19/51], Iter [40/391] Loss: 0.8785
Epoch [19/51], Iter [50/391] Loss: 0.8763
Epoch [19/51], Iter [60/391] Loss: 0.8766
Epoch [19/51], Iter [70/391] Loss: 0.8597
Epoch [19/51], Iter [80/391] Loss: 0.8725
Epoch [19/51], Iter [90/391] Loss: 0.9241
Epoch [19/51], Iter [100/391] Loss: 0.9011
Epoch [19/51], Iter [110/391] Loss: 0.8762
Epoch [19/51], Iter [120/391] Loss: 0.8860
Epoch [19/51], Iter [130/391] Loss: 0.8809
Epoch [19/51], Iter [140/391] Loss: 0.8677
Epoch [19/51], Iter [150/391] Loss: 0.8912
Epoch [19/51], Iter [160/391] Loss: 0.8727
Epoch [19/51], Iter [170/391] Loss: 0.9046
Epoch [19/51], Iter [180/391] Loss: 0.9146
Epoch [19/51], Iter [190/391] Loss: 0.8982
Epoch [19/51], Iter [200/391] Loss: 0.8648
Epoch [19/51], Iter [210/391] Loss: 0.9056
Epoch [19/51], Iter [220/391] Loss: 0.8557
Epoch [19/51], Iter [230/391] Loss: 0.8772
Epoch [19/51], Iter [240/391] Loss: 0.9103
Epoch [19/51], Iter [250/391] Loss: 0.8749
Epoch [19/51], Iter [260/391] Loss: 0.8798
Epoch [19/51], Iter [270/391] Loss: 0.8940
Epoch [19/51], Iter [280/391] Loss: 0.8775
Epoch [19/51], Iter [290/391] Loss: 0.8692
Epoch [19/51], Iter [300/391] Loss: 0.8840
Epoch [19/51], Iter [310/391] Loss: 0.8803
Epoch [19/51], Iter [320/391] Loss: 0.9035
Epoch [19/51], Iter [330/391] Loss: 0.8743
Epoch [19/51], Iter [340/391] Loss: 0.8695
Epoch [19/51], Iter [350/391] Loss: 0.8865
Epoch [19/51], Iter [360/391] Loss: 0.8858
Epoch [19/51], Iter [370/391] Loss: 0.8840
Epoch [19/51], Iter [380/391] Loss: 0.8667
Epoch [19/51], Iter [390/391] Loss: 0.8867
Epoch [20/51], Iter [10/391] Loss: 0.8713
Epoch [20/51], Iter [20/391] Loss: 0.8754
Epoch [20/51], Iter [30/391] Loss: 0.8694
Epoch [20/51], Iter [40/391] Loss: 0.8769
Epoch [20/51], Iter [50/391] Loss: 0.8738
Epoch [20/51], Iter [60/391] Loss: 0.8685
Epoch [20/51], Iter [70/391] Loss: 0.8776
Epoch [20/51], Iter [80/391] Loss: 0.8791
Epoch [20/51], Iter [90/391] Loss: 0.8686
Epoch [20/51], Iter [100/391] Loss: 0.8808
Epoch [20/51], Iter [110/391] Loss: 0.8756
Epoch [20/51], Iter [120/391] Loss: 0.8690
Epoch [20/51], Iter [130/391] Loss: 0.8713
Epoch [20/51], Iter [140/391] Loss: 0.8724
Epoch [20/51], Iter [150/391] Loss: 0.8660
Epoch [20/51], Iter [160/391] Loss: 0.8788
Epoch [20/51], Iter [170/391] Loss: 0.8507
Epoch [20/51], Iter [180/391] Loss: 0.8688
Epoch [20/51], Iter [190/391] Loss: 0.8600
Epoch [20/51], Iter [200/391] Loss: 0.9056
Epoch [20/51], Iter [210/391] Loss: 0.8770
Epoch [20/51], Iter [220/391] Loss: 0.8773
Epoch [20/51], Iter [230/391] Loss: 0.8621
Epoch [20/51], Iter [240/391] Loss: 0.8961
Epoch [20/51], Iter [250/391] Loss: 0.8687
Epoch [20/51], Iter [260/391] Loss: 0.8781
Epoch [20/51], Iter [270/391] Loss: 0.8837
Epoch [20/51], Iter [280/391] Loss: 0.8624
Epoch [20/51], Iter [290/391] Loss: 0.8750
Epoch [20/51], Iter [300/391] Loss: 0.8546
Epoch [20/51], Iter [310/391] Loss: 0.8593
Epoch [20/51], Iter [320/391] Loss: 0.8586
Epoch [20/51], Iter [330/391] Loss: 0.8704
Epoch [20/51], Iter [340/391] Loss: 0.8775
Epoch [20/51], Iter [350/391] Loss: 0.8608
Epoch [20/51], Iter [360/391] Loss: 0.9000
Epoch [20/51], Iter [370/391] Loss: 0.8759
Epoch [20/51], Iter [380/391] Loss: 0.9037
Epoch [20/51], Iter [390/391] Loss: 0.8760
[Saving Checkpoint]
Epoch [21/51], Iter [10/391] Loss: 0.8662
Epoch [21/51], Iter [20/391] Loss: 0.8521
Epoch [21/51], Iter [30/391] Loss: 0.8574
Epoch [21/51], Iter [40/391] Loss: 0.8821
Epoch [21/51], Iter [50/391] Loss: 0.8654
Epoch [21/51], Iter [60/391] Loss: 0.8712
Epoch [21/51], Iter [70/391] Loss: 0.8532
Epoch [21/51], Iter [80/391] Loss: 0.8672
Epoch [21/51], Iter [90/391] Loss: 0.8635
Epoch [21/51], Iter [100/391] Loss: 0.8398
Epoch [21/51], Iter [110/391] Loss: 0.8716
Epoch [21/51], Iter [120/391] Loss: 0.8837
Epoch [21/51], Iter [130/391] Loss: 0.8815
Epoch [21/51], Iter [140/391] Loss: 0.8651
Epoch [21/51], Iter [150/391] Loss: 0.8748
Epoch [21/51], Iter [160/391] Loss: 0.8681
Epoch [21/51], Iter [170/391] Loss: 0.8758
Epoch [21/51], Iter [180/391] Loss: 0.8664
Epoch [21/51], Iter [190/391] Loss: 0.8809
Epoch [21/51], Iter [200/391] Loss: 0.8719
Epoch [21/51], Iter [210/391] Loss: 0.8606
Epoch [21/51], Iter [220/391] Loss: 0.8635
Epoch [21/51], Iter [230/391] Loss: 0.8785
Epoch [21/51], Iter [240/391] Loss: 0.8631
Epoch [21/51], Iter [250/391] Loss: 0.8600
Epoch [21/51], Iter [260/391] Loss: 0.8841
Epoch [21/51], Iter [270/391] Loss: 0.8633
Epoch [21/51], Iter [280/391] Loss: 0.8585
Epoch [21/51], Iter [290/391] Loss: 0.8705
Epoch [21/51], Iter [300/391] Loss: 0.8889
Epoch [21/51], Iter [310/391] Loss: 0.8988
Epoch [21/51], Iter [320/391] Loss: 0.8751
Epoch [21/51], Iter [330/391] Loss: 0.8547
Epoch [21/51], Iter [340/391] Loss: 0.8879
Epoch [21/51], Iter [350/391] Loss: 0.8693
Epoch [21/51], Iter [360/391] Loss: 0.8617
Epoch [21/51], Iter [370/391] Loss: 0.8641
Epoch [21/51], Iter [380/391] Loss: 0.8653
Epoch [21/51], Iter [390/391] Loss: 0.8578
Epoch [22/51], Iter [10/391] Loss: 0.8623
Epoch [22/51], Iter [20/391] Loss: 0.8620
Epoch [22/51], Iter [30/391] Loss: 0.8524
Epoch [22/51], Iter [40/391] Loss: 0.8746
Epoch [22/51], Iter [50/391] Loss: 0.8869
Epoch [22/51], Iter [60/391] Loss: 0.8797
Epoch [22/51], Iter [70/391] Loss: 0.8908
Epoch [22/51], Iter [80/391] Loss: 0.8646
Epoch [22/51], Iter [90/391] Loss: 0.8639
Epoch [22/51], Iter [100/391] Loss: 0.8624
Epoch [22/51], Iter [110/391] Loss: 0.8891
Epoch [22/51], Iter [120/391] Loss: 0.8509
Epoch [22/51], Iter [130/391] Loss: 0.8720
Epoch [22/51], Iter [140/391] Loss: 0.8976
Epoch [22/51], Iter [150/391] Loss: 0.8651
Epoch [22/51], Iter [160/391] Loss: 0.8521
Epoch [22/51], Iter [170/391] Loss: 0.8435
Epoch [22/51], Iter [180/391] Loss: 0.8738
Epoch [22/51], Iter [190/391] Loss: 0.8688
Epoch [22/51], Iter [200/391] Loss: 0.8574
Epoch [22/51], Iter [210/391] Loss: 0.8476
Epoch [22/51], Iter [220/391] Loss: 0.8556
Epoch [22/51], Iter [230/391] Loss: 0.8770
Epoch [22/51], Iter [240/391] Loss: 0.8787
Epoch [22/51], Iter [250/391] Loss: 0.8641
Epoch [22/51], Iter [260/391] Loss: 0.8548
Epoch [22/51], Iter [270/391] Loss: 0.8567
Epoch [22/51], Iter [280/391] Loss: 0.8708
Epoch [22/51], Iter [290/391] Loss: 0.8553
Epoch [22/51], Iter [300/391] Loss: 0.8827
Epoch [22/51], Iter [310/391] Loss: 0.8541
Epoch [22/51], Iter [320/391] Loss: 0.8538
Epoch [22/51], Iter [330/391] Loss: 0.8544
Epoch [22/51], Iter [340/391] Loss: 0.8542
Epoch [22/51], Iter [350/391] Loss: 0.8697
Epoch [22/51], Iter [360/391] Loss: 0.8589
Epoch [22/51], Iter [370/391] Loss: 0.8800
Epoch [22/51], Iter [380/391] Loss: 0.8638
Epoch [22/51], Iter [390/391] Loss: 0.8968
Epoch [23/51], Iter [10/391] Loss: 0.8553
Epoch [23/51], Iter [20/391] Loss: 0.8770
Epoch [23/51], Iter [30/391] Loss: 0.8528
Epoch [23/51], Iter [40/391] Loss: 0.8666
Epoch [23/51], Iter [50/391] Loss: 0.8732
Epoch [23/51], Iter [60/391] Loss: 0.8785
Epoch [23/51], Iter [70/391] Loss: 0.8853
Epoch [23/51], Iter [80/391] Loss: 0.8762
Epoch [23/51], Iter [90/391] Loss: 0.8627
Epoch [23/51], Iter [100/391] Loss: 0.8764
Epoch [23/51], Iter [110/391] Loss: 0.8615
Epoch [23/51], Iter [120/391] Loss: 0.8664
Epoch [23/51], Iter [130/391] Loss: 0.8775
Epoch [23/51], Iter [140/391] Loss: 0.8741
Epoch [23/51], Iter [150/391] Loss: 0.8468
Epoch [23/51], Iter [160/391] Loss: 0.8661
Epoch [23/51], Iter [170/391] Loss: 0.8456
Epoch [23/51], Iter [180/391] Loss: 0.8599
Epoch [23/51], Iter [190/391] Loss: 0.8683
Epoch [23/51], Iter [200/391] Loss: 0.8708
Epoch [23/51], Iter [210/391] Loss: 0.8540
Epoch [23/51], Iter [220/391] Loss: 0.8597
Epoch [23/51], Iter [230/391] Loss: 0.8558
Epoch [23/51], Iter [240/391] Loss: 0.8804
Epoch [23/51], Iter [250/391] Loss: 0.8654
Epoch [23/51], Iter [260/391] Loss: 0.8445
Epoch [23/51], Iter [270/391] Loss: 0.8924
Epoch [23/51], Iter [280/391] Loss: 0.8732
Epoch [23/51], Iter [290/391] Loss: 0.8829
Epoch [23/51], Iter [300/391] Loss: 0.8661
Epoch [23/51], Iter [310/391] Loss: 0.8568
Epoch [23/51], Iter [320/391] Loss: 0.8682
Epoch [23/51], Iter [330/391] Loss: 0.8813
Epoch [23/51], Iter [340/391] Loss: 0.8573
Epoch [23/51], Iter [350/391] Loss: 0.8675
Epoch [23/51], Iter [360/391] Loss: 0.8518
Epoch [23/51], Iter [370/391] Loss: 0.8767
Epoch [23/51], Iter [380/391] Loss: 0.9056
Epoch [23/51], Iter [390/391] Loss: 0.8901
Epoch [24/51], Iter [10/391] Loss: 0.8678
Epoch [24/51], Iter [20/391] Loss: 0.8763
Epoch [24/51], Iter [30/391] Loss: 0.8765
Epoch [24/51], Iter [40/391] Loss: 0.8898
Epoch [24/51], Iter [50/391] Loss: 0.8374
Epoch [24/51], Iter [60/391] Loss: 0.8626
Epoch [24/51], Iter [70/391] Loss: 0.8561
Epoch [24/51], Iter [80/391] Loss: 0.8618
Epoch [24/51], Iter [90/391] Loss: 0.8570
Epoch [24/51], Iter [100/391] Loss: 0.8862
Epoch [24/51], Iter [110/391] Loss: 0.8619
Epoch [24/51], Iter [120/391] Loss: 0.8704
Epoch [24/51], Iter [130/391] Loss: 0.8717
Epoch [24/51], Iter [140/391] Loss: 0.8520
Epoch [24/51], Iter [150/391] Loss: 0.8586
Epoch [24/51], Iter [160/391] Loss: 0.8689
Epoch [24/51], Iter [170/391] Loss: 0.8536
Epoch [24/51], Iter [180/391] Loss: 0.8565
Epoch [24/51], Iter [190/391] Loss: 0.8552
Epoch [24/51], Iter [200/391] Loss: 0.8723
Epoch [24/51], Iter [210/391] Loss: 0.8639
Epoch [24/51], Iter [220/391] Loss: 0.8518
Epoch [24/51], Iter [230/391] Loss: 0.8469
Epoch [24/51], Iter [240/391] Loss: 0.8657
Epoch [24/51], Iter [250/391] Loss: 0.8555
Epoch [24/51], Iter [260/391] Loss: 0.8627
Epoch [24/51], Iter [270/391] Loss: 0.8529
Epoch [24/51], Iter [280/391] Loss: 0.8779
Epoch [24/51], Iter [290/391] Loss: 0.8954
Epoch [24/51], Iter [300/391] Loss: 0.8629
Epoch [24/51], Iter [310/391] Loss: 0.8838
Epoch [24/51], Iter [320/391] Loss: 0.8947
Epoch [24/51], Iter [330/391] Loss: 0.8529
Epoch [24/51], Iter [340/391] Loss: 0.8543
Epoch [24/51], Iter [350/391] Loss: 0.8434
Epoch [24/51], Iter [360/391] Loss: 0.8559
Epoch [24/51], Iter [370/391] Loss: 0.8444
Epoch [24/51], Iter [380/391] Loss: 0.8736
Epoch [24/51], Iter [390/391] Loss: 0.8614
Epoch [25/51], Iter [10/391] Loss: 0.8591
Epoch [25/51], Iter [20/391] Loss: 0.8263
Epoch [25/51], Iter [30/391] Loss: 0.8450
Epoch [25/51], Iter [40/391] Loss: 0.8351
Epoch [25/51], Iter [50/391] Loss: 0.8428
Epoch [25/51], Iter [60/391] Loss: 0.8414
Epoch [25/51], Iter [70/391] Loss: 0.8605
Epoch [25/51], Iter [80/391] Loss: 0.8752
Epoch [25/51], Iter [90/391] Loss: 0.8528
Epoch [25/51], Iter [100/391] Loss: 0.8394
Epoch [25/51], Iter [110/391] Loss: 0.8553
Epoch [25/51], Iter [120/391] Loss: 0.8334
Epoch [25/51], Iter [130/391] Loss: 0.8559
Epoch [25/51], Iter [140/391] Loss: 0.8758
Epoch [25/51], Iter [150/391] Loss: 0.8516
Epoch [25/51], Iter [160/391] Loss: 0.8656
Epoch [25/51], Iter [170/391] Loss: 0.8760
Epoch [25/51], Iter [180/391] Loss: 0.8674
Epoch [25/51], Iter [190/391] Loss: 0.8585
Epoch [25/51], Iter [200/391] Loss: 0.8606
Epoch [25/51], Iter [210/391] Loss: 0.8572
Epoch [25/51], Iter [220/391] Loss: 0.8547
Epoch [25/51], Iter [230/391] Loss: 0.8555
Epoch [25/51], Iter [240/391] Loss: 0.8822
Epoch [25/51], Iter [250/391] Loss: 0.8629
Epoch [25/51], Iter [260/391] Loss: 0.8650
Epoch [25/51], Iter [270/391] Loss: 0.8360
Epoch [25/51], Iter [280/391] Loss: 0.8729
Epoch [25/51], Iter [290/391] Loss: 0.8696
Epoch [25/51], Iter [300/391] Loss: 0.8488
Epoch [25/51], Iter [310/391] Loss: 0.8439
Epoch [25/51], Iter [320/391] Loss: 0.8833
Epoch [25/51], Iter [330/391] Loss: 0.8633
Epoch [25/51], Iter [340/391] Loss: 0.8513
Epoch [25/51], Iter [350/391] Loss: 0.8645
Epoch [25/51], Iter [360/391] Loss: 0.8347
Epoch [25/51], Iter [370/391] Loss: 0.8484
Epoch [25/51], Iter [380/391] Loss: 0.8596
Epoch [25/51], Iter [390/391] Loss: 0.8796
Epoch [26/51], Iter [10/391] Loss: 0.8735
Epoch [26/51], Iter [20/391] Loss: 0.8498
Epoch [26/51], Iter [30/391] Loss: 0.8757
Epoch [26/51], Iter [40/391] Loss: 0.8771
Epoch [26/51], Iter [50/391] Loss: 0.8583
Epoch [26/51], Iter [60/391] Loss: 0.8684
Epoch [26/51], Iter [70/391] Loss: 0.8678
Epoch [26/51], Iter [80/391] Loss: 0.8428
Epoch [26/51], Iter [90/391] Loss: 0.8563
Epoch [26/51], Iter [100/391] Loss: 0.8586
Epoch [26/51], Iter [110/391] Loss: 0.8425
Epoch [26/51], Iter [120/391] Loss: 0.8549
Epoch [26/51], Iter [130/391] Loss: 0.8474
Epoch [26/51], Iter [140/391] Loss: 0.8471
Epoch [26/51], Iter [150/391] Loss: 0.8618
Epoch [26/51], Iter [160/391] Loss: 0.8377
Epoch [26/51], Iter [170/391] Loss: 0.8408
Epoch [26/51], Iter [180/391] Loss: 0.8662
Epoch [26/51], Iter [190/391] Loss: 0.8667
Epoch [26/51], Iter [200/391] Loss: 0.8604
Epoch [26/51], Iter [210/391] Loss: 0.8563
Epoch [26/51], Iter [220/391] Loss: 0.8692
Epoch [26/51], Iter [230/391] Loss: 0.8592
Epoch [26/51], Iter [240/391] Loss: 0.8570
Epoch [26/51], Iter [250/391] Loss: 0.8492
Epoch [26/51], Iter [260/391] Loss: 0.8498
Epoch [26/51], Iter [270/391] Loss: 0.8473
Epoch [26/51], Iter [280/391] Loss: 0.8391
Epoch [26/51], Iter [290/391] Loss: 0.8499
Epoch [26/51], Iter [300/391] Loss: 0.8674
Epoch [26/51], Iter [310/391] Loss: 0.8873
Epoch [26/51], Iter [320/391] Loss: 0.8474
Epoch [26/51], Iter [330/391] Loss: 0.8752
Epoch [26/51], Iter [340/391] Loss: 0.8575
Epoch [26/51], Iter [350/391] Loss: 0.8665
Epoch [26/51], Iter [360/391] Loss: 0.8434
Epoch [26/51], Iter [370/391] Loss: 0.8622
Epoch [26/51], Iter [380/391] Loss: 0.8479
Epoch [26/51], Iter [390/391] Loss: 0.8487
Epoch [27/51], Iter [10/391] Loss: 0.8460
Epoch [27/51], Iter [20/391] Loss: 0.8463
Epoch [27/51], Iter [30/391] Loss: 0.8421
Epoch [27/51], Iter [40/391] Loss: 0.8458
Epoch [27/51], Iter [50/391] Loss: 0.8542
Epoch [27/51], Iter [60/391] Loss: 0.8566
Epoch [27/51], Iter [70/391] Loss: 0.8531
Epoch [27/51], Iter [80/391] Loss: 0.8431
Epoch [27/51], Iter [90/391] Loss: 0.8503
Epoch [27/51], Iter [100/391] Loss: 0.8514
Epoch [27/51], Iter [110/391] Loss: 0.8614
Epoch [27/51], Iter [120/391] Loss: 0.8666
Epoch [27/51], Iter [130/391] Loss: 0.8444
Epoch [27/51], Iter [140/391] Loss: 0.8636
Epoch [27/51], Iter [150/391] Loss: 0.8577
Epoch [27/51], Iter [160/391] Loss: 0.8616
Epoch [27/51], Iter [170/391] Loss: 0.8579
Epoch [27/51], Iter [180/391] Loss: 0.8643
Epoch [27/51], Iter [190/391] Loss: 0.8598
Epoch [27/51], Iter [200/391] Loss: 0.8423
Epoch [27/51], Iter [210/391] Loss: 0.8515
Epoch [27/51], Iter [220/391] Loss: 0.8619
Epoch [27/51], Iter [230/391] Loss: 0.8473
Epoch [27/51], Iter [240/391] Loss: 0.8438
Epoch [27/51], Iter [250/391] Loss: 0.8396
Epoch [27/51], Iter [260/391] Loss: 0.8447
Epoch [27/51], Iter [270/391] Loss: 0.8578
Epoch [27/51], Iter [280/391] Loss: 0.8542
Epoch [27/51], Iter [290/391] Loss: 0.8634
Epoch [27/51], Iter [300/391] Loss: 0.8606
Epoch [27/51], Iter [310/391] Loss: 0.8664
Epoch [27/51], Iter [320/391] Loss: 0.8418
Epoch [27/51], Iter [330/391] Loss: 0.8705
Epoch [27/51], Iter [340/391] Loss: 0.8656
Epoch [27/51], Iter [350/391] Loss: 0.8537
Epoch [27/51], Iter [360/391] Loss: 0.8437
Epoch [27/51], Iter [370/391] Loss: 0.8697
Epoch [27/51], Iter [380/391] Loss: 0.8512
Epoch [27/51], Iter [390/391] Loss: 0.8477
Epoch [28/51], Iter [10/391] Loss: 0.8466
Epoch [28/51], Iter [20/391] Loss: 0.8549
Epoch [28/51], Iter [30/391] Loss: 0.8595
Epoch [28/51], Iter [40/391] Loss: 0.8476
Epoch [28/51], Iter [50/391] Loss: 0.8521
Epoch [28/51], Iter [60/391] Loss: 0.8527
Epoch [28/51], Iter [70/391] Loss: 0.8447
Epoch [28/51], Iter [80/391] Loss: 0.8572
Epoch [28/51], Iter [90/391] Loss: 0.8575
Epoch [28/51], Iter [100/391] Loss: 0.8521
Epoch [28/51], Iter [110/391] Loss: 0.8773
Epoch [28/51], Iter [120/391] Loss: 0.8559
Epoch [28/51], Iter [130/391] Loss: 0.8376
Epoch [28/51], Iter [140/391] Loss: 0.8518
Epoch [28/51], Iter [150/391] Loss: 0.8563
Epoch [28/51], Iter [160/391] Loss: 0.8493
Epoch [28/51], Iter [170/391] Loss: 0.8746
Epoch [28/51], Iter [180/391] Loss: 0.8712
Epoch [28/51], Iter [190/391] Loss: 0.8593
Epoch [28/51], Iter [200/391] Loss: 0.8472
Epoch [28/51], Iter [210/391] Loss: 0.8598
Epoch [28/51], Iter [220/391] Loss: 0.8502
Epoch [28/51], Iter [230/391] Loss: 0.8536
Epoch [28/51], Iter [240/391] Loss: 0.8669
Epoch [28/51], Iter [250/391] Loss: 0.8549
Epoch [28/51], Iter [260/391] Loss: 0.8462
Epoch [28/51], Iter [270/391] Loss: 0.8668
Epoch [28/51], Iter [280/391] Loss: 0.8591
Epoch [28/51], Iter [290/391] Loss: 0.8648
Epoch [28/51], Iter [300/391] Loss: 0.8497
Epoch [28/51], Iter [310/391] Loss: 0.8546
Epoch [28/51], Iter [320/391] Loss: 0.8549
Epoch [28/51], Iter [330/391] Loss: 0.8437
Epoch [28/51], Iter [340/391] Loss: 0.8574
Epoch [28/51], Iter [350/391] Loss: 0.8603
Epoch [28/51], Iter [360/391] Loss: 0.8442
Epoch [28/51], Iter [370/391] Loss: 0.8521
Epoch [28/51], Iter [380/391] Loss: 0.8508
Epoch [28/51], Iter [390/391] Loss: 0.8775
Epoch [29/51], Iter [10/391] Loss: 0.8619
Epoch [29/51], Iter [20/391] Loss: 0.8558
Epoch [29/51], Iter [30/391] Loss: 0.8454
Epoch [29/51], Iter [40/391] Loss: 0.8581
Epoch [29/51], Iter [50/391] Loss: 0.8442
Epoch [29/51], Iter [60/391] Loss: 0.8529
Epoch [29/51], Iter [70/391] Loss: 0.8369
Epoch [29/51], Iter [80/391] Loss: 0.8446
Epoch [29/51], Iter [90/391] Loss: 0.8423
Epoch [29/51], Iter [100/391] Loss: 0.8393
Epoch [29/51], Iter [110/391] Loss: 0.8487
Epoch [29/51], Iter [120/391] Loss: 0.8464
Epoch [29/51], Iter [130/391] Loss: 0.8477
Epoch [29/51], Iter [140/391] Loss: 0.8652
Epoch [29/51], Iter [150/391] Loss: 0.8381
Epoch [29/51], Iter [160/391] Loss: 0.8378
Epoch [29/51], Iter [170/391] Loss: 0.8543
Epoch [29/51], Iter [180/391] Loss: 0.8453
Epoch [29/51], Iter [190/391] Loss: 0.8404
Epoch [29/51], Iter [200/391] Loss: 0.8390
Epoch [29/51], Iter [210/391] Loss: 0.8576
Epoch [29/51], Iter [220/391] Loss: 0.8501
Epoch [29/51], Iter [230/391] Loss: 0.8571
Epoch [29/51], Iter [240/391] Loss: 0.8322
Epoch [29/51], Iter [250/391] Loss: 0.8629
Epoch [29/51], Iter [260/391] Loss: 0.8533
Epoch [29/51], Iter [270/391] Loss: 0.8681
Epoch [29/51], Iter [280/391] Loss: 0.8353
Epoch [29/51], Iter [290/391] Loss: 0.8447
Epoch [29/51], Iter [300/391] Loss: 0.8584
Epoch [29/51], Iter [310/391] Loss: 0.8427
Epoch [29/51], Iter [320/391] Loss: 0.8619
Epoch [29/51], Iter [330/391] Loss: 0.8427
Epoch [29/51], Iter [340/391] Loss: 0.8604
Epoch [29/51], Iter [350/391] Loss: 0.8720
Epoch [29/51], Iter [360/391] Loss: 0.8443
Epoch [29/51], Iter [370/391] Loss: 0.8579
Epoch [29/51], Iter [380/391] Loss: 0.8435
Epoch [29/51], Iter [390/391] Loss: 0.8371
Epoch [30/51], Iter [10/391] Loss: 0.8546
Epoch [30/51], Iter [20/391] Loss: 0.8525
Epoch [30/51], Iter [30/391] Loss: 0.8475
Epoch [30/51], Iter [40/391] Loss: 0.8514
Epoch [30/51], Iter [50/391] Loss: 0.8317
Epoch [30/51], Iter [60/391] Loss: 0.8322
Epoch [30/51], Iter [70/391] Loss: 0.8484
Epoch [30/51], Iter [80/391] Loss: 0.8616
Epoch [30/51], Iter [90/391] Loss: 0.8502
Epoch [30/51], Iter [100/391] Loss: 0.8468
Epoch [30/51], Iter [110/391] Loss: 0.8464
Epoch [30/51], Iter [120/391] Loss: 0.8485
Epoch [30/51], Iter [130/391] Loss: 0.8552
Epoch [30/51], Iter [140/391] Loss: 0.8660
Epoch [30/51], Iter [150/391] Loss: 0.8390
Epoch [30/51], Iter [160/391] Loss: 0.8452
Epoch [30/51], Iter [170/391] Loss: 0.8246
Epoch [30/51], Iter [180/391] Loss: 0.8530
Epoch [30/51], Iter [190/391] Loss: 0.8480
Epoch [30/51], Iter [200/391] Loss: 0.8563
Epoch [30/51], Iter [210/391] Loss: 0.8510
Epoch [30/51], Iter [220/391] Loss: 0.8407
Epoch [30/51], Iter [230/391] Loss: 0.8512
Epoch [30/51], Iter [240/391] Loss: 0.8432
Epoch [30/51], Iter [250/391] Loss: 0.8504
Epoch [30/51], Iter [260/391] Loss: 0.8312
Epoch [30/51], Iter [270/391] Loss: 0.8428
Epoch [30/51], Iter [280/391] Loss: 0.8544
Epoch [30/51], Iter [290/391] Loss: 0.8681
Epoch [30/51], Iter [300/391] Loss: 0.8686
Epoch [30/51], Iter [310/391] Loss: 0.8499
Epoch [30/51], Iter [320/391] Loss: 0.8557
Epoch [30/51], Iter [330/391] Loss: 0.8440
Epoch [30/51], Iter [340/391] Loss: 0.8619
Epoch [30/51], Iter [350/391] Loss: 0.8458
Epoch [30/51], Iter [360/391] Loss: 0.8347
Epoch [30/51], Iter [370/391] Loss: 0.8382
Epoch [30/51], Iter [380/391] Loss: 0.8591
Epoch [30/51], Iter [390/391] Loss: 0.8477
[Saving Checkpoint]
Epoch [31/51], Iter [10/391] Loss: 0.8522
Epoch [31/51], Iter [20/391] Loss: 0.8322
Epoch [31/51], Iter [30/391] Loss: 0.8368
Epoch [31/51], Iter [40/391] Loss: 0.8494
Epoch [31/51], Iter [50/391] Loss: 0.8543
Epoch [31/51], Iter [60/391] Loss: 0.8493
Epoch [31/51], Iter [70/391] Loss: 0.8488
Epoch [31/51], Iter [80/391] Loss: 0.8331
Epoch [31/51], Iter [90/391] Loss: 0.8494
Epoch [31/51], Iter [100/391] Loss: 0.8395
Epoch [31/51], Iter [110/391] Loss: 0.8368
Epoch [31/51], Iter [120/391] Loss: 0.8564
Epoch [31/51], Iter [130/391] Loss: 0.8486
Epoch [31/51], Iter [140/391] Loss: 0.8294
Epoch [31/51], Iter [150/391] Loss: 0.8401
Epoch [31/51], Iter [160/391] Loss: 0.8649
Epoch [31/51], Iter [170/391] Loss: 0.8380
Epoch [31/51], Iter [180/391] Loss: 0.8408
Epoch [31/51], Iter [190/391] Loss: 0.8357
Epoch [31/51], Iter [200/391] Loss: 0.8365
Epoch [31/51], Iter [210/391] Loss: 0.8529
Epoch [31/51], Iter [220/391] Loss: 0.8429
Epoch [31/51], Iter [230/391] Loss: 0.8494
Epoch [31/51], Iter [240/391] Loss: 0.8453
Epoch [31/51], Iter [250/391] Loss: 0.8549
Epoch [31/51], Iter [260/391] Loss: 0.8726
Epoch [31/51], Iter [270/391] Loss: 0.8632
Epoch [31/51], Iter [280/391] Loss: 0.8383
Epoch [31/51], Iter [290/391] Loss: 0.8360
Epoch [31/51], Iter [300/391] Loss: 0.8432
Epoch [31/51], Iter [310/391] Loss: 0.8493
Epoch [31/51], Iter [320/391] Loss: 0.8462
Epoch [31/51], Iter [330/391] Loss: 0.8561
Epoch [31/51], Iter [340/391] Loss: 0.8462
Epoch [31/51], Iter [350/391] Loss: 0.8439
Epoch [31/51], Iter [360/391] Loss: 0.8361
Epoch [31/51], Iter [370/391] Loss: 0.8339
Epoch [31/51], Iter [380/391] Loss: 0.8312
Epoch [31/51], Iter [390/391] Loss: 0.8291
Epoch [32/51], Iter [10/391] Loss: 0.8577
Epoch [32/51], Iter [20/391] Loss: 0.8478
Epoch [32/51], Iter [30/391] Loss: 0.8309
Epoch [32/51], Iter [40/391] Loss: 0.8383
Epoch [32/51], Iter [50/391] Loss: 0.8275
Epoch [32/51], Iter [60/391] Loss: 0.8474
Epoch [32/51], Iter [70/391] Loss: 0.8378
Epoch [32/51], Iter [80/391] Loss: 0.8563
Epoch [32/51], Iter [90/391] Loss: 0.8395
Epoch [32/51], Iter [100/391] Loss: 0.8394
Epoch [32/51], Iter [110/391] Loss: 0.8473
Epoch [32/51], Iter [120/391] Loss: 0.8354
Epoch [32/51], Iter [130/391] Loss: 0.8524
Epoch [32/51], Iter [140/391] Loss: 0.8407
Epoch [32/51], Iter [150/391] Loss: 0.8362
Epoch [32/51], Iter [160/391] Loss: 0.8271
Epoch [32/51], Iter [170/391] Loss: 0.8259
Epoch [32/51], Iter [180/391] Loss: 0.8448
Epoch [32/51], Iter [190/391] Loss: 0.8503
Epoch [32/51], Iter [200/391] Loss: 0.8351
Epoch [32/51], Iter [210/391] Loss: 0.8367
Epoch [32/51], Iter [220/391] Loss: 0.8478
Epoch [32/51], Iter [230/391] Loss: 0.8464
Epoch [32/51], Iter [240/391] Loss: 0.8278
Epoch [32/51], Iter [250/391] Loss: 0.8462
Epoch [32/51], Iter [260/391] Loss: 0.8495
Epoch [32/51], Iter [270/391] Loss: 0.8394
Epoch [32/51], Iter [280/391] Loss: 0.8307
Epoch [32/51], Iter [290/391] Loss: 0.8372
Epoch [32/51], Iter [300/391] Loss: 0.8461
Epoch [32/51], Iter [310/391] Loss: 0.8483
Epoch [32/51], Iter [320/391] Loss: 0.8552
Epoch [32/51], Iter [330/391] Loss: 0.8437
Epoch [32/51], Iter [340/391] Loss: 0.8477
Epoch [32/51], Iter [350/391] Loss: 0.8507
Epoch [32/51], Iter [360/391] Loss: 0.8504
Epoch [32/51], Iter [370/391] Loss: 0.8311
Epoch [32/51], Iter [380/391] Loss: 0.8450
Epoch [32/51], Iter [390/391] Loss: 0.8740
Epoch [33/51], Iter [10/391] Loss: 0.8582
Epoch [33/51], Iter [20/391] Loss: 0.8298
Epoch [33/51], Iter [30/391] Loss: 0.8469
Epoch [33/51], Iter [40/391] Loss: 0.8301
Epoch [33/51], Iter [50/391] Loss: 0.8574
Epoch [33/51], Iter [60/391] Loss: 0.8613
Epoch [33/51], Iter [70/391] Loss: 0.8487
Epoch [33/51], Iter [80/391] Loss: 0.8341
Epoch [33/51], Iter [90/391] Loss: 0.8282
Epoch [33/51], Iter [100/391] Loss: 0.8358
Epoch [33/51], Iter [110/391] Loss: 0.8346
Epoch [33/51], Iter [120/391] Loss: 0.8409
Epoch [33/51], Iter [130/391] Loss: 0.8393
Epoch [33/51], Iter [140/391] Loss: 0.8551
Epoch [33/51], Iter [150/391] Loss: 0.8293
Epoch [33/51], Iter [160/391] Loss: 0.8423
Epoch [33/51], Iter [170/391] Loss: 0.8485
Epoch [33/51], Iter [180/391] Loss: 0.8489
Epoch [33/51], Iter [190/391] Loss: 0.8385
Epoch [33/51], Iter [200/391] Loss: 0.8465
Epoch [33/51], Iter [210/391] Loss: 0.8412
Epoch [33/51], Iter [220/391] Loss: 0.8503
Epoch [33/51], Iter [230/391] Loss: 0.8440
Epoch [33/51], Iter [240/391] Loss: 0.8453
Epoch [33/51], Iter [250/391] Loss: 0.8444
Epoch [33/51], Iter [260/391] Loss: 0.8379
Epoch [33/51], Iter [270/391] Loss: 0.8391
Epoch [33/51], Iter [280/391] Loss: 0.8472
Epoch [33/51], Iter [290/391] Loss: 0.8503
Epoch [33/51], Iter [300/391] Loss: 0.8407
Epoch [33/51], Iter [310/391] Loss: 0.8665
Epoch [33/51], Iter [320/391] Loss: 0.8366
Epoch [33/51], Iter [330/391] Loss: 0.8369
Epoch [33/51], Iter [340/391] Loss: 0.8278
Epoch [33/51], Iter [350/391] Loss: 0.8699
Epoch [33/51], Iter [360/391] Loss: 0.8317
Epoch [33/51], Iter [370/391] Loss: 0.8526
Epoch [33/51], Iter [380/391] Loss: 0.8679
Epoch [33/51], Iter [390/391] Loss: 0.8429
Epoch [34/51], Iter [10/391] Loss: 0.8324
Epoch [34/51], Iter [20/391] Loss: 0.8338
Epoch [34/51], Iter [30/391] Loss: 0.8338
Epoch [34/51], Iter [40/391] Loss: 0.8388
Epoch [34/51], Iter [50/391] Loss: 0.8388
Epoch [34/51], Iter [60/391] Loss: 0.8417
Epoch [34/51], Iter [70/391] Loss: 0.8234
Epoch [34/51], Iter [80/391] Loss: 0.8291
Epoch [34/51], Iter [90/391] Loss: 0.8580
Epoch [34/51], Iter [100/391] Loss: 0.8466
Epoch [34/51], Iter [110/391] Loss: 0.8379
Epoch [34/51], Iter [120/391] Loss: 0.8467
Epoch [34/51], Iter [130/391] Loss: 0.8399
Epoch [34/51], Iter [140/391] Loss: 0.8507
Epoch [34/51], Iter [150/391] Loss: 0.8328
Epoch [34/51], Iter [160/391] Loss: 0.8459
Epoch [34/51], Iter [170/391] Loss: 0.8284
Epoch [34/51], Iter [180/391] Loss: 0.8408
Epoch [34/51], Iter [190/391] Loss: 0.8436
Epoch [34/51], Iter [200/391] Loss: 0.8588
Epoch [34/51], Iter [210/391] Loss: 0.8376
Epoch [34/51], Iter [220/391] Loss: 0.8378
Epoch [34/51], Iter [230/391] Loss: 0.8196
Epoch [34/51], Iter [240/391] Loss: 0.8535
Epoch [34/51], Iter [250/391] Loss: 0.8354
Epoch [34/51], Iter [260/391] Loss: 0.8505
Epoch [34/51], Iter [270/391] Loss: 0.8613
Epoch [34/51], Iter [280/391] Loss: 0.8338
Epoch [34/51], Iter [290/391] Loss: 0.8386
Epoch [34/51], Iter [300/391] Loss: 0.8385
Epoch [34/51], Iter [310/391] Loss: 0.8329
Epoch [34/51], Iter [320/391] Loss: 0.8432
Epoch [34/51], Iter [330/391] Loss: 0.8560
Epoch [34/51], Iter [340/391] Loss: 0.8391
Epoch [34/51], Iter [350/391] Loss: 0.8467
Epoch [34/51], Iter [360/391] Loss: 0.8321
Epoch [34/51], Iter [370/391] Loss: 0.8232
Epoch [34/51], Iter [380/391] Loss: 0.8372
Epoch [34/51], Iter [390/391] Loss: 0.8629
Epoch [35/51], Iter [10/391] Loss: 0.8485
Epoch [35/51], Iter [20/391] Loss: 0.8272
Epoch [35/51], Iter [30/391] Loss: 0.8414
Epoch [35/51], Iter [40/391] Loss: 0.8409
Epoch [35/51], Iter [50/391] Loss: 0.8440
Epoch [35/51], Iter [60/391] Loss: 0.8270
Epoch [35/51], Iter [70/391] Loss: 0.8459
Epoch [35/51], Iter [80/391] Loss: 0.8527
Epoch [35/51], Iter [90/391] Loss: 0.8489
Epoch [35/51], Iter [100/391] Loss: 0.8452
Epoch [35/51], Iter [110/391] Loss: 0.8476
Epoch [35/51], Iter [120/391] Loss: 0.8440
Epoch [35/51], Iter [130/391] Loss: 0.8260
Epoch [35/51], Iter [140/391] Loss: 0.8413
Epoch [35/51], Iter [150/391] Loss: 0.8282
Epoch [35/51], Iter [160/391] Loss: 0.8400
Epoch [35/51], Iter [170/391] Loss: 0.8406
Epoch [35/51], Iter [180/391] Loss: 0.8196
Epoch [35/51], Iter [190/391] Loss: 0.8543
Epoch [35/51], Iter [200/391] Loss: 0.8540
Epoch [35/51], Iter [210/391] Loss: 0.8290
Epoch [35/51], Iter [220/391] Loss: 0.8393
Epoch [35/51], Iter [230/391] Loss: 0.8503
Epoch [35/51], Iter [240/391] Loss: 0.8252
Epoch [35/51], Iter [250/391] Loss: 0.8406
Epoch [35/51], Iter [260/391] Loss: 0.8321
Epoch [35/51], Iter [270/391] Loss: 0.8304
Epoch [35/51], Iter [280/391] Loss: 0.8532
Epoch [35/51], Iter [290/391] Loss: 0.8258
Epoch [35/51], Iter [300/391] Loss: 0.8348
Epoch [35/51], Iter [310/391] Loss: 0.8252
Epoch [35/51], Iter [320/391] Loss: 0.8674
Epoch [35/51], Iter [330/391] Loss: 0.8511
Epoch [35/51], Iter [340/391] Loss: 0.8601
Epoch [35/51], Iter [350/391] Loss: 0.8262
Epoch [35/51], Iter [360/391] Loss: 0.8499
Epoch [35/51], Iter [370/391] Loss: 0.8479
Epoch [35/51], Iter [380/391] Loss: 0.8527
Epoch [35/51], Iter [390/391] Loss: 0.8376
Epoch [36/51], Iter [10/391] Loss: 0.8267
Epoch [36/51], Iter [20/391] Loss: 0.8286
Epoch [36/51], Iter [30/391] Loss: 0.8309
Epoch [36/51], Iter [40/391] Loss: 0.8497
Epoch [36/51], Iter [50/391] Loss: 0.8113
Epoch [36/51], Iter [60/391] Loss: 0.8264
Epoch [36/51], Iter [70/391] Loss: 0.8274
Epoch [36/51], Iter [80/391] Loss: 0.8472
Epoch [36/51], Iter [90/391] Loss: 0.8238
Epoch [36/51], Iter [100/391] Loss: 0.8304
Epoch [36/51], Iter [110/391] Loss: 0.8620
Epoch [36/51], Iter [120/391] Loss: 0.8413
Epoch [36/51], Iter [130/391] Loss: 0.8338
Epoch [36/51], Iter [140/391] Loss: 0.8215
Epoch [36/51], Iter [150/391] Loss: 0.8504
Epoch [36/51], Iter [160/391] Loss: 0.8473
Epoch [36/51], Iter [170/391] Loss: 0.8302
Epoch [36/51], Iter [180/391] Loss: 0.8459
Epoch [36/51], Iter [190/391] Loss: 0.8558
Epoch [36/51], Iter [200/391] Loss: 0.8453
Epoch [36/51], Iter [210/391] Loss: 0.8390
Epoch [36/51], Iter [220/391] Loss: 0.8510
Epoch [36/51], Iter [230/391] Loss: 0.8412
Epoch [36/51], Iter [240/391] Loss: 0.8376
Epoch [36/51], Iter [250/391] Loss: 0.8441
Epoch [36/51], Iter [260/391] Loss: 0.8557
Epoch [36/51], Iter [270/391] Loss: 0.8238
Epoch [36/51], Iter [280/391] Loss: 0.8296
Epoch [36/51], Iter [290/391] Loss: 0.8518
Epoch [36/51], Iter [300/391] Loss: 0.8443
Epoch [36/51], Iter [310/391] Loss: 0.8303
Epoch [36/51], Iter [320/391] Loss: 0.8493
Epoch [36/51], Iter [330/391] Loss: 0.8366
Epoch [36/51], Iter [340/391] Loss: 0.8390
Epoch [36/51], Iter [350/391] Loss: 0.8447
Epoch [36/51], Iter [360/391] Loss: 0.8503
Epoch [36/51], Iter [370/391] Loss: 0.8176
Epoch [36/51], Iter [380/391] Loss: 0.8371
Epoch [36/51], Iter [390/391] Loss: 0.8324
Epoch [37/51], Iter [10/391] Loss: 0.8456
Epoch [37/51], Iter [20/391] Loss: 0.8209
Epoch [37/51], Iter [30/391] Loss: 0.8408
Epoch [37/51], Iter [40/391] Loss: 0.8479
Epoch [37/51], Iter [50/391] Loss: 0.8312
Epoch [37/51], Iter [60/391] Loss: 0.8520
Epoch [37/51], Iter [70/391] Loss: 0.8451
Epoch [37/51], Iter [80/391] Loss: 0.8295
Epoch [37/51], Iter [90/391] Loss: 0.8350
Epoch [37/51], Iter [100/391] Loss: 0.8388
Epoch [37/51], Iter [110/391] Loss: 0.8236
Epoch [37/51], Iter [120/391] Loss: 0.8336
Epoch [37/51], Iter [130/391] Loss: 0.8280
Epoch [37/51], Iter [140/391] Loss: 0.8513
Epoch [37/51], Iter [150/391] Loss: 0.8412
Epoch [37/51], Iter [160/391] Loss: 0.8133
Epoch [37/51], Iter [170/391] Loss: 0.8332
Epoch [37/51], Iter [180/391] Loss: 0.8412
Epoch [37/51], Iter [190/391] Loss: 0.8298
Epoch [37/51], Iter [200/391] Loss: 0.8393
Epoch [37/51], Iter [210/391] Loss: 0.8581
Epoch [37/51], Iter [220/391] Loss: 0.8375
Epoch [37/51], Iter [230/391] Loss: 0.8510
Epoch [37/51], Iter [240/391] Loss: 0.8216
Epoch [37/51], Iter [250/391] Loss: 0.8301
Epoch [37/51], Iter [260/391] Loss: 0.8455
Epoch [37/51], Iter [270/391] Loss: 0.8302
Epoch [37/51], Iter [280/391] Loss: 0.8401
Epoch [37/51], Iter [290/391] Loss: 0.8294
Epoch [37/51], Iter [300/391] Loss: 0.8245
Epoch [37/51], Iter [310/391] Loss: 0.8201
Epoch [37/51], Iter [320/391] Loss: 0.8410
Epoch [37/51], Iter [330/391] Loss: 0.8460
Epoch [37/51], Iter [340/391] Loss: 0.8287
Epoch [37/51], Iter [350/391] Loss: 0.8190
Epoch [37/51], Iter [360/391] Loss: 0.8430
Epoch [37/51], Iter [370/391] Loss: 0.8441
Epoch [37/51], Iter [380/391] Loss: 0.8364
Epoch [37/51], Iter [390/391] Loss: 0.8344
Epoch [38/51], Iter [10/391] Loss: 0.8208
Epoch [38/51], Iter [20/391] Loss: 0.8603
Epoch [38/51], Iter [30/391] Loss: 0.8245
Epoch [38/51], Iter [40/391] Loss: 0.8381
Epoch [38/51], Iter [50/391] Loss: 0.8321
Epoch [38/51], Iter [60/391] Loss: 0.8368
Epoch [38/51], Iter [70/391] Loss: 0.8503
Epoch [38/51], Iter [80/391] Loss: 0.8360
Epoch [38/51], Iter [90/391] Loss: 0.8420
Epoch [38/51], Iter [100/391] Loss: 0.8401
Epoch [38/51], Iter [110/391] Loss: 0.8293
Epoch [38/51], Iter [120/391] Loss: 0.8512
Epoch [38/51], Iter [130/391] Loss: 0.8360
Epoch [38/51], Iter [140/391] Loss: 0.8252
Epoch [38/51], Iter [150/391] Loss: 0.8271
Epoch [38/51], Iter [160/391] Loss: 0.8353
Epoch [38/51], Iter [170/391] Loss: 0.8277
Epoch [38/51], Iter [180/391] Loss: 0.8186
Epoch [38/51], Iter [190/391] Loss: 0.8363
Epoch [38/51], Iter [200/391] Loss: 0.8602
Epoch [38/51], Iter [210/391] Loss: 0.8225
Epoch [38/51], Iter [220/391] Loss: 0.8249
Epoch [38/51], Iter [230/391] Loss: 0.8343
Epoch [38/51], Iter [240/391] Loss: 0.8417
Epoch [38/51], Iter [250/391] Loss: 0.8259
Epoch [38/51], Iter [260/391] Loss: 0.8482
Epoch [38/51], Iter [270/391] Loss: 0.8203
Epoch [38/51], Iter [280/391] Loss: 0.8194
Epoch [38/51], Iter [290/391] Loss: 0.8305
Epoch [38/51], Iter [300/391] Loss: 0.8377
Epoch [38/51], Iter [310/391] Loss: 0.8357
Epoch [38/51], Iter [320/391] Loss: 0.8517
Epoch [38/51], Iter [330/391] Loss: 0.8297
Epoch [38/51], Iter [340/391] Loss: 0.8596
Epoch [38/51], Iter [350/391] Loss: 0.8234
Epoch [38/51], Iter [360/391] Loss: 0.8245
Epoch [38/51], Iter [370/391] Loss: 0.8504
Epoch [38/51], Iter [380/391] Loss: 0.8311
Epoch [38/51], Iter [390/391] Loss: 0.8362
Epoch [39/51], Iter [10/391] Loss: 0.8354
Epoch [39/51], Iter [20/391] Loss: 0.8254
Epoch [39/51], Iter [30/391] Loss: 0.8317
Epoch [39/51], Iter [40/391] Loss: 0.8378
Epoch [39/51], Iter [50/391] Loss: 0.8108
Epoch [39/51], Iter [60/391] Loss: 0.8310
Epoch [39/51], Iter [70/391] Loss: 0.8243
Epoch [39/51], Iter [80/391] Loss: 0.8260
Epoch [39/51], Iter [90/391] Loss: 0.8294
Epoch [39/51], Iter [100/391] Loss: 0.8294
Epoch [39/51], Iter [110/391] Loss: 0.8172
Epoch [39/51], Iter [120/391] Loss: 0.8400
Epoch [39/51], Iter [130/391] Loss: 0.8402
Epoch [39/51], Iter [140/391] Loss: 0.8337
Epoch [39/51], Iter [150/391] Loss: 0.8284
Epoch [39/51], Iter [160/391] Loss: 0.8274
Epoch [39/51], Iter [170/391] Loss: 0.8285
Epoch [39/51], Iter [180/391] Loss: 0.8318
Epoch [39/51], Iter [190/391] Loss: 0.8299
Epoch [39/51], Iter [200/391] Loss: 0.8244
Epoch [39/51], Iter [210/391] Loss: 0.8391
Epoch [39/51], Iter [220/391] Loss: 0.8330
Epoch [39/51], Iter [230/391] Loss: 0.8359
Epoch [39/51], Iter [240/391] Loss: 0.8351
Epoch [39/51], Iter [250/391] Loss: 0.8334
Epoch [39/51], Iter [260/391] Loss: 0.8174
Epoch [39/51], Iter [270/391] Loss: 0.8453
Epoch [39/51], Iter [280/391] Loss: 0.8355
Epoch [39/51], Iter [290/391] Loss: 0.8380
Epoch [39/51], Iter [300/391] Loss: 0.8353
Epoch [39/51], Iter [310/391] Loss: 0.8567
Epoch [39/51], Iter [320/391] Loss: 0.8247
Epoch [39/51], Iter [330/391] Loss: 0.8329
Epoch [39/51], Iter [340/391] Loss: 0.8290
Epoch [39/51], Iter [350/391] Loss: 0.8484
Epoch [39/51], Iter [360/391] Loss: 0.8320
Epoch [39/51], Iter [370/391] Loss: 0.8229
Epoch [39/51], Iter [380/391] Loss: 0.8348
Epoch [39/51], Iter [390/391] Loss: 0.8282
Epoch [40/51], Iter [10/391] Loss: 0.8257
Epoch [40/51], Iter [20/391] Loss: 0.8419
Epoch [40/51], Iter [30/391] Loss: 0.8316
Epoch [40/51], Iter [40/391] Loss: 0.8360
Epoch [40/51], Iter [50/391] Loss: 0.8401
Epoch [40/51], Iter [60/391] Loss: 0.8247
Epoch [40/51], Iter [70/391] Loss: 0.8434
Epoch [40/51], Iter [80/391] Loss: 0.8110
Epoch [40/51], Iter [90/391] Loss: 0.8381
Epoch [40/51], Iter [100/391] Loss: 0.8313
Epoch [40/51], Iter [110/391] Loss: 0.8293
Epoch [40/51], Iter [120/391] Loss: 0.8214
Epoch [40/51], Iter [130/391] Loss: 0.8189
Epoch [40/51], Iter [140/391] Loss: 0.8220
Epoch [40/51], Iter [150/391] Loss: 0.8359
Epoch [40/51], Iter [160/391] Loss: 0.8229
Epoch [40/51], Iter [170/391] Loss: 0.8275
Epoch [40/51], Iter [180/391] Loss: 0.8182
Epoch [40/51], Iter [190/391] Loss: 0.8271
Epoch [40/51], Iter [200/391] Loss: 0.8227
Epoch [40/51], Iter [210/391] Loss: 0.8359
Epoch [40/51], Iter [220/391] Loss: 0.8261
Epoch [40/51], Iter [230/391] Loss: 0.8353
Epoch [40/51], Iter [240/391] Loss: 0.8248
Epoch [40/51], Iter [250/391] Loss: 0.8250
Epoch [40/51], Iter [260/391] Loss: 0.8340
Epoch [40/51], Iter [270/391] Loss: 0.8269
Epoch [40/51], Iter [280/391] Loss: 0.8358
Epoch [40/51], Iter [290/391] Loss: 0.8373
Epoch [40/51], Iter [300/391] Loss: 0.8315
Epoch [40/51], Iter [310/391] Loss: 0.8301
Epoch [40/51], Iter [320/391] Loss: 0.8219
Epoch [40/51], Iter [330/391] Loss: 0.8335
Epoch [40/51], Iter [340/391] Loss: 0.8503
Epoch [40/51], Iter [350/391] Loss: 0.8273
Epoch [40/51], Iter [360/391] Loss: 0.8305
Epoch [40/51], Iter [370/391] Loss: 0.8179
Epoch [40/51], Iter [380/391] Loss: 0.8324
Epoch [40/51], Iter [390/391] Loss: 0.8211
[Saving Checkpoint]
Epoch [41/51], Iter [10/391] Loss: 0.8326
Epoch [41/51], Iter [20/391] Loss: 0.8302
Epoch [41/51], Iter [30/391] Loss: 0.8289
Epoch [41/51], Iter [40/391] Loss: 0.8170
Epoch [41/51], Iter [50/391] Loss: 0.8379
Epoch [41/51], Iter [60/391] Loss: 0.8294
Epoch [41/51], Iter [70/391] Loss: 0.8291
Epoch [41/51], Iter [80/391] Loss: 0.8308
Epoch [41/51], Iter [90/391] Loss: 0.8345
Epoch [41/51], Iter [100/391] Loss: 0.8377
Epoch [41/51], Iter [110/391] Loss: 0.8347
Epoch [41/51], Iter [120/391] Loss: 0.8204
Epoch [41/51], Iter [130/391] Loss: 0.8324
Epoch [41/51], Iter [140/391] Loss: 0.8338
Epoch [41/51], Iter [150/391] Loss: 0.8601
Epoch [41/51], Iter [160/391] Loss: 0.8185
Epoch [41/51], Iter [170/391] Loss: 0.8323
Epoch [41/51], Iter [180/391] Loss: 0.8257
Epoch [41/51], Iter [190/391] Loss: 0.8264
Epoch [41/51], Iter [200/391] Loss: 0.8116
Epoch [41/51], Iter [210/391] Loss: 0.8235
Epoch [41/51], Iter [220/391] Loss: 0.8310
Epoch [41/51], Iter [230/391] Loss: 0.8155
Epoch [41/51], Iter [240/391] Loss: 0.8249
Epoch [41/51], Iter [250/391] Loss: 0.8329
Epoch [41/51], Iter [260/391] Loss: 0.8300
Epoch [41/51], Iter [270/391] Loss: 0.8378
Epoch [41/51], Iter [280/391] Loss: 0.8261
Epoch [41/51], Iter [290/391] Loss: 0.8274
Epoch [41/51], Iter [300/391] Loss: 0.8398
Epoch [41/51], Iter [310/391] Loss: 0.8203
Epoch [41/51], Iter [320/391] Loss: 0.8437
Epoch [41/51], Iter [330/391] Loss: 0.8457
Epoch [41/51], Iter [340/391] Loss: 0.8260
Epoch [41/51], Iter [350/391] Loss: 0.8241
Epoch [41/51], Iter [360/391] Loss: 0.8350
Epoch [41/51], Iter [370/391] Loss: 0.8276
Epoch [41/51], Iter [380/391] Loss: 0.8334
Epoch [41/51], Iter [390/391] Loss: 0.8125
Epoch [42/51], Iter [10/391] Loss: 0.8359
Epoch [42/51], Iter [20/391] Loss: 0.8210
Epoch [42/51], Iter [30/391] Loss: 0.8361
Epoch [42/51], Iter [40/391] Loss: 0.8284
Epoch [42/51], Iter [50/391] Loss: 0.8208
Epoch [42/51], Iter [60/391] Loss: 0.8184
Epoch [42/51], Iter [70/391] Loss: 0.8385
Epoch [42/51], Iter [80/391] Loss: 0.8246
Epoch [42/51], Iter [90/391] Loss: 0.8365
Epoch [42/51], Iter [100/391] Loss: 0.8331
Epoch [42/51], Iter [110/391] Loss: 0.8274
Epoch [42/51], Iter [120/391] Loss: 0.8274
Epoch [42/51], Iter [130/391] Loss: 0.8299
Epoch [42/51], Iter [140/391] Loss: 0.8209
Epoch [42/51], Iter [150/391] Loss: 0.8225
Epoch [42/51], Iter [160/391] Loss: 0.8427
Epoch [42/51], Iter [170/391] Loss: 0.8199
Epoch [42/51], Iter [180/391] Loss: 0.8304
Epoch [42/51], Iter [190/391] Loss: 0.8211
Epoch [42/51], Iter [200/391] Loss: 0.8203
Epoch [42/51], Iter [210/391] Loss: 0.8570
Epoch [42/51], Iter [220/391] Loss: 0.8295
Epoch [42/51], Iter [230/391] Loss: 0.8182
Epoch [42/51], Iter [240/391] Loss: 0.8264
Epoch [42/51], Iter [250/391] Loss: 0.8245
Epoch [42/51], Iter [260/391] Loss: 0.8195
Epoch [42/51], Iter [270/391] Loss: 0.8357
Epoch [42/51], Iter [280/391] Loss: 0.8374
Epoch [42/51], Iter [290/391] Loss: 0.8399
Epoch [42/51], Iter [300/391] Loss: 0.8269
Epoch [42/51], Iter [310/391] Loss: 0.8178
Epoch [42/51], Iter [320/391] Loss: 0.8308
Epoch [42/51], Iter [330/391] Loss: 0.8325
Epoch [42/51], Iter [340/391] Loss: 0.8230
Epoch [42/51], Iter [350/391] Loss: 0.8319
Epoch [42/51], Iter [360/391] Loss: 0.8383
Epoch [42/51], Iter [370/391] Loss: 0.8307
Epoch [42/51], Iter [380/391] Loss: 0.8239
Epoch [42/51], Iter [390/391] Loss: 0.8323
Epoch [43/51], Iter [10/391] Loss: 0.8250
Epoch [43/51], Iter [20/391] Loss: 0.8107
Epoch [43/51], Iter [30/391] Loss: 0.8302
Epoch [43/51], Iter [40/391] Loss: 0.8308
Epoch [43/51], Iter [50/391] Loss: 0.8222
Epoch [43/51], Iter [60/391] Loss: 0.8289
Epoch [43/51], Iter [70/391] Loss: 0.8271
Epoch [43/51], Iter [80/391] Loss: 0.8261
Epoch [43/51], Iter [90/391] Loss: 0.8168
Epoch [43/51], Iter [100/391] Loss: 0.8301
Epoch [43/51], Iter [110/391] Loss: 0.8198
Epoch [43/51], Iter [120/391] Loss: 0.8330
Epoch [43/51], Iter [130/391] Loss: 0.8318
Epoch [43/51], Iter [140/391] Loss: 0.8396
Epoch [43/51], Iter [150/391] Loss: 0.8289
Epoch [43/51], Iter [160/391] Loss: 0.8276
Epoch [43/51], Iter [170/391] Loss: 0.8215
Epoch [43/51], Iter [180/391] Loss: 0.8262
Epoch [43/51], Iter [190/391] Loss: 0.8263
Epoch [43/51], Iter [200/391] Loss: 0.8267
Epoch [43/51], Iter [210/391] Loss: 0.8339
Epoch [43/51], Iter [220/391] Loss: 0.8313
Epoch [43/51], Iter [230/391] Loss: 0.8429
Epoch [43/51], Iter [240/391] Loss: 0.8122
Epoch [43/51], Iter [250/391] Loss: 0.8254
Epoch [43/51], Iter [260/391] Loss: 0.8393
Epoch [43/51], Iter [270/391] Loss: 0.8204
Epoch [43/51], Iter [280/391] Loss: 0.8297
Epoch [43/51], Iter [290/391] Loss: 0.8334
Epoch [43/51], Iter [300/391] Loss: 0.8400
Epoch [43/51], Iter [310/391] Loss: 0.8116
Epoch [43/51], Iter [320/391] Loss: 0.8240
Epoch [43/51], Iter [330/391] Loss: 0.8259
Epoch [43/51], Iter [340/391] Loss: 0.8453
Epoch [43/51], Iter [350/391] Loss: 0.8316
Epoch [43/51], Iter [360/391] Loss: 0.8335
Epoch [43/51], Iter [370/391] Loss: 0.8167
Epoch [43/51], Iter [380/391] Loss: 0.8514
Epoch [43/51], Iter [390/391] Loss: 0.8400
Epoch [44/51], Iter [10/391] Loss: 0.8350
Epoch [44/51], Iter [20/391] Loss: 0.8336
Epoch [44/51], Iter [30/391] Loss: 0.8311
Epoch [44/51], Iter [40/391] Loss: 0.8369
Epoch [44/51], Iter [50/391] Loss: 0.8249
Epoch [44/51], Iter [60/391] Loss: 0.8151
Epoch [44/51], Iter [70/391] Loss: 0.8255
Epoch [44/51], Iter [80/391] Loss: 0.8314
Epoch [44/51], Iter [90/391] Loss: 0.8309
Epoch [44/51], Iter [100/391] Loss: 0.8305
Epoch [44/51], Iter [110/391] Loss: 0.8117
Epoch [44/51], Iter [120/391] Loss: 0.8313
Epoch [44/51], Iter [130/391] Loss: 0.8285
Epoch [44/51], Iter [140/391] Loss: 0.8344
Epoch [44/51], Iter [150/391] Loss: 0.8384
Epoch [44/51], Iter [160/391] Loss: 0.8237
Epoch [44/51], Iter [170/391] Loss: 0.8206
Epoch [44/51], Iter [180/391] Loss: 0.8326
Epoch [44/51], Iter [190/391] Loss: 0.8315
Epoch [44/51], Iter [200/391] Loss: 0.8280
Epoch [44/51], Iter [210/391] Loss: 0.8117
Epoch [44/51], Iter [220/391] Loss: 0.8262
Epoch [44/51], Iter [230/391] Loss: 0.8344
Epoch [44/51], Iter [240/391] Loss: 0.8209
Epoch [44/51], Iter [250/391] Loss: 0.8320
Epoch [44/51], Iter [260/391] Loss: 0.8115
Epoch [44/51], Iter [270/391] Loss: 0.8272
Epoch [44/51], Iter [280/391] Loss: 0.8141
Epoch [44/51], Iter [290/391] Loss: 0.8578
Epoch [44/51], Iter [300/391] Loss: 0.8672
Epoch [44/51], Iter [310/391] Loss: 0.8188
Epoch [44/51], Iter [320/391] Loss: 0.8222
Epoch [44/51], Iter [330/391] Loss: 0.8095
Epoch [44/51], Iter [340/391] Loss: 0.8335
Epoch [44/51], Iter [350/391] Loss: 0.8260
Epoch [44/51], Iter [360/391] Loss: 0.8172
Epoch [44/51], Iter [370/391] Loss: 0.8266
Epoch [44/51], Iter [380/391] Loss: 0.8409
Epoch [44/51], Iter [390/391] Loss: 0.8251
Epoch [45/51], Iter [10/391] Loss: 0.8237
Epoch [45/51], Iter [20/391] Loss: 0.8187
Epoch [45/51], Iter [30/391] Loss: 0.8266
Epoch [45/51], Iter [40/391] Loss: 0.8175
Epoch [45/51], Iter [50/391] Loss: 0.8231
Epoch [45/51], Iter [60/391] Loss: 0.8137
Epoch [45/51], Iter [70/391] Loss: 0.8220
Epoch [45/51], Iter [80/391] Loss: 0.8104
Epoch [45/51], Iter [90/391] Loss: 0.8314
Epoch [45/51], Iter [100/391] Loss: 0.8159
Epoch [45/51], Iter [110/391] Loss: 0.8366
Epoch [45/51], Iter [120/391] Loss: 0.8260
Epoch [45/51], Iter [130/391] Loss: 0.8312
Epoch [45/51], Iter [140/391] Loss: 0.8274
Epoch [45/51], Iter [150/391] Loss: 0.8297
Epoch [45/51], Iter [160/391] Loss: 0.8523
Epoch [45/51], Iter [170/391] Loss: 0.8311
Epoch [45/51], Iter [180/391] Loss: 0.8221
Epoch [45/51], Iter [190/391] Loss: 0.8335
Epoch [45/51], Iter [200/391] Loss: 0.8450
Epoch [45/51], Iter [210/391] Loss: 0.8272
Epoch [45/51], Iter [220/391] Loss: 0.8247
Epoch [45/51], Iter [230/391] Loss: 0.8223
Epoch [45/51], Iter [240/391] Loss: 0.8311
Epoch [45/51], Iter [250/391] Loss: 0.8250
Epoch [45/51], Iter [260/391] Loss: 0.8321
Epoch [45/51], Iter [270/391] Loss: 0.8180
Epoch [45/51], Iter [280/391] Loss: 0.8324
Epoch [45/51], Iter [290/391] Loss: 0.8191
Epoch [45/51], Iter [300/391] Loss: 0.8279
Epoch [45/51], Iter [310/391] Loss: 0.8378
Epoch [45/51], Iter [320/391] Loss: 0.8200
Epoch [45/51], Iter [330/391] Loss: 0.8358
Epoch [45/51], Iter [340/391] Loss: 0.8350
Epoch [45/51], Iter [350/391] Loss: 0.8195
Epoch [45/51], Iter [360/391] Loss: 0.8199
Epoch [45/51], Iter [370/391] Loss: 0.8213
Epoch [45/51], Iter [380/391] Loss: 0.8281
Epoch [45/51], Iter [390/391] Loss: 0.8117
Epoch [46/51], Iter [10/391] Loss: 0.8271
Epoch [46/51], Iter [20/391] Loss: 0.8363
Epoch [46/51], Iter [30/391] Loss: 0.8308
Epoch [46/51], Iter [40/391] Loss: 0.8148
Epoch [46/51], Iter [50/391] Loss: 0.8189
Epoch [46/51], Iter [60/391] Loss: 0.8201
Epoch [46/51], Iter [70/391] Loss: 0.8229
Epoch [46/51], Iter [80/391] Loss: 0.8239
Epoch [46/51], Iter [90/391] Loss: 0.8172
Epoch [46/51], Iter [100/391] Loss: 0.8236
Epoch [46/51], Iter [110/391] Loss: 0.8274
Epoch [46/51], Iter [120/391] Loss: 0.8163
Epoch [46/51], Iter [130/391] Loss: 0.8260
Epoch [46/51], Iter [140/391] Loss: 0.8204
Epoch [46/51], Iter [150/391] Loss: 0.8220
Epoch [46/51], Iter [160/391] Loss: 0.8266
Epoch [46/51], Iter [170/391] Loss: 0.8402
Epoch [46/51], Iter [180/391] Loss: 0.8149
Epoch [46/51], Iter [190/391] Loss: 0.8238
Epoch [46/51], Iter [200/391] Loss: 0.8113
Epoch [46/51], Iter [210/391] Loss: 0.8395
Epoch [46/51], Iter [220/391] Loss: 0.8190
Epoch [46/51], Iter [230/391] Loss: 0.8200
Epoch [46/51], Iter [240/391] Loss: 0.8232
Epoch [46/51], Iter [250/391] Loss: 0.8339
Epoch [46/51], Iter [260/391] Loss: 0.8191
Epoch [46/51], Iter [270/391] Loss: 0.8413
Epoch [46/51], Iter [280/391] Loss: 0.8155
Epoch [46/51], Iter [290/391] Loss: 0.8132
Epoch [46/51], Iter [300/391] Loss: 0.8317
Epoch [46/51], Iter [310/391] Loss: 0.8219
Epoch [46/51], Iter [320/391] Loss: 0.8409
Epoch [46/51], Iter [330/391] Loss: 0.8359
Epoch [46/51], Iter [340/391] Loss: 0.8234
Epoch [46/51], Iter [350/391] Loss: 0.8146
Epoch [46/51], Iter [360/391] Loss: 0.8283
Epoch [46/51], Iter [370/391] Loss: 0.8255
Epoch [46/51], Iter [380/391] Loss: 0.8294
Epoch [46/51], Iter [390/391] Loss: 0.8214
Epoch [47/51], Iter [10/391] Loss: 0.8060
Epoch [47/51], Iter [20/391] Loss: 0.8191
Epoch [47/51], Iter [30/391] Loss: 0.8209
Epoch [47/51], Iter [40/391] Loss: 0.8253
Epoch [47/51], Iter [50/391] Loss: 0.8228
Epoch [47/51], Iter [60/391] Loss: 0.8253
Epoch [47/51], Iter [70/391] Loss: 0.8195
Epoch [47/51], Iter [80/391] Loss: 0.8420
Epoch [47/51], Iter [90/391] Loss: 0.8139
Epoch [47/51], Iter [100/391] Loss: 0.8384
Epoch [47/51], Iter [110/391] Loss: 0.8204
Epoch [47/51], Iter [120/391] Loss: 0.8057
Epoch [47/51], Iter [130/391] Loss: 0.8244
Epoch [47/51], Iter [140/391] Loss: 0.8085
Epoch [47/51], Iter [150/391] Loss: 0.8095
Epoch [47/51], Iter [160/391] Loss: 0.8321
Epoch [47/51], Iter [170/391] Loss: 0.8236
Epoch [47/51], Iter [180/391] Loss: 0.8192
Epoch [47/51], Iter [190/391] Loss: 0.8344
Epoch [47/51], Iter [200/391] Loss: 0.8202
Epoch [47/51], Iter [210/391] Loss: 0.8442
Epoch [47/51], Iter [220/391] Loss: 0.8134
Epoch [47/51], Iter [230/391] Loss: 0.8263
Epoch [47/51], Iter [240/391] Loss: 0.8189
Epoch [47/51], Iter [250/391] Loss: 0.8376
Epoch [47/51], Iter [260/391] Loss: 0.8220
Epoch [47/51], Iter [270/391] Loss: 0.8282
Epoch [47/51], Iter [280/391] Loss: 0.8218
Epoch [47/51], Iter [290/391] Loss: 0.8481
Epoch [47/51], Iter [300/391] Loss: 0.8098
Epoch [47/51], Iter [310/391] Loss: 0.8266
Epoch [47/51], Iter [320/391] Loss: 0.8176
Epoch [47/51], Iter [330/391] Loss: 0.8215
Epoch [47/51], Iter [340/391] Loss: 0.8170
Epoch [47/51], Iter [350/391] Loss: 0.8212
Epoch [47/51], Iter [360/391] Loss: 0.8311
Epoch [47/51], Iter [370/391] Loss: 0.8253
Epoch [47/51], Iter [380/391] Loss: 0.8182
Epoch [47/51], Iter [390/391] Loss: 0.8224
Epoch [48/51], Iter [10/391] Loss: 0.8063
Epoch [48/51], Iter [20/391] Loss: 0.8254
Epoch [48/51], Iter [30/391] Loss: 0.8167
Epoch [48/51], Iter [40/391] Loss: 0.8223
Epoch [48/51], Iter [50/391] Loss: 0.8100
Epoch [48/51], Iter [60/391] Loss: 0.8117
Epoch [48/51], Iter [70/391] Loss: 0.8233
Epoch [48/51], Iter [80/391] Loss: 0.8279
Epoch [48/51], Iter [90/391] Loss: 0.8191
Epoch [48/51], Iter [100/391] Loss: 0.8333
Epoch [48/51], Iter [110/391] Loss: 0.8249
Epoch [48/51], Iter [120/391] Loss: 0.8220
Epoch [48/51], Iter [130/391] Loss: 0.8225
Epoch [48/51], Iter [140/391] Loss: 0.8150
Epoch [48/51], Iter [150/391] Loss: 0.8246
Epoch [48/51], Iter [160/391] Loss: 0.8184
Epoch [48/51], Iter [170/391] Loss: 0.8206
Epoch [48/51], Iter [180/391] Loss: 0.8294
Epoch [48/51], Iter [190/391] Loss: 0.8127
Epoch [48/51], Iter [200/391] Loss: 0.8168
Epoch [48/51], Iter [210/391] Loss: 0.8206
Epoch [48/51], Iter [220/391] Loss: 0.8278
Epoch [48/51], Iter [230/391] Loss: 0.8201
Epoch [48/51], Iter [240/391] Loss: 0.8245
Epoch [48/51], Iter [250/391] Loss: 0.8157
Epoch [48/51], Iter [260/391] Loss: 0.8231
Epoch [48/51], Iter [270/391] Loss: 0.8313
Epoch [48/51], Iter [280/391] Loss: 0.8196
Epoch [48/51], Iter [290/391] Loss: 0.8216
Epoch [48/51], Iter [300/391] Loss: 0.8279
Epoch [48/51], Iter [310/391] Loss: 0.8243
Epoch [48/51], Iter [320/391] Loss: 0.8108
Epoch [48/51], Iter [330/391] Loss: 0.8184
Epoch [48/51], Iter [340/391] Loss: 0.8277
Epoch [48/51], Iter [350/391] Loss: 0.8277
Epoch [48/51], Iter [360/391] Loss: 0.8297
Epoch [48/51], Iter [370/391] Loss: 0.8248
Epoch [48/51], Iter [380/391] Loss: 0.8127
Epoch [48/51], Iter [390/391] Loss: 0.8242
Epoch [49/51], Iter [10/391] Loss: 0.8137
Epoch [49/51], Iter [20/391] Loss: 0.8304
Epoch [49/51], Iter [30/391] Loss: 0.8208
Epoch [49/51], Iter [40/391] Loss: 0.8200
Epoch [49/51], Iter [50/391] Loss: 0.8131
Epoch [49/51], Iter [60/391] Loss: 0.8273
Epoch [49/51], Iter [70/391] Loss: 0.8168
Epoch [49/51], Iter [80/391] Loss: 0.8134
Epoch [49/51], Iter [90/391] Loss: 0.8287
Epoch [49/51], Iter [100/391] Loss: 0.8381
Epoch [49/51], Iter [110/391] Loss: 0.8189
Epoch [49/51], Iter [120/391] Loss: 0.8212
Epoch [49/51], Iter [130/391] Loss: 0.8362
Epoch [49/51], Iter [140/391] Loss: 0.8233
Epoch [49/51], Iter [150/391] Loss: 0.8280
Epoch [49/51], Iter [160/391] Loss: 0.8173
Epoch [49/51], Iter [170/391] Loss: 0.8196
Epoch [49/51], Iter [180/391] Loss: 0.8187
Epoch [49/51], Iter [190/391] Loss: 0.8065
Epoch [49/51], Iter [200/391] Loss: 0.8151
Epoch [49/51], Iter [210/391] Loss: 0.8145
Epoch [49/51], Iter [220/391] Loss: 0.8171
Epoch [49/51], Iter [230/391] Loss: 0.8379
Epoch [49/51], Iter [240/391] Loss: 0.8327
Epoch [49/51], Iter [250/391] Loss: 0.8151
Epoch [49/51], Iter [260/391] Loss: 0.8382
Epoch [49/51], Iter [270/391] Loss: 0.8281
Epoch [49/51], Iter [280/391] Loss: 0.8269
Epoch [49/51], Iter [290/391] Loss: 0.8228
Epoch [49/51], Iter [300/391] Loss: 0.8233
Epoch [49/51], Iter [310/391] Loss: 0.8308
Epoch [49/51], Iter [320/391] Loss: 0.8193
Epoch [49/51], Iter [330/391] Loss: 0.8172
Epoch [49/51], Iter [340/391] Loss: 0.8202
Epoch [49/51], Iter [350/391] Loss: 0.8158
Epoch [49/51], Iter [360/391] Loss: 0.8406
Epoch [49/51], Iter [370/391] Loss: 0.8216
Epoch [49/51], Iter [380/391] Loss: 0.8255
Epoch [49/51], Iter [390/391] Loss: 0.8445
Epoch [50/51], Iter [10/391] Loss: 0.8127
Epoch [50/51], Iter [20/391] Loss: 0.8207
Epoch [50/51], Iter [30/391] Loss: 0.8080
Epoch [50/51], Iter [40/391] Loss: 0.8171
Epoch [50/51], Iter [50/391] Loss: 0.8255
Epoch [50/51], Iter [60/391] Loss: 0.8143
Epoch [50/51], Iter [70/391] Loss: 0.8184
Epoch [50/51], Iter [80/391] Loss: 0.8185
Epoch [50/51], Iter [90/391] Loss: 0.8281
Epoch [50/51], Iter [100/391] Loss: 0.8128
Epoch [50/51], Iter [110/391] Loss: 0.8266
Epoch [50/51], Iter [120/391] Loss: 0.8378
Epoch [50/51], Iter [130/391] Loss: 0.8129
Epoch [50/51], Iter [140/391] Loss: 0.8131
Epoch [50/51], Iter [150/391] Loss: 0.8328
Epoch [50/51], Iter [160/391] Loss: 0.8263
Epoch [50/51], Iter [170/391] Loss: 0.8243
Epoch [50/51], Iter [180/391] Loss: 0.8162
Epoch [50/51], Iter [190/391] Loss: 0.8403
Epoch [50/51], Iter [200/391] Loss: 0.8178
Epoch [50/51], Iter [210/391] Loss: 0.8081
Epoch [50/51], Iter [220/391] Loss: 0.8243
Epoch [50/51], Iter [230/391] Loss: 0.8093
Epoch [50/51], Iter [240/391] Loss: 0.8218
Epoch [50/51], Iter [250/391] Loss: 0.8185
Epoch [50/51], Iter [260/391] Loss: 0.8214
Epoch [50/51], Iter [270/391] Loss: 0.8238
Epoch [50/51], Iter [280/391] Loss: 0.8173
Epoch [50/51], Iter [290/391] Loss: 0.8018
Epoch [50/51], Iter [300/391] Loss: 0.8336
Epoch [50/51], Iter [310/391] Loss: 0.8192
Epoch [50/51], Iter [320/391] Loss: 0.8292
Epoch [50/51], Iter [330/391] Loss: 0.8115
Epoch [50/51], Iter [340/391] Loss: 0.8176
Epoch [50/51], Iter [350/391] Loss: 0.8216
Epoch [50/51], Iter [360/391] Loss: 0.8143
Epoch [50/51], Iter [370/391] Loss: 0.8142
Epoch [50/51], Iter [380/391] Loss: 0.8102
Epoch [50/51], Iter [390/391] Loss: 0.8169
[Saving Checkpoint]
Epoch [51/51], Iter [10/391] Loss: 0.8377
Epoch [51/51], Iter [20/391] Loss: 0.8255
Epoch [51/51], Iter [30/391] Loss: 0.8190
Epoch [51/51], Iter [40/391] Loss: 0.8202
Epoch [51/51], Iter [50/391] Loss: 0.8172
Epoch [51/51], Iter [60/391] Loss: 0.8203
Epoch [51/51], Iter [70/391] Loss: 0.8158
Epoch [51/51], Iter [80/391] Loss: 0.8228
Epoch [51/51], Iter [90/391] Loss: 0.8193
Epoch [51/51], Iter [100/391] Loss: 0.8100
Epoch [51/51], Iter [110/391] Loss: 0.8250
Epoch [51/51], Iter [120/391] Loss: 0.8218
Epoch [51/51], Iter [130/391] Loss: 0.8119
Epoch [51/51], Iter [140/391] Loss: 0.8087
Epoch [51/51], Iter [150/391] Loss: 0.8064
Epoch [51/51], Iter [160/391] Loss: 0.8167
Epoch [51/51], Iter [170/391] Loss: 0.8435
Epoch [51/51], Iter [180/391] Loss: 0.8138
Epoch [51/51], Iter [190/391] Loss: 0.8223
Epoch [51/51], Iter [200/391] Loss: 0.8235
Epoch [51/51], Iter [210/391] Loss: 0.8345
Epoch [51/51], Iter [220/391] Loss: 0.8272
Epoch [51/51], Iter [230/391] Loss: 0.8327
Epoch [51/51], Iter [240/391] Loss: 0.8202
Epoch [51/51], Iter [250/391] Loss: 0.8129
Epoch [51/51], Iter [260/391] Loss: 0.8255
Epoch [51/51], Iter [270/391] Loss: 0.8170
Epoch [51/51], Iter [280/391] Loss: 0.8150
Epoch [51/51], Iter [290/391] Loss: 0.8282
Epoch [51/51], Iter [300/391] Loss: 0.8103
Epoch [51/51], Iter [310/391] Loss: 0.8139
Epoch [51/51], Iter [320/391] Loss: 0.8258
Epoch [51/51], Iter [330/391] Loss: 0.8245
Epoch [51/51], Iter [340/391] Loss: 0.8142
Epoch [51/51], Iter [350/391] Loss: 0.8277
Epoch [51/51], Iter [360/391] Loss: 0.8232
Epoch [51/51], Iter [370/391] Loss: 0.8220
Epoch [51/51], Iter [380/391] Loss: 0.8367
Epoch [51/51], Iter [390/391] Loss: 0.8106
# | a=0.5 | T=15 | epochs = 51 |
resnet_child_a0dot5_t15_e51 = copy.deepcopy(resnet_child) #let's save for future reference
test_harness( testloader, resnet_child_a0dot5_t15_e51 )
Accuracy of the model on the test images: 89 %
(tensor(8958, device='cuda:0'), 10000)
Conclusion and Findings
$\large{ Temp }$ | $\large{ Test \ Accy }$ |
---|---|
$\large{1}$ | $\large{\color{red}{91.43\%}}$ |
$ \large{ Let \ \alpha = 0 } \to\, \large{ C = C_{hard} } $
$\large{\alpha}$ | $\large{ Temp }$ | $\large{ Test \ Accy }$ |
---|---|---|
$\large{\color{blue}{0}}$ | $\large{\color{blue}{1}}$ | $\large{\color{blue}{88.59\%}}$ |
$ \large{ Let \ \alpha = 0.5 } \to\, \large{ C = \frac{1}{2} \, C_{soft} \, T^2 + \frac{1}{2} \, C_{hard} } $
$\large{\alpha}$ | $\large{ Temp }$ | $\large{ Test \ Accy }$ |
---|---|---|
$\large{0.5}$ | $\large{2}$ | $\large{ 89.05\%}$ |
$\large{0.5}$ | $\large{5}$ | $\large{ 90.15\%}$ |
$\large{\color{blue}{0.5} }$ | $\large{\color{blue}{10}}$ | $\large{ \color{blue}{91.07\%}}$ |
$\large{0.5}$ | $\large{15}$ | $\large{ 89.58\%}$ |
$ \large{ Let \ \alpha = 1 } \to\, \large{ C = C_{soft} \, T^2 }$
$\large{\alpha}$ | $\large{ Temp }$ | $\large{ Test \ Accy }$ |
---|---|---|
$\large{1}$ | $\large{2}$ | $\large{ 88.27\%}$ |
$\color{blue}{\large{1}}$ | $\color{blue}{\large{5}}$ | $\color{blue}{\large{ 90.55\%}}$ |
$\large{1}$ | $\large{10}$ | $\large{ 89.87\%}$ |
$\large{1}$ | $\large{15}$ | $\large{ 89.10\%}$ |
$\textbf{Key Takeaways}$
As shown above, the ResCNN parent (teacher) network demonstrated a test set accuracy of $\textbf{91.43%}$. Given the increased difficulty of the CIFAR-10 dataset as compared to MNIST, this is a strong result. The issue, however, is that the parent model is cumbersome. There are a total of 12 layers (not counting the input conv. layer), arranged as as follows: 4 layers of 160 channels, 4 layers of 320 channels, and 4 layers of 640 channels. Each layer involves two convolution operators and two non-linear activation functions (ReLU). There are a total of roughly $\textbf{38.54 million}$ trainable parameters in the model.
Now, the question becomes - can we transfer or “distill” this knowledge to a simpler, computationally lighter model? The model we choose to implement is architecturally similar to the parent model; however, it has fewer layers and fewer channels per layer. This child (student) model has $\textbf{4.4 million}$ trainable parameters; in other words, the model has $\textbf{~88.5% fewer}$ trainable parameters than the parent model!
In fact, as it turns out, this “light” model is $\textit{very}$ effective. We ran a suite of benchmarks, where we set $\alpha=0$, $\alpha=\frac{1}{2}$, and $\alpha=1$. In the first benchmark, we set $\alpha=0$ as a baseline - we wanted the child model to learn $\textit{only}$ from the hard labels. Under test conditions, the model scored $ \textbf{88.59%}$ correct or $\textbf{2.85%}$ worse than the “cumbersome” parent model.
Is a difference of $\textbf{2.85%}$ statistically significant?
Another way of asking this question, is the following: What is the probability that n >= 9,143 given a Binomial( n=10,000 | pr=0.8859 )?
where n=10,000 because CIFAR-10 has 10,000 images saved for the test case and pr=0.8859 because that is the test accuracy of the child (student) model.
This essentially amounts to a probability of $\textbf{0}$. Thus, we conclude that $\textbf{2.85%}$ performance difference between the parent and the child is statistically significant.
Next, we try to train the child model by using only the output of the parent model; in other words, $ \alpha \textbf{ = 1.}$ Given the results (tabulated above), we see that knowledge distillation process is not only possible but works very well. In fact, if we set the $ \textbf{T = 5}$ , the model scores $ \textbf{90.55%}$, a statistically significant $\textbf{1.96%}$ better than before - when the model was trained on only the hard-labeled data.
$\textbf{Thus, using the parent (teacher) output and setting a higher temperature signifcantly boosted the child (student) model’s performance!}$
Finally, we experiment by using a linear combination of both the parent (teacher) output and the hard-labeled data as input to our child (student) model. we set $\alpha=\textbf{0.5}$ Using this result, we obtain test-set accuracy of $\textbf{91.07%}$ Is this statistically significant over our previous result, $\textbf{90.55%}$, when we only used the teacher output? Given that there is only a $\textbf{3.83%}$ chance that we obtained $\textbf{91.07%}$ by chance, we conclude that it is statistically significant.
Thus, we conclude that the best performance was obtained using a linear combination of parent outputs, softened by using $\textbf{T=10}$, and hard labels. At the end, the simpler child model scored $\textbf{91.07%}$ - only $\textbf{0.36%}$ below the parent. Thus, using this enhanced learning technique, the child (student) model performed close to the parent (teacher) model or the “cumbersome” model.
$\LARGE{\to}$ We conclude, that the knowledge “distillation” process is effective at transferring knowledge. Although the child (student) model can learn directly from the teacher, without any hard-labeled data, the process is most effective when using a linear combination of the parent (teacher) output and hard-labeled data.
$\LARGE{\to}$ The knowledge “distillation” process, as described by Hinton et. al. (2015), boosted the simpler, child (student) model performance by $\textbf{2.48%}$ compared to just traing the student model using the hard-target.
$\LARGE{\to}$ In the case where have a pre-trained parent (teacher) and no labeled data, we can use the distillation process to train a child (student) model. This is important given that labeled datasets are often hard to come by and are not usually made available to the public.
Auxiliary Information | Model Summary
#Reference: https://gist.github.com/HTLife/b6640af9d6e7d765411f8aa9aa94b837
##
from collections import OrderedDict
import torch as th
def summary(input_size, model):
def register_hook(module):
def hook(module, input, output):
class_name = str(module.__class__).split('.')[-1].split("'")[0]
module_idx = len(summary)
m_key = '%s-%i' % (class_name, module_idx+1)
summary[m_key] = OrderedDict()
summary[m_key]['input_shape'] = list(input[0].size())
summary[m_key]['input_shape'][0] = -1
summary[m_key]['output_shape'] = list(output.size())
summary[m_key]['output_shape'][0] = -1
params = 0
if hasattr(module, 'weight'):
params += th.prod(th.LongTensor(list(module.weight.size())))
if module.weight.requires_grad:
summary[m_key]['trainable'] = True
else:
summary[m_key]['trainable'] = False
#if hasattr(module, 'bias'):
# params += th.prod(th.LongTensor(list(module.bias.size())))
summary[m_key]['nb_params'] = params
if not isinstance(module, nn.Sequential) and \
not isinstance(module, nn.ModuleList) and \
not (module == model):
hooks.append(module.register_forward_hook(hook))
dtype = th.cuda.FloatTensor
# check if there are multiple inputs to the network
if isinstance(input_size[0], (list, tuple)):
x = [Variable(th.rand(1,*in_size)).type(dtype) for in_size in input_size]
else:
x = Variable(th.rand(1,*input_size)).type(dtype)
print(x.shape)
print(type(x[0]))
# create properties
summary = OrderedDict()
hooks = []
# register hook
model.apply(register_hook)
# make a forward pass
model(x)
# remove these hooks
for h in hooks:
h.remove()
print('----------------------------------------------------------------')
line_new = '{:>20} {:>25} {:>15}'.format('Layer (type)', 'Output Shpae', 'Param #')
print(line_new)
print('================================================================')
total_params = 0
trainable_params = 0
for layer in summary:
## input_shape, output_shape, trainable, nb_params
line_new = '{:>20} {:>25} {:>15}'.format(str(layer), str(summary[layer]['output_shape']), str(summary[layer]['nb_params']))
total_params += summary[layer]['nb_params']
if 'trainable' in summary[layer]:
if summary[layer]['trainable'] == True:
trainable_params += summary[layer]['nb_params']
print(line_new)
print('================================================================')
print('Total params: ' + str(total_params))
print('Trainable params: ' + str(trainable_params))
print('Non-trainable params: ' + str(total_params - trainable_params))
print('----------------------------------------------------------------')
return summary
$\textbf{ Parent Summary }$
summary( [3, 32, 32], resnet_parent )
torch.Size([1, 3, 32, 32])
<class 'torch.Tensor'>
----------------------------------------------------------------
Layer (type) Output Shpae Param #
================================================================
Conv2d-1 [-1, 16, 32, 32] tensor(432)
BatchNorm2d-2 [-1, 16, 32, 32] tensor(16)
ReLU-3 [-1, 16, 32, 32] 0
Conv2d-4 [-1, 160, 32, 32] tensor(23040)
BatchNorm2d-5 [-1, 160, 32, 32] tensor(160)
ReLU-6 [-1, 160, 32, 32] 0
Conv2d-7 [-1, 160, 32, 32] tensor(2.3040e+05)
BatchNorm2d-8 [-1, 160, 32, 32] tensor(160)
Conv2d-9 [-1, 160, 32, 32] tensor(23040)
BatchNorm2d-10 [-1, 160, 32, 32] tensor(160)
ReLU-11 [-1, 160, 32, 32] 0
ResidualBlock-12 [-1, 160, 32, 32] 0
Conv2d-13 [-1, 160, 32, 32] tensor(2.3040e+05)
BatchNorm2d-14 [-1, 160, 32, 32] tensor(160)
ReLU-15 [-1, 160, 32, 32] 0
Conv2d-16 [-1, 160, 32, 32] tensor(2.3040e+05)
BatchNorm2d-17 [-1, 160, 32, 32] tensor(160)
ReLU-18 [-1, 160, 32, 32] 0
ResidualBlock-19 [-1, 160, 32, 32] 0
Conv2d-20 [-1, 160, 32, 32] tensor(2.3040e+05)
BatchNorm2d-21 [-1, 160, 32, 32] tensor(160)
ReLU-22 [-1, 160, 32, 32] 0
Conv2d-23 [-1, 160, 32, 32] tensor(2.3040e+05)
BatchNorm2d-24 [-1, 160, 32, 32] tensor(160)
ReLU-25 [-1, 160, 32, 32] 0
ResidualBlock-26 [-1, 160, 32, 32] 0
Conv2d-27 [-1, 160, 32, 32] tensor(2.3040e+05)
BatchNorm2d-28 [-1, 160, 32, 32] tensor(160)
ReLU-29 [-1, 160, 32, 32] 0
Conv2d-30 [-1, 160, 32, 32] tensor(2.3040e+05)
BatchNorm2d-31 [-1, 160, 32, 32] tensor(160)
ReLU-32 [-1, 160, 32, 32] 0
ResidualBlock-33 [-1, 160, 32, 32] 0
Conv2d-34 [-1, 320, 16, 16] tensor(4.6080e+05)
BatchNorm2d-35 [-1, 320, 16, 16] tensor(320)
ReLU-36 [-1, 320, 16, 16] 0
Conv2d-37 [-1, 320, 16, 16] tensor(9.2160e+05)
BatchNorm2d-38 [-1, 320, 16, 16] tensor(320)
Conv2d-39 [-1, 320, 16, 16] tensor(4.6080e+05)
BatchNorm2d-40 [-1, 320, 16, 16] tensor(320)
ReLU-41 [-1, 320, 16, 16] 0
ResidualBlock-42 [-1, 320, 16, 16] 0
Conv2d-43 [-1, 320, 16, 16] tensor(9.2160e+05)
BatchNorm2d-44 [-1, 320, 16, 16] tensor(320)
ReLU-45 [-1, 320, 16, 16] 0
Conv2d-46 [-1, 320, 16, 16] tensor(9.2160e+05)
BatchNorm2d-47 [-1, 320, 16, 16] tensor(320)
ReLU-48 [-1, 320, 16, 16] 0
ResidualBlock-49 [-1, 320, 16, 16] 0
Conv2d-50 [-1, 320, 16, 16] tensor(9.2160e+05)
BatchNorm2d-51 [-1, 320, 16, 16] tensor(320)
ReLU-52 [-1, 320, 16, 16] 0
Conv2d-53 [-1, 320, 16, 16] tensor(9.2160e+05)
BatchNorm2d-54 [-1, 320, 16, 16] tensor(320)
ReLU-55 [-1, 320, 16, 16] 0
ResidualBlock-56 [-1, 320, 16, 16] 0
Conv2d-57 [-1, 320, 16, 16] tensor(9.2160e+05)
BatchNorm2d-58 [-1, 320, 16, 16] tensor(320)
ReLU-59 [-1, 320, 16, 16] 0
Conv2d-60 [-1, 320, 16, 16] tensor(9.2160e+05)
BatchNorm2d-61 [-1, 320, 16, 16] tensor(320)
ReLU-62 [-1, 320, 16, 16] 0
ResidualBlock-63 [-1, 320, 16, 16] 0
Conv2d-64 [-1, 640, 8, 8] tensor(1.8432e+06)
BatchNorm2d-65 [-1, 640, 8, 8] tensor(640)
ReLU-66 [-1, 640, 8, 8] 0
Conv2d-67 [-1, 640, 8, 8] tensor(3.6864e+06)
BatchNorm2d-68 [-1, 640, 8, 8] tensor(640)
Conv2d-69 [-1, 640, 8, 8] tensor(1.8432e+06)
BatchNorm2d-70 [-1, 640, 8, 8] tensor(640)
ReLU-71 [-1, 640, 8, 8] 0
ResidualBlock-72 [-1, 640, 8, 8] 0
Conv2d-73 [-1, 640, 8, 8] tensor(3.6864e+06)
BatchNorm2d-74 [-1, 640, 8, 8] tensor(640)
ReLU-75 [-1, 640, 8, 8] 0
Conv2d-76 [-1, 640, 8, 8] tensor(3.6864e+06)
BatchNorm2d-77 [-1, 640, 8, 8] tensor(640)
ReLU-78 [-1, 640, 8, 8] 0
ResidualBlock-79 [-1, 640, 8, 8] 0
Conv2d-80 [-1, 640, 8, 8] tensor(3.6864e+06)
BatchNorm2d-81 [-1, 640, 8, 8] tensor(640)
ReLU-82 [-1, 640, 8, 8] 0
Conv2d-83 [-1, 640, 8, 8] tensor(3.6864e+06)
BatchNorm2d-84 [-1, 640, 8, 8] tensor(640)
ReLU-85 [-1, 640, 8, 8] 0
ResidualBlock-86 [-1, 640, 8, 8] 0
Conv2d-87 [-1, 640, 8, 8] tensor(3.6864e+06)
BatchNorm2d-88 [-1, 640, 8, 8] tensor(640)
ReLU-89 [-1, 640, 8, 8] 0
Conv2d-90 [-1, 640, 8, 8] tensor(3.6864e+06)
BatchNorm2d-91 [-1, 640, 8, 8] tensor(640)
ReLU-92 [-1, 640, 8, 8] 0
ResidualBlock-93 [-1, 640, 8, 8] 0
AvgPool2d-94 [-1, 640, 1, 1] 0
Linear-95 [-1, 10] tensor(6400)
WideResNet-96 [-1, 10] 0
================================================================
Total params: tensor(3.8540e+07)
Trainable params: tensor(3.8540e+07)
Non-trainable params: tensor(0)
----------------------------------------------------------------
OrderedDict([('Conv2d-1',
OrderedDict([('input_shape', [-1, 3, 32, 32]),
('output_shape', [-1, 16, 32, 32]),
('trainable', True),
('nb_params', tensor(432))])),
('BatchNorm2d-2',
OrderedDict([('input_shape', [-1, 16, 32, 32]),
('output_shape', [-1, 16, 32, 32]),
('trainable', True),
('nb_params', tensor(16))])),
('ReLU-3',
OrderedDict([('input_shape', [-1, 16, 32, 32]),
('output_shape', [-1, 16, 32, 32]),
('nb_params', 0)])),
('Conv2d-4',
OrderedDict([('input_shape', [-1, 16, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('trainable', True),
('nb_params', tensor(23040))])),
('BatchNorm2d-5',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('trainable', True),
('nb_params', tensor(160))])),
('ReLU-6',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('nb_params', 0)])),
('Conv2d-7',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('trainable', True),
('nb_params', tensor(2.3040e+05))])),
('BatchNorm2d-8',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('trainable', True),
('nb_params', tensor(160))])),
('Conv2d-9',
OrderedDict([('input_shape', [-1, 16, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('trainable', True),
('nb_params', tensor(23040))])),
('BatchNorm2d-10',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('trainable', True),
('nb_params', tensor(160))])),
('ReLU-11',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('nb_params', 0)])),
('ResidualBlock-12',
OrderedDict([('input_shape', [-1, 16, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('nb_params', 0)])),
('Conv2d-13',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('trainable', True),
('nb_params', tensor(2.3040e+05))])),
('BatchNorm2d-14',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('trainable', True),
('nb_params', tensor(160))])),
('ReLU-15',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('nb_params', 0)])),
('Conv2d-16',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('trainable', True),
('nb_params', tensor(2.3040e+05))])),
('BatchNorm2d-17',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('trainable', True),
('nb_params', tensor(160))])),
('ReLU-18',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('nb_params', 0)])),
('ResidualBlock-19',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('nb_params', 0)])),
('Conv2d-20',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('trainable', True),
('nb_params', tensor(2.3040e+05))])),
('BatchNorm2d-21',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('trainable', True),
('nb_params', tensor(160))])),
('ReLU-22',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('nb_params', 0)])),
('Conv2d-23',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('trainable', True),
('nb_params', tensor(2.3040e+05))])),
('BatchNorm2d-24',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('trainable', True),
('nb_params', tensor(160))])),
('ReLU-25',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('nb_params', 0)])),
('ResidualBlock-26',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('nb_params', 0)])),
('Conv2d-27',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('trainable', True),
('nb_params', tensor(2.3040e+05))])),
('BatchNorm2d-28',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('trainable', True),
('nb_params', tensor(160))])),
('ReLU-29',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('nb_params', 0)])),
('Conv2d-30',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('trainable', True),
('nb_params', tensor(2.3040e+05))])),
('BatchNorm2d-31',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('trainable', True),
('nb_params', tensor(160))])),
('ReLU-32',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('nb_params', 0)])),
('ResidualBlock-33',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('nb_params', 0)])),
('Conv2d-34',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 320, 16, 16]),
('trainable', True),
('nb_params', tensor(4.6080e+05))])),
('BatchNorm2d-35',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('trainable', True),
('nb_params', tensor(320))])),
('ReLU-36',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('nb_params', 0)])),
('Conv2d-37',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('trainable', True),
('nb_params', tensor(9.2160e+05))])),
('BatchNorm2d-38',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('trainable', True),
('nb_params', tensor(320))])),
('Conv2d-39',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 320, 16, 16]),
('trainable', True),
('nb_params', tensor(4.6080e+05))])),
('BatchNorm2d-40',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('trainable', True),
('nb_params', tensor(320))])),
('ReLU-41',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('nb_params', 0)])),
('ResidualBlock-42',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 320, 16, 16]),
('nb_params', 0)])),
('Conv2d-43',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('trainable', True),
('nb_params', tensor(9.2160e+05))])),
('BatchNorm2d-44',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('trainable', True),
('nb_params', tensor(320))])),
('ReLU-45',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('nb_params', 0)])),
('Conv2d-46',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('trainable', True),
('nb_params', tensor(9.2160e+05))])),
('BatchNorm2d-47',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('trainable', True),
('nb_params', tensor(320))])),
('ReLU-48',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('nb_params', 0)])),
('ResidualBlock-49',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('nb_params', 0)])),
('Conv2d-50',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('trainable', True),
('nb_params', tensor(9.2160e+05))])),
('BatchNorm2d-51',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('trainable', True),
('nb_params', tensor(320))])),
('ReLU-52',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('nb_params', 0)])),
('Conv2d-53',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('trainable', True),
('nb_params', tensor(9.2160e+05))])),
('BatchNorm2d-54',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('trainable', True),
('nb_params', tensor(320))])),
('ReLU-55',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('nb_params', 0)])),
('ResidualBlock-56',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('nb_params', 0)])),
('Conv2d-57',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('trainable', True),
('nb_params', tensor(9.2160e+05))])),
('BatchNorm2d-58',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('trainable', True),
('nb_params', tensor(320))])),
('ReLU-59',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('nb_params', 0)])),
('Conv2d-60',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('trainable', True),
('nb_params', tensor(9.2160e+05))])),
('BatchNorm2d-61',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('trainable', True),
('nb_params', tensor(320))])),
('ReLU-62',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('nb_params', 0)])),
('ResidualBlock-63',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('nb_params', 0)])),
('Conv2d-64',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 640, 8, 8]),
('trainable', True),
('nb_params', tensor(1.8432e+06))])),
('BatchNorm2d-65',
OrderedDict([('input_shape', [-1, 640, 8, 8]),
('output_shape', [-1, 640, 8, 8]),
('trainable', True),
('nb_params', tensor(640))])),
('ReLU-66',
OrderedDict([('input_shape', [-1, 640, 8, 8]),
('output_shape', [-1, 640, 8, 8]),
('nb_params', 0)])),
('Conv2d-67',
OrderedDict([('input_shape', [-1, 640, 8, 8]),
('output_shape', [-1, 640, 8, 8]),
('trainable', True),
('nb_params', tensor(3.6864e+06))])),
('BatchNorm2d-68',
OrderedDict([('input_shape', [-1, 640, 8, 8]),
('output_shape', [-1, 640, 8, 8]),
('trainable', True),
('nb_params', tensor(640))])),
('Conv2d-69',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 640, 8, 8]),
('trainable', True),
('nb_params', tensor(1.8432e+06))])),
('BatchNorm2d-70',
OrderedDict([('input_shape', [-1, 640, 8, 8]),
('output_shape', [-1, 640, 8, 8]),
('trainable', True),
('nb_params', tensor(640))])),
('ReLU-71',
OrderedDict([('input_shape', [-1, 640, 8, 8]),
('output_shape', [-1, 640, 8, 8]),
('nb_params', 0)])),
('ResidualBlock-72',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 640, 8, 8]),
('nb_params', 0)])),
('Conv2d-73',
OrderedDict([('input_shape', [-1, 640, 8, 8]),
('output_shape', [-1, 640, 8, 8]),
('trainable', True),
('nb_params', tensor(3.6864e+06))])),
('BatchNorm2d-74',
OrderedDict([('input_shape', [-1, 640, 8, 8]),
('output_shape', [-1, 640, 8, 8]),
('trainable', True),
('nb_params', tensor(640))])),
('ReLU-75',
OrderedDict([('input_shape', [-1, 640, 8, 8]),
('output_shape', [-1, 640, 8, 8]),
('nb_params', 0)])),
('Conv2d-76',
OrderedDict([('input_shape', [-1, 640, 8, 8]),
('output_shape', [-1, 640, 8, 8]),
('trainable', True),
('nb_params', tensor(3.6864e+06))])),
('BatchNorm2d-77',
OrderedDict([('input_shape', [-1, 640, 8, 8]),
('output_shape', [-1, 640, 8, 8]),
('trainable', True),
('nb_params', tensor(640))])),
('ReLU-78',
OrderedDict([('input_shape', [-1, 640, 8, 8]),
('output_shape', [-1, 640, 8, 8]),
('nb_params', 0)])),
('ResidualBlock-79',
OrderedDict([('input_shape', [-1, 640, 8, 8]),
('output_shape', [-1, 640, 8, 8]),
('nb_params', 0)])),
('Conv2d-80',
OrderedDict([('input_shape', [-1, 640, 8, 8]),
('output_shape', [-1, 640, 8, 8]),
('trainable', True),
('nb_params', tensor(3.6864e+06))])),
('BatchNorm2d-81',
OrderedDict([('input_shape', [-1, 640, 8, 8]),
('output_shape', [-1, 640, 8, 8]),
('trainable', True),
('nb_params', tensor(640))])),
('ReLU-82',
OrderedDict([('input_shape', [-1, 640, 8, 8]),
('output_shape', [-1, 640, 8, 8]),
('nb_params', 0)])),
('Conv2d-83',
OrderedDict([('input_shape', [-1, 640, 8, 8]),
('output_shape', [-1, 640, 8, 8]),
('trainable', True),
('nb_params', tensor(3.6864e+06))])),
('BatchNorm2d-84',
OrderedDict([('input_shape', [-1, 640, 8, 8]),
('output_shape', [-1, 640, 8, 8]),
('trainable', True),
('nb_params', tensor(640))])),
('ReLU-85',
OrderedDict([('input_shape', [-1, 640, 8, 8]),
('output_shape', [-1, 640, 8, 8]),
('nb_params', 0)])),
('ResidualBlock-86',
OrderedDict([('input_shape', [-1, 640, 8, 8]),
('output_shape', [-1, 640, 8, 8]),
('nb_params', 0)])),
('Conv2d-87',
OrderedDict([('input_shape', [-1, 640, 8, 8]),
('output_shape', [-1, 640, 8, 8]),
('trainable', True),
('nb_params', tensor(3.6864e+06))])),
('BatchNorm2d-88',
OrderedDict([('input_shape', [-1, 640, 8, 8]),
('output_shape', [-1, 640, 8, 8]),
('trainable', True),
('nb_params', tensor(640))])),
('ReLU-89',
OrderedDict([('input_shape', [-1, 640, 8, 8]),
('output_shape', [-1, 640, 8, 8]),
('nb_params', 0)])),
('Conv2d-90',
OrderedDict([('input_shape', [-1, 640, 8, 8]),
('output_shape', [-1, 640, 8, 8]),
('trainable', True),
('nb_params', tensor(3.6864e+06))])),
('BatchNorm2d-91',
OrderedDict([('input_shape', [-1, 640, 8, 8]),
('output_shape', [-1, 640, 8, 8]),
('trainable', True),
('nb_params', tensor(640))])),
('ReLU-92',
OrderedDict([('input_shape', [-1, 640, 8, 8]),
('output_shape', [-1, 640, 8, 8]),
('nb_params', 0)])),
('ResidualBlock-93',
OrderedDict([('input_shape', [-1, 640, 8, 8]),
('output_shape', [-1, 640, 8, 8]),
('nb_params', 0)])),
('AvgPool2d-94',
OrderedDict([('input_shape', [-1, 640, 8, 8]),
('output_shape', [-1, 640, 1, 1]),
('nb_params', 0)])),
('Linear-95',
OrderedDict([('input_shape', [-1, 640]),
('output_shape', [-1, 10]),
('trainable', True),
('nb_params', tensor(6400))])),
('WideResNet-96',
OrderedDict([('input_shape', [-1, 3, 32, 32]),
('output_shape', [-1, 10]),
('nb_params', 0)]))])
$\textbf{ Child Summary }$
summary( [3, 32, 32], get_new_child_model()[0] )
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:12: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
if sys.path[0] == '':
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:26: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:15: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
from ipykernel import kernelapp as app
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:20: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
C:\Users\Eduard\Anaconda3\lib\site-packages\ipykernel_launcher.py:21: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
ENABLING GPU ACCELERATION || GeForce GTX 1080 Ti
torch.Size([1, 3, 32, 32])
<class 'torch.Tensor'>
----------------------------------------------------------------
Layer (type) Output Shpae Param #
================================================================
Conv2d-1 [-1, 16, 32, 32] tensor(432)
BatchNorm2d-2 [-1, 16, 32, 32] tensor(16)
ReLU-3 [-1, 16, 32, 32] 0
Conv2d-4 [-1, 160, 32, 32] tensor(23040)
BatchNorm2d-5 [-1, 160, 32, 32] tensor(160)
ReLU-6 [-1, 160, 32, 32] 0
Conv2d-7 [-1, 160, 32, 32] tensor(2.3040e+05)
BatchNorm2d-8 [-1, 160, 32, 32] tensor(160)
Conv2d-9 [-1, 160, 32, 32] tensor(23040)
BatchNorm2d-10 [-1, 160, 32, 32] tensor(160)
ReLU-11 [-1, 160, 32, 32] 0
ResidualBlock-12 [-1, 160, 32, 32] 0
Conv2d-13 [-1, 160, 32, 32] tensor(2.3040e+05)
BatchNorm2d-14 [-1, 160, 32, 32] tensor(160)
ReLU-15 [-1, 160, 32, 32] 0
Conv2d-16 [-1, 160, 32, 32] tensor(2.3040e+05)
BatchNorm2d-17 [-1, 160, 32, 32] tensor(160)
ReLU-18 [-1, 160, 32, 32] 0
ResidualBlock-19 [-1, 160, 32, 32] 0
Conv2d-20 [-1, 320, 16, 16] tensor(4.6080e+05)
BatchNorm2d-21 [-1, 320, 16, 16] tensor(320)
ReLU-22 [-1, 320, 16, 16] 0
Conv2d-23 [-1, 320, 16, 16] tensor(9.2160e+05)
BatchNorm2d-24 [-1, 320, 16, 16] tensor(320)
Conv2d-25 [-1, 320, 16, 16] tensor(4.6080e+05)
BatchNorm2d-26 [-1, 320, 16, 16] tensor(320)
ReLU-27 [-1, 320, 16, 16] 0
ResidualBlock-28 [-1, 320, 16, 16] 0
Conv2d-29 [-1, 320, 16, 16] tensor(9.2160e+05)
BatchNorm2d-30 [-1, 320, 16, 16] tensor(320)
ReLU-31 [-1, 320, 16, 16] 0
Conv2d-32 [-1, 320, 16, 16] tensor(9.2160e+05)
BatchNorm2d-33 [-1, 320, 16, 16] tensor(320)
ReLU-34 [-1, 320, 16, 16] 0
ResidualBlock-35 [-1, 320, 16, 16] 0
AvgPool2d-36 [-1, 320, 1, 1] 0
Linear-37 [-1, 10] tensor(3200)
ResNetChild-38 [-1, 10] 0
================================================================
Total params: tensor(4.4297e+06)
Trainable params: tensor(4.4297e+06)
Non-trainable params: tensor(0)
----------------------------------------------------------------
OrderedDict([('Conv2d-1',
OrderedDict([('input_shape', [-1, 3, 32, 32]),
('output_shape', [-1, 16, 32, 32]),
('trainable', True),
('nb_params', tensor(432))])),
('BatchNorm2d-2',
OrderedDict([('input_shape', [-1, 16, 32, 32]),
('output_shape', [-1, 16, 32, 32]),
('trainable', True),
('nb_params', tensor(16))])),
('ReLU-3',
OrderedDict([('input_shape', [-1, 16, 32, 32]),
('output_shape', [-1, 16, 32, 32]),
('nb_params', 0)])),
('Conv2d-4',
OrderedDict([('input_shape', [-1, 16, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('trainable', True),
('nb_params', tensor(23040))])),
('BatchNorm2d-5',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('trainable', True),
('nb_params', tensor(160))])),
('ReLU-6',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('nb_params', 0)])),
('Conv2d-7',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('trainable', True),
('nb_params', tensor(2.3040e+05))])),
('BatchNorm2d-8',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('trainable', True),
('nb_params', tensor(160))])),
('Conv2d-9',
OrderedDict([('input_shape', [-1, 16, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('trainable', True),
('nb_params', tensor(23040))])),
('BatchNorm2d-10',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('trainable', True),
('nb_params', tensor(160))])),
('ReLU-11',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('nb_params', 0)])),
('ResidualBlock-12',
OrderedDict([('input_shape', [-1, 16, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('nb_params', 0)])),
('Conv2d-13',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('trainable', True),
('nb_params', tensor(2.3040e+05))])),
('BatchNorm2d-14',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('trainable', True),
('nb_params', tensor(160))])),
('ReLU-15',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('nb_params', 0)])),
('Conv2d-16',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('trainable', True),
('nb_params', tensor(2.3040e+05))])),
('BatchNorm2d-17',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('trainable', True),
('nb_params', tensor(160))])),
('ReLU-18',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('nb_params', 0)])),
('ResidualBlock-19',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 160, 32, 32]),
('nb_params', 0)])),
('Conv2d-20',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 320, 16, 16]),
('trainable', True),
('nb_params', tensor(4.6080e+05))])),
('BatchNorm2d-21',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('trainable', True),
('nb_params', tensor(320))])),
('ReLU-22',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('nb_params', 0)])),
('Conv2d-23',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('trainable', True),
('nb_params', tensor(9.2160e+05))])),
('BatchNorm2d-24',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('trainable', True),
('nb_params', tensor(320))])),
('Conv2d-25',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 320, 16, 16]),
('trainable', True),
('nb_params', tensor(4.6080e+05))])),
('BatchNorm2d-26',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('trainable', True),
('nb_params', tensor(320))])),
('ReLU-27',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('nb_params', 0)])),
('ResidualBlock-28',
OrderedDict([('input_shape', [-1, 160, 32, 32]),
('output_shape', [-1, 320, 16, 16]),
('nb_params', 0)])),
('Conv2d-29',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('trainable', True),
('nb_params', tensor(9.2160e+05))])),
('BatchNorm2d-30',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('trainable', True),
('nb_params', tensor(320))])),
('ReLU-31',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('nb_params', 0)])),
('Conv2d-32',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('trainable', True),
('nb_params', tensor(9.2160e+05))])),
('BatchNorm2d-33',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('trainable', True),
('nb_params', tensor(320))])),
('ReLU-34',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('nb_params', 0)])),
('ResidualBlock-35',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 16, 16]),
('nb_params', 0)])),
('AvgPool2d-36',
OrderedDict([('input_shape', [-1, 320, 16, 16]),
('output_shape', [-1, 320, 1, 1]),
('nb_params', 0)])),
('Linear-37',
OrderedDict([('input_shape', [-1, 320]),
('output_shape', [-1, 10]),
('trainable', True),
('nb_params', tensor(3200))])),
('ResNetChild-38',
OrderedDict([('input_shape', [-1, 3, 32, 32]),
('output_shape', [-1, 10]),
('nb_params', 0)]))])