Author | Dr. VAIBHAV KUMAR compile | source of vitamin k | Analytics In Diamag

There are many popular variants of artificial neural networks for both supervised and unsupervised learning problems. Autoencoder is also a variant of neural network, mainly used for unsupervised learning problems.

When they have multiple hidden layers in the architecture, they are called deep autoencoders. These models can be applied to a variety of applications including image reconstruction.

In image reconstruction, they learn the representation of the input image pattern and reconstruct a new image that matches the original input image pattern. Image reconstruction has many important applications, especially in the medical field, where it is necessary to extract decoded noiseless images from existing incomplete or noisy images.

In this article, we will demonstrate the implementation of a deep autoencoder for image reconstruction in PyTorch. The deep learning model will take MNIST handwritten digits as the training object and reconstruct the digital image after learning the representation of the input image.

Since the encoder

Autoencoders are variants of artificial neural networks commonly used to learn valid data encodings in an unsupervised manner.

They usually learn in a presentation learning scheme, where they learn to encode a set of data. The network reconstructs input data in a very similar way by learning the representation of input data. The basic structure of the autoencoder is shown below.

The architecture typically consists of an input layer, an output layer, and one or more hidden layers that connect the input and output layers. The output layer has the same number of nodes as the input layer because it reconstructs the input.

In its general form, there is only one hidden layer, but in the case of the deep autoencoder, there are multiple hidden layers. This increase in depth reduces the computational cost of representing certain functions and the amount of training data required to learn certain functions. Its application fields include anomaly detection, image processing, information retrieval, drug discovery, etc.

Implement deep autoencoder in PyTorch

First, we’ll import all the necessary libraries.

import os
import torch 
import torchvision
import torch.nn as nn
import torchvision.transforms as transforms
import torch.optim as optim
import matplotlib.pyplot as plt
import torch.nn.functional as F
from torchvision import datasets
from torch.utils.data import DataLoader
from torchvision.utils import save_image
from PIL import Image
Copy the code

Now we will define the values of the hyperparameters.

Epochs = 100
Lr_Rate = 1e-3
Batch_Size = 128
Copy the code

The following functions will be used for image transformations required by the PyTorch model.

transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5)))Copy the code

Using the code snippet below, we will download the MNIST handwritten digital data set and prepare it for further processing.

train_set = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
test_set = datasets.MNIST(root='./data', train=False, download=True, transform=transform)

train_loader = DataLoader(train_set, Batch_Size=Batch_Size, shuffle=True)
test_loader = DataLoader(test_set, Batch_Size=Batch_Size, shuffle=True)
Copy the code

Let’s look at some information about training data and its classes.

print(train_set)
Copy the code

print(train_set.classes)
Copy the code

In the next step, we will define an Autoencoder class for defining the model.

class Autoencoder(nn.Module) :
    def __init__(self) :
        super(Autoencoder, self).__init__()

        # encoder
        self.enc1 = nn.Linear(in_features=784, out_features=256) # Input image (28*28 = 784)
        self.enc2 = nn.Linear(in_features=256, out_features=128)
        self.enc3 = nn.Linear(in_features=128, out_features=64)
        self.enc4 = nn.Linear(in_features=64, out_features=32)
        self.enc5 = nn.Linear(in_features=32, out_features=16)

        # decoder
        self.dec1 = nn.Linear(in_features=16, out_features=32)
        self.dec2 = nn.Linear(in_features=32, out_features=64)
        self.dec3 = nn.Linear(in_features=64, out_features=128)
        self.dec4 = nn.Linear(in_features=128, out_features=256)
        self.dec5 = nn.Linear(in_features=256, out_features=784) # Output image (28*28 = 784)

    def forward(self, x) :x = F.relu(self.enc1(x)) x = F.relu(self.enc2(x)) x = F.relu(self.enc3(x)) x = F.relu(self.enc4(x)) x = F.relu(self.enc5(x)) x = F.relu(self.dec1(x)) x = F.relu(self.dec2(x)) x = F.relu(self.dec3(x)) x = F.relu(self.dec4(x))  x = F.relu(self.dec5(x))return x
Copy the code

Now, we will create the Autoencoder model as an object of the Autoencoder class defined above.

model = Autoencoder()
print(model)
Copy the code

Now, we will define loss functions and optimization methods.

criterion = nn.MSELoss()
optimizer = optim.Adam(net.parameters(), lr=Lr_Rate)
Copy the code

The following functions enable the CUDA environment.

def get_device() :
    if torch.cuda.is_available():
        device = 'cuda:0'
    else:
        device = 'cpu'
    return device
Copy the code

The following function creates a directory to hold the results.

def make_dir() :
    image_dir = 'MNIST_Out_Images'
    if not os.path.exists(image_dir):
        os.makedirs(image_dir)
Copy the code

Using the following functions, we will save the reconstructed image generated by the model.

def save_decod_img(img, epoch) :
    img = img.view(img.size(0), 1.28.28)
    save_image(img, './MNIST_Out_Images/Autoencoder_image{}.png'.format(epoch))
Copy the code

The following function is called to train the model.

def training(model, train_loader, Epochs) :
    train_loss = []
    for epoch in range(Epochs):
        running_loss = 0.0
        for data in train_loader:
            img, _ = data
            img = img.to(device)
            img = img.view(img.size(0), -1)
            optimizer.zero_grad()
            outputs = model(img)
            loss = criterion(outputs, img)
            loss.backward()
            optimizer.step()
            running_loss += loss.item()

        loss = running_loss / len(train_loader)
        train_loss.append(loss)
        print('Epoch {} of {}, Train Loss: {:.3f}'.format(
            epoch+1, Epochs, loss))

        if epoch % 5= =0:
            save_decod_img(outputs.cpu().data, epoch)

    return train_loss
Copy the code

The following functions will test the image reconstruction of the trained model.

def test_image_reconstruct(model, test_loader) :
     for batch in test_loader:
        img, _ = batch
        img = img.to(device)
        img = img.view(img.size(0), -1)
        outputs = model(img)
        outputs = outputs.view(outputs.size(0), 1.28.28).cpu().data
        save_image(outputs, 'MNIST_reconstruction.png')
        break
Copy the code

Prior to training, the model is pushed into the CUDA environment and directories are created to store the resulting images using the functions defined above.

device = get_device()
model.to(device)
make_dir()
Copy the code

Now, the model will be trained.

train_loss = training(model, train_loader, Epochs)
Copy the code

After successful training, we will visualize the loss during training.

plt.figure()
plt.plot(train_loss)
plt.title('Train Loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.savefig('deep_ae_mnist_loss.png')
Copy the code

We will visualize some of the images saved during the training.

Image.open('/content/MNIST_Out_Images/Autoencoder_image0.png')
Copy the code

Image.open('/content/MNIST_Out_Images/Autoencoder_image50.png')
Copy the code

Image.open('/content/MNIST_Out_Images/Autoencoder_image95.png')
Copy the code

In the last step, we will test our autoencoder model to reconstruct the image.

test_image_reconstruct(model, testloader)

Image.open('/content/MNIST_reconstruction.png')
Copy the code

So, we can see that since the beginning of the training process, the autoencoder model starts to reconstruct the image. After the first epoch, the quality of reconstruction was not very good, and was not improved until 50 epoch.

After a complete training, we can see that in images generated after 95 EPOCH and in tests, it can construct an image that matches the original input image very well.

According to the loss value, we can know that the epoch can be set to 100 or 200.

After a long period of training, it is expected to get clearer reconstruction images. However, with this demonstration, we can understand how to implement a deep autoencoder for image reconstruction in PyTorch.

References:

  1. Sovit Ranjan Rath, “Implementing Deep Autoencoder in PyTorch”
  2. Abien Fred Agarap, “Implementing an Autoencoder in PyTorch”
  3. Reyhane Askari, “Auto Encoders”

The original link: analyticsindiamag.com/hands-on-gu…