· Prepare your own data set for training (based on the Cat/Dog fight data set)

In the above scenarios, we used Pytorch’s officially prepared FashionMNIST data set for training and testing. This post describes how we can prepare the data set ourselves for more scenarios.

This time we are using a cat/dog fight data set, and before we begin we need to process the data in the following form

Datas │ └ ─ ─ ─ “train” │ │ │ └ ─ ─ ─ cats │ │ │ cat1000. JPG │ │ │ cat1001. JPG │ │ │… │ └ ─ ─ ─ dogs │ │ │ dog1000. JPG │ │ │ dog1001. JPG │ │ │… └ ─ ─ ─ valid │ │ │ └ ─ ─ ─ cats │ │ │ cat0. JPG │ │ │ cat1. JPG │ │ │… │ └ ─ ─ ─ dogs │ │ │ dog0. JPG │ │ │ dog1. JPG │ │ │…

There are 23000 pieces of data in the train data set and 2000 pieces of data in the valid data set to verify network performance

1. Use the invisible dictionary form, concise code, not easy to understand

import torch as t
import torchvision as tv
import os

data_dir = "./datas"

BATCH_SIZE = 100

EPOCH = 10

transform = {
    x:tv.transforms.Compose(
        [tv.transforms.Resize([64.64]),tv.transforms.ToTensor()]Transforms.Resize Used to Resize the image
    ) 
    for x in ["train"."valid"]
}

datasets = {
    x:tv.datasets.ImageFolder(root = os.path.join(data_dir,x),transform=transform[x])
    for x in ["train"."valid"]
}

dataloader = {
    x:t.utils.data.DataLoader(dataset= datasets[x],
        batch_size=BATCH_SIZE,
        shuffle=True
    ) 
    for x in ["train"."valid"]
}

b_x,b_y = next(iter(dataloader["train"]))

print(b_x.shape,b_y.shape)

index_classes = datasets["train"].class_to_idx

print(index_classes)
Copy the code

2. Use explicit dictionary form, a little more code, easy to understand

import torch as t
import torchvision as tv


data_dir = "./datas"

BATCH_SIZE = 100

EPOCH = 10

transform = {
    "train":tv.transforms.Compose(
        [tv.transforms.Resize([64.64]),tv.transforms.ToTensor()]
    ),
    "valid":tv.transforms.Compose(
        [tv.transforms.Resize([64.64]),tv.transforms.ToTensor()]
    ),
}

datasets = {
    "train":tv.datasets.ImageFolder(root = os.path.join(data_dir,"train"),transform=transform["train"]),
    "vaild":tv.datasets.ImageFolder(root = os.path.join(data_dir,"vaild"),transform=transform["vaild"]),
}
dataloader = {
    "train":t.utils.data.DataLoader(dataset= datasets["train"],
        batch_size=BATCH_SIZE,
        shuffle=True
    ),
    "valid":t.utils.data.DataLoader(dataset= datasets["valid"],
        batch_size=100,
        shuffle=True
    )
}

b_x,b_y = next(iter(dataloader["train"]))

print(b_x.shape,b_y.shape)

index_classes = datasets["train"].class_to_idx

print(index_classes)
Copy the code

The output

torch.Size([100, 3, 64, 64]) torch.Size([100])
{'cats': 0.'dogs': 1}
Copy the code