from pathlib import Path
from torch.utils.data import Dataset
from torch.utils.data import DataLoader
from torch import nn
import torch
import torchvision.transforms as tvtfms
from PIL import Image
import pandas as pd
import plotly.express as px
import plotly.io as pio
from matplotlib import pyplot as plt
Introduction
This is the third part of a series of posts dedicated to image classification of components that are pats of telecommunication structures.
In Part 1 we used all the magic of fastai directly, blindly, as was introduced in the firsts lessons of Practical Deep Learning for Coders:
In Part 2 the idea was to apply the lessons from the course Walk with fastai, the missing pieces for success. It meant to get rid of the fastai magic except for the training, and use raw PyTorch for the dataset and dataloader creation and model setup.
Here we are going to use miniai, which is a simple and flexible framework that is being developed in Part 2 of Practical Deep Learning for Coders 2022. To install the framework go to: https://github.com/fastai/course22p2.
Import Libraries
Image Data
There are 514 images (training + validation) in 8 relatively “easy” to distinguish categories (components). Plus there are 20 images for testing.
- Base plate
- Grounding bar
- Identification
- Ladder
- Light
- Lightning rod
- Platform
- Transmission lines
You can download the pictures here.
There are two folders, one for the training (train) and the other for validation set (valid).
= Path("photos")
path for folder in path.iterdir()] [folder.stem
['test', 'train', 'valid']
= path / "train"
train_path = path / "valid"
valid_path = path / "test" test_path
And in each folder there is one folder for each label.
= [folder.stem for folder in train_path.iterdir()]
labels = len(labels)
number_of_labels print(labels)
['base_plate', 'grounding_bar', 'identification', 'ladder', 'light', 'lightning_rod', 'platform', 'transmission_lines']
= {k:v for k,v in enumerate(labels)}
int_to_label = {k:v for v,k in int_to_label.items()}
label_to_int print(label_to_int)
{'base_plate': 0, 'grounding_bar': 1, 'identification': 2, 'ladder': 3, 'light': 4, 'lightning_rod': 5, 'platform': 6, 'transmission_lines': 7}
Creating PyTorch Dataset
Dataset Class
This class is based in the amazing Walk with fastai course whose Using Raw PyTorch lesson has the following note in the Dataset code.
This example is highly based on the work of Sylvain Gugger for the Accelerate notebook example which can be found here: https://github.com/huggingface/notebooks/blob/main/examples/accelerate_examples/simple_cv_example.ipynb
class TowerPartsDataset(Dataset):
def __init__(self, path:Path, transforms:nn.Sequential, label_to_int:dict):
self.transforms = transforms
self.paths = [f for folder in path.iterdir() for f in folder.iterdir()]
self.to_tensor = tvtfms.ToTensor()
def __len__(self):
return len(self.paths)
def apply_x_transforms(self, filepath):
= Image.open(filepath)#.convert("RGB")
image = self.to_tensor(image)
tensor_image return self.transforms(tensor_image)
def apply_y_transforms(self, filepath):
= filepath.parent.name
label return label_to_int[label]
def __getitem__(self, index):
= self.paths[index]
filepath = self.apply_x_transforms(filepath)
x = self.apply_y_transforms(filepath)
y return (x, y)
Item Transforms
from fastai.vision.data import imagenet_stats
print(imagenet_stats)
= nn.Sequential(
item_tfms 224, 224)),
tvtfms.Resize((*imagenet_stats)
tvtfms.Normalize( )
([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
0](Image.open(train_path / 'base_plate/CIMG4695.jpg')) item_tfms[
Train and validation datasets
= TowerPartsDataset(
train_dataset
train_path,
item_tfms,
label_to_int
)
= TowerPartsDataset(
valid_dataset
valid_path,
item_tfms,
label_to_int )
= train_dataset[0]
x, y x.shape, y
(torch.Size([3, 224, 224]), 0)
DataLoader
= 64 batch_size
= DataLoader(
train_dataloader
train_dataset,=True,
shuffle=True,
drop_last=batch_size
batch_size )
= DataLoader(
valid_dataloader
valid_dataset,=batch_size * 2
batch_size )
miniai Dataloaders
from miniai.datasets import DataLoaders, show_images
= DataLoaders(train_dataloader, valid_dataloader) dls
= dls.train
dt = next(iter(dt))
xb, yb 10] xb.shape, yb[:
(torch.Size([64, 3, 224, 224]), tensor([6, 7, 0, 0, 2, 6, 6, 6, 3, 5]))
%%time
for _ in train_dataloader: pass
CPU times: user 6min 35s, sys: 49.7 s, total: 7min 25s
Wall time: 1min 25s
from operator import itemgetter
# To avoid warning of clipping input data, a sigmoid is applied
= xb[:16].sigmoid()
xbt = yb[:16]
ybt = itemgetter(*ybt.tolist())(int_to_label)
titles =2.25, titles=titles) show_images(xbt, imsize
As we can see, there are some pictures that have at least two components that could be classified as for our labels. So maybe the correct approach would be a multi-label classification.
Customizing a PyTorch Model
When loading a model to fastai learner, it is customized by changing the last two children (the Head).
What we are going to do here is to change only what is essential for our problem, that is, the final linear layer in order to have 8 features (the number of labels need to classify).
For more detail on this I encourage you to take Walk with fastai, the missing pieces for success.
Unlike Part 1 and Part 2, here we are going to try with resnet18
.
from torchvision.models import resnet18
= resnet18(pretrained=True) model
Customizing the last linear layer
We need an output size equal to the number of labels we are trying to predict.
= list(model.children())
model_child -2:] model_child[
[AdaptiveAvgPool2d(output_size=(1, 1)),
Linear(in_features=512, out_features=1000, bias=True)]
We can access the original final layer with model.fc
model.fc
Linear(in_features=512, out_features=1000, bias=True)
It has the 1,000 features ResNet was trained for. We need to change it to the 8 features (labels, classes) of the problem we are dealing here:
= nn.Linear(512, out_features=number_of_labels, bias=True) model.fc
model.fc
Linear(in_features=512, out_features=8, bias=True)
Callbacks
For more info about miniai and its callback system:
- Fastai Course Part 2 2022: Understanding CallBacks by Francesco Pochetti
- Redesign your Training Loop with CallBacks by Dien-Hoa Truong
from miniai.learner import MetricsCB, DeviceCB, ProgressCB, TrainLearner
from torcheval.metrics import MulticlassAccuracy
= MetricsCB(accuracy=MulticlassAccuracy())
metrics = [DeviceCB(), metrics, ProgressCB(plot=True)] cbs
The Optimizer
Let’s use the same optimizer as fastai’s default: AdamW
from miniai.activations import set_seed
from miniai.sgd import BatchSchedCB, RecorderCB
from torch.optim import AdamW
from functools import partial
from torch.optim import lr_scheduler
import torch.nn.functional as F
Record the scheduler’s parameters
def _lr(cb): return cb.pg['lr']
def _beta1(cb): return cb.pg['betas'][0]
= RecorderCB(lr=_lr, mom=_beta1) rec
Training with miniai
42)
set_seed(= 0.001, 4 lr, epochs
Custom training loop
miniai callbacks allows us to easily modify the training loop. In this case we are going to store the validation loss for each sample of the validation dataset (without reducing it to a single value). Since the validation set has 120 pictures, and the validation batch size is 64 x 2, by storing the last loss as valid_loss
is enough to then use those values and plot the top losses as we do with fastai.
Predictions are stored in learn.preds
as set in miniai TrainLearner
.
class TrainValidLossTrack(TrainLearner):
def get_loss(self):
self.loss = self.loss_func(self.preds, self.batch[1])
# Store loss without reduction in the Learner
self.valid_loss = self.loss_func(self.preds, self.batch[1], reduction="none")
= epochs * len(dls.train)
tmax = partial(lr_scheduler.OneCycleLR, max_lr=lr, total_steps=tmax)
sched
= [BatchSchedCB(sched), rec]
xtra = TrainValidLossTrack(model, dls, F.cross_entropy, lr=lr, cbs=cbs+xtra, opt_func=AdamW) learn
learn.fit(epochs)
accuracy | loss | epoch | train |
---|---|---|---|
0.401 | 1.696 | 0 | train |
0.850 | 0.442 | 0 | eval |
0.977 | 0.103 | 1 | train |
0.842 | 0.638 | 1 | eval |
1.000 | 0.013 | 2 | train |
0.875 | 0.621 | 2 | eval |
1.000 | 0.007 | 3 | train |
0.925 | 0.326 | 3 | eval |
Good news, it trained better than (I) expected! It looks like it overfit a little, although we only trained for 4 epochs and a smaller model (resnet18) vs Part 1 & 2 (resnet34).
How the the parameters were changed by the scheduler
import matplotlib.pyplot as plt
rec.plot()
Classification Interpretation
learn.preds.shape, learn.valid_loss.shape
(torch.Size([120, 8]), torch.Size([120]))
= [label_to_int[l.parts[-2]] for l in valid_dataset.paths] actuals
Confusion Matrix
= [[0]*number_of_labels for n in range(number_of_labels)]
confusion_matrix = list(zip(actuals, learn.preds.argmax(1).tolist()))
act_pred
for act, pred in act_pred:
+= 1 confusion_matrix[act][pred]
= y = [x for x in labels]
x
= px.imshow(confusion_matrix, labels=dict(x="Predicted", y="Actual", color="# Pics"),
fig =True, x=x, y=x,
text_auto='blues', title='Confusion Matrix')
color_continuous_scale fig.show()
Plot Top Losses
= learn.valid_loss.argsort(descending=True).cpu() sorted_losses_idcs
= dls.valid
dv = next(iter(dv))
xb, yb = xb[sorted_losses_idcs[:12]].sigmoid() xbv
= torch.tensor(act_pred)[sorted_losses_idcs[:12]].tolist() act_pred_plot
= [f"{int_to_label[act]} / \n {int_to_label[pred]}" for act, pred in act_pred_plot][:12] titles
=3.0, title='Actual / Predicted', titles=titles) show_images(xbv, imsize
Save the model
= Path('models')
mdl_path /'exported_resnet18_pytorch-miniai.pth') torch.save(learn.model, mdl_path
GPU Inference
= torch.load(mdl_path/'exported_resnet18_pytorch-miniai.pth') loaded_model
model evaluation
Remember that you must call model.eval() to set dropout and batch normalization layers to evaluation mode before running inference.
https://pytorch.org/tutorials/beginner/saving_loading_models.html#saving-and-loading-models
eval(); loaded_model.
Test dataset
= list(test_path.iterdir())
test_files print(test_files)
[Path('photos/test/archipielago-los-roques-078.jpg'), Path('photos/test/archipielago-los-roques-313.jpg'), Path('photos/test/cantv-mata-palo-350.jpg'), Path('photos/test/CIMG8119.jpg'), Path('photos/test/DSC00191.jpg'), Path('photos/test/DSC01399.jpg'), Path('photos/test/DSC01537.jpg'), Path('photos/test/DSC01628.jpg'), Path('photos/test/DSC01657.jpg'), Path('photos/test/DSC01723.jpg'), Path('photos/test/DSC01734.jpg'), Path('photos/test/DSC01955.jpg'), Path('photos/test/DSC01956.jpg'), Path('photos/test/DSC04892.jpg'), Path('photos/test/DSC05105.jpg'), Path('photos/test/DSC05130.jpg'), Path('photos/test/DSC09446.jpg'), Path('photos/test/DSC09524.jpg'), Path('photos/test/DSC09685.jpg'), Path('photos/test/DSC09908.jpg')]
0](Image.open(f)) for f in test_files], imsize=2.5) show_images([item_tfms[
= tvtfms.ToTensor()
to_tensor
def get_predictions(im_files:list, model:"torchvision.models"):
= []
tensor_images for file in im_files:
open(file)).cuda()))
tensor_images.append(item_tfms(to_tensor(Image.
return model(torch.stack(tensor_images)).argmax(1)
print([int_to_label[pred.item()] for pred in get_predictions(test_files, loaded_model)])
['grounding_bar', 'ladder', 'light', 'platform', 'base_plate', 'light', 'platform', 'grounding_bar', 'base_plate', 'base_plate', 'ladder', 'light', 'lightning_rod', 'ladder', 'identification', 'ladder', 'lightning_rod', 'ladder', 'light', 'identification']
Almost all the images in the test set are predicted correctly. Thats ok for the purpuse of this post. But as said before, probably the approach would be to do a multi-label classification.
Conclusions
- It was great to being able to train with more than 90% accuracy using PyTorch and miniai, starting with a pretrained resnet18 model.
- We implemented Confusion Matrix and Plot Top Losses from scratch, techniques available in fastai’s Classification Interpretation.
- We also did batch inference in the GPU.