PyTorch Transfer Learning with Pretrained Models Simplified

Credit: pexels.com, An artist’s illustration of artificial intelligence (AI). This image depicts how AI could adapt to an infinite amount of uses. It was created by Nidia Dias as part of the Visualising AI pr...

PyTorch offers a wide range of pretrained models, including ResNet, VGG, and DenseNet, that can be used for transfer learning.

These models have already been trained on large datasets such as ImageNet, which contains over 14 million images across 21,841 categories.

You can leverage these models to fine-tune them on your own dataset, saving you time and computational resources.

Pretrained models are particularly useful for image classification tasks, where they can be used as a starting point for your own model.

Broaden your view: Can I Learn to Code on My Own

Preparing the Dataset

To prepare a dataset for fine-tuning a pretrained model, you need to download a dataset and process it for training. This involves applying a preprocessing function over the entire dataset using 🤗 Datasets map method.

You can create a smaller subset of the full dataset to fine-tune on to reduce the time it takes. For example, the Yelp Reviews dataset can be used, which contains a large number of text reviews that can be processed using a tokenizer and padding/truncation strategy.

A unique perspective: Transfer Learning vs Fine Tuning

Credit: youtube.com, PyTorch Tutorial 15 - Transfer Learning

The Chessman image dataset from Kaggle is another example of a smaller dataset that can be used for fine-tuning. This dataset contains 556 images distributed over 6 classes, with a significant imbalance in the number of images per class.

Here's a breakdown of the classes and the number of images in each class:

Bishop: 87
King: 76
Knight: 106
Pawn: 107
Queen: 78
Rook: 102

To create the datasets for training, you'll need to split the dataset into training and validation sets. This can be done by creating subsets of the data, as shown in the code snippet below.

The dataset preparation involves defining constants for the data root directory path, validation split ratio, image size for resizing, batch size, and number of parallel processes for data preparation. This is typically done in a file called datasets.py.

For using a pretrained model from torchvision.models, you'll need to prepare a specific transform for your images. This can be done using the create_dataloaders function from the data_setup.py script, which calculates the means and standard deviations across a subset of images.

Consider reading: Random Shuffle Dataset Python Huggingface

Training with PyTorch

Credit: youtube.com, Pytorch Transfer Learning and Fine Tuning Tutorial

Training with PyTorch can be a straightforward process. You can create a Trainer object with your model, training arguments, training and test datasets, and evaluation function, then fine-tune your model by calling train().

To train a model in native PyTorch, you can use the Trainer class, which takes care of the training loop and allows you to fine-tune a model in a single line of code. However, if you prefer to write your own training loop, you can manually postprocess the tokenized dataset to prepare it for training.

Here's a step-by-step guide to preparing the dataset for training:

1. Remove the text column because the model does not accept raw text as an input.

2. Rename the label column to labels because the model expects the argument to be named labels.

3. Set the format of the dataset to return PyTorch tensors instead of lists.

Once you've prepared the dataset, you can create a smaller subset of the dataset to speed up the fine-tuning process.

You might like: Human in the Loop Reinforcement Learning

Credit: youtube.com, Transfer Learning for Computer Vision with PyTorch and Allegro Trains

To keep track of your training progress, you can use the tqdm library to add a progress bar over the number of training steps. This will help you monitor your model's performance during training.

The training script is the final piece of the puzzle before you start training. You'll need to write the training script in the train.py file, which will include the imports and building of the argument parser. The argument parser will have flags such as --epochs, --pretrained, and --learning-rate, which you can use to control the training process.

Here's an overview of the training process:

Scheduling the learning rate
Saving the best model

PyTorch provides a range of features that make it easy to train models, including dynamic computational graphs, tensor computation, automatic differentiation, and neural network building blocks. With PyTorch, you can easily prototype and experiment with different models and architectures.

Here's a summary of PyTorch's features:

Dynamic Computational Graph: PyTorch allows for automatic formulation of tasks while operations are done.
Tensor Computation: PyTorch provides a powerful tool for tensor calculus, similar to NumPy libraries.
Automatic Differentiation: PyTorch can calculate and handle gradients even with customized operations over tensors.
Neural Network Building Blocks: PyTorch provides a range of functionalities to help with the development of neural networks.
Dynamic Neural Networks: PyTorch enables the trainable network of neurons to change structure as it runs.

Using Pretrained Models

Pretrained models can significantly reduce training time and improve performance on deep learning tasks. You can find pretrained models in various places, including PyTorch domain libraries, HuggingFace Hub, timm (PyTorch Image Models) library, and Paperswithcode.

Curious to learn more? Check out: Pytorch Confusion Matrix

Credit: youtube.com, PyTorch Transfer Learning (taking a pretrained model and applying it to your own problem)

The PyTorch domain libraries, such as torchvision, torchtext, and torchaudio, come with pretrained models of some form that work right within PyTorch. You can access these models by importing the relevant libraries and exploring their documentation.

HuggingFace Hub offers a series of pretrained models on many different domains, including vision, text, and audio, from organizations around the world. You can find plenty of different datasets too.

Some popular pretrained models include ResNet's, VGG, EfficientNet's, VisionTransformer (ViT's), and ConvNeXt. You can find these models in torchvision.models and use them as a starting point for your own tasks.

Here are some common architecture backbones and their corresponding code in torchvision.models:

By using a pretrained model, you can leverage a pre-trained model, significantly reducing the training time needed for a new task, and improving performance on the new task even with a limited amount of data specific to that task.

Implementing a Model

To start implementing a model, you'll need to choose a pre-trained model. In the case of the EfficientNet_B0 model, it's been trained on millions of images and has achieved ~77.7% accuracy across ImageNet's 1000 classes.

Credit: youtube.com, How To Do Transfer Learning For Computer Vision | PyTorch Tutorial

The pre-trained model can be used as a starting point for your own image classification task, such as classifying pizza, steak, and sushi images. To set up the EfficientNet_B0 model, you can use the torchvision.models.efficientnet_b0() function.

Here's a breakdown of the pre-trained model's architecture:

Features: a collection of convolutional layers and other activation layers to learn a base representation of vision data.
Avgpool: takes the average of the output of the features layer(s) and turns it into a feature vector.
Classifier: turns the feature vector into a vector with the same dimensionality as the number of required output classes.

This pre-trained model has already been trained on a large dataset, so you can leverage its knowledge to improve your own model's performance.

Training Hyperparameters

You can create a TrainingArguments class to store hyperparameters and flags for training options. This class is where you can tune hyperparameters to find the optimal settings.

To save checkpoints from your training, you'll need to specify a location in the TrainingArguments class.

The Trainer object requires several inputs, including your model, training arguments, training and test datasets, and an evaluation function.

You can create a Trainer object by calling the Trainer class with these inputs, and then fine-tune your model by calling the train() method.

Train in Native

Credit: pexels.com, An artist’s illustration of artificial intelligence (AI). This image was inspired neural networks used in deep learning. It was created by Novoto Studio as part of the Visualising AI proje...

You can fine-tune a model in a single line of code using Trainer, which takes care of the training loop.

Trainer allows you to fine-tune a model in native PyTorch, giving you more control over the training process.

To manually postprocess tokenized_dataset, you'll need to remove the text column because the model doesn't accept raw text as an input.

Remove the text column by running the following command: tokenized_datasets = tokenized_datasets.remove_columns(["text"])

Rename the label column to labels because the model expects the argument to be named labels.

Rename the label column by running the following command: tokenized_datasets = tokenized_datasets.rename_column("label", "labels")

Set the format of the dataset to return PyTorch tensors instead of lists.

Set the format of the dataset by running the following command: tokenized_datasets.set_format("torch")

Here's a quick summary of the steps to postprocess tokenized_dataset:

Remove the text column: tokenized_datasets = tokenized_datasets.remove_columns(["text"])
Rename the label column: tokenized_datasets = tokenized_datasets.rename_column("label", "labels")
Set the format to PyTorch tensors: tokenized_datasets.set_format("torch")

ConvNet as Feature Extractor

We can use a ConvNet as a feature extractor by freezing all the network except the final layer. This means we set requires_grad=False to freeze the parameters, so the gradients are not computed in backward(). On CPU, this will take about half the time compared to the previous scenario.

Credit: youtube.com, Tutorial 125 - Using pretrained deep learning model as feature extractor for XGBoost segmentation

Freezing the parameters allows us to reuse the knowledge the ConvNet has learned from another problem, which is a key concept in transfer learning.

To freeze the parameters, we need to set requires_grad=False for the entire network except the final layer. This will significantly reduce the training time and computational resources required for the new task.

Here's a summary of the steps to use a ConvNet as a feature extractor:

Freeze all the network except the final layer
Set requires_grad=False to freeze the parameters
Reuse the knowledge the ConvNet has learned from another problem

By using a ConvNet as a feature extractor, we can leverage the knowledge it has learned from another problem and apply it to our new task, significantly reducing the training time and computational resources required.

ResNet 50 Implementation

Implementing a ResNet 50 model involves several key steps.

First, you need to choose a pre-trained model, which in this case is ResNet-50. This model is used for image classification on the MNIST dataset.

The pre-trained ResNet-50 model is then combined with data augmentation techniques such as RandomResizedCrop(224), which randomly crops the image to 224x224 while retaining the aspect ratio.

Credit: youtube.com, ResNet (actually) explained in under 10 minutes

RandomHorizontalFlip() is also used, flipping the image horizontally randomly with a 0.5 probability. Another data augmentation method is RandomRotation(10), which performs the maximum rotation of an image in random by 10 degrees.

The image is then converted to a PyTorch tensor using ToTensor(). Additionally, Grayscale(num_output_channels=3) is used to convert the image to grayscale while preventing degradation of image quality.

The ResNet 50 model is then trained and evaluated using the QMNIST dataset. The model is first preprocessed to convert images to tensors and normalize them.

The model is then deployed with the data of QMNIST to check whether it works properly. The model's accuracy is checked by comparing its labels with its predictions.

Here are the key steps in deploying the ResNet 50 model:

Preprocess QMNIST images to convert them to tensors and normalize them.
Take samples from QMNIS test dataset and DataLoader to evaluate model performances.
Deploy the trained model with the data of QMNIST to check whether it works properly.
Check the model's accuracy by comparing its labels with its predictions.
Output the detail of the accuracy for the model on QMNIST dataset.

Executing the Training Script

Executing the training script is a crucial step in the PyTorch transfer learning process. It's essential to run the script twice, once without pretrained weights and again with them.

Credit: youtube.com, Transfer Learning (PyTorch Tutorial)

You'll need to set the learning rate accordingly, as using 0.001 without pretrained weights might be too slow to train or may not train at all. The training script will output the results, including the validation accuracy and loss.

As you monitor the training, you might see fluctuations in the validation metrics. In the last epoch without pretrained weights, the validation accuracy was 61.818% and the validation loss was 1.153.

On the other hand, when using pretrained weights, the final epoch's validation results are much better. You can expect a validation accuracy of more than 98% and a validation loss of 0.098.

Here's a comparison of the training times:

Keep in mind that the training times can vary depending on your system's specifications. It's also worth noting that using pretrained weights can significantly improve the results, as seen in the example where the validation accuracy increased to more than 98%.

Executing the Inference Script

Credit: youtube.com, Python Pytorch Tutorials # 2 Transfer Learning : Inference with ImageNet Models

To execute the inference script, you'll need to run the inference.py script. This script contains the code to run inference using the trained model. All the test images are located in the input/test_images directory.

The inference code loads the trained weights from the model checkpoint saved from training and fine-tuning the pretrained EfficientNetB0 model. The next code block iterates over all the test images and runs the inference on each one of them.

You'll see the output on the terminal screen, and also take a look at the results saved to disk. The model was able to correctly predict the King, Queen, and Knight, but not the other three classes.

Explore further: Solomonoff's Theory of Inductive Inference

Execute Inference.py Script

To execute the inference.py script, all you need to do is run it. The script is located in the same directory where the test images are stored, specifically in the input/test_images directory.

The script will run inference on all the test images and save the results to disk. You'll also see the output on the terminal screen.

Check this out: Variational Inference with Normalizing Flows

Credit: youtube.com, Production Inference Deployment with PyTorch

The model is trained on a dataset of around 500 images, which is a relatively small amount of data. Despite this, the model is able to correctly predict the class of some images, such as the King, Queen, and Knight.

However, the model's performance is not perfect, and it makes mistakes on other images. This might seem like a bad performance, but it's actually not surprising given the limited amount of training data.

To improve the model's performance, you could try applying more data augmentation techniques or training the model for a few more epochs. You could also try using a larger EfficientNet model, such as EfficientNetB1.

Visualizing the Predictions

You can use the trained model to make predictions on custom images and visualize the predicted class labels along with the images.

The inference script can display predictions for a few images, making it easier to understand the model's output.

To visualize the predictions, you can use the generic function to display the predicted class labels and images side by side.

Consider reading: Transfer Learning Enables Predictions in Network Biology

Credit: youtube.com, Prediction Intervals with Conformal Inference: An Intuitive Explanation

This function is particularly useful when working with a small dataset, allowing you to quickly assess the model's performance.

The predicted class labels will be displayed along with the images, giving you a clear understanding of the model's output.

By visualizing the predictions, you can identify any patterns or biases in the model's output and make adjustments as needed.

Evaluate

Evaluating your model's performance is a crucial step in the transfer learning process. You'll need to pass a function to the Trainer to compute and report metrics.

The 🤗 Evaluate library provides a simple accuracy function you can load with the evaluate.load function. This function will help you calculate the accuracy of your predictions.

Before passing your predictions to compute, you need to convert the logits to predictions. This is because all 🤗 Transformers models return logits.

To monitor your evaluation metrics during fine-tuning, specify the eval_strategy parameter in your training arguments. This will report the evaluation metric at the end of each epoch.

Evaluating your model can take significantly less time on GPU compared to CPU. On GPU, it takes less than a minute, whereas on CPU it takes around 15-25 minutes.

For more insights, see: Confusion Matrix Metrics

Important Concepts

Credit: youtube.com, PyTorch - The Basics of Transfer Learning with TorchVision and AlexNet

In the world of PyTorch transfer learning, there are several important concepts to grasp. Pre-trained models are a great starting point, as they've already been trained on large datasets like ImageNet for vision tasks.

These pre-trained models can be used as a foundation for your own projects, saving you time and computational resources. Fine-tuning is a technique that allows you to re-train the pre-trained model with a new dataset, using a small learning rate to adapt it to your specific task.

The process of fine-tuning can be broken down into two approaches: fine-tuning and feature extraction. Fine-tuning involves re-training the entire model, while feature extraction uses the pre-trained model as a fixed feature extractor, only replacing the final classification layer during training.

Normalization is also a crucial step in the process, as it helps to speed up model training by normalizing the input data using mean subtraction and standard division. This can be achieved using a technique like Normalize(mean=[0.5], std=[0.5]).

A fresh viewpoint: Hyperparameter Machine Learning

What Is Learning?

Credit: youtube.com, 10 Important Python Concepts In 20 Minutes

Learning is a fundamental concept in deep learning, and it's essential to understand how it works. Transfer learning is a technique that allows us to take the patterns learned by another model and apply them to our own problem.

A pre-trained model on a large dataset can be reused as a starting point for a new task, significantly reducing training time and improving performance. This approach is particularly useful when dealing with limited datasets.

Computer vision models can learn patterns on millions of images in datasets like ImageNet, and then use those patterns to infer on another problem. Language models can learn the structure of language by reading large amounts of text, like all of Wikipedia.

The premise of transfer learning is to find a well-performing existing model and apply it to our own problem, making it possible to leverage already trained models and adjust them to match new tasks.

Related reading: Credit Assignment Problem

Important Concepts

Pre-trained models are a crucial part of transfer learning. They're deep learning models that have been pre-trained on large datasets like ImageNet for vision tasks.

Credit: youtube.com, 100+ Computer Science Concepts Explained

You can use these pre-trained models as a starting point for your own projects, saving you time and effort. Fine-tuning is another technique where you re-train the pre-trained model with a new dataset, but with a small learning rate.

Feature extraction is a key concept in transfer learning. You can use a pre-trained model as a fixed feature extractor, replacing only the final classification layer during training.

Normalizing input data is essential for efficient model training. By subtracting the mean and dividing by the standard deviation, you can speed up the training process.

Transformers are a vital data preprocessing stage in computer vision tasks. They help transform input data to a suitable form and scale, making it easier for models to process.

Frequently Asked Questions

Is transfer learning the same as fine-tuning?

Transfer learning and fine-tuning are related but distinct concepts in machine learning, with transfer learning capturing general patterns and fine-tuning adapting a model to a specific task. Fine-tuning builds upon transfer learning by further training the model on task-specific data.

What are the disadvantages of transfer learning?

Transfer learning can be limited by domain mismatches and overfitting, making it less effective for tasks with significantly different data distributions or requirements. Understanding these potential drawbacks can help you decide if transfer learning is the right approach for your project.

Sources

Keith Marchal

Senior Writer

View Keith's Profile

Keith Marchal is a passionate writer who has been sharing his thoughts and experiences on his personal blog for more than a decade. He is known for his engaging storytelling style and insightful commentary on a wide range of topics, including travel, food, technology, and culture. With a keen eye for detail and a deep appreciation for the power of words, Keith's writing has captivated readers all around the world.

View Keith's Profile

PyTorch Transfer Learning Tutorial with Pretrained Models

Preparing the Dataset

Training with PyTorch

Using Pretrained Models