A Comprehensive Guide to Hugging Face Transformers Format

Author

Reads 731

Workers at Electrical Transformer Substation
Credit: pexels.com, Workers at Electrical Transformer Substation

The Hugging Face Transformers Format is a widely-used standard for pre-trained language models. It's designed to make it easy to load, fine-tune, and use these models in various NLP tasks.

One of the key benefits of the Hugging Face Transformers Format is its simplicity. It uses a JSON file to store model metadata, making it easy to read and understand.

The format is also highly flexible, allowing users to easily switch between different models and tasks. This flexibility is particularly useful in NLP applications where models often need to be fine-tuned for specific use cases.

Hugging Face Transformers Format supports a wide range of models, including popular ones like BERT and RoBERTa. These models have been pre-trained on large datasets and can be fine-tuned for specific tasks like sentiment analysis and named entity recognition.

Model Management

Model Management is a crucial aspect of working with Hugging Face Transformers. You can save the best performing model checkpoint to Artifacts with load_best_model_at_end=True in the TrainingArguments.

Credit: youtube.com, Getting Started With Hugging Face in 15 Minutes | Transformers, Pipeline, Tokenizer, Models

To centralize all your best model versions, save your model checkpoints to Artifacts. This makes it easy to organize them by ML task, stage them for production, or bookmark them for further evaluation.

If you saved your model to W&B Artifacts with WANDB_LOG_MODEL, you can download your model weights for additional training or to run inference. Just load them back into the same Hugging Face architecture that you used before.

Model Loading and Saving

You can load a saved model by downloading its weights from W&B Artifacts, which you can access by setting WANDB_LOG_MODEL to 'checkpoint' or 'end' during training.

To load a saved model, you need to use the same Hugging Face architecture that you used before saving it.

You can save a model using TorchScript, which allows you to export a model like BertModel to disk under a specified filename. For example, you can save a BertModel to a file called traced_bert.pt.

Here's a summary of how to save and load models using W&B Artifacts:

By default, your model will be saved to W&B Artifacts as model-{run_id} when WANDB_LOG_MODEL is set to 'end' or 'checkpoint-{run_id}' when WANDB_LOG_MODEL is set to 'checkpoint'.

Loading a Saved Model

Credit: youtube.com, Saving and Loading Models (Coding TensorFlow)

Loading a saved model is a breeze with Weights & Biases. You can download your model weights for additional training or to run inference, just like you would load any other model.

To load a saved model, you'll need to have saved it to W&B Artifacts using WANDB_LOG_MODEL. This allows you to store and manage your models in one place.

If you saved your model to W&B Artifacts with WANDB_LOG_MODEL, you can download your model weights for additional training or to run inference. You just load them back into the same Hugging Face architecture that you used before.

Here's a quick rundown of how to load a saved model:

Loading a saved model is an essential part of the machine learning workflow, and with Weights & Biases, it's easier than ever.

Loading Pre-Trained BERT

To load a pre-trained BERT model, you can use the Hugging Face Transformers library, which provides a PyTorch interface for BERT.

Credit: youtube.com, Fine Tune Transformers Model like BERT on Custom Dataset.

This library also includes interfaces for other pre-trained language models like OpenAI's GPT and GPT-2.

First, you'll need to install the Hugging Face Transformers library.

Then, you can import PyTorch and the pre-trained BERT model, along with a BERT tokenizer.

The transformers library provides several classes for applying BERT to different tasks, including token classification and text classification.

For simple embedding extraction, the BertModel is a good choice, and it comes in two sizes: 'base' and 'large'.

Visualize Training Outputs

Visualizing your model's performance during training is crucial to understand how it's training. This can be achieved by using the callbacks system in the Transformers Trainer.

You can log additional data to W&B, such as your model's text generation outputs or predictions, to W&B Tables. This allows you to track and visualize your model's performance over time.

Logging evaluation outputs during training is often essential to really understand how your model is training. This can be done by using the callbacks system in the Transformers Trainer.

By logging your model's outputs to W&B Tables, you can track and visualize your model's performance, making it easier to identify areas for improvement.

Training and Logging

Credit: youtube.com, Log with MLflow and Hugging Face Transformers

Visualizing model outputs during training is essential to understand how your model is training. You can use the callbacks system in the Transformers Trainer to log additional data to W&B, such as text generation outputs or predictions.

Logging to Weights & Biases is taken care of by the WandbCallback in the Transformers library. However, if you need to customize your Hugging Face logging, you can modify this callback by subclassing WandbCallback and adding additional functionality.

To add a custom callback to the HF Trainer, you need to do it after the Trainer is instantiated, not during initialization. This is because the Trainer instance is passed to the callback during initialization.

The Transformers Trainer has a built-in WandbCallback that logs evaluation outputs to W&B. However, if you want to customize this logging, you can create a custom callback by subclassing WandbCallback.

Pipelines and GPUs

Pipelines are the backbone of Hugging Face Transformers, making it easy to use models for inference.

Credit: youtube.com, The pipeline function

They provide an abstraction for the complex code behind the library, offering a simple API for tasks like Masked Language Modeling and Sentiment Analysis.

Pipelines are built-in with a tokenizer, which maps raw textual input to tokens, and a model that makes predictions from the inputs.

This pre-processing enhances the model's performance.

With Hugging Face Pipelines, you can perform various NLP tasks, including text classification, named entity recognition, question answering, and text generation.

To utilize GPUs for acceleration, you can use the pipeline's built-in support for GPU acceleration.

Here are some key benefits of using Pipelines with GPUs:

Keep in mind that GPU acceleration requires a compatible GPU and sufficient resources.

Transformers Library

The Transformers library is a game-changer for natural language processing tasks. It's a library in Hugging Face that provides APIs and tools to download and train state-of-the-art pre-trained models.

You can save a lot of time and resources by using pre-trained models, which are essentially saved pre-trained networks that were previously trained on a large dataset.

Credit: youtube.com, The Hugging Face Transformers Library | Example Code + Chatbot UI with Gradio

A pre-trained model is a network that was trained on a large dataset, and using it can save you from having to train a model from scratch.

Transformers library allows you to easily download and train these pre-trained models, making it a super useful tool for NLP tasks.

With Transformers, you can perform tasks like sentiment analysis, which is a crucial aspect of understanding how people feel about a particular topic or product.

NLP Tasks and Applications

Hugging Face is a platform that provides pre-trained language models for NLP tasks such as text classification, sentiment analysis, and more.

The NLP tasks we'll cover are text classification, named entity recognition, question answering, and text generation. These tasks can be performed using Hugging Face Pipelines.

Here are some examples of NLP tasks and their applications:

  • Text classification: This task can be used to classify text into categories such as spam or not spam, or positive or negative sentiment.
  • Named entity recognition: This task can be used to identify and extract specific entities such as names, locations, and organizations from text.
  • Question answering: This task can be used to answer questions based on a given text or set of texts.
  • Text generation: This task can be used to generate new text based on a given input or set of inputs.

Transformer Applications

Transformers library is a powerful tool for handling various NLP tasks. It has great functions to tackle these tasks efficiently.

Credit: youtube.com, Natural Language Processing with Transformers Book by Leandro von Werra

The easiest way to tackle NLP tasks is to use the pipeline function, which connects a model with its necessary pre-processing and post-processing steps. This allows you to directly input any text and get an answer.

To use the Transformers library, you need to install it with the command. This is a crucial step before you can start utilizing its functions.

The pipeline function is a convenient way to use the Transformers library, allowing you to input text and get an answer without worrying about the underlying processes.

Text Generation

Text Generation is a powerful tool for creating meaningful sentences. It can auto-complete a prompt and generate text about a topic.

To perform text generation with Hugging Face Pipelines, you can use the text-generation pipeline. This pipeline can be instantiated with a default model, or you can select a specific model from the model Hub.

Here are some supported models for text generation:

You can try out a model by creating a pipeline and passing your task and model name to it. For example, you can use the default model to generate text: "First, we’re going to create a pipeline. Let’s pass our task and model name to it."

Text generation can be used to generate text about a topic, and it's a great tool for creating meaningful sentences.

Installation and Setup

Credit: youtube.com, Tutorial 1-Transformer And Bert Implementation With Huggingface

Installing the Hugging Face Transformers library is a straightforward process. You can install it using pip by running the command `pip install transformers` in your terminal or command prompt.

This command will install the latest version of Transformers from PyPI onto your machine.

You'll also need to install PyTorch to interact with models at a lower level. This can be done using the command `pip install torch`.

Note that installing PyTorch can take a considerable amount of time, typically requiring the download of several hundred megabytes of dependencies.

To verify that the installations were successful, start a Python REPL and import transformers and torch. If the imports run without errors, then you've successfully installed the dependencies needed for this tutorial.

Model Conversion and Optimization

Model conversion is a breeze with Hugging Face Transformers. You can use the transformers.onnx package as a Python module to export a checkpoint to the ONNX format.

The package supports exporting models from the Hugging Face model hub or a local checkpoint, and it will generate the ONNX graph under the specified directory. You should see similar logs indicating the export process.

Credit: youtube.com, Accelerate Transformer inference on CPU with Optimum and ONNX

The outputs used in the ONNX inference runtime can be obtained by taking a look at the ONNX configuration of each model. For example, for BERT, the outputs return a dictionary where each key corresponds to an expected output, and each value indicates the axis of that output.

You can also use the script convert_graph_to_onnx.py to export a model to ONNX. This script works for both PyTorch and Tensorflow models and ensures that the model and its weights are correctly initialized, inputs and outputs are correctly generated, and the generated model can be correctly loaded through onnxruntime.

The conversion tool supports different options to tune the behavior of the generated model, including changing the target opset version, exporting pipeline-specific prediction heads, and using the external data format.

Here are the options available in the conversion tool:

  • Change the target opset version of the generated model
  • Export pipeline-specific prediction heads
  • Use the external data format (PyTorch only)

By using these options, you can optimize the generated model for better performance.

Model Format and Conversion

Credit: youtube.com, Converting Models to #ONNX Format

You can export a Hugging Face Transformers model to the ONNX format using the transformers.onnx package. This package is available as a Python module.

The package allows you to export a checkpoint using a ready-made configuration, which can be done by simply running a command. For example, you can export a BERT model using the following command: `transformers.onnx.export("bert-base-cased")`.

The ONNX format is a unified and community-driven format to store and efficiently execute neural networks, and it's supported by the ONNX Runtime. Starting from transformers v2.10.0, Hugging Face partnered with ONNX Runtime to provide an easy export of transformers models to the ONNX format.

The conversion tool supports different options, such as changing the target opset version of the generated model, exporting pipeline-specific prediction heads, and using the external data format (PyTorch only). The tool also ensures that the model and its weights are correctly initialized from the Hugging Face model hub or a local checkpoint.

Here are some of the models that are supported by the transformers.onnx package:

  • ALBERT
  • BART
  • BERT
  • CamemBERT
  • DistilBERT
  • GPT Neo
  • LayoutLM
  • Longformer
  • mBART
  • OpenAI GPT-2
  • RoBERTa
  • T5
  • XLM-RoBERTa

About the Datasets

Credit: youtube.com, Importing and Converting Dataset into ARFF Format

The Hugging Face Dataset library is a game-changer for working with datasets. It provides an API to quickly download many public datasets and preprocess them.

You can directly download and cache a dataset from its identifier on the Dataset Hub with the load_dataset function. This function returns a DatasetDict, which is a dictionary containing each split of your dataset.

The amazing thing about the Hugging Face Datasets library is that everything is saved to disk using Apache arrow. This means that even if your dataset is huge, you won't run out of RAM.

The features attribute of a Dataset gives you more information about its columns. You can remove the columns you don't need with the remove_columns method.

You can rename labels to labels, as models from Hugging Face Transformers expect them to be named that way.

Graph Conversion

Graph conversion is a crucial step in making your transformer models compatible with various platforms and hardware.

Credit: youtube.com, Conversion Graphs

Exporting a model is done through the script convert_graph_to_onnx.py at the root of the transformers sources. The conversion tool works for both PyTorch and Tensorflow models and ensures that the model and its weights are correctly initialized from the Hugging Face model hub or a local checkpoint.

The conversion tool supports different options which let you tune the behavior of the generated model. You can change the target opset version of the generated model, export pipeline-specific prediction heads, or use the external data format (PyTorch only).

Currently, inputs and outputs are always exported with dynamic sequence axes, preventing some optimizations on the ONNX Runtime. If you'd like to see such support for fixed-length inputs/outputs, please open up an issue on transformers.

Here are the models that are close to fully feature complete and can be converted using the transformers.onnx package:

  • ALBERT
  • BART
  • BERT
  • CamemBERT
  • DistilBERT
  • GPT Neo
  • LayoutLM
  • Longformer
  • mBART
  • OpenAI GPT-2
  • RoBERTa
  • T5
  • XLM-RoBERTa

Input Formatting

Input formatting is a crucial step in preparing your data for BERT. To mark the start and end of a sentence, BERT uses special "beginning of sentence" (BOS) and "end of sentence" (EOS) tokens, which should be added to the start and end of each input sentence, respectively.

Credit: youtube.com, Converting CFG to Chomsky Normal Form (CNF): Step-by-Step Guide

The end of each sentence should append a special token [SEP], which is used when BERT is given two separate sentences and asked to determine something.

To process a batch of input data efficiently, it's necessary to pad or truncate all of the sentences to a single, constant length. This can be done by adding special "padding" (PAD) tokens to the end of shorter sentences or by truncating longer sentences to the desired length.

The 'encode' function provided by the hugging transformers library will handle the parsing and data preparation process, but before encoding your text, you need to decide on a maximum sentence length for padding and truncating.

To do this, you can use the tokenizer.encode_plus function, which combines multiple steps for you:

  1. The sentence is split into tokens
  2. Special [SEP] and [CLS] tokens are added
  3. Map tokens to their IDs
  4. Pad or truncate all sentences to the same length.
  5. Creating attention masks that differentiate real tokens from [PAD] tokens.

To process two sentences, assign each word in the first sentence plus the ‘[SEP]’ token a 0, and all tokens of the second sentence a 1.

Model Format and Conversion

Credit: youtube.com, Python Pandoc File Format Conversion

Hugging Face's transformers library generates embeddings.

The pre-trained BERT model is used to extract these embeddings.

You can use the Hugging Face API to access pre-trained models, datasets, and tokens.

This democratizes NLP, making it easier for developers to use.

Hugging Face pipelines provide an easy-to-use API that connects a model with its necessary pre-processing and post-processing steps.

This allows you to easily carry out various NLP tasks using pipeline objects.

The transformers library in Hugging Face provides APIs and tools for performing NLP tasks.

This library is a key component of the Hugging Face ecosystem.

Customization and Support

You can add a custom configuration for an unsupported architecture by creating a custom ONNX configuration object that details the model inputs and outputs.

This configuration object should have two properties: the inputs, and the outputs. The inputs return a dictionary, where each key corresponds to an expected input, and each value indicates the axis of that input.

Credit: youtube.com, Running a Hugging Face LLM on your laptop

For BERT, there are three necessary inputs, each with a similar shape made up of two dimensions: the batch is the first dimension, and the second is the sequence.

The configuration object must use OrderedDict for both inputs and outputs properties, as inputs are matched against their relative position within the PreTrainedModel.forward() prototype and outputs are matched against their position in the returned BaseModelOutputX instance.

To add this configuration object to the model class, you'll need to add it to the initialisation of the model class, and to the general transformers initialisation.

An example of such an addition is visible in the MBart model, where the custom configuration object is added to the initialisation of the model class.

Frequently Asked Questions

What is the Hugging Face transformer model?

The Hugging Face transformer model is a pre-trained deep learning framework that enables users to tap into state-of-the-art language understanding and processing capabilities. This powerful tool allows for fine-tuning and customization to suit various NLP tasks and applications.

Keith Marchal

Senior Writer

Keith Marchal is a passionate writer who has been sharing his thoughts and experiences on his personal blog for more than a decade. He is known for his engaging storytelling style and insightful commentary on a wide range of topics, including travel, food, technology, and culture. With a keen eye for detail and a deep appreciation for the power of words, Keith's writing has captivated readers all around the world.

Love What You Read? Stay Updated!

Join our community for insights, tips, and more.