Huggingface models are a powerful tool for NLP tasks, and using them in Python is a breeze. You can easily import the models using pip.
To get started, you'll need to install the Transformers library, which is the foundation of Huggingface models. This library provides a simple interface to use the models, making it easy to integrate them into your Python code.
The Transformers library is built on top of the PyTorch framework, which means you can use it with other PyTorch models and tools. This makes it easy to combine Huggingface models with other NLP techniques.
Some popular Huggingface models include BERT, RoBERTa, and DistilBERT, which have achieved state-of-the-art results in various NLP tasks. These models are pre-trained on large datasets and can be fine-tuned for specific tasks.
You can use the Huggingface models for a variety of NLP tasks, such as text classification, sentiment analysis, and language translation.
Expand your knowledge: Is Huggingface Transformers Good
Getting Started
Getting started with Hugging Face's Pretrained Models is a breeze. You can tap into the knowledge stored within models like BERT, GPT, and T5 to execute intricate tasks with minimal configuration.
The library transformers by Hugging Face contains many models in different categories, including text classification, token classification, translation, summarization, and others. This makes it a popular choice among NLP practitioners.
To get started, you'll want to explore the many models available, such as those trained in various languages and for different tasks. Some examples include text classification, named entity recognition (NER), question answering, text generation, and machine translation.
Here are some examples of models you can work with:
- Text classification
- Named entity recognition (NER)
- Question answering
- Text generation
- Machine translation
Installing
To get started with Hugging Face models, you'll need to install the transformers library. Run the following command to install it: `pip install transformers`.
You might also want to install PyTorch or TensorFlow, depending on the backend you wish to use. This will give you more flexibility and options for your project.
To use MLflow, you'll need version 2.3 or higher. This will ensure you have the necessary tools for batch inference.
Here are the minimum requirements for using Hugging Face transformers with MLflow:
- MLflow 2.3
- A cluster with the Hugging Face transformers library installed
Keep in mind that many popular NLP models work best on GPU hardware, so you may want to consider using recent GPU hardware for the best performance.
Curious to learn more? Check out: Fastapi Huggingface Gpu
What Is?
Hugging Face is a company and open-source community that has simplified working with advanced NLP models. They provide tools to download and use pre-trained models like GPT, BERT, RoBERTa, and more, making it easier for developers to get started.
Their library includes models for a range of tasks, including text classification, named entity recognition, question answering, text generation, and machine translation.
Here are some examples of what you can do with the Hugging Face library:
- Text classification: identify the sentiment of a piece of text as positive, negative, or neutral
- Named entity recognition (NER): extract specific entities like names, locations, and organizations from text
- Question answering: use a model to answer a question based on a given text
- Text generation: generate new text based on a given prompt or input
- Machine translation: translate text from one language to another
The real power of Hugging Face lies in its Transformers library, which provides seamless integration with pre-trained models.
Preparing Data
Preparing data is a crucial step in using Hugging Face models in Python. To fine-tune a pretrained model, you need to download a dataset and prepare it for training.
You can start by loading the Yelp Reviews dataset, which is a great example to practice processing data for training. To process your dataset in one step, use the 🤗 Datasets map method to apply a preprocessing function over the entire dataset.
For more insights, see: Huggingface Training Service
If you want to avoid slowing down training, you can load your data as a tf.data.Dataset instead. There are two convenience methods to do this: prepare_tf_dataset() and to_tf_dataset(). The prepare_tf_dataset() method is recommended in most cases, as it can inspect the model to automatically figure out which columns are usable as model inputs and discard the others.
Here are the two convenience methods for loading data as a tf.data.Dataset:
Preparing Data
Loading and caching models can save you a lot of time and money, especially if you're working with large datasets or restarting clusters frequently. You can cache Hugging Face models in the DBFS root volume or on a mount point by setting the TRANSFORMERS_CACHE environment variable in your code.
Logging your model to MLflow with the MLflow `transformers` flavor can also achieve similar results. This is a great way to track and reproduce your models, and it's a feature that many data scientists swear by.
Pretrained models are a game-changer in NLP, and they're incredibly easy to use. Models like BERT, GPT, and T5 have been trained on massive datasets and can be fine-tuned for specific tasks with minimal configuration.
To get started with pretrained models, you'll need to load them into your code. This can be done with just a few lines of code, using the AutoModelForSequenceClassification class, for example.
Prepare a Dataset
To prepare a dataset for fine-tuning a pre-trained model, you need to download a dataset and process it for training. This involves loading the dataset, applying a tokenizer to process the text, and including a padding and truncation strategy to handle variable sequence lengths.
You can use the 🤗 Datasets map method to apply a preprocessing function over the entire dataset in one step. This method is particularly useful for handling large datasets.
To speed up the process, you can create a smaller subset of the full dataset to fine-tune on. This can significantly reduce the time it takes to prepare the dataset.
Worth a look: Huggingface Fine Tuning Llm
Here are two convenience methods for loading data as a tf.data.Dataset:
Before using prepare_tf_dataset(), you need to add the tokenizer outputs to your dataset as columns. This will enable the method to correctly pad batches as they're loaded. If all samples in your dataset are the same length and no padding is necessary, you can skip passing the tokenizer argument.
Training Models
Training models with Hugging Face is a breeze. You can use the Trainer class, which is optimized for training Hugging Face models, making it easier to start training without manually writing your own training loop.
The Trainer API supports a wide range of training options and features such as logging, gradient accumulation, and mixed precision.
To train a model, you'll need to load it and specify the number of expected labels. For example, if you're using the Yelp Review dataset, you know there are five labels.
If you're using PyTorch Trainer, you'll see a warning about some pretrained weights not being used and some weights being randomly initialized. Don't worry, this is completely normal!
Here are the steps to prepare your dataset for training in native PyTorch:
- Remove the text column because the model does not accept raw text as an input.
- Rename the label column to labels because the model expects the argument to be named labels.
- Set the format of the dataset to return PyTorch tensors instead of lists.
By following these steps, you'll be able to fine-tune your model in a single line of code.
Hyperparameters and Optimization
To optimize your Hugging Face model, you'll need to tweak its hyperparameters. This involves creating a TrainingArguments class that contains all the hyperparameters you can tune, as well as flags for activating different training options.
You can start with the default training hyperparameters, but feel free to experiment with these to find your optimal settings. For instance, you can specify where to save the checkpoints from your training.
Tuning performance involves using each GPU effectively, which you can adjust by changing the batch size sent to the GPU by the Transformers pipeline. This will help you make the most of your cluster.
Hyperparameters
You can create a TrainingArguments class to specify hyperparameters and training options. This class contains all the hyperparameters you can tune, as well as flags for activating different training options.
If this caught your attention, see: Dataset Huggingface Modify Class Label
To start, you can use the default training hyperparameters, but feel free to experiment with these to find your optimal settings. Experimenting with hyperparameters can help you improve the performance of your model.
You should specify where to save the checkpoints from your training, allowing you to track your model's progress. This can be a crucial step in fine-tuning your model.
Tuning performance also involves using each GPU effectively, which you can adjust by changing the size of batches sent to the GPU by the Transformers pipeline.
Learning Rate Scheduler
Creating a learning rate scheduler is a crucial step in fine-tuning your model.
You can create the default learning rate scheduler from Trainer.
The AdamW optimizer from PyTorch is a good choice for fine-tuning the model.
To speed up training, specify a device, such as a GPU, if you have access to one.
If you don't have a GPU, you can get free access to a cloud GPU with a hosted notebook like Colaboratory or SageMaker StudioLab.
For more insights, see: Llama 2 Fine Tuning Huggingface
Model Evaluation
Model evaluation is a crucial step in the machine learning process, and Hugging Face makes it easy with their Trainer and Evaluate libraries. You'll need to pass a function to compute and report metrics to the Trainer, which can be loaded with the evaluate.load function.
The 🤗 Evaluate library provides a simple accuracy function that you can use to calculate the accuracy of your predictions. To do this, you'll need to convert the logits to predictions, which can be done by calling the compute method on the metric.
Visualizing your model outputs during training or evaluation is also essential to understand how your model is training. You can log additional helpful data to W&B, such as your models' text generation outputs or other predictions, using the callbacks system in the Transformers Trainer.
Readers also liked: Metric Compute Huggingface Multiclass
Evaluate
Evaluating your model's performance is a crucial step in the training process. You'll need to pass a function to the Trainer to compute and report metrics.
The 🤗 Evaluate library provides a simple accuracy function you can load with the evaluate.load function. This function is easy to use and can help you get started with evaluating your model's performance.
To calculate the accuracy of your predictions, you'll need to call the compute method on the metric. Before passing your predictions to compute, you'll need to convert the logits to predictions.
You can also specify the eval_strategy parameter in your training arguments to report the evaluation metric at the end of each epoch. This will allow you to monitor your evaluation metrics during fine-tuning.
If you'd like to get a better understanding of how your model is performing, you can customize the WandbCallback to run model predictions and log evaluation samples to a W&B Table during training. This can be done using the on_evaluate method of the Trainer callback.
Returning Complex Types
Returning Complex Types is crucial for evaluating models that return structured output, such as named-entity recognition pipelines. These pipelines return a list of dict objects containing the entity, its span, type, and an associated score.
You can get a sense of the return types to use through inspection of pipeline results, for example by running the pipeline on the driver. This is a great way to understand the structure of the output.
To represent this as a return type, you can use an array of struct fields, listing the dict entries as the fields of the struct. This is exactly what you do in named-entity recognition pipelines.
By using an array of struct fields, you can accurately represent the complex return type of your model. This makes it easier to evaluate and understand the output of your model.
Model Management
You can load a saved model from W&B Artifacts with just a few lines of code, and it's a great way to reuse your previous work.
To load a saved model, you'll need to use the WANDB_LOG_MODEL feature, which saves your model weights for future use. This allows you to download your model weights and load them back into the same Hugging Face architecture.
Loading a pretrained model is also a breeze, and it's a great way to get started with a new task. For example, you can load the BERT model using the AutoModelForSequenceClassification class, which is specifically designed for tasks like sentiment analysis.
A fresh viewpoint: How to Load a Model in Mixed Precision in Huggingface
Saving the Best Model
If you're like me, you've probably spent hours fine-tuning your model, tweaking parameters, and testing different approaches. But how do you save the best model for future use? The answer lies in setting `load_best_model_at_end=True` in your `TrainingArguments`.
This will save the best performing model checkpoint to Artifacts, where you can easily access and reuse it later. You can also use W&B's Model Registry to centralize your best model versions across your team and stage them for production or further evaluation.
To save your model checkpoints to Artifacts, you'll need to set the `WANDB_LOG_MODEL` environment variable to `end` or `checkpoint`. This will upload the model checkpoint every `args.save_steps` from the `TrainingArguments`.
Here's a quick rundown of the options:
By using these settings, you can easily save and manage your best models, making it easier to reuse them in future projects or applications.
Cache the Model
Caching the model can save you time and money. You can cache the Hugging Face model in the DBFS root volume or on a mount point to decrease ingress costs and reduce the time to load the model on a new or restarted cluster.
To do this, set the TRANSFORMERS_CACHE environment variable in your code before loading the pipeline. This is a simple step that can make a big difference in your model management workflow.
Alternatively, you can log the model to MLflow with the MLflow `transformers` flavor to achieve similar results. This approach can also help you track your model's performance and make it easier to reproduce your results.
Sources
- https://huggingface.co/docs/transformers/en/training
- https://docs.databricks.com/ja/archive/machine-learning/train-model/model-inference-nlp.html
- https://docs.wandb.ai/guides/integrations/huggingface
- https://www.geeksforgeeks.org/how-to-use-hugging-face-pretrained-model/
- https://sagemaker.readthedocs.io/en/stable/frameworks/huggingface/sagemaker.huggingface.html
Featured Images: pexels.com