Hugging Face Vertex AI: A Comprehensive Guide to Integration

Author

Posted Nov 14, 2024

Reads 1K

Young Couple Lying Face to Face in the Snow
Credit: pexels.com, Young Couple Lying Face to Face in the Snow

Hugging Face Vertex AI is a powerful integration that allows you to deploy and manage Hugging Face models on Google Cloud Vertex AI.

Hugging Face models can be easily uploaded to Vertex AI, and the platform provides a user-friendly interface for model deployment and management.

By integrating Hugging Face with Vertex AI, you can take advantage of Vertex AI's scalable and secure infrastructure to deploy your models at scale.

This integration also provides features such as automated model versioning, model serving, and model monitoring, making it easier to manage your models and get them into production.

Model Management

To upload a model from the HuggingFace Hub, you'll first need to decide which model to use, in this case, the facebook/bart-large-mnli model is a good choice. This model is a zero-shot classification model.

To pull the model from the HuggingFace Hub, you'll use git pull, which requires git lfs to be installed in advance to handle large files. This will allow you to access the large files from the repository.

Once the model is uploaded to Google Cloud Storage (GCS), you can then register the model in Vertex AI.

For more insights, see: Huggingface Git

Model Registry

Credit: youtube.com, Centrally track and manage your model versions in Amazon SageMaker | Amazon Web Services

The Model Registry is a crucial component of Vertex AI, allowing you to manage and organize your machine learning models in a centralized location.

To register a model on Vertex AI, you can use the google-cloud-aiplatform Python SDK. This involves initializing the Vertex AI session and then uploading the model configuration, not the model weights, which will be automatically downloaded from the Hugging Face Hub in the Hugging Face DLC for TEI on startup via the MODEL_ID environment variable.

You can specify various parameters when registering a model, including display_name, serving_container_image_uri, serving_container_environment_variables, and serving_container_ports. The serving_container_image_uri is the location of the Hugging Face DLC for TEI that will be used for serving the model, while serving_container_environment_variables are the environment variables that will be used during the container runtime.

Here are the parameters you can specify when registering a model on Vertex AI:

  • display_name: the name that will be shown in the Vertex AI Model Registry
  • serving_container_image_uri: the location of the Hugging Face DLC for TEI that will be used for serving the model
  • serving_container_environment_variables: the environment variables that will be used during the container runtime
  • (optional) serving_container_ports: the port where the Vertex AI endpoint will be exposed, by default 8080

By registering your models in the Model Registry, you can easily manage and deploy them on Vertex AI, streamlining your machine learning workflow and improving collaboration among team members.

Use Case Overview

Credit: youtube.com, Model Management

Model management is a crucial part of any machine learning project. Using libraries from Hugging Face is a great way to fine tune a model, as they offer a wide range of pre-trained models and datasets.

You can use the Hugging Face transformers library to fine tune a Bert model on the IMDB dataset, which is a great resource for text classification tasks. The dataset will be downloaded from the Hugging Face datasets library.

Fine tuning a Bert model on the IMDB dataset allows you to predict whether a movie review is positive or negative. This can be a useful tool for anyone who wants to analyze movie reviews or sentiment.

The IMDB dataset is a great resource for text classification tasks, and fine tuning a Bert model on it can produce impressive results.

Here's an interesting read: How to Create a Huggingface Dataset

Integration Setup

To set up the integration between Google Vertex AI and Hugging Face, you can create custom workflows using n8n's nodes. These nodes come with global operations and settings, as well as app-specific parameters that can be configured.

Credit: youtube.com, Deploy Hugging Face models from Vertex AI Model Garden

You can use the HTTP Request node to query data from any app or service with a REST API, making it easy to make custom API calls. This node also supports predefined or generic credential types.

Google Vertex AI is a unified machine learning platform that enables developers to build, deploy, and manage models efficiently. It provides a wide range of tools and services, such as AutoML and datasets, to accelerate the deployment of AI solutions.

To connect Google Vertex AI and Hugging Face, you need to establish a link between the two platforms to route data through the workflow. This connection will allow data to flow from the output of one node to the input of another.

You can have single or multiple connections for each node, giving you flexibility in how you set up your workflow.

For another approach, see: Huggingface Inference Api

Workflow Configuration

To configure a workflow for Hugging Face and Google Vertex AI, you'll need to set up data flow between the two platforms. This involves choosing the right configuration based on your specific needs.

Credit: youtube.com, An LLM journey speed run: Going from Hugging Face to Vertex AI

You can configure data flow to go from Google Vertex AI to Hugging Face or vice versa. This flexibility allows you to work with your data in the way that makes the most sense for your project.

To test and activate your workflow, you'll need to save and run it to see if everything works as expected. This will give you a chance to check past executions and isolate any mistakes.

Once you've tested your workflow, be sure to save it and activate it to make it live and ready for use.

Take a look at this: Huggingface save Model

API and Methods

To get started with Hugging Face Vertex AI, you'll need to understand the API and methods involved. You can use generic authentication, which is supported by Hugging Face.

To access the Container Registry API, you'll need to enable it in the Container Registry. This will allow you to create a container for your custom training job.

The Container Registry API is a crucial step in setting up your Vertex AI project. By enabling it, you'll be able to create a container that can be used for your custom training job.

Supported Methods

Credit: youtube.com, HTTP Request Methods | GET, POST, PUT, DELETE

You can use generic authentication with Hugging Face.

Hugging Face has integrations that you can see.

Using generic authentication is a supported method, and you can learn more about it by seeing Hugging Face integrations.

Hugging Face has a variety of integrations that you can explore.

The Hugging Face integrations page is where you can find more information on their supported methods.

Enable the API

Enabling the API is a crucial step in setting up your custom training job.

To enable the Container Registry API, navigate to the Container Registry and select Enable if it isn't already. You'll use this to create a container for your custom training job.

This step is necessary to access the features you need to create a container.

Model Deployment and Usage

Model deployment is a crucial step in making your Hugging Face model available on Vertex AI. You can deploy your model using the aiplatform.Model object returned by the upload method, which will deploy an endpoint using FastAPI.

You might enjoy: Hugging Face Local

Credit: youtube.com, Deploy Hugging Face models from Vertex AI Model Garden

To deploy an endpoint, you'll need to specify a machine type, such as the n1-standard-4 from the N1-Series, which comes with GPU acceleration and 4 vCPUs. This process can take around 15-20 minutes.

After deployment, your model will be registered in the Vertex AI Model Registry and ready for generating images on a Vertex AI Endpoint. You can also access Hugging Face models on Vertex AI by searching for them in the partners section and clicking on Hugging Face.

To deploy a Hugging Face model, you'll need to provide a Hugging Face access token and deploy it as shown below. This will create a new model instance in Vertex AI Model Registry and make it ready for generating images on a Vertex AI Endpoint.

The recommended deployment recipe provided by Vertex AI Model Garden includes the associated machine type, but you'll also need to specify the model inference runtime to generate inference. The Hugging Face Deep Learning container comes into play for this purpose.

Here's a summary of the arguments provided to the deploy method:

Note that the machine_type and accelerator_type are tied together, so you'll need to select an instance that supports the accelerator you're using.

Jay Matsuda

Lead Writer

Jay Matsuda is an accomplished writer and blogger who has been sharing his insights and experiences with readers for over a decade. He has a talent for crafting engaging content that resonates with audiences, whether he's writing about travel, food, or personal growth. With a deep passion for exploring new places and meeting new people, Jay brings a unique perspective to everything he writes.

Love What You Read? Stay Updated!

Join our community for insights, tips, and more.