As you start your journey in MLOps, it's essential to understand the process of taking a machine learning model from development to deployment with automation. This involves several key steps that streamline the process, making it more efficient and reliable.
One crucial step is model training, which is where you develop and refine your machine learning model using various algorithms and techniques. According to the article, model training is typically done in a local environment, such as a Jupyter Notebook, and involves experimenting with different parameters and hyperparameters to achieve optimal results.
The next step is model deployment, which involves putting your trained model into production and making it accessible to users. This is where automation comes in, as you can use tools like Kubernetes and Docker to containerize and deploy your model with minimal manual intervention.
Curious to learn more? Check out: Applied Machine Learning Course
MLOps Fundamentals
MLOps is all about combining machine learning with software engineering to build production-grade ML applications.
By following software engineering best practices, you can refactor your code into clean Python scripts that are easy to test, document, log, serve, and version.
The Made-With-ML course provides a comprehensive MLOps curriculum, including lessons and code repositories like GokuMohandas/Made-With-ML.
To design, develop, deploy, and iterate on production-grade ML applications, you'll need to focus on testing, documentation, logging, serving, and versioning.
Here are some key MLOps concepts to keep in mind:
- Testing: Ensure your ML models are robust and accurate
- Documentation: Keep track of your code and its dependencies
- Logging: Monitor your application's performance and errors
- Serving: Deploy your ML models in a production-ready environment
- Versioning: Manage different versions of your code and models
Databricks and Environment
In an MLOps course, you'll learn how to implement best practices on Databricks, which elevates data scientists and speeds up time to production.
The cluster environment determines where our workloads will be executed, including the operating system and dependencies.
To create or update a cluster environment, you'll need to follow the steps outlined in the cluster environment section of the course materials.
Implementing Best Practices on Databricks
Implementing Best Practices on Databricks is crucial for elevating data scientists and speeding up time to production. It's not about the tools you use, but how you use them to follow MLOps principles.
To follow these principles, you must be able to look up unambiguously corresponding code/commit on git, ML model artifacts, and what data was used to train the model. This ensures reproducibility and makes it easier to track changes.
Databricks offers features like Unity catalog, model serving, feature serving, and Databricks Asset Bundles that can help you implement MLOps best practices. However, it was not straightforward due to lacking documentation and notebook-first available training materials.
Developing on Databricks requires Python experience and basic knowledge of git, CI/CD. You can use Databricks asset bundles (DAB) to streamline your workflow.
To get started, explore the Jupyter notebook to interactively walkthrough the core machine learning workloads. This will give you a hands-on experience with Databricks.
Broaden your view: Databricks Mlops Book
Cluster Environment
The cluster environment is where our workloads will be executed, and it's determined by the OS, dependencies, and more. We've already created this for us, but let's take a look at how to create or update one ourselves.
To create a cluster environment, we need to consider the OS and dependencies that will be used for execution. The cluster environment determines where our workloads will be executed.
If we've already created the cluster environment, we can skip to the next step. However, if we're creating one from scratch, we'll need to decide on the OS and dependencies that will be used for execution.
Experimentation and Tracking
In an actual production setting, it's best to have a central location to store all of your experiments.
You can spin up your own MLflow server for all of your team members to track their experiments on or use a managed solution like Weights & Biases, Comet, etc.
Running experiments on a local laptop is convenient, but you can access your MLflow dashboard at http://localhost:8080/.
If you're on Anyscale Workspaces, you'll need to expose the port of the MLflow server to generate a public URL to your MLflow server.
Storing experiments in a central location helps keep track of all your experiments in one place.
This can be especially helpful when working with a team, as everyone can access the same experiments and view the results.
Model Deployment
Model deployment is a crucial step in the MLOps process, where you take your trained machine learning model and make it accessible to others.
To deploy a machine learning model, you can use frameworks like Flask, as seen in Project 15, where a model is deployed as a web application.
In this process, you'll need to consider factors like model serving, model monitoring, and model maintenance.
Here's an interesting read: Mlops Continuous Delivery and Automation Pipelines in Machine Learning
Ci/Cd
In this section, we'll explore how to automate the deployment process using GitHub Actions.
We'll start by adding necessary credentials to the /settings/secrets/actions page of our GitHub repository.
This involves adding a personal access token, which can be obtained by following these steps: New GitHub personal access token → Add a name → Toggle repo and workflow → Click Generate token (scroll down) → Copy the token and paste it when prompted for your password.
With credentials in place, we can make changes to our code (not on main branch) and push them to GitHub. The workflow will then trigger automatically, producing comments with the training and evaluation results directly on the PR.
We can then review the results and, if satisfied, merge the PR into the main branch, which will trigger the serve workflow to rollout our new service to production!
Here are the steps involved in the CI/CD process:
- Add necessary credentials to the /settings/secrets/actions page
- Make changes to code and push to GitHub
- Trigger the workflow and review results
- Merge PR into main branch to rollout new service
Deploying Single Container App on Minikube
To deploy a single container app on Minikube, you'll need to set up Minikube with VirtualBox on your Windows 10 Home system. Minikube is a tool that allows you to run Kubernetes locally.
You'll learn various concepts of Kubernetes like pods, deployments, services, and ingress. These concepts will help you understand how to create and manage your containerized application.
To create a deployment, you'll use a command that will create a deployment object in your Kubernetes cluster. This deployment object will define the desired state of your application.
You can access your deployed application using Kubernetes ingress. Ingress is a way to expose your application to the outside world by routing traffic to it.
Deploying Multi-Container App on Minikube and GKE
Deploying a multi-container application on Minikube and GKE is a crucial step in model deployment. This process involves understanding how to deploy a multi-container application on both local and cloud environments.
Minikube is a great tool for testing and developing applications before deploying them to production environments like Google Kubernetes Engine. By using Minikube, you can appreciate the benefits of a local testing environment.
Project 7 focuses on deploying a multi-container application on Minikube and GKE. This project covers the use of Kubernetes Secrets and Persistent Volume Claims.
Kubernetes Secrets are used to store sensitive information such as passwords and keys. Persistent Volume Claims are used to persist data even after the container is deleted.
Deploying ML Models with Flask
Deploying ML models with Flask is a great way to share your machine learning expertise with others.
Project 15 focuses on this very topic, showing how to deploy a machine learning model as a web application using the Flask framework.
For your interest: Deep Learning Ai Mlops
Flask is a lightweight and flexible framework that's perfect for building web applications that run machine learning models.
It's easy to learn and use, even for those without extensive web development experience.
To deploy a machine learning model with Flask, you'll need to use a model that's already been trained and saved in a format that Flask can understand.
Project 15 shows how to use a pre-trained model to make predictions and display the results in a web application.
One of the benefits of deploying ML models with Flask is that it allows you to share your model with others and get feedback in real-time.
This can be especially useful for models that are still in the testing phase or for models that need to be fine-tuned based on user input.
Frequently Asked Questions
How long does it take to learn MLOps?
Learning MLOps typically takes a month or more to get started, with deeper understanding and professionalization of processes achievable through step-by-step implementation. Start by focusing on the biggest impact areas for the greatest return on investment.
What is the average salary of MLOps?
The average salary for an MLOps Engineer in India ranges from ₹21.2 lakhs to ₹63.7 lakhs, with entry-level positions starting at around ₹600,000 annually. Discover the factors influencing MLOps salaries and how to maximize your earning potential.
Sources
- https://maven.com/marvelousmlops/mlops-with-databricks
- https://github.com/GokuMohandas/mlops-course
- https://iowastateonline.iastate.edu/programs-and-courses/artificial-intelligence/mlops-machine-learning-operations/
- https://cloudxlab.com/course/116/mlops-certification-training
- https://www.appliedai-institute.de/en/free-online-courses/the-mlops-workbook
Featured Images: pexels.com