A Step-by-Step Guide to Designing Machine Learning Systems

Author

Reads 1.3K

An artist’s illustration of artificial intelligence (AI). This image represents how machine learning is inspired by neuroscience and the human brain. It was created by Novoto Studio as par...
Credit: pexels.com, An artist’s illustration of artificial intelligence (AI). This image represents how machine learning is inspired by neuroscience and the human brain. It was created by Novoto Studio as par...

Designing machine learning systems can be a daunting task, but breaking it down into smaller steps can make it more manageable.

First, clearly define the problem you're trying to solve with your machine learning system. As discussed in the article, this involves identifying the specific task or outcome you want to achieve, such as image classification or sentiment analysis.

Next, gather and preprocess your data. This includes collecting relevant data, handling missing values, and converting data into a suitable format for machine learning algorithms. For example, in the article, it's mentioned that data preprocessing is an essential step in the machine learning pipeline.

After data preprocessing, you can start selecting the appropriate machine learning algorithm for your task. The choice of algorithm depends on the type of problem you're trying to solve and the characteristics of your data.

Suggestion: Data Preprocessing

Designing a Machine Learning System

You should focus on the Data and Modeling areas when answering an ML Design interview question, as these are the core components of what you will build.

Credit: youtube.com, Designing Machine Learning Systems | book summary | Read a book with me

The general thrust of ML Design interviews is to understand your thought process when faced with an almost real-world problem and data collection/preprocessing, as well as the model you will choose.

In an ML System Design interview, you should ask the interviewer if they would like you to explain how to productionize each component, and include the following steps at a minimum when creating a general architecture.

See what others are reading: Grokking System Design Pdf

Design a Learning System

Designing a Learning System is a crucial step in creating a machine learning system. It involves understanding the requirements and clarifying them to ensure you're on the same page as the interviewer.

You should ask the interviewer to reword the prompt in their own words to confirm understanding. This ensures that you're answering the correct question.

Some questions to ask to understand the scope of the project include:

  1. How much data would we have access to?
  2. What are the hardware constraints, such as time and compute power available?
  3. Do we need a model that's quick to respond or extremely accurate?
  4. Do we need to think about retraining the model?

The architecture of the system should be designed to accommodate data ingestion to model serving. This involves creating a general architecture that includes the necessary components, such as data storage and model deployment.

In designing machine learning systems, it's essential to consider the key design decisions, including reliability, scalability, maintainability, and adaptability to changing environments and business requirements.

Review

Credit: youtube.com, This ML Design Interview strategy got me into Meta

Designing a machine learning system can be a daunting task, but with the right guidance, it can be a game-changer for your company. Chip, a masterful teacher, has written a book that is considered the very best resource for building, deploying, and scaling machine learning models at a company for maximum impact.

The book is a must-read for anyone serious about ML in production, as it provides the most relevant information to design and implement ML systems end to end. Laurence Moroney, AI and ML Lead at Google, agrees that Chip has admirably cut through the chaff to get the most essential information.

One of the best resources for designing ML systems for production is Chip's book. It focuses on the first principles behind designing ML systems, making it a valuable resource to navigate the ephemeral landscape of tooling and platform options.

If you're looking for a book that will help you design and implement ML systems, Chip's book is the way to go. It's a comprehensive resource that will give you the knowledge you need to succeed.

Check this out: System Design Grokking

Machine Learning Interview Template

Credit: youtube.com, Spotify ML Question - Design a Recommendation System (Full mock interview)

The Machine Learning Interview Template is a must-have for any aspiring ML engineer. It's a generic template that guides you through almost any ML system design question you can get in an interview.

To answer an ML Design interview question, focus on the two main areas: Data and Modeling. This is because the interviewer wants to see your thought process when faced with a real-world problem and data collection/preprocessing, as well as the model you will choose, are core components of what you will build.

You should spend most of your time in the interview on these areas, and that's where the interviewer will be looking to see how you perform.

Data and Labeling

Data and Labeling is a crucial step in designing machine learning systems.

We restricted our label space to four categories: natural-language-processing, computer-vision, mlops, and other.

The labeling process involves ingestions and quality assurance (QA) checks.

We initially assumed that content can only belong to one category, but reality showed us that content can belong to more than one category, known as multilabel.

This realization led us to simplify our approach for the sake of many libraries that don't support or complicate multilabel scenarios.

Data

Credit: youtube.com, What is Data Labeling? Its Types, Role, Challenges and Solutions | AI Data Labeling Services

Data is a crucial component of machine learning, and it's essential to understand its role in the labeling process. High-quality data is the backbone of accurate machine learning models.

Noise in the data can lead to biased models, which is why data preprocessing is a vital step. Data preprocessing involves handling missing values, outliers, and irrelevant features.

Data labeling is a manual process that requires human judgment and expertise. Labeling data involves assigning relevant labels to the data points, which helps the machine learning model understand the context.

Data quality is directly proportional to the accuracy of the machine learning model. Poor data quality can lead to poor model performance, which is why it's essential to ensure data accuracy.

Data annotation is a time-consuming process, but it's necessary for building high-quality machine learning models. Annotated data helps the model learn from the data and make accurate predictions.

The type of data used in machine learning can vary, from text and images to audio and video. Each type of data requires specific labeling techniques and tools to ensure accurate labeling.

Labeling

Credit: youtube.com, What is Data Labeling ? | Prepare Your Data for ML and AI | Attaching meaning to digital data 27

Labeling is a crucial step in the machine learning process. We decided to restrict the label space to four main categories: natural-language-processing, computer-vision, mlops, and other.

The labeling process involves ingestions and quality assurance (QA). We simplified the label space by restricting it to the four mentioned categories.

Content can belong to more than one category, known as multilabel, but we chose to limit it to multiclass for simplicity. This decision was made because many libraries don't support or complicate multilabel scenarios.

Here's a quick rundown of the label categories we're working with:

Modeling and Evaluation

Modeling and evaluation are crucial steps in designing machine learning systems.

To ensure our model is effective, we need to think about when and how we'll evaluate it. This involves defining metrics that will help us measure its performance.

There are core principles to follow when modeling, including end-to-end utility, manual before ML, augment vs. automate, internal vs. external, and thorough testing and evaluation.

Credit: youtube.com, All Machine Learning Models Explained in 5 Minutes | Types of ML Models Basics

These principles are essential to benchmark iterations against each other and plug-and-play with the system.

Here are some key considerations for thorough testing and evaluation:

  • Creating a gold-standard labeled dataset that is representative of the problem space.
  • Rule-based text matching approaches to categorize content.
  • Predicting labels (probabilistic) from content title and description.

By following these principles and considerations, we can develop a robust and effective machine learning system.

Modeling

Modeling is a crucial step in any project, and there are some core principles to keep in mind to ensure success. The end result of every iteration should deliver minimum end-to-end utility so that we can benchmark iterations against each other and plug-and-play with the system. This allows for easy comparison and integration of different approaches.

Manual before ML is another important principle. It's essential to try to see how well a simple rule-based system performs before moving onto more complex ones. This helps to establish a baseline and ensures that we're not overcomplicating things.

Augment vs. automate is a key consideration. We should allow the system to supplement the decision-making process as opposed to making the actual decision. This approach enables us to leverage the strengths of both humans and machines.

If this caught your attention, see: Pruning Decision Tree

Credit: youtube.com, How to evaluate ML models | Evaluation metrics for machine learning

Internal vs. external is also a crucial aspect. Not all early releases have to be end-user facing. We can use early versions for internal validation, feedback, data collection, etc.

Here are some key principles to keep in mind when modeling:

  • End-to-end utility
  • Manual before ML
  • Augment vs. automate
  • Internal vs. external
  • Thorough testing and evaluation

Some of the advantages of starting simple include getting internal feedback on end-to-end utility, performing A/B testing to understand UI/UX design, and deployed locally to start generating more data required for more complex approaches.

Before diving into machine learning models, it's essential to establish a baseline model. A good baseline for our earlier prompt would be to recommend the most popular products to the users. This "model" will always be easy to implement and provides a baseline that all other models should outperform.

Here are some traditional ML models that are quick to train:

  • Logistic regression
  • Decision trees

These models are great for getting started, but it's essential to discuss their pros and cons. For example, logistic regression is a simple and interpretable model, but it may not perform well on complex datasets. Decision trees are easy to train and can handle non-linear relationships, but they can be prone to overfitting.

When it comes to model versioning, there are several tools to consider:

  • DVC
  • Amazon SageMaker
  • Google Cloud AI Platform

These tools enable us to manage different versions of our models, track changes, and collaborate with others.

Evaluation

Credit: youtube.com, Model Evaluation | Stanford CS224U Natural Language Understanding | Spring 2021

Evaluation is a crucial step in the modeling process. We need to determine which metrics to prioritize, and how to evaluate our model.

To decide which metrics to prioritize, we need to consider the specific task at hand. For example, in an email spam detector, precision is very important because it's better to flag some spam than to completely miss an important email.

We should always give at least two metrics: one for offline evaluation and one for online evaluation. Offline metrics are used to score the model when it's being built, while online metrics are used once the model is in production.

Offline metrics include AUC, F1, R², MSE, and Intersection over Union. Online metrics, on the other hand, are use case specific and could include the click-through rate or how long users spend watching a video that was recommended.

Non-functional metrics, such as training speed and scalability, extensibility to new techniques, and tooling for easy training, debugging, evaluation, and deployment, are also important to consider.

Broaden your view: Offline Learning

Credit: youtube.com, Is Topic Model Evaluation Broken? The Incoherence of Coherence [NeurIPS 2021 Research Talk]

To conduct offline evaluation, we need a gold standard holdout dataset that we can use to benchmark all of our models. We'll also be creating slices of data that we want to evaluate in isolation.

Here are some key considerations for offline evaluation:

  • True positives (TP): we correctly predicted the class.
  • False positives (FP): we incorrectly predicted the class but it was another class.
  • True negatives (TN): we correctly predicted that it wasn't the class.
  • False negatives (FN): we incorrectly predicted that it wasn't the class but it was.

By considering these metrics and evaluation methods, we can ensure that our model is well-evaluated and effective in production.

Inference and Serving

Inference and serving are crucial steps in designing machine learning systems. To make predictions, we need to decide whether to perform batch (offline) or real-time (online) inference.

Batch inference is ideal for tasks where predictions can be made on a finite set of inputs and then written to a database for low latency inference. This approach can generate and cache predictions for very fast inference for users, and the model doesn't need to be spun up as its own service since it's never used in real-time. However, predictions can become stale if user interests change.

Curious to learn more? Check out: Solomonoff's Theory of Inductive Inference

Credit: youtube.com, What is AI Inference?

Some tasks where batch serving is ideal include recommending content that existing users will like based on their viewing history. But new users may just receive generic recommendations until their history is processed the next day.

Online inference, on the other hand, is suitable for real-time predictions where input features are fed to the model to retrieve predictions. This approach can yield more up-to-date predictions, but requires managed microservices to handle request traffic and real-time monitoring since the input space is unbounded.

To serve our model to users, we need to decide whether to run it on the user's phone/computer or on our own service. Running it on the user's device would use their memory and battery, but would provide quick latency. On the other hand, storing the model on our own service would increase latency and privacy concerns, but remove the burden of taking up memory and battery on the user's device.

Here are some measurements that we should log to monitor performance:

  • error rates
  • time to return queries
  • metric scores

To address biases and misuses of our model, we should discuss how often we would retrain the model. Some models need to be retrained every day, while others may need to be retrained every week or month.

Keith Marchal

Senior Writer

Keith Marchal is a passionate writer who has been sharing his thoughts and experiences on his personal blog for more than a decade. He is known for his engaging storytelling style and insightful commentary on a wide range of topics, including travel, food, technology, and culture. With a keen eye for detail and a deep appreciation for the power of words, Keith's writing has captivated readers all around the world.

Love What You Read? Stay Updated!

Join our community for insights, tips, and more.