Applied machine learning and AI is revolutionizing the field of engineering, enabling the creation of intelligent systems that can learn from data and make decisions on their own. This is made possible by the application of algorithms and statistical models that can analyze and interpret complex data.
Machine learning algorithms are trained on large datasets, allowing them to learn patterns and relationships that would be difficult or impossible for humans to identify. In fact, a dataset of 10,000 samples can be used to train a machine learning model to achieve high accuracy in classification tasks.
As engineers, it's essential to understand the fundamentals of machine learning and AI, including supervised and unsupervised learning, regression, classification, and clustering. By grasping these concepts, engineers can design and develop intelligent systems that can improve efficiency, accuracy, and decision-making in various industries.
A different take: Learn Ai and Ml
Machine Learning Fundamentals
Most machine learning models fall into one of two broad categories: supervised learning models and unsupervised learning models. Supervised learning models make predictions, and they're trained with labeled data.
Explore further: Difference between Supervised and Unsupervised Machine Learning
A great example of a supervised learning model is the US Postal Service's model that turns handwritten zip codes into digits. Another example is the model your credit card company uses to authorize purchases.
Unsupervised learning models, on the other hand, don't require labeled data. They're used to provide insights into existing data, or to group data into categories and categorize future inputs accordingly.
Classification
Classification is a fundamental concept in machine learning that involves categorizing data into predefined groups. It's a crucial task in many industries, including e-commerce and advertising.
Google's research paper on predicting advertiser churn for Google AdWords in 2010 is a great example of classification in action. The paper explores how to use machine learning to predict which advertisers are likely to leave Google AdWords.
Classification can be used to categorize documents, items, or even products. For instance, Walmart's research paper on large-scale item categorization in e-commerce using multiple recurrent neural networks in 2016 shows how to use machine learning to categorize products quickly and efficiently.
Additional reading: Generative Ai Human Creativity and Art Google Scholar
Here are some notable examples of classification in machine learning:
- Prediction of Advertiser Churn for Google AdWords (Paper) Google2010
- High-Precision Phrase-Based Document Classification on a Modern Scale (Paper) LinkedIn2011
- Chimera: Large-scale Classification using Machine Learning, Rules, and Crowdsourcing (Paper) Walmart2014
- Large-scale Item Categorization in e-Commerce Using Multiple Recurrent Neural Networks (Paper) NAVER2016
- Discovering and Classifying In-app Message Intent at AirbnbAirbnb2019
- Teaching Machines to Triage Firefox BugsMozilla2019
- Categorizing Products at ScaleShopify2020
- How We Built the Good First Issues FeatureGitHub2020
- Testing Firefox More Efficiently with Machine LearningMozilla2020
- Using ML to Subtype Patients Receiving Digital Mental Health Interventions (Paper) Microsoft2020
- Scalable Data Classification for Security and Privacy (Paper) Facebook2020
- Uncovering Online Delivery Menu Best Practices with Machine LearningDoorDash2020
- Using a Human-in-the-Loop to Overcome the Cold Start Problem in Menu Item TaggingDoorDash2020
- Deep Learning: Product Categorization and ShelvingWalmart2021
- Large-scale Item Categorization for e-Commerce (Paper) DianPing, eBay2012
- Semantic Label Representation with an Application on Multimodal Product CategorizationWalmart2022
- Building Airbnb Categories with ML and Human-in-the-LoopAirbnb2022
Forecasting
Machine learning is all about making predictions and forecasts, and one of the most important applications is in forecasting. Companies like Uber and Gojek have developed automated forecasting tools to predict demand and supply.
Forecasting is a crucial aspect of business operations, and it's not just about predicting numbers, but also about understanding the underlying patterns and trends. For instance, Uber's automated forecasting tool uses machine learning to predict demand and supply, while Gojek's tool uses a similar approach to forecast demand.
Companies like DoorDash and Grubhub have also developed forecasting tools to predict order volume and supply. These tools use machine learning algorithms to analyze historical data and make predictions about future demand.
One of the key challenges in forecasting is retraining machine learning models in the wake of unexpected events, such as the COVID-19 pandemic. Companies like DoorDash have developed strategies to retrain their models and adapt to changing circumstances.
Suggestion: Data Labeling Companies
Here are some examples of companies that have developed forecasting tools:
These are just a few examples of companies that have developed forecasting tools using machine learning. The possibilities are endless, and the applications are vast.
Search
Machine learning has revolutionized the way we search for information online. One of the key applications of machine learning is in search ranking, which is the process of determining the order in which search results are displayed to users.
Amazon, for instance, uses a complex ranking system to display search results, which involves multiple factors such as relevance, popularity, and user behavior. In fact, Amazon's search ranking system is so sophisticated that it can even detect and prevent clickjacking attacks.
The goal of search ranking is to provide users with the most relevant and useful search results, which can be achieved through the use of machine learning algorithms. These algorithms can analyze large amounts of data, identify patterns, and make predictions about the relevance of search results.
In 2016, Yahoo developed a ranking system that uses a combination of machine learning algorithms to determine the relevance of search results. The system, known as "Ranking Relevance in Yahoo Search", uses a variety of factors such as keyword matching, document similarity, and user behavior to rank search results.
Here are some key statistics about search ranking:
- In 2017, Twitter used deep learning to improve the ranking of search results on its platform.
- Alibaba's e-commerce search engine uses a ranking system that takes into account factors such as user behavior, product attributes, and merchant information.
- In 2019, Airbnb developed a search ranking system that uses machine learning to personalize search results based on user behavior and preferences.
By using machine learning to improve search ranking, companies can provide users with more relevant and useful search results, which can lead to increased engagement, conversion rates, and customer satisfaction.
Sequence Modelling
Sequence modelling is a key area of machine learning where algorithms are trained to recognize patterns in sequential data. This can be particularly useful in applications like predicting clinical events or understanding consumer histories.
Recurrent neural networks (RNNs) are a type of neural network architecture well-suited for sequence modelling tasks. They can learn to recognize patterns in sequential data, such as time series data or text sequences.
Additional reading: Hidden Layers in Neural Networks Code Examples Tensorflow
For example, a study by Sutter Health in 2015 used RNNs to predict clinical events, while another study by Zalando in 2016 used deep learning to understand consumer histories.
The applications of sequence modelling are diverse, and include early detection of heart failure onset, notification attendance prediction, and click-through rate prediction.
Here are some notable examples of sequence modelling in practice:
These examples demonstrate the potential of sequence modelling in a variety of domains, and highlight the importance of this area of machine learning.
Weak Supervision
Weak supervision is a technique used in machine learning to train models with limited or noisy labeled data. This approach is often used when it's not feasible to collect large amounts of high-quality labeled data.
One notable example of weak supervision is Snorkel DryBell, a case study by Google in 2019 that deployed weak supervision at an industrial scale. This project demonstrated the effectiveness of weak supervision in real-world applications.
Weak supervision can be achieved through various methods, including label synthesis, weak labeling, and active learning. These methods can be used individually or in combination to train models with limited labeled data.
The Osprey system, developed by Intel in 2019, is another example of weak supervision in action. Osprey uses weak supervision to address imbalanced extraction problems without requiring code modifications.
In some cases, weak supervision can be used to improve the performance of machine-learned products. The Overton system, developed by Apple in 2019, is designed to monitor and improve machine-learned products by providing feedback to developers.
Here are some key examples of weak supervision in action:
By leveraging weak supervision, developers can build more robust and accurate machine learning models, even with limited labeled data.
Generation
Machine learning is a field that's rapidly advancing, with new breakthroughs and innovations emerging every year. One area where we've seen significant progress is in the generation of new content, such as text, images, and even entire videos.
Explore further: New Ai Software Engineer
Better language models have been developed, enabling machines to generate human-like language that's indistinguishable from the real thing. This has huge implications for applications like chatbots, virtual assistants, and language translation software.
The GPT-3 model, for example, has been shown to be a few-shot learner, able to learn new tasks with minimal training data. This means that machines can learn to perform complex tasks with just a few examples, rather than requiring large amounts of training data.
In addition to text generation, researchers have also made significant strides in image generation and super resolution. Image GPT, a model developed by OpenAI, can generate high-quality images from text prompts, while deep learned super resolution techniques have been used to enhance the quality of feature films.
Here are some key papers and projects that have contributed to these advancements:
- Better Language Models and Their Implications (Paper)
- Image GPT (Paper, Code)
- Language Models are Few-Shot Learners (Paper)
- Deep Learned Super Resolution for Feature Film Production (Paper)
- Unit Test Case Generation with Transformers
Machine vs AI
Machine learning and artificial intelligence (AI) are often used interchangeably, but technically speaking, machine learning is a subset of AI.
Machine learning encompasses not only machine learning models but also other types of models such as expert systems and reinforcement learning systems.
An example of a reinforcement learning system is AlphaGo, which was the first computer program to beat a professional human Go player.
It trains on games that have already been played and learns strategies for winning on its own.
Deep learning is a subset of machine learning and what most people refer to as AI today.
Deep learning is machine learning performed with neural networks.
There are forms of deep learning that don't involve neural networks, but the vast majority of deep learning today involves neural networks.
Machine learning models can be divided into conventional models and deep-learning models.
Conventional models use learning algorithms to model patterns in data, while deep-learning models use neural networks to do the same.
Neural networks have been developed to excel at certain tasks, including computer vision and tasks involving human languages.
We'll take a closer look at neural networks in Chapter 8.
See what others are reading: Human in the Loop Reinforcement Learning
Sources
- https://ece.emory.edu/areas-of-study/data-analytics/big-data-analytics-and-applied-ml-with-python-cert.php
- https://github.com/eugeneyan/applied-ml
- https://www.oreilly.com/library/view/applied-machine-learning/9781492098041/ch01.html
- https://softlandia.fi/en/blog/the-rise-of-applied-ai-engineers-and-the-shift-in-ai-skillsets
- https://online.engineering.gwu.edu/online-doctor-engineering-artificial-intelligence-machine-learning
Featured Images: pexels.com