Applied machine learning and AI is revolutionizing the field of engineering, enabling the creation of intelligent systems that can learn from data and make decisions on their own. This is made possible by the application of algorithms and statistical models that can analyze and interpret complex data.
Machine learning algorithms are trained on large datasets, allowing them to learn patterns and relationships that would be difficult or impossible for humans to identify. In fact, a dataset of 10,000 samples can be used to train a machine learning model to achieve high accuracy in classification tasks.
As engineers, it's essential to understand the fundamentals of machine learning and AI, including supervised and unsupervised learning, regression, classification, and clustering. By grasping these concepts, engineers can design and develop intelligent systems that can improve efficiency, accuracy, and decision-making in various industries.
Worth a look: Machine Learning Data Labeling
Machine Learning Fundamentals
Most machine learning models fall into one of two broad categories: supervised learning models and unsupervised learning models. Supervised learning models make predictions, and they're trained with labeled data.
Additional reading: Supervised or Unsupervised Machine Learning Examples
A great example of a supervised learning model is the US Postal Service's model that turns handwritten zip codes into digits. Another example is the model your credit card company uses to authorize purchases.
Unsupervised learning models, on the other hand, don't require labeled data. They're used to provide insights into existing data, or to group data into categories and categorize future inputs accordingly.
You might enjoy: Difference between Model and Algorithm in Machine Learning
Classification
Classification is a fundamental concept in machine learning that involves categorizing data into predefined groups. It's a crucial task in many industries, including e-commerce and advertising.
Google's research paper on predicting advertiser churn for Google AdWords in 2010 is a great example of classification in action. The paper explores how to use machine learning to predict which advertisers are likely to leave Google AdWords.
Classification can be used to categorize documents, items, or even products. For instance, Walmart's research paper on large-scale item categorization in e-commerce using multiple recurrent neural networks in 2016 shows how to use machine learning to categorize products quickly and efficiently.
For your interest: Generative Ai Human Creativity and Art Google Scholar
Here are some notable examples of classification in machine learning:
- Prediction of Advertiser Churn for Google AdWords (Paper) Google2010
- High-Precision Phrase-Based Document Classification on a Modern Scale (Paper) LinkedIn2011
- Chimera: Large-scale Classification using Machine Learning, Rules, and Crowdsourcing (Paper) Walmart2014
- Large-scale Item Categorization in e-Commerce Using Multiple Recurrent Neural Networks (Paper) NAVER2016
- Discovering and Classifying In-app Message Intent at AirbnbAirbnb2019
- Teaching Machines to Triage Firefox BugsMozilla2019
- Categorizing Products at ScaleShopify2020
- How We Built the Good First Issues FeatureGitHub2020
- Testing Firefox More Efficiently with Machine LearningMozilla2020
- Using ML to Subtype Patients Receiving Digital Mental Health Interventions (Paper) Microsoft2020
- Scalable Data Classification for Security and Privacy (Paper) Facebook2020
- Uncovering Online Delivery Menu Best Practices with Machine LearningDoorDash2020
- Using a Human-in-the-Loop to Overcome the Cold Start Problem in Menu Item TaggingDoorDash2020
- Deep Learning: Product Categorization and ShelvingWalmart2021
- Large-scale Item Categorization for e-Commerce (Paper) DianPing, eBay2012
- Semantic Label Representation with an Application on Multimodal Product CategorizationWalmart2022
- Building Airbnb Categories with ML and Human-in-the-LoopAirbnb2022
Forecasting
Machine learning is all about making predictions and forecasts, and one of the most important applications is in forecasting. Companies like Uber and Gojek have developed automated forecasting tools to predict demand and supply.
Forecasting is a crucial aspect of business operations, and it's not just about predicting numbers, but also about understanding the underlying patterns and trends. For instance, Uber's automated forecasting tool uses machine learning to predict demand and supply, while Gojek's tool uses a similar approach to forecast demand.
Companies like DoorDash and Grubhub have also developed forecasting tools to predict order volume and supply. These tools use machine learning algorithms to analyze historical data and make predictions about future demand.
One of the key challenges in forecasting is retraining machine learning models in the wake of unexpected events, such as the COVID-19 pandemic. Companies like DoorDash have developed strategies to retrain their models and adapt to changing circumstances.
Broaden your view: Top Machine Learning Applications at Fin Tech Companies
Here are some examples of companies that have developed forecasting tools:
These are just a few examples of companies that have developed forecasting tools using machine learning. The possibilities are endless, and the applications are vast.
Search
Machine learning has revolutionized the way we search for information online. One of the key applications of machine learning is in search ranking, which is the process of determining the order in which search results are displayed to users.
Amazon, for instance, uses a complex ranking system to display search results, which involves multiple factors such as relevance, popularity, and user behavior. In fact, Amazon's search ranking system is so sophisticated that it can even detect and prevent clickjacking attacks.
The goal of search ranking is to provide users with the most relevant and useful search results, which can be achieved through the use of machine learning algorithms. These algorithms can analyze large amounts of data, identify patterns, and make predictions about the relevance of search results.
In 2016, Yahoo developed a ranking system that uses a combination of machine learning algorithms to determine the relevance of search results. The system, known as "Ranking Relevance in Yahoo Search", uses a variety of factors such as keyword matching, document similarity, and user behavior to rank search results.
Here are some key statistics about search ranking:
- In 2017, Twitter used deep learning to improve the ranking of search results on its platform.
- Alibaba's e-commerce search engine uses a ranking system that takes into account factors such as user behavior, product attributes, and merchant information.
- In 2019, Airbnb developed a search ranking system that uses machine learning to personalize search results based on user behavior and preferences.
By using machine learning to improve search ranking, companies can provide users with more relevant and useful search results, which can lead to increased engagement, conversion rates, and customer satisfaction.
Sequence Modelling
Sequence modelling is a key area of machine learning where algorithms are trained to recognize patterns in sequential data. This can be particularly useful in applications like predicting clinical events or understanding consumer histories.
Recurrent neural networks (RNNs) are a type of neural network architecture well-suited for sequence modelling tasks. They can learn to recognize patterns in sequential data, such as time series data or text sequences.
For your interest: Hidden Layers in Neural Networks Code Examples Tensorflow
For example, a study by Sutter Health in 2015 used RNNs to predict clinical events, while another study by Zalando in 2016 used deep learning to understand consumer histories.
The applications of sequence modelling are diverse, and include early detection of heart failure onset, notification attendance prediction, and click-through rate prediction.
Here are some notable examples of sequence modelling in practice:
These examples demonstrate the potential of sequence modelling in a variety of domains, and highlight the importance of this area of machine learning.
Weak Supervision
Weak supervision is a technique used in machine learning to train models with limited or noisy labeled data. This approach is often used when it's not feasible to collect large amounts of high-quality labeled data.
One notable example of weak supervision is Snorkel DryBell, a case study by Google in 2019 that deployed weak supervision at an industrial scale. This project demonstrated the effectiveness of weak supervision in real-world applications.
Take a look at this: Data Labeling in Machine Learning with Python
Weak supervision can be achieved through various methods, including label synthesis, weak labeling, and active learning. These methods can be used individually or in combination to train models with limited labeled data.
The Osprey system, developed by Intel in 2019, is another example of weak supervision in action. Osprey uses weak supervision to address imbalanced extraction problems without requiring code modifications.
In some cases, weak supervision can be used to improve the performance of machine-learned products. The Overton system, developed by Apple in 2019, is designed to monitor and improve machine-learned products by providing feedback to developers.
Here are some key examples of weak supervision in action:
By leveraging weak supervision, developers can build more robust and accurate machine learning models, even with limited labeled data.
Generation
Machine learning is a field that's rapidly advancing, with new breakthroughs and innovations emerging every year. One area where we've seen significant progress is in the generation of new content, such as text, images, and even entire videos.
Better language models have been developed, enabling machines to generate human-like language that's indistinguishable from the real thing. This has huge implications for applications like chatbots, virtual assistants, and language translation software.
The GPT-3 model, for example, has been shown to be a few-shot learner, able to learn new tasks with minimal training data. This means that machines can learn to perform complex tasks with just a few examples, rather than requiring large amounts of training data.
In addition to text generation, researchers have also made significant strides in image generation and super resolution. Image GPT, a model developed by OpenAI, can generate high-quality images from text prompts, while deep learned super resolution techniques have been used to enhance the quality of feature films.
Here are some key papers and projects that have contributed to these advancements:
- Better Language Models and Their Implications (Paper)
- Image GPT (Paper, Code)
- Language Models are Few-Shot Learners (Paper)
- Deep Learned Super Resolution for Feature Film Production (Paper)
- Unit Test Case Generation with Transformers
Machine vs AI
Machine learning and artificial intelligence (AI) are often used interchangeably, but technically speaking, machine learning is a subset of AI.
Machine learning encompasses not only machine learning models but also other types of models such as expert systems and reinforcement learning systems.
An example of a reinforcement learning system is AlphaGo, which was the first computer program to beat a professional human Go player.
It trains on games that have already been played and learns strategies for winning on its own.
Deep learning is a subset of machine learning and what most people refer to as AI today.
Deep learning is machine learning performed with neural networks.
There are forms of deep learning that don't involve neural networks, but the vast majority of deep learning today involves neural networks.
Machine learning models can be divided into conventional models and deep-learning models.
Conventional models use learning algorithms to model patterns in data, while deep-learning models use neural networks to do the same.
Neural networks have been developed to excel at certain tasks, including computer vision and tasks involving human languages.
We'll take a closer look at neural networks in Chapter 8.
On a similar theme: Reinforcement Learning
Sources
- Big Data Analytics and Applied Machine Learning with Python ... (emory.edu)
- applyingML (applyingml.com)
- Improving Accuracy By Certainty Estimation of Human Decisions, Labels, and Raters (fb.com)
- Paper (thodrek.github.io)
- Data Management Challenges in Production Machine Learning (research.google)
- Monitoring Data Quality at Scale with Statistical Modeling (uber.com)
- Introducing Fabricator: A Declarative Feature Engineering Framework (doordash.engineering)
- Developing scalable feature engineering DAGs (outerbounds.com)
- Open sourcing Feathr – LinkedIn’s feature store for productive machine learning (linkedin.com)
- Near real-time features for near real-time personalization (linkedin.com)
- ML Feature Serving Infrastructure at Lyft (lyft.com)
- Optimal Feature Discovery: Better, Leaner Machine Learning Models Through Information Theory (uber.com)
- Building Riviera: A Declarative Real-Time Feature Engineering Framework (doordash.engineering)
- Feast: Bridging ML Models and Data (gojek.io)
- Accelerating Machine Learning with the Feature Store Service (condenast.com)
- Introducing Feast: An Open Source Feature Store for Machine Learning (google.com)
- Building the Activity Graph, Part 2 (Feature Storage Section) (linkedin.com)
- Distributed Time Travel for Feature Generation (netflixtechblog.com)
- Building Airbnb Categories with ML and Human-in-the-Loop (medium.com)
- Using a Human-in-the-Loop to Overcome the Cold Start Problem in Menu Item Tagging (doordash.engineering)
- Uncovering Online Delivery Menu Best Practices with Machine Learning (doordash.engineering)
- Testing Firefox More Efficiently with Machine Learning (mozilla.org)
- Teaching Machines to Triage Firefox Bugs (mozilla.org)
- Paper (kdd.org)
- Large-scale Item Categorization in e-Commerce Using Multiple Recurrent Neural Networks (kdd.org)
- Chimera: Large-scale Classification using Machine Learning, Rules, and Crowdsourcing (acm.org)
- High-Precision Phrase-Based Document Classification on a Modern Scale (linkedin.com)
- Prediction of Advertiser Churn for Google AdWords (research.google)
- Using Machine Learning to Predict the Value of Ad Requests (twitter.com)
- Using Machine Learning to Predict Value of Homes On Airbnb (medium.com)
- Causal Forecasting at Lyft (Part 1) (lyft.com)
- DeepETA: How Uber Predicts Arrival Times Using Deep Learning (uber.com)
- The history of Amazon’s forecasting algorithm (amazon.science)
- Greykite: A flexible, intuitive, and fast forecasting library (linkedin.com)
- Managing Supply and Demand Balance Through Machine Learning (doordash.engineering)
- Introducing Orbit, An Open Source Package for Time Series Inference and Forecasting (uber.com)
- Retraining Machine Learning Models in the Wake of COVID-19 (doordash.engineering)
- Under the Hood of Gojek’s Automated Forecasting Tool (gojek.io)
- Engineering Extreme Event Forecasting at Uber with RNN (uber.com)
- Paper (arxiv.org)
- Homepage Recommendation with Exploitation and Exploration (doordash.engineering)
- Evolving DoorDash’s Substitution Recommendations Algorithm (doordash.engineering)
- Recommend API: Unified end-to-end machine learning infrastructure to generate recommendations (slack.engineering)
- RecSysOps: Best Practices for Operating a Large-Scale Recommender System (medium.com)
- Blueprints for recommender system architectures: 10th anniversary edition (amatriain.net)
- Improving job matching with machine-learned activity features (linkedin.com)
- Beyond Matrix Factorization: Using hybrid features for user-business recommendations (yelp.com)
- Lessons Learned from Building out Context-Aware Recommender Systems (onepeloton.com)
- How We Built: An Early-Stage Machine Learning Model for Recommendations (onepeloton.com)
- Building a Deep Learning Based Retrieval System for Personalized Recommendations (ebayinc.com)
- The Amazon Music conversational recommender is hitting the right notes (amazon.science)
- Understanding Data Storage and Ingestion for Large-Scale Deep Recommendation Model Training (arxiv.org)
- "Are you sure?": Preliminary Insights from Scaling Product Comparisons to Multiple Shops (arxiv.org)
- On YouTube's Recommendation System (blog.youtube)
- Deep Retrieval: End-to-End Learnable Structure Model for Large-Scale Recommendations (arxiv.org)
- Self-supervised Learning for Large-scale Item Recommendations (arxiv.org)
- Lessons Learned Addressing Dataset Bias in Model-Based Candidate Generation (arxiv.org)
- Multi-task Learning and Calibration for Utility-based Home Feed Ranking (medium.com)
- Improving the Quality of Recommended Pins with Lightweight Ranking (medium.com)
- Multi-task Learning for Related Products Recommendations at Pinterest (medium.com)
- A Case Study of Session-based Recommendations in the Home-improvement Domain (acm.org)
- Improved Deep & Cross Network for Feature Cross Learning in Web-scale LTR Systems (arxiv.org)
- Zero-Shot Heterogeneous Transfer Learning from RecSys to Cold-Start Search Retrieval (arxiv.org)
- Building a Heterogeneous Social Network Recommendation System (linkedin.com)
- A Closer Look at the AI Behind Course Recommendations on LinkedIn Learning (Part 2) (linkedin.com)
- The Evolution of Kit: Automating Marketing Using Machine Learning (shopify.com)
- Contextual and Sequential User Embeddings for Large-Scale Music Recommendation (acm.org)
- For Your Ears Only: Personalizing Spotify Home with Machine Learning (atspotify.com)
- ATBRG: Adaptive Target-Behavior Relational Graph Network for Effective Recommendation (arxiv.org)
- MiNet: Mixed Interest Network for Cross-Domain Click-Through Rate Prediction (arxiv.org)
- Controllable Multi-Interest Framework for Recommendation (arxiv.org)
- TPG-DNN: A Method for User Intent Prediction with Multi-task Learning (arxiv.org)
- Paper (arxiv.org)
- Deep Interest with Hierarchical Attention Network for Click-Through Rate Prediction (arxiv.org)
- Temporal-Contextual Recommendation in Real-Time (amazon.science)
- Learning to be Relevant: Evolution of a Course Recommendation System (acm.org)
- Using Machine Learning to Predict what File you Need Next (Part 2) (dropbox.tech)
- Food Discovery with Uber Eats: Using Graph Learning to Power Recommendations (uber.com)
- Powered by AI: Instagram’s Explore recommender system (facebook.com)
- Personalized Recommendations for Experiences Using Deep Learning (tripadvisor.com)
- Multi-Interest Network with Dynamic Routing for Recommendation at Tmall (arxiv.org)
- Paper (arxiv.org)
- SDM: Sequential Deep Matching Model for Online Large-scale Recommender System (arxiv.org)
- Behavior Sequence Transformer for E-commerce Recommendation in Alibaba (arxiv.org)
- Explore, Exploit, and Explain: Personalizing Explainable Recommendations with Bandits (acm.org)
- Pixie: A System for Recommending 3+ Billion Items to 200+ Million Users in Real-Time (arxiv.org)
- Paper (nips.cc)
- A Meta-Learning Perspective on Cold-Start Recommendations for Items (nips.cc)
- Personalized Recommendations in LinkedIn Learning (linkedin.com)
- Session-based Recommendations with Recurrent Neural Networks (arxiv.org)
- Learning a Personalized Homepage (netflixtechblog.com)
- Recommending Music on Spotify with Deep Learning (benanne.github.io)
- Learning to Rank Recommendations with the k -Order Statistic Loss (acm.org)
- Deep Learning for Search Ranking at Etsy (etsy.com)
- How to Optimise Rankings with Cascade Bandits (medium.com)
- Learning To Rank Diversely (medium.com)
- SearchSage: Learning Search Query Representations at Pinterest (medium.com)
- Siamese BERT-based Model for Web Search Relevance Ranking (arxiv.org)
- Paper (arxiv.org)
- Graph Intention Network for Click-through Rate Prediction in Sponsored Search (arxiv.org)
- Using Learning-to-rank to Precisely Locate Where to Deliver Packages (amazon.science)
- Towards Personalized and Semantic Retrieval for E-commerce Search via Embedding Learning (arxiv.org)
- Embedding-based Retrieval in Facebook Search (arxiv.org)
- Things Not Strings: Understanding Search Intent with Better Recall (doordash.engineering)
- GDMix: A Deep Ranking Personalization Framework (linkedin.com)
- COLD: Towards the Next Generation of Pre-Ranking System (arxiv.org)
- AI at Scale in Bing (bing.com)
- Video (crossminds.ai)
- Ads Allocation in Feed via Constrained Optimization (acm.org)
- Quality Matches Via Personalized AI for Hirer and Seeker Preferences (linkedin.com)
- Improving Deep Learning for Airbnb Search (arxiv.org)
- Query2vec: Search query expansion with query embeddings (grubhub.com)
- How We Used Semantic Search to Make Our Search 10x Smarter (medium.com)
- Aggregating Search Results from Heterogeneous Sources via Reinforcement Learning (arxiv.org)
- Neural Code Search: ML-based Code Search Using Natural Language Queries (facebook.com)
- Paper (arxiv.org)
- Entity Personalized Talent Search Models with Tree Interaction Features (arxiv.org)
- Machine Learning-Powered Search Ranking of Airbnb Experiences (medium.com)
- Reinforcement Learning to Rank in E-Commerce Search Engine (arxiv.org)
- Paper (arxiv.org)
- Globally Optimized Mutual Influence Aware Ranking in E-Commerce Search (arxiv.org)
- Powering Search & Recommendations at DoorDash (doordash.engineering)
- An Ensemble-based Approach to Click-Through Rate Prediction for Promoted Listings at Etsy (arxiv.org)
- Using Deep Learning at Scale in Twitter’s Timelines (twitter.com)
- Learning to Rank Personalized Search Results in Professional Networks (arxiv.org)
- Paper (kdd.org)
- Ranking Relevance in Yahoo Search (kdd.org)
- Embeddings at Spotify's Scale - How Hard Could It Be? (arize.com)
- Multi-objective Hyper-parameter Optimization of Behavioral Song Embeddings (arxiv.org)
- The Embeddings That Came in From the Cold: Improving Vectors for New and Rare Products with Content-Based Inference (acm.org)
- BERT Goes Shopping: Comparing Distributional Models for Product Representations (aclanthology.org)
- Machine Learning for a Better Developer Experience (netflixtechblog.com)
- Should we Embed? A Study on Performance of Embeddings for Real-Time Recommendations (arxiv.org)
- Personalized Store Feed with Vector Embeddings (doordash.engineering)
- Towards Deep and Representation Learning for Talent Search at LinkedIn (arxiv.org)
- Understanding Latent Style (stitchfix.com)
- Paper (kdd.org)
- Embeddings@Twitter (twitter.com)
- ML-Enhanced Code Completion Improves Developer Productivity (googleblog.com)
- (Part 2) (arxiv.org)
- How we reduced our text similarity runtime by 99.96% (medium.com)
- WIDeText: A Multimodal Deep Learning Framework (medium.com)
- GeDi: A Powerful New Method for Controlling Language Models (einstein.ai)
- Photon: A Robust Cross-Domain Text-to-SQL System (aclweb.org)
- Deploying Lifelong Open-Domain Dialogue Learning (arxiv.org)
- A Highly Efficient, Real-Time Text-to-Speech System Deployed on CPUs (facebook.com)
- Paper (arxiv.org)
- Using Neural Networks to Find Answers in Tables (googleblog.com)
- Goal-Oriented End-to-End Conversational Models with Profile Features in a Real-World Setting (amazon.science)
- Building Smart Replies for Member Messages (linkedin.com)
- Search-based User Interest Modeling with Sequential Behavior Data for CTR Prediction (arxiv.org)
- Practice on Long Sequential User Behavior Modeling for Click-Through Rate Prediction (arxiv.org)
- Deep Learning for Electronic Health Records (googleblog.com)
- Continual Prediction of Notification Attendance with Classical and Deep Networks (arxiv.org)
- Paper (doogkong.github.io)
- Deep Learning for Understanding Consumer Histories (zalando.com)
- Doctor AI: Predicting Clinical Events via Recurrent Neural Networks (arxiv.org)
- An Efficient Training Approach for Very Large Scale Face Recognition (arxiv.org)
- Using Machine Learning to Detect Deficient Coverage in Colonoscopy Screenings (googleblog.com)
- On-device Supermarket Product Recognition (googleblog.com)
- RepNet: Counting Repetitions in Videos (googleblog.com)
- Machine Learning-based Damage Assessment for Disaster Relief (googleblog.com)
- Making machines recognize and transcribe conversations in meetings using audio and video (microsoft.com)
- How we Improved Computer Vision Metrics by More Than 5% Only by Cleaning Labelling Errors (deepomatic.com)
- Selecting the Best Image for Each Merchant Using Exploration and Machine Learning (doordash.engineering)
- Bandits for Online Calibration: An Application to Content Moderation on Social Media Platforms (arxiv.org)
- Shifting Consumption towards Diverse content via Reinforcement Learning (atspotify.com)
- Part 2 (towardsdatascience.com)
- Deep Reinforcement Learning in Production Part1 (towardsdatascience.com)
- Paper (arxiv.org)
- Dynamic Pricing on E-commerce Platform with Deep Reinforcement Learning (arxiv.org)
- Reinforcement Learning for On-Demand Logistics (doordash.engineering)
- Budget Constrained Bidding by Model-free Reinforcement Learning in Display Advertising (arxiv.org)
- Deep Reinforcement Learning for Sponsored Search Real-time Bidding (arxiv.org)
- Improving the accuracy of our machine learning WAF using data augmentation and sampling (cloudflare.com)
- Evolving our machine learning to stop mobile bots (cloudflare.com)
- Fighting fraud with Triplet Loss (olx.com)
- Cloudflare Bot Management: Machine Learning and More (cloudflare.com)
- Blocking Slack Invite Spam With Machine Learning (slack.engineering)
- Detecting and Preventing Abuse on LinkedIn using Isolation Forests (linkedin.com)
- Paper (aaai.org)
- Metapaths guided Neighbors aggregated Network for Heterogeneous Graph Reasoning (arxiv.org)
- Video (crossminds.ai)
- Traffic Prediction with Advanced Graph Neural Networks (deepmind.com)
- AliGraph: A Comprehensive Graph Neural Network Platform (arxiv.org)
- Graph Convolutional Neural Networks for Web-Scale Recommender Systems (arxiv.org)
- Building The LinkedIn Knowledge Graph (linkedin.com)
- Optimizing DoorDash’s Marketing Spend with Machine Learning (doordash.engineering)
- Next-Generation Optimization for Dasher Dispatch at DoorDash (doordash.engineering)
- How Trip Inferences and Machine Learning Optimize Delivery Times on Uber Eats (uber.com)
- (Part 1) (grab.com)
- One-shot Text Labeling using Attention and Belief Propagation for Information Extraction (arxiv.org)
- Paper (arxiv.org)
- AutoKnow: self-driving knowledge collection for products of thousands of types (amazon.science)
- Using Machine Learning to Index Text from Billions of Images (dropbox.tech)
- Bootstrapping Conversational Agents with Weak Supervision (aaai.org)
- Paper (ajratner.github.io)
- Snorkel DryBell: A Case Study in Deploying Weak Supervision at Industrial Scale (acm.org)
- Unit Test Case Generation with Transformers (arxiv.org)
- Paper (pixar.com)
- Deep Learned Super Resolution for Feature Film Production (pixar.com)
- Language Models are Few-Shot Learners (arxiv.org)
- Paper (openai.com)
- Better Language Models and Their Implications (openai.com)
- The Machine Learning Behind Hum to Search (googleblog.com)
- Improving On-Device Speech Recognition with VoiceFilter-Lite (googleblog.com)
- MPC-based machine learning: Achieving end-to-end privacy-preserving machine learning (facebook.com)
- Federated Learning with Formal Differential Privacy Guarantees (googleblog.com)
- Federated Learning: Collaborative Machine Learning without Centralized Training Data (googleblog.com)
- Accelerating our A/B experiments with machine learning (dropbox.tech)
- Meet Dash-AB — The Statistics Engine of Experimentation at DoorDash (doordash.engineering)
- Overtracking and Trigger Analysis: Reducing sample sizes while INCREASING sensitivity (booking.ai)
- Challenges in Experimentation (lyft.com)
- Interpreting A/B Test Results: False Negatives and Power (netflixtechblog.com)
- Iterating Real-time Assignment Algorithms Through Experimentation (doordash.engineering)
- Leveraging Causal Modeling to Get More Value from Flat Experiment Results (doordash.engineering)
- Improving Online Experiment Capacity by 4X with Parallelization and Increased Sensitivity (doordash.engineering)
- Improving Experimental Power through Control Using Predictions as Covariate (doordash.engineering)
- Our Evolution Towards T-REX: The Prehistory of Experimentation Infrastructure at LinkedIn (linkedin.com)
- Paper (mlr.press)
- Paper (nips.cc)
- Announcing a New Framework for Designing Optimal Experiments with Pyro (uber.com)
- Paper (arxiv.org)
- Constrained Bayesian Optimization with Noisy Experiments (fb.com)
- Under the Hood of Uber’s Experimentation Platform (uber.com)
- Analyzing Experiment Outcomes: Beyond Average Treatment Effects (uber.com)
- Building an Intelligent Experimentation Platform with Uber Engineering (uber.com)
- The Reusable Holdout: Preserving Validity in Adaptive Data Analysis (googleblog.com)
- Overlapping Experiment Infrastructure: More, Better, Faster Experimentation (research.google)
- Dealing with Train-serve Skew in Real-time ML Models: A Short Guide (nubank.com.br)
- Permute, Quantize, and Fine-tune: Efficient Compression of Neural Networks (arxiv.org)
- How We Scaled Bert To Serve 1+ Billion Daily Requests on CPUs (roblox.com)
- LiFT: A Scalable Framework for Measuring Fairness in ML Applications (linkedin.com)
- Elastic Distributed Training with XGBoost on Ray (uber.com)
- Didact AI: The anatomy of an ML-powered stock picking engine (principiamundi.com)
- Monzo’s machine learning stack (monzo.com)
- Zalando's Machine Learning Platform (zalando.com)
- The Magic of Merlin: Shopify's New Machine Learning Platform (shopify.engineering)
- DARWIN: Data Science and Artificial Intelligence Workbench at LinkedIn (linkedin.com)
- Redesigning Etsy’s Machine Learning Platform (etsy.com)
- Evolving Reddit’s ML Model Deployment and Serving Architecture (reddit.com)
- LyftLearn: ML Model Training Infrastructure built on Kubernetes (lyft.com)
- Introducing Flyte: Cloud Native Machine Learning and Data Processing Platform (lyft.com)
- Meet Michelangelo: Uber’s Machine Learning Platform (uber.com)
- ML Education at Uber: Frameworks Inspired by Engineering Principles (uber.com)
- Automatic Retraining for Machine Learning Models: Tips and Lessons Learned (nubank.com.br)
- Best Practices for Real-time Machine Learning: Alerting (nubank.com.br)
- Maintaining Machine Learning Model Accuracy Through Monitoring (doordash.engineering)
- Tuning Model Performance (uber.com)
- Continuous Integration and Deployment for Machine Learning Online Serving and Models (uber.com)
- Challenges in Deploying Machine Learning: a Survey of Case Studies (arxiv.org)
- 150 Successful Machine Learning Models: 6 Lessons Learned at Booking.com (booking.ai)
- On Challenges in Machine Learning Model Management (computer.org)
- Rules of Machine Learning: Best Practices for ML Engineering (google.com)
- Paper (nips.cc)
- Paper (arxiv.org)
- Practical Recommendations for Gradient-Based Training of Deep Architectures (arxiv.org)
- AlphaGo (oreil.ly)
- Scikit-Learn (oreil.ly)
- k-nearest neighbors (oreil.ly)
- KNeighborsRegressor (oreil.ly)
- Segment Anything Model (SAM) (segment-anything.com)
- YOKOT.AI (yokot.ai)
- Doctor of Engineering in A.I. & Machine Learning (gwu.edu)
Featured Images: pexels.com