A confusion matrix is a table used to evaluate the performance of a classification model by comparing its predictions with the actual outcomes.
It's a simple yet powerful tool that helps you understand how well your model is doing.
A typical confusion matrix displays four key metrics: true positives, false positives, true negatives, and false negatives.
These metrics are essential for assessing the accuracy and reliability of your model.
Imagine you're building a model to predict whether someone will buy a product based on their browsing history. The true positives would be the people who actually bought the product and were correctly predicted to do so.
What is a Confusion Matrix?
A confusion matrix is a matrix that summarizes the performance of a machine learning model on a set of test data. It's a way to display the number of accurate and inaccurate instances based on the model's predictions.
The confusion matrix is often used to measure the performance of classification models, which aim to predict a categorical label for each input instance. This is especially useful when you're trying to understand how well your model is working.
True Positive (TP) instances occur when the model correctly predicts a positive outcome, where the actual outcome was indeed positive.
Here's a breakdown of the different types of instances you'll see in a confusion matrix:
False Positive (FP) instances occur when the model incorrectly predicts a positive outcome, where the actual outcome was negative. This is also known as a Type I error.
Creating a Confusion Matrix
Creating a Confusion Matrix is a crucial step in evaluating the performance of a classification model. You can use the confusion_matrix method from sklearn.metrics to compute the confusion matrix.
To create a confusion matrix, you'll need to import the necessary libraries, including scikit-learn and matplotlib. The confusion matrix is a table with two rows and two columns that reports the number of true positives, false negatives, false positives, and true negatives.
Here's a breakdown of what each cell in the confusion matrix represents:
- cm[0][0] = TP (True Positives)
- cm[1][1] = TN (True Negatives)
- cm[0][1] = FP (False Positives)
- cm[1][0] = FN (False Negatives)
You can use the plot_confusion_matrix method to visualize the confusion matrix. This will help you understand the performance of your model and identify areas for improvement.
Understanding Confusion Matrix Metrics
Recall measures the effectiveness of a classification model in identifying all relevant instances from a dataset, and it's calculated as the ratio of true positive instances to the sum of true positive and false negative instances.
Recall is calculated using the formula: Recall = TP / (TP + FN), which is demonstrated in Example 2 with a value of 0.8333 for the given dataset.
Specificity, also known as the True Negative Rate, measures a model's ability to correctly identify negative instances, and it's calculated as the ratio of true negative instances to the sum of true negative and false positive instances.
Accuracy
Accuracy is a measure of how well a model performs by comparing the total correct instances to the total instances. It's a simple yet effective way to gauge a model's overall performance.
Accuracy is calculated by dividing the sum of true positives and true negatives by the total number of instances. This includes both correct and incorrect predictions.
The formula for accuracy is TP + TN / (TP + TN + FP + FN), where TP is the number of true positives, TN is the number of true negatives, FP is the number of false positives, and FN is the number of false negatives. For example, in a scenario with 5 true positives, 3 true negatives, 1 false positive, and 1 false negative, the accuracy would be 8/10 or 0.8.
Recall
Recall measures the effectiveness of a classification model in identifying all relevant instances from a dataset. It's the ratio of true positive instances to the sum of true positive and false negative instances. Recall is a crucial metric, especially in medical diagnoses, where identifying all actual positive cases is critical, even if it results in some false positives.
The formula for Recall is TP / (TP + FN), where TP is the number of true positive instances and FN is the number of false negative instances. For example, if a model has 5 true positive instances and 1 false negative instance, its Recall would be 5 / (5 + 1) = 0.8333.
Types of Classification
In multi-class classification, you have more than two possible classes for your model to predict. The confusion matrix expands to accommodate these additional classes.
The confusion matrix has two main components: rows and columns. Rows represent the actual classes (ground truth) in your dataset, while columns represent the predicted classes by your model.
Each cell within the matrix shows the count of instances where the model predicted a particular class when the actual class was another.
A 3X3 Confusion matrix is a common example of this, where each class has three possible outcomes.
Here's a breakdown of the classes in a 3X3 Confusion matrix:
In a multi-class classification scenario, the precision, recall, and f1-score are often displayed alongside the confusion matrix.
Multi-Class Classification with Python
In multi-class classification, you have more than two possible classes for your model to predict. The confusion matrix expands to accommodate these additional classes.
A 3X3 confusion matrix is shown below for the image having three classes. Each cell within the matrix shows the count of instances where the model predicted a particular class (column) when the actual class was another (row).
To calculate true negatives, we need to know the total number of images that were NOT cats, dogs, or horses. Let’s assume there were 10 such images, and the model correctly classified all of them as “not cat,” “not dog,” and “not horse.”
True Negative (TN) Counts: 10 (for each class, as the model correctly identified each non-cat/dog/horse image as not belonging to that class)
Here are the definitions of the metrics used in the classification report:
- True Positive (TP): The image was of a particular animal (cat, dog, or horse), and the model correctly predicted that animal.
- True Negative (TN): The image was not of a particular animal and the model correctly predicted it as not that animal.
- False Positive (FP): The image was not of a particular animal, but the model incorrectly predicted it as that animal.
- False Negative (FN): The image was of a particular animal, but the model incorrectly predicted it as a different animal.
Here's a breakdown of the numbers in the example confusion matrix:
To create a confusion matrix with Python in Scikit-learn, you need to follow these steps:
1. Run a classification algorithm on your training data.
2. Import metrics from the sklearn module.
3. Run the confusion matrix function on actual and predicted values.
4. Plot the confusion matrix.
5. Inspect the classification report.
Confusion Matrix Parameters
The parameters of a confusion matrix are crucial to its interpretation, and Scikit-learn's confusion_matrix() function provides several options to customize its display.
The first parameter is y_true, which represents the ground truth target values. This is an array-like of shape (n_samples,).
The y_pred parameter is another important one, as it contains the estimated targets as returned by a classifier. It too is an array-like of shape (n_samples,).
You can also specify a list of labels to index the matrix using the labels parameter. This can be used to reorder or select a subset of labels. If None is given, those that appear at least once in y_true or y_pred are used in sorted order. This parameter is an array-like of shape (n_classes), and defaults to None.
If you need to assign different weights to each sample, you can use the sample_weight parameter. This is an array-like of shape (n_samples,), and defaults to None.
Finally, you can choose to normalize the confusion matrix using the normalize parameter. This can be set to 'true', 'pred', or 'all' to normalize over the true rows, predicted columns, or all the population respectively. If None is given, the matrix will not be normalized.
Example and Explanation
A confusion matrix is a table used to evaluate the performance of a classification model, and it's a crucial tool for data scientists and analysts.
The matrix displays the number of true positives, true negatives, false positives, and false negatives, which are all essential metrics for assessing model accuracy.
A true positive is a correct prediction, where the model correctly identifies a positive class, as seen in the example where 85 out of 100 actual positive instances were correctly predicted.
A false positive is a wrong prediction, where the model incorrectly identifies a negative class as positive, such as the 15 instances where actual negatives were incorrectly predicted as positives.
The precision of a model is calculated by dividing the number of true positives by the sum of true positives and false positives, resulting in a precision of 0.85 in our example.
The recall of a model is calculated by dividing the number of true positives by the sum of true positives and false negatives, resulting in a recall of 0.85 in our example.
A high precision model is one that correctly identifies most of the actual positive instances, while a high recall model is one that correctly identifies most of the actual positive instances.
Frequently Asked Questions
What are the colors of confusion matrix display?
The default color map in a confusion matrix uses a yellow/orange/red color scale. This color scale can be customized by the user to better visualize their data.
What are the 4 values in a confusion matrix?
A confusion matrix displays four key values: true positives, true negatives, false positives, and false negatives. These values help analyze model performance and identify misclassifications.
Sources
- Making Sense of the Confusion Matrix in Python (Scikit-learn) (jcchouinard.com)
- Understanding the Confusion Matrix in Machine Learning (geeksforgeeks.org)
- Colormap (matplotlib.org)
- Axes (matplotlib.org)
- 10.1.1.484.4384 (psu.edu)
- 10.1186/s13040-021-00244-z (doi.org)
- "Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation" (researchgate.net)
- 10.1016/j.patrec.2005.10.010 (doi.org)
- "An Introduction to ROC Analysis" (elte.hu)
- 10.1016/S0034-4257(97)00083-7 (doi.org)
- Confusion Matrices with Heatmaps: Implementation in Python ... (shahinur.com)
Featured Images: pexels.com