What is Confusion Matrix?
A confusion matrix is a performance measurement technique for Machine learning classification. It measures the performance of a classifier in depth. Confusion matrix helps us know performance of the classification model on a set of test data for that the true values are known.
A confusion matrix is a summary of prediction results on a classification problem. It gives you insight not only into the errors being made by your classifier but more importantly the types of errors that are being made.
Looking for a Data Science Career? Explore Here!
Outcomes of Confusion Matrix
The four different combinations from the predicted and actual values of a classifier are:
- True Positive (TP): You predicted positive values and it is positive.
- False Positive (FP): You predicted positive values but it is actually negative.
- False Negative (FN): You precited negative value but it is actually positive.
- True Negative (TN): You predicted negative value and it is negative.
Example of Confusion Matrix
Suppose you predict the result of a cricket match between India and Australia.
- True Positive: When you predict a positive outcome and it turns out to be correct. Suppose you predict that India will win and it wins.
- True Negative: When you predict a negative outcome and it turns out to be correct. Suppose you predict that India will lose and it loses.
- False Positive: When you predict a positive outcome and it turns out to be wrong. Suppose you predict that India will win and it loses.
- False negative: When you predict a negative outcome and it turns out to be correct. Suppose you predict that India will lose but it wins.
Important Terms using a Confusion matrix
Precision: It measures how good our model is when the prediction is positive. It measures how likely the prediction of the positive class is correct. It is useful for the conditions where false positive is a higher concern as compared to a false negative.
Recall: Precision alone is not very helpful because it ignores the negative class. It measures how good our model is at correctly predicting positive classes. It is the ratio of correct positive predictions to all positive classes. It is useful when false negative dominates false positives.
How to Calculate a Confusion Matrix?
- You need a test dataset or a validation dataset with expected outcome values.
- Predict all the rows in the test dataset.
- From the expected outcomes and predictions count:
- The total of correct predictions of each class.
- The total of incorrect predictions of each class.
- Each row of the matrix links to a predicted class.
- Each column of the matrix corresponds with an actual class.
- Enter the total counts of correct and incorrect classification into the table.
- The sum of correct predictions for a class go into the predicted column and expected row for that class value.
- The sum of incorrect predictions for a class goes into the expected row for that class value and the predicted column for that specific class value.
Join Our Data Science and Machine Learning Course! Enroll Here!
Benefits of Using a Confusion Matrix
- It provides an insight not only to the errors which are made by a classifier but also errors that are being made.
- It reflects how a classification model is disorganized and confused while making predictions.
- It is a useful machine learning method which allows you to measure Recall, Precision, Accuracy, and AUC-ROC curve.
- It assists in prevailing over the limitations of deploying classification accuracy alone.
In short, a confusion matrix is a summarized table of the number of correct and incorrect predictions for binary classification tasks. By visualizing the confusion matrix, an individual could determine the accuracy of the model by observing the diagonal values for measuring the number of accurate classification.
Discussion about this post