IIT Certification Programs

Workshops

Projects

Blogs

Careers

Student Reviews

For Business

Academic Training

Informative Articles

Find Jobs

We are Hiring!

All Courses

Choose a category

All Courses

CSE

Modified on

What is Confusion Matrix?

Skill-Lync

While using classification algorithms, two kinds of outputs are generated. In one of the types, the output is class, while in another, probability is the output.

Let us consider an example to understand it much better. Suppose imagine that we create an algorithm to predict whether an item belongs to class A or B. Algorithms such as K Nearest Neighbors (KNN) or Support Vector Machine (SVM) creates class output, while Random forest, Gradient Boost and Ada Boost gives the probability.

In the former set of algorithms, the model predicts whether an item belongs to class A or class B, while in the latter set, the model gives you the probability that it would belong to Class A or B. So each output, in this case, will have two probability values, one for item belonging to class A and one for belonging to class B.

The advantage of probability based output is that it gives flexibility to the user to set a threshold based on which decisions about class type can be made.

To understand its power let us understand a bit more about the confusion matrix.

Confusion matrix is 2x2 table in case of a two class problem. This gives you the details of correct classifications and miss classifications.

Suppose imagine we have two classes A and B. The labels are given to us. We construct a model and use the model to make some predictions. We compare our predictions with the actual label.

We have the following scenario. We may have a situation where our actual class is A and our model predicted it as A. In this case we call it as True positive.

In the second case we have the actual class is A but our model has classified it as B (not A). In this case it is called False Negative.

In the third case, we have the actual class is B but our model classified it as A. This is called the False Positive.

Finally we have a scenario, where our item belongs to B and our model classifies it as B. This is called as True Negative.

These four values are used to construct the confusion matrix.

TP: True Positive

FP: False Positive

TN: True Negative

FN: False Negative

Classification models that give probability as output, require a threshold from the user to make a judgement.

For instance let us take an example, wherein a model has given the probability value of [0.4,0.6] for an item I.

This means that item I belongs to A with a probability of 0.4 (call it p_a) and belongs to B with a probability of 0.6 (call it p_b). This is just an example. The values can be anything.

Here we can set a threshold like say p=0.7 (it can again be any value). Now our model compares the first value of probability, with the threshold p.

In this case 0.4 is compared with p=0.7.

Let us assume we have condition that if p_a>p, assign A. You can see that 0.4 is less than 0.7, so the item is assigned to class B by default. If our p was 0.3, then the same item would have been assigned to A.

This way the model can be tuned to give varied output.

For instance if our requirement is that we must have absolute pure true positives only, then we will set the value of p to be higher. Similarly, if we want pure true negatives we will set p value to be low.

Here is a list of the following scenarios, where our priorities differ.

False Positive Rate (FPR) = FP/(FP +TN).

In some of the problems our main criteria would be to reduce the false positive rate. This could be because the cost of dealing with false positives could be more devastating.

For example let us imagine, government giving say some 1 Cr (10 million) package to all covid positive patients. In this scenario, you can see having a false positive is a big loss to the Government. So to reduce this, the value of the threshold is reduced so that we are certain of what is positive.

False Negative Rate (FNR) sensitivity = FN/(TP+FN)

This is a measure of models FNR. This measures the degree of missing rate. This is a perfect measure when we are dealing with a scenario where the cost of missing a true positive is higher. For instance if there is a fraudulent transaction, we need to find it with highest accuracy, missing it would result in tremendous loss.

True Negative Rate (TNR) = TN/(TN+FP)

This is also called the specificity. Here we are more interested in being 100% sure that something is truly negative. For instances during the tests for covid, those tested negative are allowed to move freely, while those who are tested positive quarantine themselves. Here we want to make sure that those tested negative are truly negative.

True Positive Rate TPR or recall or sensitivity =TP/(TP+FN)

This is an important metric when it is inexpensive to check everyone, but expensive when even one right one is missed. This is also termed as catching all thieves. So it is easier to check everyone who is going into the airport, but even leaving one bad guy can be dangerous. So in those scenario, TPR is used to to evaluate the model.

Negative Predicted Rate (NPR) = TN/(TN+FN)

This is more important when you are dealing with medical data. Here you want to be certain about the negative predicted value. If a person is negative for a disease, the model should rightly predict all the negative people. This is because if the model misses and if we treat an healthy person there by wasting our resources.

Positive Predicted Value (PPV) Precision = TP/(TP+FP)

This is more important when you are dealing with medical data. Here you want to be certain about the positive predicted value. If a person is positive for a disease, the model should not miss him from labeling as positive. This is because if the model misses and if we miss the treatment, the person may become more ill or may lose is life.

False Discovery Rate (FDR) = FP/(FP+TP)

This metric is important when false discovery is devastating. For instance a false alarm regarding bomb threat leads to complete evacuation of the area, scanning the area and public panic. So we do not want to have many false discovery scenarios.

Accuracy = TP+TN/(TP+TN+FP+FN)

We use this when all classes are important.

Apart from this we also have the following metrics, which are useful for checking how good our model is.

Cohen-Kapp metric. This is alternative to accuracy and works well with imbalanced dataset. Imbalanced datasets are datasets where the one output is more than other. For instance if we try to create a model which will predict if a given soilder is a male or female. You will notice that by default there are more members are male than female. There fore if we were to take the entire data of all military personals, there is an imbalance in the data. While for balanced data the numbers in each class is same.
Matthews correlation coefficeint:for imbalnced dataset
ROC curve: for balanced dataset
Precision recall curve
F-Beta score: Similar to accuracy, with weights to the precision and recall
ROC-AUC: graph that plots between sensitifity and specificity for various value of threshold probability.
PR AUC score: graph that plots between sensitifity and specificity for various value of threshold probability for imbalanced dataset

Author

Navin Baskar

Author

Skill-Lync

Subscribe to Our Free Newsletter

When analysing SQL data, Microsoft Excel can come into play as a very effective tool. Excel is instrumental in establishing a connection to a specific database that has been filtered to meet your needs. Through this process, you can now manipulate and report your SQL data, attach a table of data to Excel or build pivot tables.

CSE

07 Aug 2022

How to remove MySQL Server from your PC? A Stepwise Guide

Microsoft introduced and distributes the SQL Server, a relational database management system (RDBMS). SQL Server is based on SQL, a common programming language for communicating with relational databases, like other RDBMS applications.

CSE

22 Aug 2022

Introduction to Artificial Intelligence, Machine learning, and Deep Learning

Machine Learning is a process by which we train a device to learn some knowledge and use the awareness of that acquired information to make decisions. For instance, let us consider an application of machine learning in sales.

CSE

30 Jun 2022

Do Not Be Just Another Engineer: Four Tips to Enhance Your Engineering Career

Companies seek candidates who can differentiate themselves from the colossal pool of engineers. You could have a near-perfect CGPA and be a bookie, but the value you can provide to a company determines your worth.

CSE

03 Jul 2022

Cross-Validation Techniques For Data

Often while working with datasets, we encounter scenarios where the data present might be very scarce. Due to this scarcity, dividing the data into tests and training leads to a loss of information.

CSE

26 Dec 2022

Author

Skill-Lync

Subscribe to Our Free Newsletter

CSE

07 Aug 2022

How to remove MySQL Server from your PC? A Stepwise Guide

CSE

22 Aug 2022

Introduction to Artificial Intelligence, Machine learning, and Deep Learning

CSE

30 Jun 2022

Do Not Be Just Another Engineer: Four Tips to Enhance Your Engineering Career

CSE

03 Jul 2022

Cross-Validation Techniques For Data

Often while working with datasets, we encounter scenarios where the data present might be very scarce. Due to this scarcity, dividing the data into tests and training leads to a loss of information.

CSE

26 Dec 2022

Book a Free Demo, now!