## Introduction to Naive Bayes Classifier | by Priyanka Meena | Nov, 2020 | Medium

Naive Bayes is a time period that’s collectively used for classification algorithms which can be based mostly on Bayes Theorem. For uninitiated, classification algorithms are these algorithms which can be used to categorize a brand new commentary into predefined lessons. For instance, let’s assume that you’re working as a knowledge analyst with a serious financial institution in London and you want to predict based mostly on historic information, if a buyer will fraud on a financial institution mortgage or not.

You should be questioning there is “Bayes” within the title as a result of the algorithm is predicated on Bayes Theorem however why “Naive”? Is it as a result of the algorithm is “Naive” or “Dumb”? No !! the algorithm will not be “dumb” however in reality it really works higher than some very sophisticated algorithms at instances. The algorithm is “Naive” as a result of it really works on the final assumption that the presence of a specific characteristic in a category is impartial or completely unrelated to the presence of another characteristic in the identical class. For instance, a shopper can default on a financial institution mortgage, if he/she has a low credit score rating, low applicant_income and many others. Both these options independently contribute to the chance that candidate will default, that’s, presence of 1 characteristic will not be associated to one other.

Don’t be disheartened if a number of the phrases sound alien to you. The goal of this collection of articles is to clarify machine learning algorithms within the easiest doable method. So that by the top of this collection, it is possible for you to to construct your individual machine learning fashions with nice ease. So let’s proceed with this text on Naive Bayes Classifier.

Bayes Theorem !! What precisely is Bayes Theorem?

Bayes Theorem is a highly regarded mathematical formulation that’s used to decide the conditional chance of an occasion, based mostly on prior information of situations that is perhaps associated to the occasion.

Wait! What precisely is Conditional chance?

It is the probability of an consequence occurring,on condition that one other occasion has already occured. For instance, two playing cards are drawn with out substitute from a deck of 52 playing cards. What is the chance that the second card is an ace on condition that the primary card drawn was additionally ace?

So, P(drawing the primary ace) = Total no. of aces / Total no. of playing cards = 4/52

P(drawing the second ace) = 3/51 (it is because after the primary draw we’re left with solely Three aces within the deck and the overall variety of playing cards additionally reduces to 51). So, that is what’s conditional chance all about. The second occasion is depend upon the incidence of first one.

Coming again to Bayes Theorem. It is mathematically given by the next formulation :

Now that now we have an honest understanding of what Bayes Theorem is. Let’s go forward and perceive its use in classification issues.

## NAIVE BAYES INTUITION

Problem : Try to predict if a candidate with credit score rating of 180 will default on a mortgage or not. Consider the next frequency desk for calculating the probability of default.

We can observe the next steps to calculate the chances. Since the candidate has credit score rating of 180, let’s predict the label for (100–200).

Step 1: Calculate the prior chance for every class (Yes and No)

P(Yes) = 11/29 = 0.379

P(100–200) = 10/29 = 0.345
Step 2: Find Likelihood chance with every attribute for every class

P(100–200 | Yes) = 6/11 = 0.545
Step 3: Calculate posterior chance utilizing Bayes Theorem

P(Yes|100–200) = P(100–200|Yes) * P(Yes)/P(100–200)

= 0.545*0.379/0.345 = 0.5989 =0.599
Step 4: Make prediction

Since, the chance of the candidate defaulting is greater than 50%(it’s the assumed significance degree and it may well differ based mostly on the use case), we will say that the candidate will default.

## END TO END NAIVE BAYES CLASSIFIER

Having realized how a naive bayes classifier works, let’s attempt to construct a classification mannequin based mostly on it utilizing sklearn. Sklearn or scikit-learn is an open supply machine learning library written in python.

For the aim of this text, we will likely be utilizing social_network_ads dataset. In this downside, we are going to attempt to predict whether or not a consumer have bought a product by clicking on the commercials proven to him/her on social media, based mostly on age and estimated wage. So let’s get began.

Step 1 : Import primary libraries