Logistic Regression

Machine Learning Algorithms from Scratch

Rithuraj Nambiar
Analytics Vidhya

--

“I believe that we do not know anything for certain, but everything probably.”
Christiaan Huygens, Oeuvres Completes

We had our time through one of the basics of Regression Algorithms, now it’s time we see if we can get a seat at Classification Algorithms. Classification Algorithms help in classifying the dataset into different classes. To take an example, we have our garbage separated into wet garbage and dry garbage, that would be a classic example of the classification problem.

The main task of any classification algorithm is to predict which dataset belongs to which category,(in our case, which garbage goes where). There are tonnes of Classification algorithms in the market, let us start with a basic one, Logistic Regression. Logistic Regression is a binary classifier, that is it states the prediction in the form of 0 and 1, i.e. true or false. It is calculating the probability of the target variable with the help of feature variables. It uses probability in the foundation and as you guessed it is one of the many topics borrowed from Probability.

Too many small talks, let us get to the bottom of it!

What is Logistic Regression?

Logistic Regression is also known as Logit Regression. It is commonly used to estimate the probability that an instance belongs to a particular case. Let's take an example to understand this better, if you have 10 e-mails in your inbox, how will the mail system classify these mails into spam or not spam. It would firstly scan over the mails and detect common words used in spam emails, and then classify those ten emails into spam and non-spam folders, probably the mail system is using Logistic Regression. What this algorithm basically does is, if the estimated probability is more than the threshold, say 60% then it will predict that the instance belongs to a particular class (say positive class, labeled 1) or else it will classify the instance as another class (say negative class, labeled 0). We can say that Logistic Regression is also a binary classifier, as it is classifying the dataset as 0’s and 1’s. Normally, Logistic Regression is used in two-class values.

Logistic Regression uses a sigmoid function to predict the probability of a particular class.

Sigmoid Function

As logistic regression is borrowed from probability, the sigmoid function returns a value in the range of 0 to 1. The usual threshold that is used is, if the value is greater than or equal to 0.5 then that is a positive case, else it is a negative case. You might find all of this gibberish now, let us get our hands-on so as we’ll have a closer yet better look over the algorithm.

Implementation

We will start our project by importing the required libraries, here we are using:

  • Pandas: For data handling and manipulation
  • Numpy: For mathematical operations
  • Matplotlib: For Data Visualization
  • load_iris: We’ll be using the pre-existing dataset from here.
  • train_test_split: For splitting the data into training and test sets.

After importing the data, we’ll see how our data actually looks, as we are importing the data from sklearn, we don’t need to format the data, it is already in an ideal way.

But as we are going to train the data with logistic regression, logistic regression is a binary classifier, so we will coin our data in such a way that the target variable is binary.

There are four feature names made available to us, but we will be using only two from the data, Petal Length and Petal Width , also as told before we have to convert our target variable into binary data.

Thus, are going to predict if an iris flower belongs to virginica or not.

Then, we will convert our raw data into a data frame, having 3 columns, Petal Length, Petal Width, and Virginica.

After having a data frame, we will use the train_test_split function to split the data into training and test set, before that we will separate our data into a feature and the target variables.

So now our data is ready for being trained under logistic regression. Before moving ahead let us have a look at what actually is going to happen.

Sigmoid Function

We have talked about our sigmoid function, we had our look over the graph this is how the function looks in numeric form, we won’t be able to use the equation directly so we will modify this equation a bit.

Prediction for Logistic Regression

We will be using the above equation to make a prediction, the equation is almost the same, we are just adding in our weights (b0, b1, b2). These weights would be helping us make accurate predictions. These weights would be firstly initialized at some point (say all are equal to 0) and then another equation is used to update the value of these weights.

Firstly we have one prediction, as we had initialized all weights to 0, whatever the data would be the prediction would turn out to be 0.5.

So now, it's time to improve.

While improving the value of weights, we have to specify the learning rate, I have chosen 0.3. Then we will just plug in other values and thus update our values, as we update the values.

So we now know how to update the weights, and also how to make prediction, let us get all these things together, and create a function for the same.

The above function firstly converts the data into NumPy array, then it loops around for a given number of an epoch, again inside that loop we have another loop that loops around the data and makes predictions and keeps on updating the data, for our comfort I have created an empty list for the weights, wherein we can have a look at the values that weights were given later on.

After our values of weights after epochs are found, we now need to make our prediction in order to see how ideal those values are.

The above function would create a list pred_list that would contain the prediction that would be made using the weights.

The prediction that would be made by the sigmoid function would be between 0 and 1, but we need the values to be either 0 or 1. So we would have another crisp function that would turn the values into our required format.

If the value is more than or equal to 0.5 then it would be a positive class or else it would be a negative class.

So now our predictions are ready, now we have to compare the predictions with the true value so as to know how the model performed. For that, we have another function.

This function would calculate the accuracy percentage of our model.

So if we run all these functions, one by one. (In this case, I chose to have 35 epochs.) We will get our prediction score.

Woohoo!! we predicted the class of the flower with an accuracy of 82%. That is a great thing!

Conclusion

Yay! We did it, we used the stochastic gradient descent for the Logistic Regression approach to predict the class of iris flower using petal length and width. To summarize, we split our data into train and test sets, then we used the sigmoid function to calculate the prediction, and then we update our weights until we get better predictions.

You can also have a look at the GitHub Repository that contains the code and data that I have used in the blog.

Thanks a lot for reading till the end!

If you require any further information, please do not hesitate to contact me.

Portfolio : rithurajnambiar17.github.io

LinkedIn: https://www.linkedin.com/in/rithuraj-nambiar/

GitHub: https://www.github.com/rithurajnambiar17/

E-Mail: rithurajnambiar17@gmail.com

--

--

Rithuraj Nambiar
Analytics Vidhya

AI Enthusiast | Machine Learning | Data Science | Python |