Posted 2021-02-09Updated 2021-02-09Data Science

Logistic Regression in 20 Lines of Python

Logistic regression is a classification algorithm widely used in industry. Its structure is simple, with the following main advantages and disadvantages:

pros

Training and inference are both fast
Easy to implement
Low memory usage
Good interpretability

cons

Because it cannot fit nonlinear relationships, it places higher demands on feature engineering.
It is relatively sensitive to multicollinearity

Structurally, the only difference between logistic regression and linear regression is the addition of the sigmoid function, also called the logistic function, which converts the output into a normalized value.

Linear regression

$$ Y=X \cdot W+B $$

Logistic regression

$$ Y=\sigma (X \cdot W+B) $$

where

$$ \sigma(t)=\frac{1}{1+e^{-t}} $$

is the sigmoid function

$$ \hat y=\left\{\begin{array}{ll} 0, & z<0 \\ 0.5, & z="0" 1,>0 \end{array}, \quad z=w^{T} x+b\right. $$

The sigmoid function is introduced because the output of linear regression does not lie between 0 and 1, nor can it fit discrete variables. An ideal classification function is not differentiable, while the logit function is a convex function differentiable to arbitrary order and has good mathematical properties. In addition, the sigmoid function can represent likelihood continuously, although its output is not a “probability” in the strict mathematical sense.

Once the model is fixed, the next step is to estimate the parameters by maximum likelihood.
Under a Bernoulli distribution, maximum likelihood estimation leads to the cross-entropy loss function.
Assume

$$ \begin{array}{l} P(Y=1 \mid x)=p(x) \\ P(Y=0 \mid x)=1-p(x) \end{array} $$

The probability density function of the Bernoulli distribution is

$$ f_{X}(x)=p^{x}(1-p)^{1-x}=\left\{\begin{array}{ll} p & \text { if } x=1 \\ 1-p & \text { if } x=0 \end{array}\right. $$

The likelihood function is

$$ L(w)=\prod\left[p\left(x_{i}\right)\right]^{y_{i}}\left[1-p\left(x_{i}\right)\right]^{1-y_{i}} $$

Rewriting it in logarithmic form for easier computation, we find that it is exactly the same as cross-entropy:

$$ \ln L(w)=\sum\left[y_{i} \ln p\left(x_{i}\right)+\left(1-y_{i}\right) \ln \left(1-p\left(x_{i}\right)\right)\right] $$

Its negative can be used as the loss function, so maximizing the likelihood is equivalent to minimizing the loss:

$$ J(w)=-\frac{1}{N} \ln L(w) $$

This article uses gradient descent to search for the optimal parameters.
The derivation of the partial derivative of the loss with respect to the weights is omitted here:

$$ \frac{\partial J(w)}{\partial w_{i}}=\left(p\left(x_{i}\right)-y_{i}\right) x_{i} $$

The detailed derivation can be found in [1].

Update the parameters:

$$ w_{i}^{k+1}=w_{i}^{k}-\alpha \frac{\partial J(w)}{\partial w_{i}} $$

The math is done. The following is the implementation.
The complete Python code needed to define a usable logistic regression model is under 20 lines:

import numpy as np

# utility function
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

class LogisticsReg():
    def __init__(self,learn_rate=1e-3,niter=1000):
        # set the learning rate and the number of iterations
        self.lr=learn_rate
        self.iter=int(niter)

    def fit(self,x,y):
        # insert a column of 1s; after multiplying by \\(w_0\\) it acts as the bias term b
        # this greatly simplifies both the code and the derivation
        x=np.insert(x, 0, values=1, axis=1)

        # randomly initialize w
        w=np.random.uniform(size=(x.shape[1]))

        for i in range(self.iter):
            # compute the model prediction
            p = sigmoid(x.dot(w))
            # gradient descent
            w -= self.lr * x.T.dot(p - y)
            
        self.w=w

    def predict(self,x):
        # a column must also be inserted at prediction time
        x=np.insert(x, 0, values=1, axis=1)
        return sigmoid(x.dot(self.w))

References
[1] https://blog.csdn.net/jasonzzj/article/details/52017438

Logistic Regression in 20 Lines of Python

https://en.heth.ink/LogisticRegression/

Author

Posted on

2021-02-09

Updated on

2021-02-09

Logistic Regression in 20 Lines of Python

Author

Posted on

Updated on

Licensed under

Categories

Recents