Linear Fit using Gradient Descent with Numpy

Zora Hirbodvash
3 min readFeb 17, 2021

Here, we apply the Gradient Descent method for line fitting (one of the simplest models) which is just a specific application of Gradient Descent.

We start by explaining what the Gradient Descent is. Let’s consider a potential field function over a vector space and call the function f(x). We aim to minimize or maximize the function in the vector space. The gradient descent method idea for minimization is to take very small repeated steps and follow the opposite direction of the gradient. If we travel along the direction of the gradient, we get a higher value for f(x). As a result, we keep updating the position of f(x) every time as follows:

x_(n+1)=x_n-α∇f(x_n)

The new value of x is built from the old value of x, and it subtracts from the value of step multiply by the gradient value of the old x. Keep in mind that we cannot go too far with choosing the step since the gradient in each point is a localized indicator of how to change the f(x). Hence, we consider the step (alpha) to be a very small value. Alpha is also known as a learning rate in ML.

First, we start by importing libraries:

import numpy as np
import matplotlib.pyplot as plt

Then, we generate some random data and doing the data visualization by matplotlib:

xdata=np.linspace(1,10,100)
ydata=xdata*5+np.random.randn(100)*5+10
plt.scatter(xdata,ydata)

We are trying to fit the data with a regression model here. We can speculate that this should be a straight-line (linear regression model). So, our model is y=m*x+b.

The choice of m and b is the most essential role here. We can consider m and b in a vector form as:

C=(m¦b)

We apply the linear model and assess the quality of a parameter by looking at the error incurred by the model prediction.

def f(x,c):
return c[0]*x+c[1]

y_pred=f(xdata,[12,-17])

plt.plot(xdata,ydata,’b.’)
plt.plot(xdata,y_pred,’r — ‘)

def error(xdata,ydata,c):
y_pridect=f(xdata,c)
return (y_pred-ydata)

c=[14,-10]
e=error(xdata,ydata,c)
w = (xdata[1] — xdata[0])/3
plt.bar(xdata,e,width=w)

Next, we have to compute a loss function that assesses the quality of the model parameter which is c. The loss function is to be as follows:

L(c)=∑_i▒〖(f(x_i,c)-y_i)〗²

∂L/∂m=2∑_i▒〖e_i x_i 〗
∂L/∂b=2∑_i▒e_i

Let’s find the partial derivative of the Loss function relative to m and b.

def Loss(c):
e=error(xdata,ydata,c)
L=np.sum(e**2)
return L

def Loss_grad(c):
e=error(xdata,ydata,c)
grad_m=np.sum(2*e*xdata)
grad_b = np.sum(2*e)
return np.array([grad_m,grad_b])

Finally we perform Gradient Descent optimization. We start by an initial guess for model parameter(c0), and step (alpha) and total number of iteration(n).

Loss_grad(c)def gradientDescent(c0, alpha, n):
c = np.array(c0)
for i in range(n):
c = c -( alpha * Loss_grad(c))
L = Loss(c)
print(i,c,L)
return c

C=gradientDescent([9,-14],1e-8,3000)

plt.plot(xdata,ydata,’b.’)
y_pred = f(xdata, C)
plt.plot(xdata, y_pred, ‘r — ‘)

--

--