Pattern recognition and machine learning, Chapter 3 reproduces how Bayesian linear regression converges the predicted distribution. What we are doing is almost the same as the Bayesian estimation we did in Chapter 1, only the basis function is a Gaussian function (3.4). Therefore, the code is almost diverted.

Compared to Chapter 1, in Chapter 3, the expression of mathematical formulas has been rewritten to handle vectors. Is there any intention in this area? Since I introduced the concept of the feature vector in (3.16) here, I wonder if this is the way to write it, but what about it?

Rough flow of implementation

(1) What you want to find is the predicted distribution (3.57).

 p(t|{\bf x}, {\bf t}, \alpha, \beta) = N (t| {\bf m}^T_N \phi({\bf x}),  \sigma^2_N ({\bf x})) (3.58)

② The basis function is $ \ phi_i (x) = \ exp \ {-\ frac {(x-\ mu_j) ^ 2} {2s ^ 2} } $ in (3.4). Although 9 basis functions are specified, they are arranged so that they are evenly distributed because there is no instruction on how to arrange them.

(3) Since there are the following three formulas for calculating the mean and variance of the predicted distribution, implement each of them to obtain the predicted distribution.

{\bf m}_N = \beta{\bf S}_N\Phi(x_n)^{\bf T}{\bf t}(3.53)

\sigma^2_N(x) = \beta^{-1} + \phi({\bf x})^{\bf T} {\bf S}_N \phi({\bf x}). (3.59)

{\bf S}^{-1}_N = \alpha {\bf I} + \beta\Phi^{\bf T}\Phi(3.54)

code

In drawing the figure, @ naoya_t's This article (Reproduction of the Bayesian linear regression (PRML §3.3)) shows how the sample data converges as it increases. ) I wasn't confident that I could come up with an idea to draw more beautifully, so I used it as a reference, almost like a sutra copy. Thank you very much.

import numpy as np
from numpy.linalg import inv
import pandas as pd
from pylab import *
import matplotlib.pyplot as plt

def func(x_train, y_train):
    # (3.4) Gaussian basis function
    def phi(s, mu_i, M, x):
        return np.array([exp(-(x - mu)**2 / (2 * s**2)) for mu in mu_i]).reshape((M, 1))

    #(3.53)' ((1.70)) Mean of predictive distribution
    def m(x, x_train, y_train, S):
        sum = np.array(zeros((M, 1)))
        for n in xrange(len(x_train)):
            sum += np.dot(phi(s, mu_i, M, x_train[n]), y_train[n])
        return Beta * phi(s, mu_i, M, x).T.dot(S).dot(sum)
    
    #(3.59)'((1.71)) Variance of predictive distribution   
    def s2(x, S):
        return 1.0/Beta + phi(s, mu_i, M, x).T.dot(S).dot(phi(s, mu_i, M, x))

    #(3.53)' ((1.72))
    def S(x_train, y_train):
        I = np.identity(M)
        Sigma = np.zeros((M, M))
        for n in range(len(x_train)):
            Sigma += np.dot(phi(s, mu_i, M, x_train[n]), phi(s, mu_i, M, x_train[n]).T)
        S_inv = alpha*I + Beta*Sigma
        S = inv(S_inv)
        return S
    
    #params for prior probability
    alpha = 0.1
    Beta = 9
    s = 0.1

    #use 9 gaussian basis functions
    M = 9

    # Assign basis functions
    mu_i = np.linspace(0, 1, M)
    
    S = S(x_train, y_train)

    #Sine curve
    x_real = np.arange(0, 1, 0.01)
    y_real = np.sin(2*np.pi*x_real)
    
    #Seek predictive distribution corespponding to entire x
    mean = [m(x, x_train, y_train, S)[0,0] for x in x_real]
    variance = [s2(x, S)[0,0] for x in x_real]
    SD = np.sqrt(variance)
    upper = mean + SD
    lower = mean - SD
    
    plot(x_train, y_train, 'bo')
    plot(x_real, y_real, 'g-')
    plot(x_real, mean, 'r-')
    fill_between(x_real, upper, lower, color='pink')
    xlim(0.0, 1.0)
    ylim(-2, 2)
    title("Figure 3.8")
    show()

if __name__ == "__main__":
    # Maximum observed data points (underlining function plus some noise)
    x_train = np.linspace(0, 1, 26)

    #Set "small level of random noise having a Gaussian distribution"
    loc = 0
    scale = 0.3
    y_train =  np.sin(2*np.pi*x_train) + np.random.normal(loc,scale,26)


    #Sample data pick up
    def randidx(n, k):
        r = range(n)
        shuffle(r)
        return sort(r[0:k])

    for k in (1, 2, 5, 26):
        indices = randidx(size(x_train), k)
        func(x_train[indices], y_train[indices])

result

Screen Shot 2015-09-21 at 00.20.33.png

reference

Reproduction of Bayesian Linear Regression (PRML §3.3)

Implemented in Python PRML Chapter 3 Bayesian Linear Regression

Rough flow of implementation

code

result

reference