Exercise: Linear Regression
題目地址: http://openclassroom.stanford.edu/MainFolder/DocumentPage.php?course=DeepLearning&doc=exercises/ex2/ex2.html
題目給出的數(shù)據(jù)是一組2-8歲男童的身高。x是年齡,y是身高。樣本數(shù)m=50.
使用gradient descent來(lái)做線性回歸。
step1:數(shù)據(jù)準(zhǔn)備。
加載數(shù)據(jù):
>> x=load('ex2x.dat');
>> y=load('ex2y.dat');
可視化數(shù)據(jù):
figure % open a new figure window plot(x, y, 'o'); ylabel('Height in meters') xlabel('Age in years')效果如下:
我們將模型設(shè)定為:
Hθ(x) = θ0 + θ1x
因此,還需要在數(shù)據(jù)集x的坐標(biāo)統(tǒng)一加上1,相當(dāng)于x0=1
m = length(y); % store the number of training examples x = [ones(m, 1), x]; % Add a column of ones to x增加了這一列以后,要特別注意,現(xiàn)在年齡這一變量已經(jīng)移動(dòng)到了第二列上。
step2:線性回歸
先來(lái)回憶一下我們的計(jì)算模型:
批量梯度下降(注意不是online梯度下降)的規(guī)則是:
題目要求:
1、使用步長(zhǎng)為0.07的學(xué)習(xí)率來(lái)進(jìn)行梯度下降。權(quán)重向量θ={θ0,θ1}初始化為0。計(jì)算一次迭代后的權(quán)重向量θ的值。
首先初始化權(quán)重向量theta
>> theta=zeros(1,2)
theta =???? 0???? 0
計(jì)算迭代一次時(shí)的權(quán)重向量:
delta=(x*theta')-y;
sum=delta'*x;
delta_theta=sum*0.07/m;
theta1=theta - delta_theta;
求得的theta1的結(jié)果為:
theta1 =??? 0.0745??? 0.3800
?
2、迭代進(jìn)行g(shù)radient descent,直到收斂到一個(gè)點(diǎn)上。
測(cè)試代碼(每100步輸出一條theta直線,直到迭代結(jié)束):
function [ result ] = gradientDescent( x,y,alpha,accuracy ) %GRADIENTDESCENT Summary of this function goes here % Detailed explanation goes here orgX=x; plot(orgX, y, 'o'); ylabel('Height in meters') xlabel('Age in years') m=length(y); x=[ones(m,1),x]; theta = zeros(1,2); hold on; times = 0; while(1)times = times + 1;delta=(x*theta')-y;sum=delta'*x;delta_theta=sum*alpha/m;theta1=theta - delta_theta;if(all(abs(theta1(:) - theta(:)) < accuracy))result = theta;break;endtheta = theta1;if(mod(times,100) == 0)plot(x(:,2), x*theta', '-','color',[(mod(times/100,256))/256 128/256 2/256]); % remember that x is now a matrix with 2 columns% and the second column contains the time infolegend('Training data', 'Linear regression');end endend效果如下:
可以看到,theta所確定的直線慢慢逼近數(shù)據(jù)集。
潔凈版的代碼(只輸出最終的theta結(jié)果,不輸出中間步驟):
function [ result ] = gradientDescent( x,y,alpha,accuracy ) %GRADIENTDESCENT Summary of this function goes here % Detailed explanation goes here orgX=x; plot(orgX, y, 'o'); ylabel('Height in meters') xlabel('Age in years') m=length(y); x=[ones(m,1),x]; theta = zeros(1,2); hold on; while(1)delta=(x*theta')-y;sum=delta'*x;delta_theta=sum*alpha/m;theta1=theta - delta_theta;if(all(abs(theta1(:) - theta(:)) < accuracy))result = theta;plot(x(:,2), x*theta', '-','color',[200/256 128/256 2/256]); % remember that x is now a matrix with 2 columns% and the second column contains the time infolegend('Training data', 'Linear regression');break;endtheta = theta1; endend結(jié)果:
執(zhí)行代碼
result=gradientDescent(x,y,0.07,0.00000001)
其中0.07為步長(zhǎng)(即學(xué)習(xí)率),0.00000001為兩個(gè)浮點(diǎn)矩陣的相近程度(即在這個(gè)程度內(nèi),視為這兩個(gè)浮點(diǎn)矩陣相等)
收斂時(shí)的theta值為:
theta =??? 0.7502 ?? 0.0639
?
3、現(xiàn)在,我們已經(jīng)訓(xùn)練出了權(quán)重向量theta的值,我們可以用它來(lái)預(yù)測(cè)一下別的數(shù)據(jù)。
predict the height for a two boys of age 3.5 and age 7
代碼如下:
>> boys=[3.5;7];
>> boys=[ones(2,1),boys];
>> heights=boys*theta';
執(zhí)行結(jié)果:
heights =
??? 0.9737
??? 1.1973
?
4、理解J(θ)
?
J(θ)是與權(quán)重向量相關(guān)的cost function。權(quán)重向量的選取,會(huì)影響cost的大小。我們希望cost可以盡量小,這樣相當(dāng)于尋找cost function的最小值。查找一個(gè)函數(shù)的最小值有一個(gè)簡(jiǎn)單的方法,就是對(duì)該函數(shù)求導(dǎo)(假設(shè)該函數(shù)可導(dǎo)),然后取導(dǎo)數(shù)為0時(shí)的點(diǎn),該點(diǎn)即為函數(shù)的最小值(也可能是極小值)。梯度下降就是對(duì)J(θ)求偏導(dǎo)后,一步一步逼近導(dǎo)數(shù)為0的點(diǎn)的方法。
因?yàn)槲覀冊(cè)诖司毩?xí)中使用的θ向量只有兩個(gè)維度,因此可以使用matlab的plot函數(shù)進(jìn)行可視化。在一般的應(yīng)用中,θ向量的維度很高,會(huì)形成超平面,因此無(wú)法簡(jiǎn)單的用plot的方法進(jìn)行可視化。
代碼:
function [] = showJTheta( x,y ) %SHOW Summary of this function goes here % Detailed explanation goes here J_vals = zeros(100, 100); % initialize Jvals to 100x100 matrix of 0's theta0_vals = linspace(-3, 3, 100); theta1_vals = linspace(-1, 1, 100); for i = 1:length(theta0_vals)for j = 1:length(theta1_vals)t = [theta0_vals(i); theta1_vals(j)];J_vals(i,j) = sum(sum((x*t - y).^2,2),1)/(2*length(y));end end% Plot the surface plot % Because of the way meshgrids work in the surf command, we need to % transpose J_vals before calling surf, or else the axes will be flipped J_vals = J_vals' figure; surf(theta0_vals, theta1_vals, J_vals); axis([-3, 3, -1, 1,0,40]); xlabel('\theta_0'); ylabel('\theta_1')end執(zhí)行:
x=load('ex2x.dat');
y=load('ex2y.dat');
x=[ones(length(y),1),x];
showJTheta(x,y);
結(jié)果
加上等高線的繪制語(yǔ)句
figure;
% Plot the cost function with 15 contours spaced logarithmically
% between 0.01 and 100
contour(theta0_vals, theta1_vals, J_vals, logspace(-2, 2, 15));
xlabel('\theta_0'); ylabel('\theta_1');
效果
總結(jié)
以上是生活随笔為你收集整理的Exercise: Linear Regression的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: Lucene的评分(score)机制的简
- 下一篇: Exercise: Logistic R