當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

吴恩达《机器学习》学习笔记十四——应用机器学习的建议实现一个机器学习模型的改进

發(fā)布時(shí)間：2024/7/23 编程问答 28 豆豆

生活随笔收集整理的這篇文章主要介紹了吴恩达《机器学习》学习笔记十四——应用机器学习的建议实现一个机器学习模型的改进小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

吳恩達(dá)《機(jī)器學(xué)習(xí)》學(xué)習(xí)筆記十四——應(yīng)用機(jī)器學(xué)習(xí)的建議實(shí)現(xiàn)一個(gè)機(jī)器學(xué)習(xí)模型的改進(jìn)

一、任務(wù)介紹
二、代碼實(shí)現(xiàn)
- 1.準(zhǔn)備數(shù)據(jù)
- 2.代價(jià)函數(shù)
- 3.梯度計(jì)算
- 4.帶有正則化的代價(jià)函數(shù)和梯度計(jì)算
- 5.擬合數(shù)據(jù)
- 6.創(chuàng)建多項(xiàng)式特征
- 7.準(zhǔn)備多項(xiàng)式回歸數(shù)據(jù)
- 8.繪制學(xué)習(xí)曲線
- - 𝜆=0
  - 𝜆=1
  - 𝜆=100
- 9.找到最佳的 𝜆

前幾次筆記介紹了具體實(shí)現(xiàn)一個(gè)機(jī)器學(xué)習(xí)模型的時(shí)候應(yīng)該如何操作，首先是快速實(shí)現(xiàn)一個(gè)較為簡單的模型，然后通過繪制學(xué)習(xí)曲線去判斷目前的模型存在什么問題，分析應(yīng)該如何改進(jìn)，這次筆記就將實(shí)現(xiàn)這一整個(gè)過程。

數(shù)據(jù)集：https://pan.baidu.com/s/1Er82YunOOmyTJIW-0H40mQ
提取碼：41rj

一、任務(wù)介紹

這次筆記中我們將要實(shí)現(xiàn)正則化線性回歸，并用它來研究帶有方差-偏差性質(zhì)的模型。首先使用正則化線性回歸，通過水位（water_level）的改變來預(yù)測一個(gè)大壩的出水量（flow），然后通過繪制學(xué)習(xí)曲線來診斷學(xué)習(xí)算法存在的問題，分析是具有偏差還是具有方差，以及如何調(diào)節(jié)。

二、代碼實(shí)現(xiàn)

1.準(zhǔn)備數(shù)據(jù)

import numpy as np import scipy.io as sio import scipy.optimize as opt import pandas as pd import matplotlib.pyplot as plt import seaborn as sns def load_data():"""for ex5d['X'] shape = (12, 1)pandas has trouble taking this 2d ndarray to construct a dataframe, so I ravelthe results"""d = sio.loadmat('ex5data1.mat')return map(np.ravel, [d['X'], d['y'], d['Xval'], d['yval'], d['Xtest'], d['ytest']])

把訓(xùn)練集、驗(yàn)證集和測試集都讀入了，然后觀察一下他們的維度：

X, y, Xval, yval, Xtest, ytest = load_data() print('X:', X.shape) print('Xval:', Xval.shape) print('Xtest:', Xtest.shape)

可以看到訓(xùn)練集、驗(yàn)證集和測試集分別有12、21、21個(gè)數(shù)據(jù)。

下面對訓(xùn)練數(shù)據(jù)可視化，對它有個(gè)直觀的理解：

df = pd.DataFrame({'water_level':X, 'flow':y})sns.lmplot('water_level', 'flow', data=df, fit_reg=False, size=7) plt.show()

最后是需要為三個(gè)數(shù)據(jù)集的X都插入偏置，這是線性回歸假設(shè)函數(shù)的常數(shù)項(xiàng)，也是與參數(shù)theta0相乘的項(xiàng)，接著再觀察一下維度：

X, Xval, Xtest = [np.insert(x.reshape(x.shape[0], 1), 0, np.ones(x.shape[0]), axis=1) for x in (X, Xval, Xtest)] print('X:', X.shape) print('Xval:', Xval.shape) print('Xtest:', Xtest.shape)

2.代價(jià)函數(shù)

線性回歸的代價(jià)函數(shù)如下圖所示：

相應(yīng)的代碼實(shí)現(xiàn)如下所示：

def cost(theta, X, y): # INPUT：參數(shù)值theta，數(shù)據(jù)X,標(biāo)簽y # OUTPUT：當(dāng)前參數(shù)值下代價(jià)函數(shù) # TODO：根據(jù)參數(shù)和輸入的數(shù)據(jù)計(jì)算代價(jià)函數(shù)# STEP1：獲取樣本個(gè)數(shù)m = X.shape[0]# STEP2：計(jì)算代價(jià)函數(shù)inner = X @ theta - ysquare_sum = inner.T @ innercost = square_sum / (2 * m)return cost

給一個(gè)初始參數(shù)計(jì)算的玩玩，注意參數(shù)theta的維度：

theta = np.ones(X.shape[1]) cost(theta, X, y)

3.梯度計(jì)算

線性回歸的梯度計(jì)算公式如下所示：

代碼實(shí)現(xiàn)如下所示：

def gradient(theta, X, y): # INPUT：參數(shù)值theta，數(shù)據(jù)X,標(biāo)簽y # OUTPUT：當(dāng)前參數(shù)值下梯度 # TODO：根據(jù)參數(shù)和輸入的數(shù)據(jù)計(jì)算梯度 # STEP1：獲取樣本個(gè)數(shù)m = X.shape[0]# STEP2：計(jì)算代價(jià)函數(shù)grad= (X.T @ (X @ theta - y))/mreturn grad gradient(theta, X, y)

4.帶有正則化的代價(jià)函數(shù)和梯度計(jì)算

帶有正則化的梯度計(jì)算公式如下圖所示：

def regularized_gradient(theta, X, y, l=1): # INPUT：參數(shù)值theta，數(shù)據(jù)X,標(biāo)簽y # OUTPUT：當(dāng)前參數(shù)值下梯度 # TODO：根據(jù)參數(shù)和輸入的數(shù)據(jù)計(jì)算梯度 # STEP1：獲取樣本個(gè)數(shù)m = X.shape[0]# STEP2：計(jì)算正則化梯度regularized_term = theta.copy() # same shape as thetaregularized_term[0] = 0 # don't regularize intercept thetaregularized_term = (l / m) * regularized_termreturn gradient(theta, X, y) + regularized_term regularized_gradient(theta, X, y)

帶有正則化的代價(jià)函數(shù)公式如下所示：

def regularized_cost(theta, X, y, l=1):m = X.shape[0]regularized_term = (l / (2 * m)) * np.power(theta[1:], 2).sum()return cost(theta, X, y) + regularized_term

5.擬合數(shù)據(jù)

def linear_regression_np(X, y, l=1): # INPUT：數(shù)據(jù)X,標(biāo)簽y，正則化參數(shù)l # OUTPUT：當(dāng)前參數(shù)值下梯度 # TODO：根據(jù)參數(shù)和輸入的數(shù)據(jù)計(jì)算梯度 # STEP1：初始化參數(shù)theta = np.ones(X.shape[1])# STEP2：調(diào)用優(yōu)化算法擬合參數(shù)res = opt.minimize(fun=regularized_cost,x0=theta,args=(X, y, l),method='TNC',jac=regularized_gradient,options={'disp': True})return res

調(diào)用擬合數(shù)據(jù)的函數(shù)來優(yōu)化參數(shù)：

theta = np.ones(X.shape[0])final_theta = linear_regression_np(X, y, l=0).get('x')print(final_theta)

這就是用訓(xùn)練數(shù)據(jù)擬合后的theta0和theta1，下面我們對這個(gè)參數(shù)組合得到的模型來進(jìn)行可視化。

b = final_theta[0] # intercept m = final_theta[1] # slopeplt.scatter(X[:,1], y, label="Training data") plt.plot(X[:, 1], X[:, 1]*m + b, label="Prediction") plt.legend(loc=2) plt.show()

對訓(xùn)練數(shù)據(jù)都不能很好的擬合，顯然是欠擬合的。但這個(gè)例子太明顯了，有些情況可能觀察這個(gè)圖不能直接判斷，所以最好還是繪制學(xué)習(xí)曲線，也就是用訓(xùn)練數(shù)據(jù)的子集去擬合模型，得到的參數(shù)同樣用來計(jì)算驗(yàn)證集上的誤差，隨著訓(xùn)練數(shù)據(jù)逐漸增多，計(jì)算訓(xùn)練誤差和驗(yàn)證集誤差并繪制出相應(yīng)的曲線：

1.使用訓(xùn)練集的子集來擬合模型

2.在計(jì)算訓(xùn)練代價(jià)和交叉驗(yàn)證代價(jià)時(shí)，沒有用正則化

3.記住使用相同的訓(xùn)練集子集來計(jì)算訓(xùn)練代價(jià)

training_cost, cv_cost = [], [] # TODO：計(jì)算訓(xùn)練代價(jià)和交叉驗(yàn)證集代價(jià) # STEP1：獲取樣本個(gè)數(shù)，遍歷每個(gè)樣本 m = X.shape[0] for i in range(1, m+1):# STEP2：計(jì)算當(dāng)前樣本的代價(jià)res = linear_regression_np(X[:i, :], y[:i], l=0)tc = regularized_cost(res.x, X[:i, :], y[:i], l=0)cv = regularized_cost(res.x, Xval, yval, l=0)# STEP3：把計(jì)算結(jié)果存儲至預(yù)先定義的數(shù)組training_cost, cv_cost中training_cost.append(tc)cv_cost.append(cv) plt.plot(np.arange(1, m+1), training_cost, label='training cost') plt.plot(np.arange(1, m+1), cv_cost, label='cv cost') plt.legend(loc=1) plt.show()

從這個(gè)圖可以看出，訓(xùn)練集誤差和驗(yàn)證集誤差都是比較大的，這是一種欠擬合的情況。

6.創(chuàng)建多項(xiàng)式特征

因?yàn)槟Ｐ颓窋M合，所以不能使用簡單的線性函數(shù)來擬合了，應(yīng)該添加一些多項(xiàng)式特征來增加模型的復(fù)雜性。

def prepare_poly_data(*args, power):"""args: keep feeding in X, Xval, or Xtestwill return in the same order"""def prepare(x):# 特征映射df = poly_features(x, power=power)# 歸一化處理ndarr = normalize_feature(df).as_matrix()# 添加偏置項(xiàng)return np.insert(ndarr, 0, np.ones(ndarr.shape[0]), axis=1)return [prepare(x) for x in args]

特征映射之前寫過，這里不再贅述，直接上代碼：

def poly_features(x, power, as_ndarray=False): #特征映射data = {'f{}'.format(i): np.power(x, i) for i in range(1, power + 1)}df = pd.DataFrame(data)return df.as_matrix() if as_ndarray else df

嘗試一下上面的代碼，構(gòu)造一下次數(shù)最高為3的多項(xiàng)式特征：

X, y, Xval, yval, Xtest, ytest = load_data() poly_features(X, power=3)

7.準(zhǔn)備多項(xiàng)式回歸數(shù)據(jù)

擴(kuò)展特征到 8階,或者你需要的階數(shù)

使用歸一化來合并x^n

不要忘記添加偏置項(xiàng)

def normalize_feature(df):"""Applies function along input axis(default 0) of DataFrame."""return df.apply(lambda column: (column - column.mean()) / column.std()) X_poly, Xval_poly, Xtest_poly= prepare_poly_data(X, Xval, Xtest, power=8) X_poly[:3, :]

8.繪制學(xué)習(xí)曲線

𝜆=0

首先，沒有使用正則化，所以 𝜆=0

def plot_learning_curve(X, y, Xval, yval, l=0): # INPUT：訓(xùn)練數(shù)據(jù)集X,y，交叉驗(yàn)證集Xval，yval，正則化參數(shù)l # OUTPUT：當(dāng)前參數(shù)值下梯度 # TODO：根據(jù)參數(shù)和輸入的數(shù)據(jù)計(jì)算梯度 # STEP1：初始化參數(shù)，獲取樣本個(gè)數(shù)，開始遍歷training_cost, cv_cost = [], []m = X.shape[0]for i in range(1, m + 1):# STEP2：調(diào)用之前寫好的擬合數(shù)據(jù)函數(shù)進(jìn)行數(shù)據(jù)擬合res = linear_regression_np(X[:i, :], y[:i], l=l)# STEP3：計(jì)算樣本代價(jià)tc = cost(res.x, X[:i, :], y[:i])cv = cost(res.x, Xval, yval)# STEP3：把計(jì)算結(jié)果存儲至預(yù)先定義的數(shù)組training_cost, cv_cost中training_cost.append(tc)cv_cost.append(cv)plt.plot(np.arange(1, m + 1), training_cost, label='training cost')plt.plot(np.arange(1, m + 1), cv_cost, label='cv cost')plt.legend(loc=1) plot_learning_curve(X_poly, y, Xval_poly, yval, l=0) plt.show()

從這個(gè)學(xué)習(xí)曲線看，訓(xùn)練誤差太低了，而驗(yàn)證集誤差不算低，這是過擬合的情況（lamda=0，所以也沒有任何抑制過擬合的作用，而多項(xiàng)式次數(shù)又比較高，過擬合很正常）。

𝜆=1

plot_learning_curve(X_poly, y, Xval_poly, yval, l=1) plt.show()

訓(xùn)練誤差稍有增加，驗(yàn)證誤差也降得較低，算是比較好的情況。

𝜆=100

plot_learning_curve(X_poly, y, Xval_poly, yval, l=100) plt.show()

正則化過多，變成了欠擬合情況。

9.找到最佳的 𝜆

l_candidate = [0, 0.001, 0.003, 0.01, 0.03, 0.1, 0.3, 1, 3, 10] training_cost, cv_cost = [], [] for l in l_candidate:res = linear_regression_np(X_poly, y, l)tc = cost(res.x, X_poly, y)cv = cost(res.x, Xval_poly, yval)training_cost.append(tc)cv_cost.append(cv) plt.plot(l_candidate, training_cost, label='training') plt.plot(l_candidate, cv_cost, label='cross validation') plt.legend(loc=2)plt.xlabel('lambda')plt.ylabel('cost') plt.show()

找出最佳的 𝜆，即找出驗(yàn)證誤差最小時(shí)對應(yīng)的 𝜆：

# best cv I got from all those candidates l_candidate[np.argmin(cv_cost)]

用測試集去計(jì)算這些𝜆情況下的測試誤差，最終的目的是要測試誤差小：

# use test data to compute the cost for l in l_candidate:theta = linear_regression_np(X_poly, y, l).xprint('test cost(l={}) = {}'.format(l, cost(theta, Xtest_poly, ytest)))

調(diào)參后， 𝜆=0.3 是最優(yōu)選擇，這個(gè)時(shí)候測試代價(jià)最小，我們上述選擇的𝜆=1時(shí)的測試誤差很接近最優(yōu)選擇下的誤差，所以上述的操作就是建立并改進(jìn)一個(gè)模型的大致流程。

總結(jié)

以上是生活随笔為你收集整理的吴恩达《机器学习》学习笔记十四——应用机器学习的建议实现一个机器学习模型的改进的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。