當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

机器学习实践三---神经网络学习

發(fā)布時(shí)間：2023/11/29 编程问答 28 豆豆

生活随笔收集整理的這篇文章主要介紹了机器学习实践三---神经网络学习小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

Neural Networks

在這個(gè)練習(xí)中，將實(shí)現(xiàn)神經(jīng)網(wǎng)絡(luò)BP算法,練習(xí)的內(nèi)容是手寫數(shù)字識(shí)別。

Visualizing the data

這次數(shù)據(jù)還是5000個(gè)樣本，每個(gè)樣本是一張20*20的灰度圖片 fig, ax_array = plt.subplots(nrows=10, ncols=10, figsize=(6, 4))for row in range(10):for column in range(10):ax_array[row, column].matshow(sample_images[10 * row + column].reshape((20, 20)).T, cmap='gray')ax_array[row, column].axis('off')plt.show()returndata = loadmat("ex4data1.mat") X = data['X'] y = data['y']m = X.shape[0] rand_sample_num = np.random.permutation(m) sample_images = X[rand_sample_num[0:100], :] display_data(sample_images)

Model representation

這是一個(gè)簡(jiǎn)單的神經(jīng)網(wǎng)絡(luò)，輸入層、隱藏層、輸出，樣本圖片是20*20，所以輸入層是400個(gè)單元，（再加上一個(gè)額外偏差單元），第二層隱藏層是25個(gè)單元，輸出層是10個(gè)單元。從上面的數(shù)據(jù)顯示中有兩個(gè)變量X 和y。
ex4weights.mat 中提供了訓(xùn)練好的網(wǎng)絡(luò)參數(shù)theta1, theta2,
theta1 has size 25 x 401
theta2 has size 10 x 26

Feedforward and cost function

為了最后的輸出，我們將標(biāo)簽值也就是數(shù)字從0到9，轉(zhuǎn)化為one-hot 碼

from sklearn.preprocessing import OneHotEncoder def to_one_hot(y):encoder = OneHotEncoder(sparse=False) # return a array instead of matrixy_onehot = encoder.fit_transform(y.reshape(-1,1))return y_onehot

加載數(shù)據(jù)

X, label_y = load_mat('ex4data1.mat') X = np.insert(X, 0, 1, axis=1) y = to_one_hot(label_y)

load weight

def load_weight(path):data = loadmat(path)return data['Theta1'], data['Theta2']t1, t2 = load_weight('ex4weights.mat')

theta 轉(zhuǎn)化
因?yàn)閛pt.minimize傳參問題，我們這里對(duì)theta進(jìn)行平坦化

# 展開 def unrool(var1, var2):return np.r_[var1.flatten(), var2.flatten()] # 分開矩陣化 def rool(array):return array[:25*401].reshape(25, 401), array[25*401:].reshape(10, 26)

Feedforward Regularized cost function

這里主要是前饋傳播和代價(jià)函數(shù)的一些邏輯，正則化為了預(yù)防高方差問題。

def sigmoid(z):return 1 / (1 + np.exp(-z))#前饋傳播 def feed_forward(theta, X):theta1, theta2 = rool(theta)a1 = Xz2 = a1.dot(theta1.T)a2 = np.insert(sigmoid(z2), 0, 1, axis=1)z3 = a2.dot(theta2.T)a3 = sigmoid(z3)return a1, z2, a2, z3, a3# a1, z2, a2, z3, h = feed_forward(t1, t2, X)def cost(theta, X, y):a1, z2, a2, z3, h = feed_forward(theta, X)J = -y * np.log(h) - (1-y) * np.log(1 - h)return J# Implement Regularization def regularized_cost(theta, X, y, l=1):theta1, theta2 = rool(theta)temp_theta1 = theta1[:, 1:]temp_theta2 = theta2[:, 1:]reg = temp_theta1.flatten().T.dot(temp_theta1.flatten()) + temp_theta2.flatten().T.dot(temp_theta2.flatten())regularized_theta = l / (2 * len(X)) * reg return regularized_theta + cost(theta, X, y)

Backprogation

反向傳播算法，是機(jī)器學(xué)習(xí)比較難推理的算法了，也是最重要的算法，為了得到最優(yōu)的theta值，通過進(jìn)行反向傳播，來不斷跟新theta值, 當(dāng)然還有一些超參數(shù)，如lambda、a 、訓(xùn)練迭代次數(shù)，如果進(jìn)行adam、Rmsprop等優(yōu)化學(xué)習(xí)效率算法，還有有一些其他的超參數(shù)。

# random initalization# 梯度 def gradient(theta, X, y):theta1, theta2 = rool(theta)a1, z2, a2, z3, h = feed_forward(theta, X)d3 = h - yd2 = d3.dot(theta2[:, 1:]) * sigmoid_gradient(z2)D2 = d3.T.dot(a2)D1 = d2.T.dot(a1)D = (1 / len(X)) * unrool(D1, D2)return D

Sigmoid gradient

也就是對(duì)sigmoid 函數(shù)求導(dǎo)

def sigmoid_gradient(z):return sigmoid(z) * (1 - sigmoid(z))

Random initialization

初始化參數(shù)，我們一般使用隨機(jī)初始化np.random.randn(-2,2),生成高斯分布，再乘以一個(gè)小的數(shù)，這樣把它初始化為很小的隨機(jī)數(shù)，
這樣直觀地看就相當(dāng)于把訓(xùn)練放在了邏輯回歸的直線部分進(jìn)行開始，初始化參數(shù)還可以盡量避免梯度消失和梯度爆炸的問題。

def random_init(size):return np.random.randn(-2, 2, size) * 0.01

Backporpagation

Regularized Neural Networks

正則化神經(jīng)網(wǎng)絡(luò)

def regularized_gradient(theta, X, y, l=1):a1, z2, a2, z3, h = feed_forward(theta, X)D1, D2 = rool(gradient(theta, X, y))t1[:, 0] = 0t2[:, 0] = 0reg_D1 = D1 + (l / len(X)) * t1reg_D2 = D2 + (l / len(X)) * t2return unrool(reg_D1, reg_D2)

Learning parameters using fmincg

調(diào)優(yōu)參數(shù)

def nn_training(X, y):init_theta = random_init(10285) # 25*401 + 10*26res = opt.minimize(fun=regularized_cost,x0=init_theta,args=(X, y, 1),method='TNC',jac=regularized_gradient,options={'maxiter': 400})return resres = nn_training(X, y)

準(zhǔn)確率
def accuracy(theta, X, y):
_, _, _, _, h = feed_forward(res.x, X)
y_pred = np.argmax(h, axis=1) + 1
print(classification_report(y, y_pred))

accuracy(res.x, X, label_y)

Visualizing the hidden layer

隱藏層顯示跟輸入層顯示差不多

def plot_hidden(theta):t1, _ = rool(theta)t1 = t1[:, 1:]fig, ax_array = plt.subplots(5, 5, sharex=True, sharey=True, figsize=(6, 6))for r in range(5):for c in range(5):ax_array[r, c].matshow(t1[r * 5 + c].reshape(20, 20), cmap='gray_r')plt.xticks([])plt.yticks([])plt.show()plot_hidden(res.x)

super parameter lambda update

神經(jīng)網(wǎng)絡(luò)是非常強(qiáng)大的模型，可以形成高度復(fù)雜的決策邊界。如果沒有正則化，神經(jīng)網(wǎng)絡(luò)就有可能“過度擬合”一個(gè)訓(xùn)練集，從而使它在訓(xùn)練集上獲得接近100%的準(zhǔn)確性，但在以前沒有見過的新例子上則不會(huì)。你可以設(shè)置較小的正則化λ值和MaxIter參數(shù)高的迭代次數(shù)為自己看到這個(gè)結(jié)果。

創(chuàng)作挑戰(zhàn)賽新人創(chuàng)作獎(jiǎng)勵(lì)來咯，堅(jiān)持創(chuàng)作打卡瓜分現(xiàn)金大獎(jiǎng)

總結(jié)

以上是生活随笔為你收集整理的机器学习实践三---神经网络学习的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇：机器学习实践二 -多分类和神经网络
下一篇：机器学习实践四--正则化线性回归和偏