當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

莫烦-pytorch

發布時間：2023/12/20 编程问答 35 豆豆

生活随笔收集整理的這篇文章主要介紹了莫烦-pytorch 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

pytorch 莫煩

激勵函數

Y = AF(Wx)
這里的AF()就是激勵函數，其實就是另外一個非線性函數。比如relu，sigmoid，tanh

選擇激勵函數的竅門：當神經網絡層只有兩三層時，可選擇任意的激勵函數；當神經網絡特別多層時，要慎重，小心梯度爆炸
CNN時推薦relu
RNN時推薦tanh或者relu

回歸

建立神經網絡

class Net(torch.nn.Module): # 繼承 torch 的 Moduledef __init__(self, n_feature, n_hidden, n_output):super(Net, self).__init__() # 繼承 __init__ 功能# 定義每層用什么樣的形式self.hidden = torch.nn.Linear(n_feature, n_hidden) # 隱藏層線性輸出self.predict = torch.nn.Linear(n_hidden, n_output) # 輸出層線性輸出def forward(self, x): # 這同時也是 Module 中的 forward 功能# 正向傳播輸入值, 神經網絡分析出輸出值x = F.relu(self.hidden(x)) # 激勵函數(隱藏層的線性值)x = self.predict(x) # 輸出值return x

訓練網絡

# optimizer 是訓練的工具,有四個常用的optimizer optimizer = torch.optim.SGD(net.parameters(), lr=0.2) # 傳入 net 的所有參數, 學習率 loss_func = torch.nn.MSELoss() # 預測值和真實值的誤差計算公式 (均方差)for t in range(100):prediction = net(x) # 喂給 net 訓練數據 x, 輸出預測值loss = loss_func(prediction, y) # 計算兩者的誤差optimizer.zero_grad() # 清空上一步的殘余更新參數值loss.backward() # 誤差反向傳播, 計算參數更新值optimizer.step() # 將參數更新值施加到 net 的 parameters 上

快速搭建

搭建神經網絡不止class net()這種方法，有一個快速的方法torch.nn.Sequential(）

net = torch.nn.Sequential(torch.nn.Linear(1, 10),torch.nn.ReLU(),torch.nn.Linear(10, 1) )

Sequential方法直接認定的就是relu()這種激勵函數，而對于自己手寫的net來說，可以在forward()方法中指定激勵函數，就會更加靈活一些。

保存與提取

保存

torch.save(net1, 'net.pkl') # 保存整個網絡 torch.save(net1.state_dict(), 'net_params.pkl') # 只保存網絡中的參數 (速度快, 占內存少)

提取

def restore_net():# restore entire net1 to net2net2 = torch.load('net.pkl')prediction = net2(x)

提取網絡參數
網絡參數：能獨立地反映網絡特性的參數
提取所有網路參數

net3.load_state_dict(torch.load('net_params.pkl')) prediction = net3(x)

批訓練

DataLoader

# 先轉換成 torch 能識別的 Dataset torch_dataset = Data.TensorDataset(data_tensor=x, target_tensor=y)# 把 dataset 放入 DataLoader loader = Data.DataLoader(dataset=torch_dataset, # torch TensorDataset formatbatch_size=BATCH_SIZE, # mini batch size.，就是每次取多少數據shuffle=True, # 要不要打亂數據 (打亂比較好)num_workers=2, # 多線程來讀數據 )

優化器

要讓神經網絡聰明起來！！！！

SGD
Momentum
AdaGrad
RMSProp
Adam

SGD

Momentum
所以我們把這個人從平地上放到了一個斜坡上, 只要他往下坡的方向走一點點, 由于向下的慣性, 他不自覺地就一直往下走, 走的彎路也變少了. 這就是 Momentum 參數更新.

AdaGrad
而是給他一雙不好走路的鞋子, 使得他一搖晃著走路就腳疼, 鞋子成為了走彎路的阻力, 逼著他往前直著走

RMSProp
是momentum和adagrad的集合體，同時具備兩者的優勢。但是RMSProp并沒有包含momentum的一部分，所以在Adam中又進一步改進

Adam
對于Adam來說，能快好的達到目標，快速收斂到最好的地方

Optimizer

SGD
Momentum
RMSProp
Adam

net_SGD = Net() net_Momentum = Net() net_RMSprop = Net() net_Adam = Net() # different optimizers opt_SGD = torch.optim.SGD(net_SGD.parameters(), lr=LR) opt_Momentum = torch.optim.SGD(net_Momentum.parameters(), lr=LR, momentum=0.8) opt_RMSprop = torch.optim.RMSprop(net_RMSprop.parameters(), lr=LR, alpha=0.9) opt_Adam = torch.optim.Adam(net_Adam.parameters(), lr=LR, betas=(0.9, 0.99))

在實驗中，對各個優化器還是都應該試一試，看看哪個更好

CNN

從下到上的順序, 首先是輸入的圖片(image), 經過一層卷積層 (convolution), 然后在用池化(pooling)方式處理卷積的信息, 這里使用的是 max pooling 的方式.
然后在經過一次同樣的處理, 把得到的第二次處理的信息傳入兩層全連接的神經層 (fully connected),這也是一般的兩層神經網絡層,最后在接上一個分類器(classifier)進行分類預測. 這僅僅是對卷積神經網絡在圖片處理上一次簡單的介紹.
卷積層(Convolutional Layer) - 主要作用是提取特征
池化層(Max Pooling Layer) - 主要作用是下采樣(downsampling)，卻不會損壞識別結果
全連接層(Fully Connected Layer) - 主要作用是分類預測

class CNN(nn.Module):def __init__(self):super(CNN, self).__init__()self.conv1 = nn.Sequential( # input shape (1, 28, 28)nn.Conv2d(in_channels=1, # input heightout_channels=16, # n_filterskernel_size=5, # filter sizestride=1, # filter movement/steppadding=2, # 如果想要 con2d 出來的圖片長寬沒有變化, padding=(kernel_size-1)/2 當 stride=1), # output shape (16, 28, 28)nn.ReLU(), # activationnn.MaxPool2d(kernel_size=2), # 在 2x2 空間里向下采樣, output shape (16, 14, 14))self.conv2 = nn.Sequential( # input shape (16, 14, 14)nn.Conv2d(16, 32, 5, 1, 2), # output shape (32, 14, 14)nn.ReLU(), # activationnn.MaxPool2d(2), # output shape (32, 7, 7))self.out = nn.Linear(32 * 7 * 7, 10) # fully connected layer, output 10 classesdef forward(self, x):x = self.conv1(x)x = self.conv2(x)x = x.view(x.size(0), -1) # 展平多維的卷積圖成 (batch_size, 32 * 7 * 7)output = self.out(x)return output

這個 CNN 整體流程是卷積(Conv2d) -> 激勵函數(ReLU) -> 池化, 向下采樣 (MaxPooling) -> 再來一遍 -> 展平多維的卷積成的特征圖 -> 接入全連接層 (Linear) -> 輸出

RNN

RNN是在有順序的數據上進行學習的，在反向傳遞得到誤差的時候，每一步都會乘以自己的一個參數W，若W是小于1，則誤差傳遞到初始時間的時候會接近0，即梯度消失；反之，則是梯度爆炸！hong！然后LSTM是為了解決這個問題而提出來的

LSTM循環神經網絡

主線：就是主線劇情
分線，即是原本的RNN體系。
1. 輸入：重要程度寫入主線劇情進行分析.
2.忘記：如果此時的分線劇情更改了我們對之前劇情的想法, 那么忘記控制就會將之前的某些主線劇情忘記, 按比例替換成現在的新劇情
3.輸出：基于目前的主線劇情和分線劇情判斷要輸出的到底是什么

RNN分類問題

class RNN(nn.Module):def __init__(self):super(RNN, self).__init__()self.rnn = nn.LSTM( # LSTM 效果要比 nn.RNN() 好多了input_size=28, # 圖片每行的數據像素點hidden_size=64, # rnn hidden unitnum_layers=1, # 有幾層 RNN layersbatch_first=True, # input & output 會是以 batch size 為第一維度的特征集 e.g. (batch, time_step, input_size))self.out = nn.Linear(64, 10) # 輸出層def forward(self, x):# x shape (batch, time_step, input_size)# r_out shape (batch, time_step, output_size)# h_n shape (n_layers, batch, hidden_size) LSTM 有兩個 hidden states, h_n 是分線, h_c 是主線# h_c shape (n_layers, batch, hidden_size)r_out, (h_n, h_c) = self.rnn(x, None) # None 表示 hidden state 會用全0的 state# 選取最后一個時間點的 r_out 輸出# 這里 r_out[:, -1, :] 的值也是 h_n 的值out = self.out(r_out[:, -1, :])return out

RNN整體的流程是：

RNN回歸問題

class RNN(nn.Module):def __init__(self):super(RNN, self).__init__()self.rnn = nn.RNN( # 這回一個普通的 RNN 就能勝任input_size=1,hidden_size=32, # rnn hidden unitnum_layers=1, # 有幾層 RNN layersbatch_first=True, # input & output 會是以 batch size 為第一維度的特征集 e.g. (batch, time_step, input_size))self.out = nn.Linear(32, 1)def forward(self, x, h_state): # 因為 hidden state 是連續的, 所以我們要一直傳遞這一個 state# x (batch, time_step, input_size)# h_state (n_layers, batch, hidden_size)# r_out (batch, time_step, output_size)r_out, h_state = self.rnn(x, h_state) # h_state 也要作為 RNN 的一個輸入outs = [] # 保存所有時間點的預測值for time_step in range(r_out.size(1)): # 對每一個時間點計算 outputouts.append(self.out(r_out[:, time_step, :]))return torch.stack(outs, dim=1), h_state

自編碼（Autoencoder）

是一種非監督式學習，接受大量的輸入信息，然后總結原數據的精髓。
編碼器Encoder
特征屬性降維
解碼器Decoder
將精髓信息解壓成原始信息

AutoEncoder

class AutoEncoder(nn.Module):def __init__(self):super(AutoEncoder, self).__init__()# 壓縮self.encoder = nn.Sequential(nn.Linear(28*28, 128), 28*28->128nn.Tanh(),nn.Linear(128, 64), 128->64nn.Tanh(),nn.Linear(64, 12), 64->12nn.Tanh(),nn.Linear(12, 3), # 壓縮成3個特征, 進行 3D 圖像可視化)# 解壓self.decoder = nn.Sequential(nn.Linear(3, 12),nn.Tanh(),nn.Linear(12, 64),nn.Tanh(),nn.Linear(64, 128),nn.Tanh(),nn.Linear(128, 28*28),nn.Sigmoid(), # 激勵函數讓輸出值在 (0, 1))def forward(self, x):encoded = self.encoder(x)decoded = self.decoder(encoded)return encoded, decodedautoencoder = AutoEncoder()

DQN

強化學習融合了神經網絡+Q-learing

通過NN預測出Q(s2, a1) 和 Q(s2,a2) 的值，即Q估計

選取Q估計中最大值的動作來換取還清中的獎勵reward

Q現實是之前在Q-learing中的值

更新神經網絡中的參數

顯示網絡和估計網絡建立的基本體系

class Net(nn.Module):def __init__(self, ):super(Net, self).__init__()self.fc1 = nn.Linear(N_STATES, 10)self.fc1.weight.data.normal_(0, 0.1) # initializationself.out = nn.Linear(10, N_ACTIONS)self.out.weight.data.normal_(0, 0.1) # initializationdef forward(self, x):x = self.fc1(x)x = F.relu(x)actions_value = self.out(x)return actions_value

class DQN(object):def __init__(self):# 建立 target net 和 eval net 還有 memorydef choose_action(self, x):# 根據環境觀測值選擇動作的機制return actiondef store_transition(self, s, a, r, s_):# 存儲記憶#如果記憶滿了，就覆蓋老數據def learn(self):# target 網絡更新# 學習記憶庫中的記憶

GAN

大白話解釋GAN：新手畫家隨機靈感畫畫，新手鑒賞家接受畫作（不知道是新手畫還是著名畫），說出判斷，一邊還告訴新手怎么畫，然后新手就畫的越來越像著名畫家的畫。

G = nn.Sequential( # Generatornn.Linear(N_IDEAS, 128), # random ideas (could from normal distribution)nn.ReLU(),nn.Linear(128, ART_COMPONENTS), # making a painting from these random ideas )D = nn.Sequential( # Discriminatornn.Linear(ART_COMPONENTS, 128), # receive art work either from the famous artist or a newbie like Gnn.ReLU(),nn.Linear(128, 1),nn.Sigmoid(), # tell the probability that the art work is made by artist )

Dropout緩解過擬合

torch.nn.Dropout(0.5) 這里的 0.5 指的是隨機有 50% 的神經元會被關閉/丟棄.

net_dropped = torch.nn.Sequential(torch.nn.Linear(1, N_HIDDEN),torch.nn.Dropout(0.5), # drop 50% of the neurontorch.nn.ReLU(),torch.nn.Linear(N_HIDDEN, N_HIDDEN),torch.nn.Dropout(0.5), # drop 50% of the neurontorch.nn.ReLU(),torch.nn.Linear(N_HIDDEN, 1), )

批標準化（Batch Normalization）BN

將分散的數據統一的一種做法，使數據具有統一規格。
BN被添加在每一個全連接和激勵函數之間，對每一層神經網絡進行標準化

class Net(nn.Module):def __init__(self, batch_normalization=False):super(Net, self).__init__()self.do_bn = batch_normalizationself.fcs = [] # 太多層了, 我們用 for loop 建立self.bns = []self.bn_input = nn.BatchNorm1d(1, momentum=0.5) # 給 input 的 BNfor i in range(N_HIDDEN): # 建層input_size = 1 if i == 0 else 10fc = nn.Linear(input_size, 10)setattr(self, 'fc%i' % i, fc) # 注意! pytorch 一定要你將層信息變成 class 的屬性! 我在這里花了2天時間發現了這個 bugself._set_init(fc) # 參數初始化self.fcs.append(fc)if self.do_bn:bn = nn.BatchNorm1d(10, momentum=0.5)setattr(self, 'bn%i' % i, bn) # 注意! pytorch 一定要你將層信息變成 class 的屬性! 我在這里花了2天時間發現了這個 bugself.bns.append(bn)self.predict = nn.Linear(10, 1) # output layerself._set_init(self.predict) # 參數初始化def _set_init(self, layer): # 參數初始化init.normal_(layer.weight, mean=0., std=.1)init.constant_(layer.bias, B_INIT)def forward(self, x):pre_activation = [x]if self.do_bn: x = self.bn_input(x) # 判斷是否要加 BNlayer_input = [x]for i in range(N_HIDDEN):x = self.fcs[i](x)pre_activation.append(x) # 為之后出圖if self.do_bn: x = self.bns[i](x) # 判斷是否要加 BNx = ACTIVATION(x)layer_input.append(x) # 為之后出圖out = self.predict(x)return out, layer_input, pre_activation# 建立兩個 net, 一個有 BN, 一個沒有 nets = [Net(batch_normalization=False), Net(batch_normalization=True)]

總結

以上是生活随笔為你收集整理的莫烦-pytorch的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。