日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

深度学总结:RNN训练需要注意地方:pytorch每一个batch训练之前需要把hidden = hidden.data,否者反向传播的梯度会遍历以前的timestep

發布時間:2024/9/15 编程问答 27 豆豆

pytorch每一個batch訓練之前需要把hidden = hidden.data,否者反向傳播的梯度會遍歷以前的timestep

tensorflow也有把new_state更新,但是沒有明顯detach的操作,預計是tensorflow自己機制默認backpropagation一個timestep的梯度:

for e in range(epochs):# Train networknew_state = sess.run(model.initial_state)loss = 0for x, y in get_batches(encoded, batch_size, num_steps):counter += 1start = time.time()feed = {model.inputs: x,model.targets: y,model.keep_prob: keep_prob,model.initial_state: new_state}batch_loss, new_state, _ = sess.run([model.loss, model.final_state, model.optimizer], feed_dict=feed)

pytorch每一個batch訓練之前需要把hidden = hidden.data,否者反向傳播的梯度會遍歷以前的timestep,它是自動求導,需要專門把那個state提出來一下,這樣就相當于detach了,反向梯度到這里就停止了。

# train the RNN def train(rnn, n_steps, print_every):# initialize the hidden statehidden = None for batch_i, step in enumerate(range(n_steps)):# defining the training data time_steps = np.linspace(step * np.pi, (step+1)*np.pi, seq_length + 1)data = np.sin(time_steps)data.resize((seq_length + 1, 1)) # input_size=1x = data[:-1]y = data[1:]# convert data into Tensorsx_tensor = torch.Tensor(x).unsqueeze(0) # unsqueeze gives a 1, batch_size dimensiony_tensor = torch.Tensor(y)# outputs from the rnnprediction, hidden = rnn(x_tensor, hidden)## Representing Memory ### make a new variable for hidden and detach the hidden state from its history# this way, we don't backpropagate through the entire historyhidden = hidden.data# calculate the lossloss = criterion(prediction, y_tensor)# zero gradientsoptimizer.zero_grad()# perform backprop and update weightsloss.backward()optimizer.step()# display loss and predictionsif batch_i%print_every == 0: print('Loss: ', loss.item())plt.plot(time_steps[1:], x, 'r.') # inputplt.plot(time_steps[1:], prediction.data.numpy().flatten(), 'b.') # predictionsplt.show()return rnn

總結

以上是生活随笔為你收集整理的深度学总结:RNN训练需要注意地方:pytorch每一个batch训练之前需要把hidden = hidden.data,否者反向传播的梯度会遍历以前的timestep的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。