當前位置：首頁 >

Tensorflow：tf.contrib.rnn.DropoutWrapper函数(谷歌已经为Dropout申请了专利！)、MultiRNNCell函数的解读与理解

發布時間：2025/3/21 38 豆豆

生活随笔收集整理的這篇文章主要介紹了 Tensorflow：tf.contrib.rnn.DropoutWrapper函数(谷歌已经为Dropout申请了专利！)、MultiRNNCell函数的解读与理解小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

Tensorflow：tf.contrib.rnn.DropoutWrapper函數(谷歌已經為Dropout申請了專利！)、MultiRNNCell函數的解讀與理解

1、tf.contrib.rnn.DropoutWrapper函數解讀與理解

1.1、源代碼解讀

1.2、案例應用

2、tf.contrib.rnn.MultiRNNCell函數解讀與理解

2.1、源代碼解讀

2.2、案例應用

tensorflow官網API文檔：https://tensorflow.google.cn/api_docs

1、tf.contrib.rnn.DropoutWrapper函數解讀與理解

? ? ? 在機器學習的模型中，如果模型的參數太多，而訓練樣本又太少，訓練出來的模型很容易產生過擬合的現象。在訓練神經網絡的時候經常會遇到過擬合的問題。過擬合具體表現在：模型在訓練數據上損失函數較小，預測準確率較高；但是在測試數據上損失函數比較大，預測準確率較低。

? ? ? 機器學習模型訓練中，過擬合現象實在令人頭禿。而 2012 年 Geoffrey Hinton 提出的 Dropout 對防止過擬合有很好的效果。之后大量 Dropout 變體涌現，這項技術也成為機器學習研究者常用的訓練 trick。萬萬沒想到的是，谷歌為該項技術申請了專利，而且這項專利已經正式生效，2019-06-26 專利生效，2034-09-03 專利到期！

? ? ? Dropout，指在神經網絡中，每個神經單元在每次有數據流入時，以一定的概率keep_prob正常工作，否則輸出0值。這是一種有效的正則化方法，可以有效降低過擬合。在RNN中進行dropout時，對于RNN的部分不進行dropout，也就是說從t-1時候的狀態傳遞到t時刻進行計算時，這個中間不進行memory的dropout；僅在同一個t時刻中，多層cell之間傳遞信息的時候進行dropout。在RNN中，這里的dropout是在輸入，輸出，或者不用的循環層之間使用，或者全連接層，不會在同一層的循環體中使用。

1.1、源代碼解讀

Operator adding dropout to inputs and outputs of the given cell.	操作者將dropout添加到給定單元的輸入和輸出。
tf.compat.v1.nn.rnn_cell.DropoutWrapper(args, *kwargs )
Args: cell: an RNNCell, a projection to output_size is added to it. input_keep_prob: unit Tensor or float between 0 and 1, input keep probability; if it is constant and 1, no input dropout will be added. output_keep_prob: unit Tensor or float between 0 and 1, output keep probability; if it is constant and 1, no output dropout will be added. state_keep_prob: unit Tensor or float between 0 and 1, output keep probability; if it is constant and 1, no output dropout will be added. State dropout is performed on the outgoing states of the cell.?Note?the state components to which dropout is applied when?state_keep_prob?is in?(0, 1)?are also determined by the argumentdropout_state_filter_visitor?(e.g. by default dropout is never applied to the?c?component of an?LSTMStateTuple). variational_recurrent: Python bool. If?True, then the same dropout pattern is applied across all time steps per run call. If this parameter is set,?input_size?must?be provided. input_size: (optional) (possibly nested tuple of)?TensorShape?objects containing the depth(s) of the input tensors expected to be passed in to the?DropoutWrapper. Required and used?iff?variational_recurrent = True?and?input_keep_prob < 1. dtype: (optional) The?dtype?of the input, state, and output tensors. Required and used?iffvariational_recurrent = True. seed: (optional) integer, the randomness seed. dropout_state_filter_visitor: (optional), default: (see below). Function that takes any hierarchical level of the state and returns a scalar or depth=1 structure of Python booleans describing which terms in the state should be dropped out. In addition, if the function returns?True, dropout is applied across this sublevel. If the function returns?False, dropout is not applied across this entire sublevel. Default behavior: perform dropout on all terms except the memory (c) state of?LSTMCellState?objects, and don't try to apply dropout to?TensorArray?objects:?def dropout_state_filter_visitor(s): if isinstance(s, LSTMCellState): # Never perform dropout on the c state. return LSTMCellState(c=False, h=True) elif isinstance(s, TensorArray): return False return True **kwargs: dict of keyword arguments for base layer.	參數: cell:一個RNNCell，向它添加一個到output_size的投影。 input_keep_prob:單位張量或浮點數在0到1之間，輸入保持概率;如果是常數和1，則不添加輸入dropout。 output_keep_prob:單位張量或浮動在0和1之間，輸出保持概率;如果是常數和1，則不添加輸出dropout。 state_keep_prob:單位張量或浮點數在0到1之間，輸出保持概率;如果是常數和1，則不添加輸出dropout。狀態退出是在計算單元的輸出狀態上執行的。注意，當state_keep_prob位于(0,1)中時，dropout應用到的狀態組件也由argumentdropout_state_filter_visitor(例如。默認情況下，dropout從不應用于LSTMStateTuple的c組件)。 variational_recurrent: Python布爾類型。如果為真，則在每次運行調用的所有時間步上應用相同的退出模式。如果設置了該參數，則必須提供input_size。 input_size:(可選的)(可能嵌套的元組)TensorShape對象，包含期望傳遞給DropoutWrapper的輸入張量的深度。需要和使用的iff variational_= True和input_keep_prob < 1。 (可選)輸入、狀態和輸出張量的dtype。需要和使用iffvariational_= True。種子:(可選)整數，隨機種子。 dropout_state_filter_visitor:(可選)，默認:(見下)。函數，該函數接受狀態的任何層次結構，并返回一個標量或深度=1的Python布爾值結構，該結構描述應該刪除狀態中的哪些項。此外，如果函數返回True，則在此子層上應用dropout。如果函數返回False，則不會在整個子層上應用dropout。默認行為:除了LSTMCellState對象的內存(c)狀態外，在所有條件下執行dropout，并且不要試圖將dropout應用到TensorArray對象:def dropout_state_filter_visitor(s): if isinstance(s, LSTMCellState): #永遠不要在c狀態下執行dropout。返回LSTMCellState(c=False, h=True) elif isinstance(s, TensorArray):返回False返回True **kwargs:基層關鍵字參數的字典。
Methods get_initial_state View source get_initial_state(inputs=None, batch_size=None, dtype=None ) zero_state View source zero_state(batch_size, dtype )

1.2、案例應用

相關文章：TF之LSTM：利用多層LSTM算法對MNIST手寫數字識別數據集進行多分類

lstm_cell = rnn.BasicLSTMCell(num_units=hidden_size, forget_bias=1.0, state_is_tuple=True) #定義一層 LSTM_cell，只需要說明 hidden_size, 它會自動匹配輸入的 X 的維度 lstm_cell = rnn.DropoutWrapper(cell=lstm_cell, input_keep_prob=1.0, output_keep_prob=keep_prob) #添加 dropout layer, 一般只設置 output_keep_prob

2、tf.contrib.rnn.MultiRNNCell函數解讀與理解

2.1、源代碼解讀

RNN cell composed sequentially of multiple simple cells.

RNN細胞由多個簡單細胞依次組成。

tf.compat.v1.nn.rnn_cell.MultiRNNCell(cells, state_is_tuple=True )

Args:

cells: list of RNNCells that will be composed in this order.
state_is_tuple: If True, accepted and returned states are n-tuples, where?n = len(cells). If False, the states are all concatenated along the column axis. This latter behavior will soon be deprecated.

參數:

單元格:按此順序組成的RNNCells列表。
state_is_tuple:如果為真，則接受狀態和返回狀態為n元組，其中n = len(cell)。如果為假，則所有狀態都沿著列軸連接。后一種行為很快就會被摒棄。

Methods

get_initial_state

View source

get_initial_state(inputs=None, batch_size=None, dtype=None )

zero_state

View source

zero_state(batch_size, dtype )

Return zero-filled state tensor(s).

Args:

batch_size: int, float, or unit Tensor representing the batch size.
dtype: the data type to use for the state.

Returns:

If?state_size?is an int or TensorShape, then the return value is a?N-D?tensor of shape?[batch_size, state_size]filled with zeros.

If?state_size?is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of?2-Dtensors with the shapes?[batch_size, s]?for each s in?state_size.

如果state_size是一個int或TensorShape，那么返回值就是一個包含0的shape [batch_size, state_size]的N-D張量。

如果state_size是一個嵌套列表或元組，那么返回值就是一個嵌套列表或元組(具有相同結構)的2-張量，其中每個s的形狀[batch_size, s]為state_size中的每個s。

2.2、案例應用

相關文章：DL之LSTM：LSTM算法論文簡介(原理、關鍵步驟、RNN/LSTM/GRU比較、單層和多層的LSTM)、案例應用之詳細攻略

num_units = [128, 64] cells = [BasicLSTMCell(num_units=n) for n in num_units] stacked_rnn_cell = MultiRNNCell(cells)

總結

以上是生活随笔為你收集整理的Tensorflow：tf.contrib.rnn.DropoutWrapper函数(谷歌已经为Dropout申请了专利！)、MultiRNNCell函数的解读与理解的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： DL之ShuffleNet：Shuffl
下一篇： ML之PPMCC：PPMCC皮尔逊相关系

Tensorflow：tf.contrib.rnn.DropoutWrapper函数(谷歌已经为Dropout申请了专利！)、MultiRNNCell函数的解读与理解

1、tf.contrib.rnn.DropoutWrapper函數解讀與理解

1.1、源代碼解讀

1.2、案例應用

2、tf.contrib.rnn.MultiRNNCell函數解讀與理解

2.1、源代碼解讀

2.2、案例應用

總結

1、tf.contrib.rnn.DropoutWrapper函數解讀與理解

1.1、源代碼解讀

1.2、案例應用

2、tf.contrib.rnn.MultiRNNCell函數解讀與理解

2.1、源代碼解讀