日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 人工智能 > pytorch >内容正文

pytorch

深度学习优化算法实现(Momentum, Adam)

發布時間:2025/3/15 pytorch 38 豆豆
生活随笔 收集整理的這篇文章主要介紹了 深度学习优化算法实现(Momentum, Adam) 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

目錄

  • Momentum
    • 初始化
    • 更新參數
  • Adam
    • 初始化
    • 更新參數

除了常見的梯度下降法外,還有幾種比較通用的優化算法;表現都優于梯度下降法。本文只記錄完成吳恩達深度學習作業時遇到的Momentum和Adam算法,而且只有簡要的代碼。具體原理請看深度學習優化算法解析(Momentum, RMSProp, Adam),比較具體的說明了吳恩達版本的三種優化算法的講述!

Momentum

初始化

def initialize_velocity(parameters):"""Initializes the velocity as a python dictionary with:- keys: "dW1", "db1", ..., "dWL", "dbL" - values: numpy arrays of zeros of the same shape as the correspondinggradients/parameters.Arguments:parameters -- python dictionary containing your parameters.parameters['W' + str(l)] = Wlparameters['b' + str(l)] = blReturns:v -- python dictionary containing the current velocity.v['dW' + str(l)] = velocity of dWlv['db' + str(l)] = velocity of dbl"""L = len(parameters) // 2 # number of layers in the neural networksv = {}# Initialize velocityfor l in range(L):### START CODE HERE ### (approx. 2 lines)v['dW' + str(l + 1)] = np.zeros(np.shape(parameters['W' + str(l + 1)]))v['db' + str(l + 1)] = np.zeros(np.shape(parameters['b' + str(l + 1)]))### END CODE HERE ###return v

更新參數

def update_parameters_with_momentum(parameters, grads, v, beta, learning_rate):"""Update parameters using MomentumArguments:parameters -- python dictionary containing your parameters:parameters['W' + str(l)] = Wlparameters['b' + str(l)] = blgrads -- python dictionary containing your gradients foreach parameters:grads['dW' + str(l)] = dWlgrads['db' + str(l)] = dblv -- python dictionary containing the current velocity:v['dW' + str(l)] = ...v['db' + str(l)] = ...beta -- the momentum hyperparameter, scalarlearning_rate -- the learning rate, scalarReturns:parameters -- python dictionary containing your updated parameters v -- python dictionary containing your updated velocities"""L = len(parameters) // 2 # number of layers in the neural networks# Momentum update for each parameterfor l in range(L):### START CODE HERE ### (approx. 4 lines)# compute velocitiesv['dW' + str(l + 1)] = beta * v['dW' + str(l + 1)] + (1 - beta) * grads['dW' + str(l + 1)]v['db' + str(l + 1)] = beta * v['db' + str(l + 1)] + (1 - beta) * grads['db' + str(l + 1)]# update parametersparameters['W' + str(l + 1)] += -learning_rate * v['dW' + str(l + 1)]parameters['b' + str(l + 1)] += -learning_rate * v['db' + str(l + 1)]### END CODE HERE ###return parameters, v

Adam

初始化

def initialize_adam(parameters) :"""Initializes v and s as two python dictionaries with:- keys: "dW1", "db1", ..., "dWL", "dbL" - values: numpy arrays of zeros of the same shape as the corresponding gradients/parameters.Arguments:parameters -- python dictionary containing your parameters.parameters["W" + str(l)] = Wlparameters["b" + str(l)] = blReturns: v -- python dictionary that will contain the exponentially weighted average of the gradient.v["dW" + str(l)] = ...v["db" + str(l)] = ...s -- python dictionary that will contain the exponentially weighted average of the squared gradient.s["dW" + str(l)] = ...s["db" + str(l)] = ..."""L = len(parameters) // 2 # number of layers in the neural networksv = {}s = {}# Initialize v, s. Input: "parameters". Outputs: "v, s".for l in range(L):### START CODE HERE ### (approx. 4 lines)v['dW' + str(l + 1)] = np.zeros(np.shape(parameters['W' + str(l + 1)]))v['db' + str(l + 1)] = np.zeros(np.shape(parameters['b' + str(l + 1)]))s['dW' + str(l + 1)] = np.zeros(np.shape(parameters['W' + str(l + 1)]))s['db' + str(l + 1)] = np.zeros(np.shape(parameters['b' + str(l + 1)]))### END CODE HERE ###return v, s

更新參數

def update_parameters_with_adam(parameters, grads, v, s, t,learning_rate = 0.01,beta1 = 0.9, beta2 = 0.999, epsilon = 1e-8):"""Update parameters using AdamArguments:parameters -- python dictionary containing your parameters:parameters['W' + str(l)] = Wlparameters['b' + str(l)] = blgrads -- python dictionary containing your gradientsfor each parameters:grads['dW' + str(l)] = dWlgrads['db' + str(l)] = dblv -- Adam variable, moving average of the first gradient,python dictionarys -- Adam variable, moving average of the squared gradient,python dictionarylearning_rate -- the learning rate, scalar.beta1 -- Exponential decay hyperparameter forthe first moment estimates beta2 -- Exponential decay hyperparameter forthe second moment estimates epsilon -- hyperparameter preventing divisionby zero in Adam updatesReturns:parameters -- python dictionary containing your updated parameters v -- Adam variable, moving average of the first gradient, python dictionarys -- Adam variable, moving average of the squared gradient, python dictionary"""L = len(parameters) // 2 # number of layers in the neural networks# Initializing first moment estimate,python dictionaryv_corrected = {} # Initializing second moment estimate,python dictionarys_corrected = {} # Perform Adam update on all parametersfor l in range(L):### START CODE HERE ### (approx. 2 lines)v['dW' + str(l + 1)] = beta1 * v['dW' + str(l + 1)] + (1 - beta1) * grads['dW' + str(l + 1)]v['db' + str(l + 1)] = beta1 * v['db' + str(l + 1)] + (1 - beta1) * grads['db' + str(l + 1)]### END CODE HERE ###### START CODE HERE ### (approx. 2 lines)v_corrected['dW' + str(l + 1)] = v['dW' + str(l + 1)] / (1 - beta1 ** t)v_corrected['db' + str(l + 1)] = v['db' + str(l + 1)] / (1 - beta1 ** t)### END CODE HERE ###### START CODE HERE ### (approx. 2 lines)s['dW' + str(l + 1)] = beta2 * s['dW' + str(l + 1)] + (1 - beta2) * np.square(grads['dW' + str(l + 1)])s['db' + str(l + 1)] = beta2 * s['db' + str(l + 1)] + (1 - beta2) * np.square(grads['db' + str(l + 1)]) ### END CODE HERE ###### START CODE HERE ### (approx. 2 lines)s_corrected['dW' + str(l + 1)] = s['dW' + str(l + 1)] / (1 - beta2 ** t)s_corrected['db' + str(l + 1)] = s['db' + str(l + 1)] / (1 - beta2 ** t)### END CODE HERE ###### START CODE HERE ### (approx. 2 lines)parameters['W' + str(l + 1)] += -learning_rate * v_corrected['dW' + str(l + 1)] / (np.sqrt(s['dW' + str(l + 1)]) + epsilon)parameters['b' + str(l + 1)] += -learning_rate * v_corrected['db' + str(l + 1)] / (np.sqrt(s['db' + str(l + 1)]) + epsilon)### END CODE HERE ###return parameters, v, s

總結

以上是生活随笔為你收集整理的深度学习优化算法实现(Momentum, Adam)的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。