當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

TensorFlow2-神经网络训练

發(fā)布時間：2024/4/11 编程问答 28 豆豆

生活随笔收集整理的這篇文章主要介紹了 TensorFlow2-神经网络训练小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

TensorFlow2神經(jīng)網(wǎng)絡(luò)訓(xùn)練

文章目錄

TensorFlow2神經(jīng)網(wǎng)絡(luò)訓(xùn)練
- 梯度下降
- 反向傳播
- 訓(xùn)練可視化
- 補(bǔ)充說明

梯度下降

梯度 $?f=(?f?x1;?f?x2;…;?f?xn)\nabla f=\left(\frac{\partial f}{\partial x_{1}} ; \frac{\partial f}{\partial x_{2}} ; \ldots ; \frac{\partial f}{\partial x_{n}}\right)$ 指函數(shù)關(guān)于變量x的導(dǎo)數(shù)，梯度的方向表示函數(shù)值增大的方向，梯度的模表示函數(shù)值增大的速率。那么只要不斷將參數(shù)的值向著梯度的反方向更新一定大小，就能得到函數(shù)的最小值（全局最小值或者局部最小值）。
$θt+1=θt?αt?f(θt)\theta_{t+1}=\theta_{t}-\alpha_{t} \nabla f\left(\theta_{t}\right)$
上述參數(shù)更新的過程就叫做梯度下降法，但是一般利用梯度更新參數(shù)時會將梯度乘以一個小于1的學(xué)習(xí)速率（learning rate），這是因?yàn)橥荻鹊哪＿€是比較大的，直接用其更新參數(shù)會使得函數(shù)值不斷波動，很難收斂到一個平衡點(diǎn)（這也是學(xué)習(xí)率不宜過大的原因）。
但是對于不同的函數(shù)，GD（梯度下降法）未必都能找到最優(yōu)解，很多時候它只能收斂到一個局部最優(yōu)解就不再變動了（盡管這個局部最優(yōu)解已經(jīng)很接近全局最優(yōu)解了），這是函數(shù)性質(zhì)決定的，實(shí)驗(yàn)證明，梯度下降法對于凸函數(shù)有著較好的表現(xiàn)。
TensorFlow和PyTorch這類深度學(xué)習(xí)框架是支持自動梯度求解的，在TensorFlow2中只要將需要進(jìn)行梯度求解的代碼段包裹在GradientTape中，TensorFlow就會自動求解相關(guān)運(yùn)算的梯度。但是通過tape.gradient(loss, [w1, w2, ...])只能調(diào)用一次，梯度作為占用顯存較大的資源在被獲取一次后就會被釋放掉，要想多次調(diào)用需要設(shè)置tf.GradientTape(persistent=True)（此時注意及時釋放資源）。TensorFlow2也支持多階求導(dǎo)，只要將求導(dǎo)進(jìn)行多層包裹即可。示例如下。

反向傳播

反向傳播算法（BP）是訓(xùn)練深度神經(jīng)網(wǎng)絡(luò)的核心算法，它的實(shí)現(xiàn)是基于鏈?zhǔn)椒▌t的。將輸出層的loss通過權(quán)值反向傳播(前向傳播的逆運(yùn)算)回第i層（這是個反復(fù)迭代返回的過程），計算i層的梯度更新參數(shù)。具體原理見之前的BP神經(jīng)網(wǎng)絡(luò)博客。
在TensorFlow2中，對于經(jīng)典的BP神經(jīng)網(wǎng)絡(luò)層進(jìn)行了封裝，稱為全連接層，自動完成BP神經(jīng)網(wǎng)絡(luò)隱層的操作。下面為使用Dense層構(gòu)建BP神經(jīng)網(wǎng)絡(luò)訓(xùn)練Fashion_MNIST數(shù)據(jù)集進(jìn)行識別的代碼。""" Author: Zhou Chen Date: 2019/10/15 Desc: About """ import tensorflow as tf from tensorflow import keras from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics(x, y), (x_test, y_test) = datasets.fashion_mnist.load_data() print(x.shape, y.shape)def preprocess(x, y):x = tf.cast(x, dtype=tf.float32) / 255.y = tf.cast(y, dtype=tf.int32)return x, ybatch_size = 64 db = tf.data.Dataset.from_tensor_slices((x, y)) db = db.map(preprocess).shuffle(10000).batch(batch_size) db_test = tf.data.Dataset.from_tensor_slices((x_test, y_test)) db_test = db_test.map(preprocess).shuffle(10000).batch(batch_size)model = Sequential([layers.Dense(256, activation=tf.nn.relu), # [b, 784] => [b, 256]layers.Dense(128, activation=tf.nn.relu), # [b, 256] => [b, 128]layers.Dense(64, activation=tf.nn.relu), # [b, 128] => [b, 64]layers.Dense(32, activation=tf.nn.relu), # [b, 64] => [b, 32]layers.Dense(10), # [b, 32] => [b, 10] ]) model.build(input_shape=([None, 28*28])) optimizer = optimizers.Adam(lr=1e-3)def main():# forwardfor epoch in range(30):for step, (x, y) in enumerate(db):x = tf.reshape(x, [-1, 28*28])with tf.GradientTape() as tape:logits = model(x)y_onthot = tf.one_hot(y, depth=10)loss_mse = tf.reduce_mean(tf.losses.MSE(y_onthot, logits))loss_ce = tf.reduce_mean(tf.losses.categorical_crossentropy(y_onthot, logits, from_logits=True))grads = tape.gradient(loss_ce, model.trainable_variables)# backwardoptimizer.apply_gradients(zip(grads, model.trainable_variables))if step % 100 == 0:print(epoch, step, "loss:", float(loss_mse), float(loss_ce))# testtotal_correct, total_num = 0, 0for x, y in db_test:x = tf.reshape(x, [-1, 28*28])logits = model(x)prob = tf.nn.softmax(logits, axis=1)pred = tf.cast(tf.argmax(prob, axis=1), dtype=tf.int32)correct = tf.reduce_sum(tf.cast(tf.equal(pred, y), dtype=tf.int32))total_correct += int(correct)total_num += int(x.shape[0])acc = total_correct / total_numprint("acc", acc)if __name__ == '__main__':main()

訓(xùn)練可視化

TensorFlow有一套伴生的可視化工具包TensorBoard（使用pip安裝，最新版本的TensorFlow會自動安裝TensorBoard），它是基于Web端的方便監(jiān)控訓(xùn)練過程和訓(xùn)練數(shù)據(jù)的工具，監(jiān)控數(shù)據(jù)來源于本地磁盤指定的一個目錄。一般使用TensorBoard需要三步，創(chuàng)建log目錄，創(chuàng)建summary實(shí)例，指定數(shù)據(jù)給summary實(shí)例。
tensorboard --logdir logs監(jiān)聽設(shè)定的log目錄，此時由于并沒有寫入文件，所以顯示如下。
后面兩步一般在訓(xùn)練過程中嵌入，示例如下。（注意，TensorBoard并沒有設(shè)計組合多個sample圖片而是一個個顯示，組合需要自己寫接口，下面的代碼就寫了這個接口。）""" Author: Zhou Chen Date: 2019/10/15 Desc: About """ import tensorflow as tf from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics import datetime from matplotlib import pyplot as plt import iodef preprocess(x, y):x = tf.cast(x, dtype=tf.float32) / 255.y = tf.cast(y, dtype=tf.int32)return x, ydef plot_to_image(figure):# Save the plot to a PNG in memory.buf = io.BytesIO()plt.savefig(buf, format='png')# Closing the figure prevents it from being displayed directly inside the notebook.plt.close(figure)buf.seek(0)# Convert PNG buffer to TF imageimage = tf.image.decode_png(buf.getvalue(), channels=4)# Add the batch dimensionimage = tf.expand_dims(image, 0)return imagedef image_grid(images):"""Return a 5x5 grid of the MNIST images as a matplotlib figure."""# Create a figure to contain the plot.figure = plt.figure(figsize=(10, 10))for i in range(25):# Start next subplot.plt.subplot(5, 5, i + 1, title='name')plt.xticks([])plt.yticks([])plt.grid(False)plt.imshow(images[i], cmap=plt.cm.binary)return figurebatchsz = 128 (x, y), (x_val, y_val) = datasets.mnist.load_data() print('datasets:', x.shape, y.shape, x.min(), x.max())db = tf.data.Dataset.from_tensor_slices((x, y)) db = db.map(preprocess).shuffle(60000).batch(batchsz).repeat(10)ds_val = tf.data.Dataset.from_tensor_slices((x_val, y_val)) ds_val = ds_val.map(preprocess).batch(batchsz, drop_remainder=True)network = Sequential([layers.Dense(256, activation='relu'),layers.Dense(128, activation='relu'),layers.Dense(64, activation='relu'),layers.Dense(32, activation='relu'),layers.Dense(10)]) network.build(input_shape=(None, 28 * 28)) network.summary()optimizer = optimizers.Adam(lr=0.01)current_time = datetime.datetime.now().strftime("%Y%m%d-%H%M%S") log_dir = 'logs/' + current_time summary_writer = tf.summary.create_file_writer(log_dir)# get x from (x,y) sample_img = next(iter(db))[0] # get first image instance sample_img = sample_img[0] sample_img = tf.reshape(sample_img, [1, 28, 28, 1]) with summary_writer.as_default():tf.summary.image("Training sample:", sample_img, step=0)for step, (x, y) in enumerate(db):with tf.GradientTape() as tape:# [b, 28, 28] => [b, 784]x = tf.reshape(x, (-1, 28 * 28))# [b, 784] => [b, 10]out = network(x)# [b] => [b, 10]y_onehot = tf.one_hot(y, depth=10)# [b]loss = tf.reduce_mean(tf.losses.categorical_crossentropy(y_onehot, out, from_logits=True))grads = tape.gradient(loss, network.trainable_variables)optimizer.apply_gradients(zip(grads, network.trainable_variables))if step % 100 == 0:print(step, 'loss:', float(loss))with summary_writer.as_default():tf.summary.scalar('train-loss', float(loss), step=step)# evaluateif step % 500 == 0:total, total_correct = 0., 0for _, (x, y) in enumerate(ds_val):# [b, 28, 28] => [b, 784]x = tf.reshape(x, (-1, 28 * 28))# [b, 784] => [b, 10]out = network(x)# [b, 10] => [b]pred = tf.argmax(out, axis=1)pred = tf.cast(pred, dtype=tf.int32)# bool typecorrect = tf.equal(pred, y)# bool tensor => int tensor => numpytotal_correct += tf.reduce_sum(tf.cast(correct, dtype=tf.int32)).numpy()total += x.shape[0]print(step, 'Evaluate Acc:', total_correct / total)# print(x.shape)val_images = x[:25]val_images = tf.reshape(val_images, [-1, 28, 28, 1])with summary_writer.as_default():tf.summary.scalar('test-acc', float(total_correct / total), step=step)tf.summary.image("val-onebyone-images:", val_images, max_outputs=25, step=step)val_images = tf.reshape(val_images, [-1, 28, 28])figure = image_grid(val_images)tf.summary.image('val-images:', plot_to_image(figure), step=step)
TensorBoard的反饋效果如下。

補(bǔ)充說明

本文主要針對TensorFlow2中l(wèi)ayers中Dense層以及反向傳播和訓(xùn)練可視化進(jìn)行了簡略說明。
博客同步至我的個人博客網(wǎng)站，歡迎瀏覽其他文章。
如有錯誤，歡迎指正。

總結(jié)

以上是生活随笔為你收集整理的TensorFlow2-神经网络训练的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

神经网络

上一篇： Linux服务-Samba文件服务器部署
下一篇： TensorFlow2-高层API接口K