當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

ResNet网络详解与keras实现

發(fā)布時(shí)間：2025/3/15 编程问答 27 豆豆

生活随笔收集整理的這篇文章主要介紹了 ResNet网络详解与keras实现小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

ResNet網(wǎng)絡(luò)詳解與keras實(shí)現(xiàn)

ResNet網(wǎng)絡(luò)詳解與keras實(shí)現(xiàn)
- - Resnet網(wǎng)絡(luò)的概覽
  - Pascal_VOC數(shù)據(jù)集
    - 第一層目錄
    - 第二層目錄
    - 第三層目錄
  - 梯度退化
  - Residual Learning
  - Identity vs Projection Shortcuts
  - Bottleneck architecture
  - Resnet網(wǎng)絡(luò)構(gòu)建表
  - ResNet論文結(jié)果
    - 為了搭建Resnet網(wǎng)絡(luò)我們使用了以下策略
    - 整個(gè)代碼的流程如下
  - 實(shí)驗(yàn)結(jié)果
  - 實(shí)驗(yàn)結(jié)果分析
  - 本博客相關(guān)引用

本博客旨在給經(jīng)典的ResNet網(wǎng)絡(luò)進(jìn)行詳解與代碼實(shí)現(xiàn)，如有不足或者其他的見解，請?jiān)诒静┛拖旅媪粞浴?/p>

Resnet網(wǎng)絡(luò)的概覽

為了解決訓(xùn)練很深的網(wǎng)絡(luò)時(shí)候出現(xiàn)的梯度退化(gradient degradation)的問題,Kaiming He提出了Resnet結(jié)構(gòu)。由于使用了殘差學(xué)習(xí)的方法(Resuidal learning)，使得網(wǎng)絡(luò)的層數(shù)得到了大大的提升。

ResNet由于使用了shortcut,把原來需要學(xué)習(xí)逼近的未知函數(shù)H(x)恒等映射(Identity mapping),變成了逼近F(x)=H(x)-x的一個(gè)函數(shù)。作者認(rèn)為這兩種表達(dá)的效果相同，但是優(yōu)化的難度卻并不相同，作者假設(shè)F(x)的優(yōu)化會(huì)比H(x)簡單的多。這一想法也是源于圖像處理中的殘差向量編碼，通過一個(gè)reformulation，將一個(gè)問題分解成多個(gè)尺度直接的殘差問題，能夠很好的起到優(yōu)化訓(xùn)練的效果。

ResNet針對較深(層數(shù)大于等于50)的網(wǎng)絡(luò)提出了BottleNeck的結(jié)構(gòu)，這個(gè)結(jié)構(gòu)可以減少運(yùn)算的時(shí)間復(fù)雜度。

ResNet里存在兩種shortcut,Identity shortcut & Projection shortcut。Identity shortcut使用零填充的方式保證其緯度不變，而Projection shortcut則具有下面的形式

y=F(x,Wi)+Wsx來匹配緯度的變換。

ResNet這個(gè)模型在圖像處理的相關(guān)任務(wù)中具有很好的泛化性，在2015年的ImageNet Recognization,ImageNet detection,ImageNet localization,COCO detection,COCO segmentation等等任務(wù)上取得第一的成績。

在本篇博客中，將對Resnet的結(jié)構(gòu)進(jìn)行詳細(xì)的解釋，并用代碼實(shí)現(xiàn)ResNet的網(wǎng)絡(luò)結(jié)構(gòu)。同時(shí)，本文還將引入另一篇論文<>，來更加深入的理解Resnet。本文使用VOC2012的數(shù)據(jù)集進(jìn)行網(wǎng)絡(luò)的訓(xùn)練，驗(yàn)證，與測試。為了快速開發(fā)，本次我們把Keras作為代碼的框架。

Pascal_VOC數(shù)據(jù)集

Pascal VOC為圖像識別，檢測與分割提供了一整套標(biāo)準(zhǔn)化的優(yōu)秀的數(shù)據(jù)集，每一年都會(huì)舉辦一次圖像識別競賽。下面是VOC2012，訓(xùn)練集(包括驗(yàn)證集)的下載地址。

VOC2012里面有20類物體的圖片，圖片總共有1.7萬張。我把數(shù)據(jù)集分成了3個(gè)部分，訓(xùn)練集，驗(yàn)證集，測試集，比例為8:1:1。
下面是部分截圖：

第一層目錄

第二層目錄

第三層目錄

接著我們使用keras代碼來使用這個(gè)數(shù)據(jù)集，代碼如下：

IM_WIDTH=224 #圖片寬度 IM_HEIGHT=224 #圖片高度 batch_size=32 #批的大小# train data train_datagen = ImageDataGenerator(width_shift_range=0.1,height_shift_range=0.1,shear_range=0.1,zoom_range=0.1,horizontal_flip=True,rescale=1./255 ) train_generator = train_datagen.flow_from_directory(train_root,target_size=(IM_WIDTH, IM_HEIGHT),batch_size=batch_size,shuffle=True )# vaild data vaild_datagen = ImageDataGenerator(width_shift_range=0.1,height_shift_range=0.1,shear_range=0.1,zoom_range=0.1,horizontal_flip=True,rescale=1./255 ) vaild_generator = train_datagen.flow_from_directory(vaildation_root,target_size=(IM_WIDTH, IM_HEIGHT),batch_size=batch_size, )# test data test_datagen = ImageDataGenerator(rescale=1./255 ) test_generator = train_datagen.flow_from_directory(test_root,target_size=(IM_WIDTH, IM_HEIGHT),batch_size=batch_size, )

我使用了3個(gè)ImageDataGenerator，分別來使用訓(xùn)練集，驗(yàn)證集與測試集的數(shù)據(jù)。使用ImageDataGenerator需要導(dǎo)入相應(yīng)的模塊，==from keras.preprocessing.image import ImageDataGenerator==。ImageDataGenrator可以用來做數(shù)據(jù)增強(qiáng)，提高模型的魯棒性.它里面提供了許多變換，包括圖片旋轉(zhuǎn)，對稱，平移等等操作。里面的flow_from_directory方法可以從相應(yīng)的目錄里面批量獲取圖片，這樣就可以不用一次性讀取所有圖片(防止內(nèi)存不足)。

梯度退化

按照我們的慣性思維，一個(gè)網(wǎng)絡(luò)越深則這個(gè)網(wǎng)絡(luò)就應(yīng)該具有更好的學(xué)習(xí)能力，而梯度退化是指下面一種現(xiàn)象：隨著網(wǎng)絡(luò)層數(shù)的增加，網(wǎng)絡(luò)的效果先是變好到飽和，然后立即下降的一個(gè)現(xiàn)象。在這里，我們引用一幅來自Resnet里面的圖片，更加直觀的理解這個(gè)現(xiàn)象：

從上圖我們可以看出，一個(gè)56層的網(wǎng)絡(luò)的訓(xùn)練誤差和測試誤差都大于一個(gè)20層的網(wǎng)絡(luò)。

Residual Learning

為了解決梯度退化的問題，論文中提出了Residual learning這個(gè)方法，它通過構(gòu)造一個(gè)Residual block來完成。如圖Figure 2所示，引入殘差結(jié)構(gòu)以后，把原來需要學(xué)習(xí)逼近的未知函數(shù)H(x)恒等映射(Identity mapping),變成了逼近F(x)=H(x)-x的一個(gè)函數(shù)。作者認(rèn)為這兩種表達(dá)的效果相同，但是優(yōu)化的難度卻并不相同，作者假設(shè)F(x)的優(yōu)化會(huì)比H(x)簡單的多。這一想法也是源于圖像處理中的殘差向量編碼，通過一個(gè)reformulation，將一個(gè)問題分解成多個(gè)尺度直接的殘差問題，能夠很好的起到優(yōu)化訓(xùn)練的效果。

上圖的恒等映射，是把一個(gè)輸入x和其堆疊了2次后的輸出F(x)的進(jìn)行元素級和作為總的輸出。因此它沒有增加網(wǎng)絡(luò)的運(yùn)算復(fù)雜度，而且這個(gè)操作很容易被現(xiàn)在的一些常用庫執(zhí)行(e.g.,Caffe,tensorflow)。

下面是一張沒有使用普通圖(plain,即沒有加入恒等映射的圖)，與一張有shortcut圖的對比：

最左邊的圖為經(jīng)典的VGG-19圖的網(wǎng)絡(luò)結(jié)構(gòu)，中間的圖是一個(gè)類似于VGG-19的34層的普通圖，最右邊的圖是34層的帶有恒等映射的Resnet網(wǎng)絡(luò)圖。其中黑色的實(shí)線代表的是同一緯度(即卷積核的個(gè)數(shù)相同)下的恒等映射。而虛線指的是不同維度間(卷積核的個(gè)數(shù)不同)的恒等映射。

Identity vs Projection Shortcuts

除了最簡單的Identity shortcuts(直接進(jìn)行同緯度的元素級相加)，論文還研究了Projection shortcuts($ y=F(x,{W_i})+W_sx$).論文研究了以下3種情況：

i. 對于緯度沒有變化的連接進(jìn)行直接相連，對于緯度增加的連接則通過補(bǔ)零填充后進(jìn)行連接。由于shortcuts是恒等的，因此這個(gè)連接本身不會(huì)帶來額外的參數(shù)。

ii. 對于緯度沒有變化的連接進(jìn)行直接相連，對于緯度增加的連接則通過投影相連，投影相連會(huì)增加參數(shù)。

iii. 對于所有的連接都采取投影相連。

作者對以上三種情況都進(jìn)行了研究，發(fā)現(xiàn)iii的效果比ii好一點(diǎn)點(diǎn)點(diǎn)(marginly better)，發(fā)現(xiàn)ii的效果比i的效果好一點(diǎn)。這是因?yàn)?W_s$中帶來的額外參數(shù)所帶來的效果。

Bottleneck architecture

如上圖右邊所示，作者在研究更深層次(層數(shù)大于50)的網(wǎng)絡(luò)的時(shí)候，使用了Bottleneck這個(gè)網(wǎng)絡(luò)結(jié)構(gòu)。我覺得作者可能是參考了goolenet里面的Inception結(jié)構(gòu)。我們可以看到在Bottleneck中，第一個(gè)1x1的卷積層用來在降低緯度(用來降低運(yùn)算復(fù)雜度)，而后一個(gè)的1x1的卷積層則用來增加緯度，使其保持與原來的輸入具有相同的緯度。(從而可以進(jìn)行恒等映射)。

Resnet網(wǎng)絡(luò)構(gòu)建表

Tabel 1

上圖是一個(gè)Resnet的網(wǎng)絡(luò)構(gòu)建表，它顯示了resnet是怎么構(gòu)成的。同時(shí)這個(gè)表還提供了各個(gè)網(wǎng)絡(luò)的運(yùn)算浮點(diǎn)數(shù)，雖然resnet的層數(shù)比較深，但是它的運(yùn)算量都小于VGG-19（19.6x10的9次方)。

ResNet論文結(jié)果:

上圖左邊是普通的網(wǎng)絡(luò)，右邊是殘差網(wǎng)絡(luò)，較細(xì)的線代表驗(yàn)證誤差，較粗的線則代表訓(xùn)練誤差。我們可以看到普通的網(wǎng)絡(luò)存在梯度退化的現(xiàn)象，即34層網(wǎng)絡(luò)的訓(xùn)練和驗(yàn)證誤差都大于18層的網(wǎng)絡(luò)，而殘差網(wǎng)絡(luò)中則不存在這個(gè)現(xiàn)象。可見殘差網(wǎng)絡(luò)解決了梯度退化的問題。

為了搭建Resnet網(wǎng)絡(luò)，我們使用了以下策略：

使用identity_block這個(gè)函數(shù)來搭建Resnet34,使用bottleneck這個(gè)函數(shù)來搭建Resnet50。
每個(gè)卷積層后都使用BatchNormalization，來防止模型過擬合，并且使輸出滿足高斯分布。
具體網(wǎng)絡(luò)搭建可以參考Tabel.1，可以邊看表里面的具體參數(shù)邊搭網(wǎng)絡(luò)。

整個(gè)代碼的流程如下：

graph TD A(導(dǎo)入相應(yīng)庫) --> Z[模型參數(shù)設(shè)置以及其它配置] Z --> B[生成訓(xùn)練集,測試集,驗(yàn)證集的三個(gè)迭代器] B --> C[identity_block函數(shù)的編寫] C --> D[bottleneck_block函數(shù)的編寫] D --> F[根據(jù)resnet網(wǎng)絡(luò)構(gòu)建表來構(gòu)建網(wǎng)絡(luò)] F --> G[模型訓(xùn)練與驗(yàn)證] G --> H[模型保存] H --> I(模型在測試集上測試)

# coding=utf-8 from keras.models import Model from keras.layers import Input, Dense, Dropout, BatchNormalization, Conv2D, MaxPooling2D, AveragePooling2D, concatenate, \Activation, ZeroPadding2D from keras.layers import add, Flatten from keras.utils import plot_model from keras.metrics import top_k_categorical_accuracy from keras.preprocessing.image import ImageDataGenerator from keras.models import load_model import os# Global Constants NB_CLASS=20 IM_WIDTH=224 IM_HEIGHT=224 train_root='/home/faith/keras/dataset/traindata/' vaildation_root='/home/faith/keras/dataset/vaildationdata/' test_root='/home/faith/keras/dataset/testdata/' batch_size=32 EPOCH=60# train data train_datagen = ImageDataGenerator(width_shift_range=0.1,height_shift_range=0.1,shear_range=0.1,zoom_range=0.1,horizontal_flip=True,rescale=1./255 ) train_generator = train_datagen.flow_from_directory(train_root,target_size=(IM_WIDTH, IM_HEIGHT),batch_size=batch_size,shuffle=True )# vaild data vaild_datagen = ImageDataGenerator(width_shift_range=0.1,height_shift_range=0.1,shear_range=0.1,zoom_range=0.1,horizontal_flip=True,rescale=1./255 ) vaild_generator = train_datagen.flow_from_directory(vaildation_root,target_size=(IM_WIDTH, IM_HEIGHT),batch_size=batch_size, )# test data test_datagen = ImageDataGenerator(rescale=1./255 ) test_generator = train_datagen.flow_from_directory(test_root,target_size=(IM_WIDTH, IM_HEIGHT),batch_size=batch_size, )def Conv2d_BN(x, nb_filter, kernel_size, strides=(1, 1), padding='same', name=None):if name is not None:bn_name = name + '_bn'conv_name = name + '_conv'else:bn_name = Noneconv_name = Nonex = Conv2D(nb_filter, kernel_size, padding=padding, strides=strides, activation='relu', name=conv_name)(x)x = BatchNormalization(axis=3, name=bn_name)(x)return xdef identity_Block(inpt, nb_filter, kernel_size, strides=(1, 1), with_conv_shortcut=False):x = Conv2d_BN(inpt, nb_filter=nb_filter, kernel_size=kernel_size, strides=strides, padding='same')x = Conv2d_BN(x, nb_filter=nb_filter, kernel_size=kernel_size, padding='same')if with_conv_shortcut:shortcut = Conv2d_BN(inpt, nb_filter=nb_filter, strides=strides, kernel_size=kernel_size)x = add([x, shortcut])return xelse:x = add([x, inpt])return xdef bottleneck_Block(inpt,nb_filters,strides=(1,1),with_conv_shortcut=False):k1,k2,k3=nb_filtersx = Conv2d_BN(inpt, nb_filter=k1, kernel_size=1, strides=strides, padding='same')x = Conv2d_BN(x, nb_filter=k2, kernel_size=3, padding='same')x = Conv2d_BN(x, nb_filter=k3, kernel_size=1, padding='same')if with_conv_shortcut:shortcut = Conv2d_BN(inpt, nb_filter=k3, strides=strides, kernel_size=1)x = add([x, shortcut])return xelse:x = add([x, inpt])return xdef resnet_34(width,height,channel,classes):inpt = Input(shape=(width, height, channel))x = ZeroPadding2D((3, 3))(inpt)#conv1x = Conv2d_BN(x, nb_filter=64, kernel_size=(7, 7), strides=(2, 2), padding='valid')x = MaxPooling2D(pool_size=(3, 3), strides=(2, 2), padding='same')(x)#conv2_xx = identity_Block(x, nb_filter=64, kernel_size=(3, 3))x = identity_Block(x, nb_filter=64, kernel_size=(3, 3))x = identity_Block(x, nb_filter=64, kernel_size=(3, 3))#conv3_xx = identity_Block(x, nb_filter=128, kernel_size=(3, 3), strides=(2, 2), with_conv_shortcut=True)x = identity_Block(x, nb_filter=128, kernel_size=(3, 3))x = identity_Block(x, nb_filter=128, kernel_size=(3, 3))x = identity_Block(x, nb_filter=128, kernel_size=(3, 3))#conv4_xx = identity_Block(x, nb_filter=256, kernel_size=(3, 3), strides=(2, 2), with_conv_shortcut=True)x = identity_Block(x, nb_filter=256, kernel_size=(3, 3))x = identity_Block(x, nb_filter=256, kernel_size=(3, 3))x = identity_Block(x, nb_filter=256, kernel_size=(3, 3))x = identity_Block(x, nb_filter=256, kernel_size=(3, 3))x = identity_Block(x, nb_filter=256, kernel_size=(3, 3))#conv5_xx = identity_Block(x, nb_filter=512, kernel_size=(3, 3), strides=(2, 2), with_conv_shortcut=True)x = identity_Block(x, nb_filter=512, kernel_size=(3, 3))x = identity_Block(x, nb_filter=512, kernel_size=(3, 3))x = AveragePooling2D(pool_size=(7, 7))(x)x = Flatten()(x)x = Dense(classes, activation='softmax')(x)model = Model(inputs=inpt, outputs=x)return modeldef resnet_50(width,height,channel,classes):inpt = Input(shape=(width, height, channel))x = ZeroPadding2D((3, 3))(inpt)x = Conv2d_BN(x, nb_filter=64, kernel_size=(7, 7), strides=(2, 2), padding='valid')x = MaxPooling2D(pool_size=(3, 3), strides=(2, 2), padding='same')(x)#conv2_xx = bottleneck_Block(x, nb_filters=[64,64,256],strides=(1,1),with_conv_shortcut=True)x = bottleneck_Block(x, nb_filters=[64,64,256])x = bottleneck_Block(x, nb_filters=[64,64,256])#conv3_xx = bottleneck_Block(x, nb_filters=[128, 128, 512],strides=(2,2),with_conv_shortcut=True)x = bottleneck_Block(x, nb_filters=[128, 128, 512])x = bottleneck_Block(x, nb_filters=[128, 128, 512])x = bottleneck_Block(x, nb_filters=[128, 128, 512])#conv4_xx = bottleneck_Block(x, nb_filters=[256, 256, 1024],strides=(2,2),with_conv_shortcut=True)x = bottleneck_Block(x, nb_filters=[256, 256, 1024])x = bottleneck_Block(x, nb_filters=[256, 256, 1024])x = bottleneck_Block(x, nb_filters=[256, 256, 1024])x = bottleneck_Block(x, nb_filters=[256, 256, 1024])x = bottleneck_Block(x, nb_filters=[256, 256, 1024])#conv5_xx = bottleneck_Block(x, nb_filters=[512, 512, 2048], strides=(2, 2), with_conv_shortcut=True)x = bottleneck_Block(x, nb_filters=[512, 512, 2048])x = bottleneck_Block(x, nb_filters=[512, 512, 2048])x = AveragePooling2D(pool_size=(7, 7))(x)x = Flatten()(x)x = Dense(classes, activation='softmax')(x)model = Model(inputs=inpt, outputs=x)return modeldef acc_top2(y_true, y_pred):return top_k_categorical_accuracy(y_true, y_pred, k=2)def check_print():# Create a Keras Modelmodel = resnet_50(IM_WIDTH,IM_HEIGHT,3,NB_CLASS)model.summary()# Save a PNG of the Model Buildplot_model(model, to_file='resnet.png')model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['acc',top_k_categorical_accuracy])print 'Model Compiled'return modelif __name__ == '__main__':if os.path.exists('resnet_50.h5'):model=load_model('resnet_50.h5')else:model=check_print()model.fit_generator(train_generator,validation_data=vaild_generator,epochs=EPOCH,steps_per_epoch=train_generator.n/batch_size,validation_steps=vaild_generator.n/batch_size)model.save('resnet_50.h5')loss,acc,top_acc=model.evaluate_generator(test_generator, steps=test_generator.n / batch_size)print 'Test result:loss:%f,acc:%f,top_acc:%f' % (loss, acc, top_acc)

實(shí)驗(yàn)結(jié)果

DataLossAccTop5-acc

Training set	1.85	39.9%	85.3%
Vaildation set	2.01	36.6%	82.0%
Testing set	2.08	35.7%	78.1%
Dataset	VOC2012	Classes	20
Model	ResNet	Framework	Keras

實(shí)驗(yàn)結(jié)果分析

我們可以發(fā)現(xiàn)模型最后在測試集上的效果與訓(xùn)練集上的效果有一定程度上的差距，模型出現(xiàn)了一點(diǎn)過擬合。為了防止過擬合，而且為了加速收斂，本文在每一層之間都是用了BatchNormalization層。由于本文只訓(xùn)練了60個(gè)epoch，每個(gè)epoch差不多迭代500次，由于訓(xùn)練的次數(shù)太少，故效果并未具體顯現(xiàn)。

本博客相關(guān)引用

以下是本博客的引用，再次本人對每個(gè)引用的作者表示感謝。讀者如果對Resnet這個(gè)網(wǎng)絡(luò)仍然存在一些疑慮，或者想要有更深的理解，可以參考以下的引用。

引用博客1

引用博客2

引用文獻(xiàn)1:Deep Residual Learning for Image Recognition

引用文獻(xiàn)2:Residual Networks are Exponential Ensembles of Relatively Shallow Networks

總結(jié)

以上是生活随笔為你收集整理的ResNet网络详解与keras实现的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇： SPSS基础操作（二）：用迭代法处理序列
下一篇：微型计算机方面的论文,微型计算机论文.d