當前位置：首頁 > 人工智能 > 卷积神经网络 >内容正文

卷积神经网络

使用预训练的卷积神经网络（猫狗图片分类）

發布時間：2025/4/16 卷积神经网络 201 豆豆

生活随笔收集整理的這篇文章主要介紹了使用预训练的卷积神经网络（猫狗图片分类）小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

本次所用數據來自ImageNet，使用預訓練好的數據來預測一個新的數據集：貓狗圖片分類。這里，使用VGG模型，這個模型內置在Keras中，直接導入就可以了。

from keras.applications import VGG16conv_base = VGG16(weights='imagenet',include_top=False,input_shape=(150, 150, 3))

說一下這三個參數：

weights：指定模型初始化權重檢查點
include_top：指定模型最后是否包含密集連接分類器。默認情況下，這個密集連接分類器對應于ImageNet的1000個類別。因為我們打算使用自己的分類器（只有兩個類別：cat和dog），所以不用包含。
input_shape：輸入到網絡中的圖像張量（可選參數），如果不傳入這個參數，那么網絡可以處理任意形狀的輸入

看一下VGG16網絡的詳細構架：

conv_base.summary() _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_1 (InputLayer) (None, 150, 150, 3) 0 _________________________________________________________________ block1_conv1 (Conv2D) (None, 150, 150, 64) 1792 _________________________________________________________________ block1_conv2 (Conv2D) (None, 150, 150, 64) 36928 _________________________________________________________________ block1_pool (MaxPooling2D) (None, 75, 75, 64) 0 _________________________________________________________________ block2_conv1 (Conv2D) (None, 75, 75, 128) 73856 _________________________________________________________________ block2_conv2 (Conv2D) (None, 75, 75, 128) 147584 _________________________________________________________________ block2_pool (MaxPooling2D) (None, 37, 37, 128) 0 _________________________________________________________________ block3_conv1 (Conv2D) (None, 37, 37, 256) 295168 _________________________________________________________________ block3_conv2 (Conv2D) (None, 37, 37, 256) 590080 _________________________________________________________________ block3_conv3 (Conv2D) (None, 37, 37, 256) 590080 _________________________________________________________________ block3_pool (MaxPooling2D) (None, 18, 18, 256) 0 _________________________________________________________________ block4_conv1 (Conv2D) (None, 18, 18, 512) 1180160 _________________________________________________________________ block4_conv2 (Conv2D) (None, 18, 18, 512) 2359808 _________________________________________________________________ block4_conv3 (Conv2D) (None, 18, 18, 512) 2359808 _________________________________________________________________ block4_pool (MaxPooling2D) (None, 9, 9, 512) 0 _________________________________________________________________ block5_conv1 (Conv2D) (None, 9, 9, 512) 2359808 _________________________________________________________________ block5_conv2 (Conv2D) (None, 9, 9, 512) 2359808 _________________________________________________________________ block5_conv3 (Conv2D) (None, 9, 9, 512) 2359808 _________________________________________________________________ block5_pool (MaxPooling2D) (None, 4, 4, 512) 0 ================================================================= Total params: 14,714,688 Trainable params: 14,714,688 Non-trainable params: 0

最后這個特征圖形狀為（4， 4， 512），我們在這個特征上面添加一個密集連接分類器。

不使用數據增強的快速特征提取（計算代價低）

首先，運行ImageDataGenerator實例，將圖像及其標簽提取為Numpy數組，調用conv_base模型的predict方法從這些圖像的中提取特征。

import os import numpy as np from keras.preprocessing.image import ImageDataGeneratorbase_dir = '/Users/fchollet/Downloads/cats_and_dogs_small'train_dir = os.path.join(base_dir, 'train') validation_dir = os.path.join(base_dir, 'validation') test_dir = os.path.join(base_dir, 'test')datagen = ImageDataGenerator(rescale=1./255) batch_size = 20def extract_features(directory, sample_count):features = np.zeros(shape=(sample_count, 4, 4, 512))labels = np.zeros(shape=(sample_count))generator = datagen.flow_from_directory(directory,target_size=(150, 150),batch_size=batch_size,class_mode='binary')i = 0for inputs_batch, labels_batch in generator:features_batch = conv_base.predict(inputs_batch)features[i * batch_size : (i + 1) * batch_size] = features_batchlabels[i * batch_size : (i + 1) * batch_size] = labels_batchi += 1if i * batch_size >= sample_count:break # 這些生成器在循環中不斷生成數據，所以你必須在讀完所有圖像之后終止循環return features, labelstrain_features, train_labels = extract_features(train_dir, 2000) validation_features, validation_labels = extract_features(validation_dir, 1000) test_features, test_labels = extract_features(test_dir, 1000)

目前，提取的特征形狀為（samples， 4， 4， 512），我們要將其輸入到密集連接分類器中去，所以必須首先對其形狀展平為（samples ，8192）

train_features = np.reshape(train_features, (2000, 4 * 4 * 512)) validation_features = np.reshape(validation_features, (1000, 4 * 4 * 512)) test_features = np.reshape(test_features, (1000, 4 * 4 * 512))

下面定義一個密集連接分類器，并在剛剛保存好的數據和標簽上訓練分類器：

from keras import models from keras import layers from keras import optimizersmodel = models.Sequential() model.add(layers.Dense(256, activation='relu', input_dim=4 * 4 * 512)) model.add(layers.Dropout(0.5)) model.add(layers.Dense(1, activation='sigmoid'))model.compile(optimizer=optimizers.RMSprop(lr=2e-5),loss='binary_crossentropy',metrics=['acc'])history = model.fit(train_features, train_labels,epochs=30,batch_size=20,validation_data=(validation_features, validation_labels))

訓練速度非?？?#xff0c;因為只需要處理兩個Dense層。下面看一下訓練過程中的損失曲線和精度曲線：

import matplotlib.pyplot as pltacc = history.history['acc'] val_acc = history.history['val_acc'] loss = history.history['loss'] val_loss = history.history['val_loss']epochs = range(len(acc))plt.plot(epochs, acc, 'bo', label='Training acc') plt.plot(epochs, val_acc, 'b', label='Validation acc') plt.title('Training and validation accuracy') plt.legend()plt.figure()plt.plot(epochs, loss, 'bo', label='Training loss') plt.plot(epochs, val_loss, 'b', label='Validation loss') plt.title('Training and validation loss') plt.legend()plt.show()

從圖中可以看出，驗證精度達到了約90%，比之前從一開始就訓練小型模型效果要好很多，但是從圖中也可以看出，雖然dropout比率比較大，但模型從一開始就出現了過擬合。這是因為本方法中沒有使用數據增強，而數據增強對防止小型圖片數據集過擬合非常重要。

使用數據增強的特征提取（計算代價高）

這種方法速度更慢，計算代價更高，但是可以在訓練期間使用數據增強。這種方法是：擴展conv_base模型，然后在輸入數據上端到端的運行模型。（這種方法計算代價很高，必須在GPU上運行）

承接我們之前定義的網絡模型

from keras import models from keras import layersmodel = models.Sequential() model.add(conv_base) model.add(layers.Flatten()) model.add(layers.Dense(256, activation='relu')) model.add(layers.Dense(1, activation='sigmoid')) model.summary() _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= vgg16 (Model) (None, 4, 4, 512) 14714688 _________________________________________________________________ flatten_1 (Flatten) (None, 8192) 0 _________________________________________________________________ dense_3 (Dense) (None, 256) 2097408 _________________________________________________________________ dense_4 (Dense) (None, 1) 257 ================================================================= Total params: 16,812,353 Trainable params: 16,812,353 Non-trainable params: 0

我們可以看到，VGG16的卷積基一共有14714688個參數，其上添加的分類器一共有200萬個參數，非常多。

在編譯和訓練模型之前，需要凍結卷積基。凍結一個或多個層是指在訓練過程中保持其權重不變。如果不這么做，那么卷積基之前學到的表示將會在訓練過程中被修改。因為其上添加的Dense是隨機初始化的，所以非常打的權重更新會在網絡中進行傳播，對之前學到的表示造成很大破壞。

在Keras中，凍結網絡的方法是將其trainable屬性設置為False

print('This is the number of trainable weights ''before freezing the conv base:', len(model.trainable_weights))

This is the number of trainable weights before freezing the conv base: 30

conv_base.trainable = False print('This is the number of trainable weights ''after freezing the conv base:', len(model.trainable_weights))

This is the number of trainable weights after freezing the conv base: 4

如此設置之后，只有添加的兩個Dense層的權重才會被訓練，總共有4個權重張量，每層2個（主權重矩陣和偏置向量），注意的是，如果想修改權重屬性trainable，那么應該修改好屬性之后再編譯模型。

下面，我們可以訓練模型了，并使用數據增強的辦法：

from keras.preprocessing.image import ImageDataGeneratortrain_datagen = ImageDataGenerator(rescale=1./255,rotation_range=40,width_shift_range=0.2,height_shift_range=0.2,shear_range=0.2,zoom_range=0.2,horizontal_flip=True,fill_mode='nearest')# Note that the validation data should not be augmented! test_datagen = ImageDataGenerator(rescale=1./255)train_generator = train_datagen.flow_from_directory(# This is the target directorytrain_dir,# All images will be resized to 150x150target_size=(150, 150),batch_size=20,# Since we use binary_crossentropy loss, we need binary labelsclass_mode='binary')validation_generator = test_datagen.flow_from_directory(validation_dir,target_size=(150, 150),batch_size=20,class_mode='binary')model.compile(loss='binary_crossentropy',optimizer=optimizers.RMSprop(lr=2e-5),metrics=['acc'])history = model.fit_generator(train_generator,steps_per_epoch=100,epochs=30,validation_data=validation_generator,validation_steps=50,verbose=2) model.save('cats_and_dogs_small_3.h5')

我們再來看看驗證精度：

acc = history.history['acc'] val_acc = history.history['val_acc'] loss = history.history['loss'] val_loss = history.history['val_loss']epochs = range(len(acc))plt.plot(epochs, acc, 'bo', label='Training acc') plt.plot(epochs, val_acc, 'b', label='Validation acc') plt.title('Training and validation accuracy') plt.legend()plt.figure()plt.plot(epochs, loss, 'bo', label='Training loss') plt.plot(epochs, val_loss, 'b', label='Validation loss') plt.title('Training and validation loss') plt.legend()plt.show()

驗證精度到了將近96%，而且減少了過擬合。

微調模型

我們下面使用模型微調，進一步提高模型的性能。模型微調的步驟如下：

（1）在已經訓練好的基網絡（base network）上添加自定義網絡
（2）凍結基網絡
（3）訓練所添加的部分
（4）解凍基網絡的一些層
（5）聯合訓練解凍的這些層和添加的部分

在做特征提取的時候已經完成了前三個步驟。我們繼續第四個步驟，先解凍conv_base，然后凍結其中的部分層。

_________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_1 (InputLayer) (None, 150, 150, 3) 0 _________________________________________________________________ block1_conv1 (Conv2D) (None, 150, 150, 64) 1792 _________________________________________________________________ block1_conv2 (Conv2D) (None, 150, 150, 64) 36928 _________________________________________________________________ block1_pool (MaxPooling2D) (None, 75, 75, 64) 0 _________________________________________________________________ block2_conv1 (Conv2D) (None, 75, 75, 128) 73856 _________________________________________________________________ block2_conv2 (Conv2D) (None, 75, 75, 128) 147584 _________________________________________________________________ block2_pool (MaxPooling2D) (None, 37, 37, 128) 0 _________________________________________________________________ block3_conv1 (Conv2D) (None, 37, 37, 256) 295168 _________________________________________________________________ block3_conv2 (Conv2D) (None, 37, 37, 256) 590080 _________________________________________________________________ block3_conv3 (Conv2D) (None, 37, 37, 256) 590080 _________________________________________________________________ block3_pool (MaxPooling2D) (None, 18, 18, 256) 0 _________________________________________________________________ block4_conv1 (Conv2D) (None, 18, 18, 512) 1180160 _________________________________________________________________ block4_conv2 (Conv2D) (None, 18, 18, 512) 2359808 _________________________________________________________________ block4_conv3 (Conv2D) (None, 18, 18, 512) 2359808 _________________________________________________________________ block4_pool (MaxPooling2D) (None, 9, 9, 512) 0 _________________________________________________________________ block5_conv1 (Conv2D) (None, 9, 9, 512) 2359808 _________________________________________________________________ block5_conv2 (Conv2D) (None, 9, 9, 512) 2359808 _________________________________________________________________ block5_conv3 (Conv2D) (None, 9, 9, 512) 2359808 _________________________________________________________________ block5_pool (MaxPooling2D) (None, 4, 4, 512) 0 ================================================================= Total params: 14,714,688 Trainable params: 14,714,688 Non-trainable params: 0

再回顧一下這些層，我們將微調最后三個卷積層，也就是說，知道block4_pool的所有層都應該被凍結，后面三層來進行訓練。

要知道，訓練的參數越多，過擬合的風險越大。卷積基有1500萬個參數，所以你在小型數據集上訓練這么多參數是有風險的。因此，這種情況下最好的策略是僅微調卷積基最后三兩層。

conv_base.trainable = Trueset_trainable = False for layer in conv_base.layers:if layer.name == 'block5_conv1':set_trainable = Trueif set_trainable:layer.trainable = Trueelse:layer.trainable = False

現在可以微調網絡了，我們將使用學習率非常小的RMSProp優化器來實現。之所以讓學習率很小，是因為對于微調網絡的三層表示，我們希望其變化范圍不要太大，太大的權重可能會破壞這些表示。

model.compile(loss='binary_crossentropy',optimizer=optimizers.RMSprop(lr=1e-5),metrics=['acc'])history = model.fit_generator(train_generator,steps_per_epoch=100,epochs=100,validation_data=validation_generator,validation_steps=50) model.save('cats_and_dogs_small_4.h5')

下面，繪制曲線看看效果：

這些曲線看起來包含噪音。為了讓圖像更具有可讀性，可以讓每個損失精度替換為指數移動平均，從而讓曲線變得更加平滑，下面用一個簡單實用函數來實現：

def smooth_curve(points, factor=0.8):smoothed_points = []for point in points:if smoothed_points:previous = smoothed_points[-1]smoothed_points.append(previous * factor + point * (1 - factor))else:smoothed_points.append(point)return smoothed_pointsplt.plot(epochs,smooth_curve(acc), 'bo', label='Smoothed training acc') plt.plot(epochs,smooth_curve(val_acc), 'b', label='Smoothed validation acc') plt.title('Training and validation accuracy') plt.legend()plt.figure()plt.plot(epochs,smooth_curve(loss), 'bo', label='Smoothed training loss') plt.plot(epochs,smooth_curve(val_loss), 'b', label='Smoothed validation loss') plt.title('Training and validation loss') plt.legend()plt.show()

通過指數移動平均，驗證曲線變得更清楚了?？梢钥吹?#xff0c;精度提高了1%，約從96%提高到了97%。

下面，在測試集上評估一下這個模型

test_generator = test_datagen.flow_from_directory(test_dir,target_size=(150, 150),batch_size=20,class_mode='binary')test_loss, test_acc = model.evaluate_generator(test_generator, steps=50) print('test acc:', test_acc)

Found 1000 images belonging to 2 classes.
test acc: 0.967999992371

得到了差不多97%的測試精度，在關于這個數據集的原始Kaggle競賽中，這個結果是最佳結果之一。
值得注意的是，我們只是用了一小部分訓練數據（約10%）就得到了這個結果。訓練20000個樣本和訓練2000個樣本還是有很大差別的。

更多精彩內容，歡迎關注我的微信公眾號：數據瞎分析

總結

以上是生活随笔為你收集整理的使用预训练的卷积神经网络（猫狗图片分类）的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：房价预测：回归问题
下一篇：卷机神经网络的可视化（可视化中间激活）