使用TensorFlow 2.0+和Keras实现AlexNet CNN架构
技術 (Technical)
介紹 (Introduction)
The main content of this article will present how the AlexNet Convolutional Neural Network(CNN) architecture is implemented using TensorFlow and Keras.
本文的主要內容將介紹如何使用TensorFlow和Keras實現AlexNet卷積神經網絡(CNN)架構。
But first, allow me to provide a brief background behind the AlexNet CNN architecture.
但是首先,請允許我簡要介紹AlexNet CNN架構的背景知識。
AlexNet was first utilized in the public setting when it won the ImageNet Large Scale Visual Recognition Challenge(ILSSVRC 2012 contest). It was at this contest that AlexNet showed that deep convolutional neural network can be used for solving image classification.
AlexNet贏得ImageNet大規模視覺識別挑戰賽(ILSSVRC 2012競賽)時首次在公共場合使用。 正是在這次競賽中,AlexNet證明了深度卷積神經網絡可用于解決圖像分類。
AlexNet won the ILSVRC 2012 contest by a margin.
AlexNet以微弱優勢贏得了2012年ILSVRC比賽。
The research paper that detailed the internal components of the CNN architecture also introduced some novel techniques and methods such as efficient computing resource utilization; data augmentation, GPU training, and multiple strategies to prevent overfitting within neural networks.
該研究論文詳細介紹了CNN體系結構的內部組件,還介紹了一些新穎的技術和方法,例如有效的計算資源利用; 數據擴充,GPU訓練和多種策略來防止神經網絡內的過度擬合。
I have written an article that presents key ideas and techniques that AlexNet brought to the world of computer vision and deep learning.
我寫了一篇文章 ,介紹AlexNet帶給計算機視覺和深度學習領域的關鍵思想和技術。
以下是本文的一些主要學習目標: (Here are some of the key learning objectives from this article:)
Introduction to neural network implementation with Keras and TensorFlow
用Keras和TensorFlow實現神經網絡的簡介
Data preprocessing with TensorFlow
使用TensorFlow進行數據預處理
Training visualization with TensorBoard
使用TensorBoard訓練可視化
Description of standard machine learning terms and terminologies
標準機器學習術語和術語的描述
AlexNet實施 (AlexNet Implementation)
AlexNet CNN is probably one of the simplest methods to approach understanding deep learning concepts and techniques.
AlexNet CNN可能是了解深度學習概念和技術的最簡單方法之一。
AlexNet is not a complicated architecture when it is compared with some state of the art CNN architectures that have emerged in the more recent years.
與最近幾年出現的一些最新的CNN架構相比,AlexNet并不是一個復雜的架構。
AlexNet is simple enough for beginners and intermediate deep learning practitioners to pick up some good practices on model implementation techniques.
AlexNet非常簡單,對于初學者和中級深度學習從業人員而言,他們可以學習一些有關模型實現技術的良好實踐。
All code presented in this article is written using Jupyter Lab. At the end of this article is a GitHub link to the notebook that includes all code in the implementation section.
本文介紹的所有代碼都是使用Jupyter Lab編寫的。 本文結尾處是指向筆記本的GitHub鏈接,其中包括實現部分中的所有代碼。
So let’s begin.
因此,讓我們開始吧。
1.工具和庫 (1. Tools And Libraries)
We begin implementation by importing the following libraries:
我們通過導入以下庫開始實施:
TensorFlow: An open-source platform for the implementation, training, and deployment of machine learning models.
TensorFlow :一個用于實施,培訓和部署機器學習模型的開源平臺。
Keras: An open-source library used for the implementation of neural network architectures that run on both CPUs and GPUs.
Keras :一個開放源代碼庫,用于實現在CPU和GPU上運行的神經網絡體系結構。
Matplotlib: A visualization python tool used for illustrating interactive charts and images.
Matplotlib :一種可視化的python工具,用于說明交互式圖表和圖像。
from tensorflow import keras
import matplotlib.pyplot as plt
import os
import time
2.數據集 (2. Dataset)
The CIFAR-10 dataset contains 60,000 colour images, each with dimensions 32x32px. The content of the images within the dataset is sampled from 10 classes.
CIFAR-10數據集包含60,000個彩色圖像,每個彩色圖像的尺寸為32x32px 。 數據集中的圖像內容是從10個類別中采樣的。
Classes within the CIFAR-10 datasetCIFAR-10數據集中的類CIFAR-10 images were aggregated by some of the creators of the AlexNet network, Alex Krizhevsky and Geoffrey Hinton.
CIFAR-10圖像由AlexNet網絡的一些創建者Alex Krizhevsky和Geoffrey Hinton聚合而成。
The deep learning Keras library provides direct access to the CIFAR10 dataset with relative ease, through its dataset module. Accessing common datasets such as CIFAR10 or MNIST, becomes a trivial task with Keras.
深度學習Keras庫通過其數據集模塊相對輕松地直接訪問CIFAR10數據集 。 使用Keras ,訪問常見的數據集(例如CIFAR10或MNIST )成為一項瑣碎的任務。
(train_images, train_labels), (test_images, test_labels) = keras.datasets.cifar10.load_data()In order to reference the class names of the images during the visualization stage, a python list containing the classes is initialized with the variable name CLASS_NAMES.
為了在可視化階段引用圖像的類名,使用變量名CLASS_NAMES .初始化了包含這些類的python列表CLASS_NAMES .
CLASS_NAMES= ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']The CIFAR dataset is partitioned into 50,000 training data and 10,000 test data by default. The last partition of the dataset we require is the validation data.
默認情況下,CIFAR數據集分為50,000個訓練數據和10,000個測試數據。 我們需要的數據集的最后一個分區是驗證數據。
The validation data is obtained by taking the last 5000 images within the training data.
通過拍攝訓練數據中的最后5000張圖像獲得驗證數據。
validation_images, validation_labels = train_images[:5000], train_labels[:5000]train_images, train_labels = train_images[5000:], train_labels[5000:]
Training Dataset: This is the group of our dataset used to train the neural network directly. Training data refers to the dataset partition exposed to the neural network during training.
訓練數據集 :這是我們的數據集,用于直接訓練神經網絡。 訓練數據是指訓練期間暴露于神經網絡的數據集分區。
Validation Dataset: This group of the dataset is utilized during training to assess the performance of the network at various iterations.
驗證數據集 :在訓練期間利用該組數據集來評估網絡在各種迭代中的性能。
Test Dataset: This partition of the dataset evaluates the performance of our network after the completion of the training phase.
測試數據集 :在訓練階段完成后, 數據集的此分區評估了我們網絡的性能。
TensorFlow provides a suite of functions and operations that enables easy data manipulation and modification through a defined input pipeline.
TensorFlow提供了一組功能和操作,可通過定義的輸入管道輕松進行數據操作和修改。
To be able to access these methods and procedures, it is required that we transform our dataset into an efficient data representation TensorFlow is familiar with. This is achieved using the tf.data.Dataset API.
為了能夠訪問這些方法和過程,需要將數據集轉換為TensorFlow熟悉的高效數據表示形式。 這是使用tf.data.Dataset實現的 API。
More specifically, tf.data.Dataset.from_tensor_slices method takes the train, test, and validation dataset partitions and returns a corresponding TensorFlow Dataset representation.
更具體地說, tf.data.Dataset.from_tensor_slices方法采用訓練,測試和驗證數據集分區,并返回相應的TensorFlow數據集表示形式。
train_ds = tf.data.Dataset.from_tensor_slices((train_images, train_labels))test_ds = tf.data.Dataset.from_tensor_slices((test_images, test_labels))
validation_ds = tf.data.Dataset.from_tensor_slices((validation_images, validation_labels))
3.預處理 (3. Preprocessing)
Preprocessing within any machine learning is associated with the transformation of data from one form to another.
任何機器學習中的預處理都與數據從一種形式到另一種形式的轉換相關。
Usually, preprocessing is conducted to ensure the data utilized is within an appropriate format.
通常,進行預處理以確保所使用的數據在適當的格式內。
First, let’s visualize the images within the CIFAR-10 dataset.
首先,讓我們可視化CIFAR-10數據集中的圖像。
The code snippet below uses the Matplotlib library to present the pixel information of the data from five training images into actual images. There is also an indicator of the class each depicted content within the images belongs to.
下面的代碼段使用Matplotlib庫將來自五個訓練圖像的數據的像素信息呈現為實際圖像。 還存在圖像中每個所描繪的內容所屬的類別的指示符。
Excuse the blurriness of the images; the CIFAR-10 images have small dimensions, which makes visualization of the actual pictures a bit difficult.
請原諒圖像的模糊性; CIFAR-10圖像的尺寸較小,這使得實際圖片的可視化有點困難。
plt.figure(figsize=(20,20))for i, (image, label) in enumerate(train_ds.take(5)):
ax = plt.subplot(5,5,i+1)
plt.imshow(image)
plt.title(CLASS_NAMES[label.numpy()[0]])
plt.axis('off')
The primary preprocessing transformations that will be imposed on the data presented to the network are:
將對呈現給網絡的數據進行的主要預處理轉換是:
- Normalizing and standardizing the images. 標準化和標準化圖像。
Resizing of the images from 32x32 to 227x227. The AlexNet network input expects a 227x227 image.
將圖片的大小從32x32調整為227x227 。 AlexNet網絡輸入需要227x227的圖像。
We’ll create a function called process_images.
我們將創建一個名為process_images.的函數process_images.
This function will perform all preprocessing work that we require for the data. This function is called further down the machine learning workflow.
此功能將執行數據所需的所有預處理工作。 進一步在機器學習工作流程中調用此功能。
def process_images(image, label):# Normalize images to have a mean of 0 and standard deviation of 1
image = tf.image.per_image_standardization(image)
# Resize images from 32x32 to 277x277
image = tf.image.resize(image, (227,227))
return image, label
4.數據/輸入管道 (4. Data/Input Pipeline)
So far, we have obtained and partitioned the dataset and created a function to process the dataset. The next step is to build an input pipeline.
到目前為止,我們已經獲得并劃分了數據集并創建了一個函數來處理數據集。 下一步是建立輸入管道。
An input/data pipeline is described as a series of functions or methods that are called consecutively one after another. Input pipelines are a chain of functions that either act upon the data or enforces an operation on the data flowing through the pipeline.
輸入/數據管道被描述為一系列功能或方法,它們被一個接一個地連續調用。 輸入管道是作用于數據或對流經管道的數據執行操作的功能鏈。
Let’s get the size of each of the dataset partition we created; the sizes of the dataset partitions are required to ensure that the dataset is thoroughly shuffled before passed through the network.
讓我們獲取我們創建的每個數據集分區的大小; 需要數據集分區的大小,以確保在通過網絡之前徹底改組數據集。
train_ds_size = tf.data.experimental.cardinality(train_ds).numpy()test_ds_size = tf.data.experimental.cardinality(test_ds).numpy()
validation_ds_size = tf.data.experimental.cardinality(validation_ds).numpy()
print("Training data size:", train_ds_size)
print("Test data size:", test_ds_size)
print("Validation data size:", validation_ds_size)
For our basic input/data pipeline, we will conduct three primary operations:
對于我們的基本輸入/數據管道,我們將執行三個主要操作:
Preprocessing the data within the dataset
預處理數據集中的數據
Shuffle the dataset
隨機播放數據集
Batch data within the dataset
數據集中的批處理數據
.map(process_images)
.shuffle(buffer_size=train_ds_size)
.batch(batch_size=32, drop_remainder=True))test_ds = (test_ds
.map(process_images)
.shuffle(buffer_size=train_ds_size)
.batch(batch_size=32, drop_remainder=True))validation_ds = (validation_ds
.map(process_images)
.shuffle(buffer_size=train_ds_size)
.batch(batch_size=32, drop_remainder=True))
5.模型實施 (5. Model Implementation)
Within this section, we will implement the AlexNet CNN architecture from scratch.
在本節中,我們將從頭開始實現AlexNet CNN架構。
Through the utilization of Keras Sequential API, we can implement consecutive neural network layers within our models that are stacked against each other.
通過使用Keras順序API ,我們可以在彼此堆疊的模型中實現連續的神經網絡層。
Here are the types of layers the AlexNet CNN architecture is composed of, along with a brief description:
以下是AlexNet CNN架構所組成的層的類型,以及簡要說明:
Convolutional layer: A convolution is a mathematical term that describes a dot product multiplication between two sets of elements. Within deep learning the convolution operation acts on the filters/kernels and image data array within the convolutional layer. Therefore a convolutional layer is simply a layer the houses the convolution operation that occurs between the filters and the images passed through a convolutional neural network.
卷積層 :卷積是一個數學術語,用于描述兩組元素之間的點積乘法。 在深度學習中,卷積操作作用于卷積層內的濾鏡/內核和圖像數據陣列。 因此,卷積層僅僅是容納卷積操作的層,該卷積操作發生在過濾器和通過卷積神經網絡的圖像之間。
Batch Normalisation layer: Batch Normalization is a technique that mitigates the effect of unstable gradients within a neural network through the introduction of an additional layer that performs operations on the inputs from the previous layer. The operations standardize and normalize the input values, after that the input values are transformed through scaling and shifting operations.
批量歸一化層 :批量歸一化是一種技術,它通過引入一個附加層來減輕神經網絡中不穩定梯度的影響,該附加層對來自上一層的輸入執行操作。 在通過縮放和移位操作對輸入值進行轉換之后,這些操作將對輸入值進行標準化和標準化。
MaxPooling layer: Max pooling is a variant of sub-sampling where the maximum pixel value of pixels that fall within the receptive field of a unit within a sub-sampling layer is taken as the output. The max-pooling operation below has a window of 2x2 and slides across the input data, outputting an average of the pixels within the receptive field of the kernel.
MaxPooling層 :Max pooling是子采樣的一種變體,其中將屬于子采樣層內某個單元的接收場內的像素的最大像素值作為輸出。 下面的最大池化操作具有2x2的窗口,并在輸入數據上滑動,輸出內核接受域內像素的平均值。
Flatten layer: Takes an input shape and flattens the input image data into a one-dimensional array.
拼合層 :采用輸入形狀并將輸入圖像數據拼合為一維數組。
Dense Layer: A dense layer has an embedded number of arbitrary units/neurons within. Each neuron is a perceptron.
致密層 :致密層中嵌入了任意數量的單元/神經元。 每個神經元都是一個感知器。
Some other operations and techniques utilized within the AlexNet CNN that are worth mentioning are:
值得一提的是AlexNet CNN中使用的其他一些操作和技術是:
Activation Function: A mathematical operation that transforms the result or signals of neurons into a normalized output. The purpose of an activation function as a component of a neural network is to introduce non-linearity within the network. The inclusion of an activation function enables the neural network to have greater representational power and solve complex functions.
激活函數 :將神經元的結果或信號轉換為標準化輸出的數學運算。 激活函數作為神經網絡的組成部分的目的是在網絡內引入非線性。 包含激活函數使神經網絡具有更大的表示能力并能夠解決復雜的函數。
Rectified Linear Unit Activation Function(ReLU): A type of activation function that transforms the value results of a neuron. The transformation imposed by ReLU on values from a neuron is represented by the formula y=max(0,x). The ReLU activation function clamps down any negative values from the neuron to 0, and positive values remain unchanged. The result of this mathematical transformation is utilized as the output of the current layer and used as input to a consecutive layer within a neural network.
整流線性單位激活函數(ReLU) :一種激活函數,可轉換神經元的值結果。 ReLU對來自神經元的值施加的變換由公式y = max(0,x)表示 。 ReLU激活功能將神經元的任何負值鉗制為0,而正值保持不變。 該數學變換的結果被用作當前層的輸出,并被用作神經網絡內連續層的輸入。
Softmax Activation Function: A type of activation function that is utilized to derive the probability distribution of a set of numbers within an input vector. The output of a softmax activation function is a vector in which its set of values represents the probability of an occurrence of a class or event. The values within the vector all add up to 1.
Softmax激活函數 :一種激活函數,用于導出輸入向量內一組數字的概率分布。 softmax激活函數的輸出是一個向量,其中的一組值表示發生類或事件的概率。 向量中的值總計為1。
Dropout: Dropout technique works by randomly reducing the number of interconnecting neurons within a neural network. At every training step, each neuron has a chance of being left out, or rather, dropped out of the collated contributions from connected neurons.
輟學 :輟學技術通過隨機減少神經網絡中互連神經元的數量來工作。 在每個訓練步驟中,每個神經元都有機會被遺漏,或更確切地說,會從連接的神經元的整理貢獻中消失。
The code snippet represents the Keras implementation of the AlexNet CNN architecture.
該代碼段表示AlexNet CNN架構的Keras實現。
model = keras.models.Sequential([keras.layers.Conv2D(filters=96, kernel_size=(11,11), strides=(4,4), activation='relu', input_shape=(227,227,3)),
keras.layers.BatchNormalization(),
keras.layers.MaxPool2D(pool_size=(3,3), strides=(2,2)),
keras.layers.Conv2D(filters=256, kernel_size=(5,5), strides=(1,1), activation='relu', padding="same"),
keras.layers.BatchNormalization(),
keras.layers.MaxPool2D(pool_size=(3,3), strides=(2,2)),
keras.layers.Conv2D(filters=384, kernel_size=(3,3), strides=(1,1), activation='relu', padding="same"),
keras.layers.BatchNormalization(),
keras.layers.Conv2D(filters=384, kernel_size=(1,1), strides=(1,1), activation='relu', padding="same"),
keras.layers.BatchNormalization(),
keras.layers.Conv2D(filters=256, kernel_size=(1,1), strides=(1,1), activation='relu', padding="same"),
keras.layers.BatchNormalization(),
keras.layers.MaxPool2D(pool_size=(3,3), strides=(2,2)),
keras.layers.Flatten(),
keras.layers.Dense(4096, activation='relu'),
keras.layers.Dropout(0.5),
keras.layers.Dense(4096, activation='relu'),
keras.layers.Dropout(0.5),
keras.layers.Dense(10, activation='softmax')
])
6. TensorBoard (6. TensorBoard)
At this point, we have the custom AlexNet network implemented.
至此,我們已經實現了定制的AlexNet網絡。
Before we proceed onto training, validation, and evaluation of the network with data, we first have to set up some monitoring facilities.
在繼續對數據進行網絡的培訓,驗證和評估之前,我們首先必須設置一些監視工具。
TensorBoard is a tool that provides a suite of visualization and monitoring mechanisms. For the work in this tutorial, we’ll be utilizing TensorBoard to monitor the progress of the training of the network.
TensorBoard是一種工具,提供了一套可視化和監視機制。 對于本教程中的工作,我們將利用TensorBoard監視網絡培訓的進度。
More specifically, we’ll be monitoring the following metrics: training loss, training accuracy, validation loss, validation accuracy.
更具體地說,我們將監視以下指標: 降雨損失,訓練準確性,驗證損失,驗證準確性。
In the shortcode snippet below we are creating a reference to the directory we would like all TensorBoard files to be stored within. The function get_run_logdir returns the location of the exact directory that is named according to the current time the training phase starts.
在下面的簡短代碼段中,我們創建了一個目錄引用,我們希望將所有TensorBoard文件存儲在其中。 函數get_run_logdir返回根據培訓階段開始的當前時間命名的確切目錄的位置。
To complete this current process, we pass the directory to store TensorBoard related files for a particular training session to the TensorBoard callback.
為了完成當前過程,我們將目錄存儲到TensorBoard回調中,以將特定培訓課程的TensorBoard相關文件存儲。
root_logdir = os.path.join(os.curdir, "logs\\fit\\")def get_run_logdir():run_id = time.strftime("run_%Y_%m_%d-%H_%M_%S")
return os.path.join(root_logdir, run_id)run_logdir = get_run_logdir()
tensorboard_cb = keras.callbacks.TensorBoard(run_logdir)
7.培訓和結果 (7. Training and Results)
To train the network, we have to compile it.
要訓??練網絡,我們必須對其進行編譯。
The compilation processes involve specifying the following items:
編譯過程涉及指定以下各項:
Loss function: A method that quantifies ‘how well’ a machine learning model performs. The quantification is an output(cost) based on a set of inputs, which are referred to as parameter values. The parameter values are used to estimate a prediction, and the ‘loss’ is the difference between the predictions and the actual values.
損失函數 :一種方法,量化機器“ 如何 ”學習模型執行。 量化是基于一組輸入的輸出(成本),稱為參數值。 參數值用于估計預測,而“損失”是預測與實際值之間的差。
Optimization Algorithm: An optimizer within a neural network is an algorithmic implementation that facilitates the process of gradient descent within a neural network by minimizing the loss values provided via the loss function. To reduce the loss, it is paramount the values of the weights within the network are selected appropriately.
優化算法 :神經網絡內的優化器是一種算法實現,可通過最小化通過損失函數提供的損失值來促進神經網絡內的梯度下降過程。 為了減少損失,最重要的是適當選擇網絡內的權重值。
Learning Rate: An integral component of a neural network implementation detail as it’s a factor value that determines the level of updates that are made to the values of the weights of the network. Learning rate is a type of hyperparameter.
學習率 :神經網絡實現細節的組成部分,因為它是決定對網絡權重值進行更新的級別的因子值。 學習率是一種超參數。
model.compile(loss='sparse_categorical_crossentropy', optimizer=tf.optimizers.SGD(lr=0.001), metrics=['accuracy'])model.summary()
We can also provide a summary of the network to have more insight into the layer composition of the network by running the model.summary()function.
我們還可以通過運行model.summary()函數提供網絡摘要,以更深入地了解網絡的層組成。
Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d (Conv2D) (None, 55, 55, 96) 34944 _________________________________________________________________ batch_normalization (BatchNo (None, 55, 55, 96) 384 _________________________________________________________________ max_pooling2d (MaxPooling2D) (None, 27, 27, 96) 0 _________________________________________________________________ conv2d_1 (Conv2D) (None, 27, 27, 256) 614656 _________________________________________________________________ batch_normalization_1 (Batch (None, 27, 27, 256) 1024 _________________________________________________________________ max_pooling2d_1 (MaxPooling2 (None, 13, 13, 256) 0 _________________________________________________________________ conv2d_2 (Conv2D) (None, 13, 13, 384) 885120 _________________________________________________________________ batch_normalization_2 (Batch (None, 13, 13, 384) 1536 _________________________________________________________________ conv2d_3 (Conv2D) (None, 13, 13, 384) 147840 _________________________________________________________________ batch_normalization_3 (Batch (None, 13, 13, 384) 1536 _________________________________________________________________ conv2d_4 (Conv2D) (None, 13, 13, 256) 98560 _________________________________________________________________ batch_normalization_4 (Batch (None, 13, 13, 256) 1024 _________________________________________________________________ max_pooling2d_2 (MaxPooling2 (None, 6, 6, 256) 0 _________________________________________________________________ flatten (Flatten) (None, 9216) 0 _________________________________________________________________ dense (Dense) (None, 4096) 37752832 _________________________________________________________________ dropout (Dropout) (None, 4096) 0 _________________________________________________________________ dense_1 (Dense) (None, 4096) 16781312 _________________________________________________________________ dropout_1 (Dropout) (None, 4096) 0 _________________________________________________________________ dense_2 (Dense) (None, 10) 40970 ================================================================= Total params: 56,361,738 Trainable params: 56,358,986 Non-trainable params: 2,752 _________________________________________________________________At this point, we are ready to train the network.
至此,我們準備訓練網絡。
Training the custom AlexNet network is very simple with the Keras module enabled through TensorFlow. We simply have to call the fit()method and pass relevant arguments.
通過TensorFlow啟用Keras模塊,訓練自定義AlexNet網絡非常簡單。 我們只需要調用fit()方法并傳遞相關參數即可。
Epoch: This is a numeric value that indicates the number of time a network has been exposed to all the data points within a training dataset.
時代:這是一個數字值,表示網絡暴露于訓練數據集中所有數據點的時間。
model.fit(train_ds,epochs=50,
validation_data=validation_ds,
validation_freq=1,
callbacks=[tensorboard_cb])
After executing this cell of code within the notebook, the network will begin to train and validate against the data provided. You’ll start to see training and validation logs such as the one shown below:
在筆記本中執行此代碼單元后,網絡將開始訓練并針對提供的數據進行驗證。 您將開始看到培訓和驗證日志,如下所示:
Train for 1562 steps, validate for 156 stepsEpoch 1/50
1/1562 [..............................] - ETA: 3:05:44 - loss: 5.6104 - accuracy: 0.0625WARNING:tensorflow:Method (on_train_batch_end) is slow compared to the batch update (0.168881). Check your callbacks.
1562/1562 [==============================] - 42s 27ms/step - loss: 2.0966 - accuracy: 0.3251 - val_loss: 1.4436 - val_accuracy: 0.4920
Epoch 2/50
1562/1562 [==============================] - 39s 25ms/step - loss: 1.5864 - accuracy: 0.4382 - val_loss: 1.2939 - val_accuracy: 0.5447
Epoch 3/50
1562/1562 [==============================] - 39s 25ms/step - loss: 1.4391 - accuracy: 0.4889 - val_loss: 1.1749 - val_accuracy: 0.5859
Epoch 4/50
1562/1562 [==============================] - 39s 25ms/step - loss: 1.3278 - accuracy: 0.5307 - val_loss: 1.0841 - val_accuracy: 0.6228
Epoch 5/50
1562/1562 [==============================] - 39s 25ms/step - loss: 1.2349 - accuracy: 0.5630 - val_loss: 1.0094 - val_accuracy: 0.6569
Epoch 6/50
1562/1562 [==============================] - 40s 25ms/step - loss: 1.1657 - accuracy: 0.5876 - val_loss: 0.9599 - val_accuracy: 0.6851
Epoch 7/50
1562/1562 [==============================] - 39s 25ms/step - loss: 1.1054 - accuracy: 0.6128 - val_loss: 0.9102 - val_accuracy: 0.6937
Epoch 8/50
1562/1562 [==============================] - 40s 26ms/step - loss: 1.0477 - accuracy: 0.6285 - val_loss: 0.8584 - val_accuracy: 0.7109
Epoch 9/50
1562/1562 [==============================] - 39s 25ms/step - loss: 1.0026 - accuracy: 0.6461 - val_loss: 0.8392 - val_accuracy: 0.7137
Epoch 10/50
1562/1562 [==============================] - 39s 25ms/step - loss: 0.9601 - accuracy: 0.6627 - val_loss: 0.7684 - val_accuracy: 0.7398
Epoch 11/50
1562/1562 [==============================] - 40s 25ms/step - loss: 0.9175 - accuracy: 0.6771 - val_loss: 0.7683 - val_accuracy: 0.7476
Epoch 12/50
1562/1562 [==============================] - 40s 25ms/step - loss: 0.8827 - accuracy: 0.6914 - val_loss: 0.7012 - val_accuracy: 0.7702
Epoch 13/50
1562/1562 [==============================] - 40s 25ms/step - loss: 0.8465 - accuracy: 0.7035 - val_loss: 0.6496 - val_accuracy: 0.7903
Epoch 14/50
1562/1562 [==============================] - 40s 26ms/step - loss: 0.8129 - accuracy: 0.7160 - val_loss: 0.6137 - val_accuracy: 0.7991
Epoch 15/50
1562/1562 [==============================] - 40s 26ms/step - loss: 0.7832 - accuracy: 0.7250 - val_loss: 0.6181 - val_accuracy: 0.7957
Epoch 16/50
1562/1562 [==============================] - 40s 25ms/step - loss: 0.7527 - accuracy: 0.7371 - val_loss: 0.6102 - val_accuracy: 0.7953
Epoch 17/50
1562/1562 [==============================] - 40s 26ms/step - loss: 0.7193 - accuracy: 0.7470 - val_loss: 0.5236 - val_accuracy: 0.8327
Epoch 18/50
1562/1562 [==============================] - 40s 26ms/step - loss: 0.6898 - accuracy: 0.7559 - val_loss: 0.5091 - val_accuracy: 0.8425
Epoch 19/50
1562/1562 [==============================] - 40s 25ms/step - loss: 0.6620 - accuracy: 0.7677 - val_loss: 0.4824 - val_accuracy: 0.8468
Epoch 20/50
1562/1562 [==============================] - 40s 26ms/step - loss: 0.6370 - accuracy: 0.7766 - val_loss: 0.4491 - val_accuracy: 0.8620
Epoch 21/50
1562/1562 [==============================] - 41s 26ms/step - loss: 0.6120 - accuracy: 0.7850 - val_loss: 0.4212 - val_accuracy: 0.8694
Epoch 22/50
1562/1562 [==============================] - 41s 26ms/step - loss: 0.5846 - accuracy: 0.7943 - val_loss: 0.4091 - val_accuracy: 0.8746
Epoch 23/50
1562/1562 [==============================] - 40s 26ms/step - loss: 0.5561 - accuracy: 0.8070 - val_loss: 0.3737 - val_accuracy: 0.8872
Epoch 24/50
1562/1562 [==============================] - 40s 26ms/step - loss: 0.5314 - accuracy: 0.8150 - val_loss: 0.3808 - val_accuracy: 0.8810
Epoch 25/50
1562/1562 [==============================] - 40s 26ms/step - loss: 0.5107 - accuracy: 0.8197 - val_loss: 0.3246 - val_accuracy: 0.9048
Epoch 26/50
1562/1562 [==============================] - 40s 26ms/step - loss: 0.4833 - accuracy: 0.8304 - val_loss: 0.3085 - val_accuracy: 0.9115
Epoch 27/50
1562/1562 [==============================] - 41s 26ms/step - loss: 0.4595 - accuracy: 0.8425 - val_loss: 0.2992 - val_accuracy: 0.9111
Epoch 28/50
1562/1562 [==============================] - 40s 26ms/step - loss: 0.4395 - accuracy: 0.8467 - val_loss: 0.2566 - val_accuracy: 0.9305
Epoch 29/50
1562/1562 [==============================] - 41s 26ms/step - loss: 0.4157 - accuracy: 0.8563 - val_loss: 0.2482 - val_accuracy: 0.9339
Epoch 30/50
1562/1562 [==============================] - 40s 26ms/step - loss: 0.3930 - accuracy: 0.8629 - val_loss: 0.2129 - val_accuracy: 0.9449
Epoch 31/50
1562/1562 [==============================] - 40s 26ms/step - loss: 0.3727 - accuracy: 0.8705 - val_loss: 0.1999 - val_accuracy: 0.9525
Epoch 32/50
1562/1562 [==============================] - 40s 26ms/step - loss: 0.3584 - accuracy: 0.8751 - val_loss: 0.1791 - val_accuracy: 0.9593
Epoch 33/50
1562/1562 [==============================] - 41s 26ms/step - loss: 0.3387 - accuracy: 0.8830 - val_loss: 0.1770 - val_accuracy: 0.9557
Epoch 34/50
1562/1562 [==============================] - 41s 26ms/step - loss: 0.3189 - accuracy: 0.8905 - val_loss: 0.1613 - val_accuracy: 0.9643
Epoch 35/50
1562/1562 [==============================] - 40s 26ms/step - loss: 0.3036 - accuracy: 0.8969 - val_loss: 0.1421 - val_accuracy: 0.9681
Epoch 36/50
1562/1562 [==============================] - 41s 26ms/step - loss: 0.2784 - accuracy: 0.9039 - val_loss: 0.1290 - val_accuracy: 0.9736
Epoch 37/50
1562/1562 [==============================] - 40s 26ms/step - loss: 0.2626 - accuracy: 0.9080 - val_loss: 0.1148 - val_accuracy: 0.9762
Epoch 38/50
1562/1562 [==============================] - 41s 26ms/step - loss: 0.2521 - accuracy: 0.9145 - val_loss: 0.0937 - val_accuracy: 0.9828
Epoch 39/50
1562/1562 [==============================] - 42s 27ms/step - loss: 0.2387 - accuracy: 0.9190 - val_loss: 0.1045 - val_accuracy: 0.9768
Epoch 40/50
1562/1562 [==============================] - 40s 26ms/step - loss: 0.2215 - accuracy: 0.9247 - val_loss: 0.0850 - val_accuracy: 0.9860
Epoch 41/50
1562/1562 [==============================] - 40s 26ms/step - loss: 0.2124 - accuracy: 0.9274 - val_loss: 0.0750 - val_accuracy: 0.9862
Epoch 42/50
1562/1562 [==============================] - 41s 26ms/step - loss: 0.1980 - accuracy: 0.9335 - val_loss: 0.0680 - val_accuracy: 0.9896
Epoch 43/50
1562/1562 [==============================] - 41s 26ms/step - loss: 0.1906 - accuracy: 0.9350 - val_loss: 0.0616 - val_accuracy: 0.9912
Epoch 44/50
1562/1562 [==============================] - 40s 26ms/step - loss: 0.1769 - accuracy: 0.9410 - val_loss: 0.0508 - val_accuracy: 0.9922
Epoch 45/50
1562/1562 [==============================] - 41s 26ms/step - loss: 0.1648 - accuracy: 0.9455 - val_loss: 0.0485 - val_accuracy: 0.9936
Epoch 46/50
1562/1562 [==============================] - 41s 26ms/step - loss: 0.1571 - accuracy: 0.9487 - val_loss: 0.0435 - val_accuracy: 0.9952
Epoch 47/50
1562/1562 [==============================] - 40s 26ms/step - loss: 0.1514 - accuracy: 0.9501 - val_loss: 0.0395 - val_accuracy: 0.9950
Epoch 48/50
1562/1562 [==============================] - 41s 26ms/step - loss: 0.1402 - accuracy: 0.9535 - val_loss: 0.0274 - val_accuracy: 0.9984
Epoch 49/50
1562/1562 [==============================] - 40s 26ms/step - loss: 0.1357 - accuracy: 0.9549 - val_loss: 0.0308 - val_accuracy: 0.9966
Epoch 50/50
1562/1562 [==============================] - 42s 27ms/step - loss: 0.1269 - accuracy: 0.9596 - val_loss: 0.0251 - val_accuracy: 0.9976<tensorflow.python.keras.callbacks.History at 0x2de3aaa0ec8>
For better visualization and monitoring of training performance, we’ll use the TensorBoard functionality.
為了更好地可視化和監視培訓效果,我們將使用TensorBoard功能。
Open up a terminal at the directory level where the TensorBoard log folder exists and run the following command:
在TensorBoard日志文件夾所在的目錄級別打開一個終端,然后運行以下命令:
tensorboard --logdir logsDirectory level where TensorBoard log file residesTensorBoard日志文件所在的目錄級別Follow the instructions on the terminal and navigate to ‘localhost:6006’ (this could be a different port number for you).
按照終端上的說明,導航到“ localhost:6006 ”(這可能是您使用的其他端口號)。
Alas, you will be presented with a page that is similar to the image depicted below:
las,您將看到與以下圖像相似的頁面:
TensorBoard ToolTensorBoard工具Below is the snippet of the visualization of the complete training and validation phase provided by TensorBoard.
以下是TensorBoard提供的完整培訓和驗證階段的可視化摘要。
TensorBoard Training and Validation monitoringTensorBoard培訓和驗證監控8.評估 (8. Evaluation)
The last official step is to assess the trained network through network evaluation.
官方的最后一步是通過網絡評估來評估經過訓練的網絡。
The evaluation phase will provide a performance score of the trained model on unseen data. For the evaluation phase of the model, we’ll be utilizing the batch of test data created at earlier steps.
評估階段將在看不見的數據上提供經過訓練的模型的性能評分。 在模型的評估階段,我們將利用在先前步驟中創建的一批測試數據。
Evaluating a model is very simple, you simply call the evaluate()method and pass the batched test data.
評估模型非常簡單,只需調用evaluate()方法并傳遞批處理的測試數據即可。
model.evaluate(test_ds)After executing the cell block above, we are presented with a score that indicates the performance of the model on unseen data.
執行完上面的單元塊后,我們將得到一個分數,該分數指示模型在看不見的數據上的性能。
312/312 [==============================] - 8s 27ms/step - loss: 0.9814 - accuracy: 0.7439[0.9813630809673132, 0.7438902]
The first element of the returned result contains the evaluation loss: 0.9813, the second element indicates is the evaluation accuracy 0.74389.
返回結果的第一個元素包含評估損失:0.9813,第二個元素指示評估準確性為0.74389。
在CIFAR-10數據集上進行了培訓,驗證和評估的定制實現的AlexNet網絡,以在包含5000個數據點的測試數據集上創建評估精度為74%的模型。 (The custom implemented AlexNet network that was trained, validated, and evaluated on the CIFAR-10 dataset to create a model with an evaluation accuracy of 74% on a test dataset containing 5000 data points.)
獎金(可選) (Bonus (Optional))
This section includes some information that supplements the implementation of an AlexNet convolutional neural network.
本節包含一些信息,這些信息補充了AlexNet卷積神經網絡的實現。
Although this additional information is not crucial to gain an understanding of the implementation processes, these sections will provide readers with some additional background knowledge that can be leveraged in future work.
盡管這些附加信息對于理解實施過程不是至關重要的,但這些部分將為讀者提供一些其他背景知識,這些知識可以在以后的工作中加以利用。
The sections covered are as follows:
涵蓋的部分如下:
Local Response Normalisation
本地響應規范化
Information into why we batch and shuffle the dataset before training
有關訓練之前我們為何對數據集進行批處理和洗牌的信息
本地響應規范化 (Local Response Normalisation)
Many are familiar with batch normalization, but the AlexNet architecture used a different method of normalization within the network: Local Response Normalization (LRN).
許多人熟悉批處理規范化,但是AlexNet體系結構在網絡中使用了另一種規范化方法:本地響應規范化(LRN)。
LRN is a technique that maximizes the activation of neighbouring neurons. Neighbouring neurons describe neurons across several feature maps that share the same spatial position. By normalizing the activations of the neurons, neurons with high activations are highlighted; this essentially mimics the lateral inhibition that happens within neurobiology.
LRN是使鄰近神經元激活最大化的技術。 相鄰的神經元描述跨共享相同空間位置的多個特征圖的神經元。 通過標準化神經元的激活,高激活神經元被突出顯示。 這實質上是模仿神經生物學中發生的側向抑制。
LRN are not widely utilized in modern CNN architectures, as there are other more effective methods of normalization. Although LRN implementations can still be found in some standard machine learning libraries and frameworks, so feel free to experiment.
LRN在現代CNN架構中并未得到廣泛利用,因為還有其他更有效的標準化方法。 盡管仍可以在某些標準的機器學習庫和框架中找到LRN實現,但是請隨時嘗試。
我們為什么要對數據集進行隨機播放? (Why do we shuffle the dataset?)
Shuffling the dataset before training is a traditional process within a typical machine learning project. But why do we do it?
在訓練之前對數據集進行洗牌是典型機器學習項目中的傳統過程。 但是為什么我們要這樣做呢?
When conducting data aggregation, it is common to consecutively accumulate images or data points that correspond to the same classes and labels. A typical final result after loading data used to train, and validate a network is a set of images/data points that are arranged in order of corresponding classes.
進行數據聚合時,通常會連續累積對應于相同類別和標簽的圖像或數據點。 加載用于訓練和驗證網絡的數據后,典型的最終結果是按照相應類的順序排列的一組圖像/數據點。
The method by which neural networks learn within Deep learning is through the detection of patterns between spatial information within images.
神經網絡在深度學習中學習的方法是通過檢測圖像中空間信息之間的模式。
Supposedly we have a dataset of 10,000 images with five classes. The first 2,000 images belong to Class 1; the second 2,000 images belong to Class 2, and so on.
假設我們有五個類別的10,000張圖像的數據集。 前2,000張圖像屬于Class 1; 第二2,000張圖片屬于Class 2,依此類推。
During the training phase, if we present the network with unshuffled training data, we would find that the neural network will learn patterns that closely correlate to Class 1, as these are the images and data points the neural network is exposed to first. This will increase the difficulty of an optimization algorithm discovering an optimal solution for the entire dataset.
在訓練階段,如果我們為網絡提供未經改組的訓練數據,我們會發現神經網絡將學習與第1類緊密相關的模式,因為這些是神經網絡首先暴露的圖像和數據點。 這將增加優化算法發現整個數據集的最優解的難度。
By shuffling the dataset, we ensure two key things:
通過對數據集進行混排,我們確保了兩個關鍵的事情:
1. There is large enough variance within the dataset that enables each data point within the training data to have an independent effect on the network. Therefore we can have a network that generalizes well to the entire dataset, rather than a subsection of the dataset.
1.數據集中有足夠大的方差,使訓練數據中的每個數據點都能對網絡產生獨立影響。 因此,我們可以擁有一個可以很好地歸納到整個數據集而不是數據集的子部分的網絡。
2. Our validation partition of the dataset is obtained from the training data; if we fail to shuffle the dataset appropriately, we find that our validation dataset will not be representative of the classes within training data. For example, our validation dataset might only contain data points from the last class of the training data, as opposed to equal representation of every class with the dataset.
2.我們從訓練數據中獲得數據集的驗證分區; 如果我們未能適當地對數據集進行混洗,則會發現我們的驗證數據集將不代表訓練數據中的類。 例如,我們的驗證數據集可能只包含來自訓練數據最后一類的數據點,而不是每個類與數據集的均等表示。
為什么我們在訓練之前對數據集進行批處理? (Why do we batch the dataset before training?)
Dataset partitions are usually batched for memory optimization reasons. There are two ways you can train a network.
通常出于內存優化的原因而對數據集分區進行批處理。 訓練網絡有兩種方法。
Approach #1 will work for a small dataset, but when you start approaching a larger sized dataset, you will find that approach #1 consumes a lot of memory resources
方法1適用于較小的數據集,但是當您開始使用較大的數據集時,您會發現方法1消耗大量內存資源
By using approach #1 for a large dataset, the images or data points are held in memory, and this typically causes ‘Out of Memory’ error during training.
通過對大型數據集使用方法1,圖像或數據點將保存在內存中,這通常在訓練過程中導致“內存不足”錯誤。
Approach #2 is a more conservative method of training network with large dataset while considering efficient memory management. By batching the training data, we are only holding 16, 32, or 128 data points at any giving time in memory, as opposed to an entire dataset.
方法2是在考慮有效內存管理的同時使用大型數據集訓練網絡的更為保守的方法。 通過分批訓練數據,我們在任何給定時間都只保留16、32或128個數據點,而不是整個數據集。
結論 (Conclusion)
This detailed article covers some topics surrounding typical processes within deep learning projects. We’ve gone through the following subject areas:
這篇詳細的文章涵蓋了有關深度學習項目中典型流程的一些主題。 我們已經完成了以下主題領域:
- Machine and Deep learning tools and libraries 機器和深度學習工具和庫
- Data partitioning 資料分割
- Creating Input and data pipelines using TensorFlow 使用TensorFlow創建輸入和數據管道
- Data Preprocessing 數據預處理
- Convolutional Neural Network Implementation (AlexNet) 卷積神經網絡實現(AlexNet)
- Model performance monitoring using TensorBoard 使用TensorBoard進行模型性能監控
- Model Evaluation 模型評估
In the future, we’ll cover the implementation of another well known convolutional neural network architecture: GoogLeNet.
將來,我們將介紹另一種眾所周知的卷積神經網絡體系結構的實現:GoogLeNet。
來自我的更多 (More From Me)
To connect with me or find more content similar to this article, do the following:
要與我聯系或查找更多類似于本文的內容,請執行以下操作:
Subscribe to my Email list for weekly newsletters
訂閱我的電子郵件列表 每周通訊
Follow me on Medium
跟我來中
Connect and reach me on LinkedIn
在LinkedIn上聯系并聯系我
翻譯自: https://towardsdatascience.com/implementing-alexnet-cnn-architecture-using-tensorflow-2-0-and-keras-2113e090ad98
總結
以上是生活随笔為你收集整理的使用TensorFlow 2.0+和Keras实现AlexNet CNN架构的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 了解LSTM和GRU
- 下一篇: power bi_如何将Power BI