當前位置：首頁 > 人工智能 > Caffe >内容正文

Caffe

Caffe官方教程翻译（6）：Learning LeNet

發布時間：2025/3/21 Caffe 68 豆豆

生活随笔收集整理的這篇文章主要介紹了 Caffe官方教程翻译（6）：Learning LeNet 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

前言

最近打算重新跟著官方教程學習一下caffe，順便也自己翻譯了一下官方的文檔。自己也做了一些標注，都用斜體標記出來了。中間可能額外還加了自己遇到的問題或是運行結果之類的。歡迎交流指正，拒絕噴子！
官方教程的原文鏈接：http://nbviewer.jupyter.org/github/BVLC/caffe/blob/master/examples/01-learning-lenet.ipynb

Solving in Python with LeNet

在這個例子中我們將要學習Caffe的Python接口，著重學習Solver接口。

1.準備

準備好Python環境：我們通過使用pylab庫來導入numpy并繪圖。

from pylab import * %matplotlib inline

導入caffe，添加它的路徑到sys.path。請事先編譯好pycaffe。

import sys caffe_root = '/home/xhb/caffe/caffe/' # caffe的根路徑，請自行設置 sys.path.insert(0, caffe_root + 'python') import caffe

我們首先使用提供的LeNet例子的數據和網絡模型(你需要自行下載好數據，并創建好數據庫，如下所示)

# run scripts from caffe root import os os.chdir(caffe_root) # Download data !data/mnist/get_mnist.sh # Prepare data !examples/mnist/create_mnist.sh # back to examples os.chdir('examples') Downloading... Creating lmdb... I0301 12:48:30.756855 995 db_lmdb.cpp:35] Opened lmdb examples/mnist/mnist_train_lmdb I0301 12:48:30.757007 995 convert_mnist_data.cpp:88] A total of 60000 items. I0301 12:48:30.757015 995 convert_mnist_data.cpp:89] Rows: 28 Cols: 28 I0301 12:48:35.242076 995 convert_mnist_data.cpp:108] Processed 60000 files. I0301 12:48:35.257020 996 db_lmdb.cpp:35] Opened lmdb examples/mnist/mnist_test_lmdb I0301 12:48:35.257267 996 convert_mnist_data.cpp:88] A total of 10000 items. I0301 12:48:35.257280 996 convert_mnist_data.cpp:89] Rows: 28 Cols: 28 I0301 12:48:35.941156 996 convert_mnist_data.cpp:108] Processed 10000 files. Done.

2.創建網絡

現在讓我們來編寫一個LeNet的變種網絡，經典的1989年的convnet結構。
我們另外需要兩個文件：
- 網絡的prototxt文件，定義了網絡結構，并指向了訓練和測試數據集。
- 解決方案的prototxt文件，定義了超參數等。
我們首先創建網絡。我們將使用Python代碼以簡潔而自然的方式來編寫網絡，并序列化為Caffe的protobuf模型格式。
這個網絡需要從生成好的LMDB數據庫文件讀取數據，單也可以使用MemoryDataLayer直接從ndarray讀取數據。

from caffe import layers as L, params as Pdef lenet(lmdb, batch_size):# our version of LeNet: a series of linear and simple nonlinear transformationsn = caffe.NetSpec()n.data, n.label = L.Data(batch_size=batch_size, backend=P.Data.LMDB, source=lmdb,transform_param=dict(scale=1./255), ntop=2)n.conv1 = L.Convolution(n.data, kernel_size=5, num_output=20, weight_filler=dict(type='xavier'))n.pool1 = L.Pooling(n.conv1, kernel_size=2, stride=2, pool=P.Pooling.MAX)n.conv2 = L.Convolution(n.pool1, kernel_size=5, num_output=50, weight_filler=dict(type='xavier'))n.pool2 = L.Pooling(n.conv2, kernel_size=2, stride=2, pool=P.Pooling.MAX)n.fc1 = L.InnerProduct(n.pool2, num_output=500, weight_filler=dict(type='xavier'))n.relu1 = L.ReLU(n.fc1, in_place=True)n.score = L.InnerProduct(n.relu1, num_output=10, weight_filler=dict(type='xavier'))n.loss = L.SoftmaxWithLoss(n.score, n.label)return n.to_proto()with open('mnist/lenet_auto_train.prototxt', 'w') as f:f.write(str(lenet('mnist/mnist_train_lmdb', 64)))with open('mnist/lenet_auto_test.prototxt', 'w') as f:f.write(str(lenet('mnist/mnist_test_lmdb', 100)))

通過使用Google的protobuf庫，這個網絡已經被以一種更加冗長單卻易讀的序列化格式保存到硬盤上了。你可以直接讀取，寫入，修改數據。讓我們看看要訓練的網絡。

!cat mnist/lenet_auto_train.prototxt layer {name: "data"type: "Data"top: "data"top: "label"transform_param {scale: 0.00392156885937}data_param {source: "mnist/mnist_train_lmdb"batch_size: 64backend: LMDB} } layer {name: "conv1"type: "Convolution"bottom: "data"top: "conv1"convolution_param {num_output: 20kernel_size: 5weight_filler {type: "xavier"}} } layer {name: "pool1"type: "Pooling"bottom: "conv1"top: "pool1"pooling_param {pool: MAXkernel_size: 2stride: 2} } layer {name: "conv2"type: "Convolution"bottom: "pool1"top: "conv2"convolution_param {num_output: 50kernel_size: 5weight_filler {type: "xavier"}} } layer {name: "pool2"type: "Pooling"bottom: "conv2"top: "pool2"pooling_param {pool: MAXkernel_size: 2stride: 2} } layer {name: "fc1"type: "InnerProduct"bottom: "pool2"top: "fc1"inner_product_param {num_output: 500weight_filler {type: "xavier"}} } layer {name: "relu1"type: "ReLU"bottom: "fc1"top: "fc1" } layer {name: "score"type: "InnerProduct"bottom: "fc1"top: "score"inner_product_param {num_output: 10weight_filler {type: "xavier"}} } layer {name: "loss"type: "SoftmaxWithLoss"bottom: "score"bottom: "label"top: "loss" }

現在讓我們看看學習參數（超參數），它們都被保存在一個prototxt文件中（caffe源碼中已經提供了）。我們使用有動量、權重衰減、指定的學習率表的SGD算法。

# 備注：這里我修改了lenet_auto_solver.prototxt，因為我不是在caffe_root下操作的，所以不能使用相關路徑； # 如果這個文件中的路徑錯了，后面的程序會直接死掉，無法運行，所以無法運行時可以查看下這個文件中定義的路徑是否出錯了 !cat mnist/lenet_auto_solver.prototxt # The train/test net protocol buffer definition # train_net: "mnist/lenet_auto_train.prototxt" train_net: "/home/xhb/caffe/caffe/examples/mnist/lenet_auto_train.prototxt" # test_net: "mnist/lenet_auto_test.prototxt" test_net: "/home/xhb/caffe/caffe/examples/mnist/lenet_auto_test.prototxt" # test_iter specifies how many forward passes the test should carry out. # In the case of MNIST, we have test batch size 100 and 100 test iterations, # covering the full 10,000 testing images. test_iter: 100 # Carry out testing every 500 training iterations. test_interval: 500 # The base learning rate, momentum and the weight decay of the network. base_lr: 0.01 momentum: 0.9 weight_decay: 0.0005 # The learning rate policy lr_policy: "inv" gamma: 0.0001 power: 0.75 # Display every 100 iterations display: 100 # The maximum number of iterations max_iter: 10000 # snapshot intermediate results snapshot: 5000 snapshot_prefix: "/home/xhb/caffe/caffe/examples/mnist/lenet"

3.導入并檢驗解決方案

我們選擇一個設備，并導入解決方案（solver）。使用SGD算法（帶動量）進行優化，但是其他優化算法也是可行的，比如Adagrad和Nesterov的加速梯度下降算法。

# 備注：我在筆記本上跑的，所以沒有采用GPU模式，而是使用了CPU模式 # caffe.set_device(0) # caffe.set_mode_gpu() caffe.set_mode_cpu()### load the solver and create train and test nets # solver = None# ignore this workaround for lmdb data (can't instantiate two solvers on the same data) solver = caffe.SGDSolver('mnist/lenet_auto_solver.prototxt')

為了大致了解下網絡結構，我們可以檢查一下中間特征（blob）的維度和參數。

# each output is (batch size, feature dim, spatial dim) [(k, v.data.shape) for k, v in solver.net.blobs.items()] [('data', (64, 1, 28, 28)),('label', (64,)),('conv1', (64, 20, 24, 24)),('pool1', (64, 20, 12, 12)),('conv2', (64, 50, 8, 8)),('pool2', (64, 50, 4, 4)),('fc1', (64, 500)),('score', (64, 10)),('loss', ())] # just print the weight sizes (we'll omit the biases) [(k, v[0].data.shape) for k, v in solver.net.params.items()] [('conv1', (20, 1, 5, 5)),('conv2', (50, 20, 5, 5)),('fc1', (500, 800)),('score', (10, 500))]

在運行之前，我們先看看是否整個網絡都如我們所期望的那樣正確導入了。在訓練和測試網絡上跑一次前向運算，并確認他們是否包含了你要的數據。

solver.net.forward() # 訓練網絡 solver.test_nets[0].forward() # 測試網絡（有可能不止一個，所以返回的是一個列表） {'loss': array(2.3477354049682617, dtype=float32)}

備注：這里我的運行結果跟官網上結果有一點不同，他的結果是：{'loss': array(2.365971088409424, dtype=float32)}

# 用一點小技巧來貼出前8張圖片 imshow(solver.net.blobs['data'].data[:8, 0].transpose(1, 0, 2).reshape(28, 8*28), cmap='gray') axis('off') print 'train labels:', solver.net.blobs['label'].data[:8] train labels: [ 5. 0. 4. 1. 9. 2. 1. 3.]

imshow(solver.test_nets[0].blobs['data'].data[:8, 0].transpose(1, 0, 2).reshape(28, 8*28), cmap='gray') axis('off') print 'test labels:', solver.test_nets[0].blobs['label'].data[:8] test labels: [ 7. 2. 1. 0. 4. 1. 4. 9.]

4.分步運行solver

訓練和測試網絡都能正確導入數據和標簽了。
- 使用SGD跑一次看看結果如何。

solver.step(1) # imshow(solver.net.params['conv1'][0].diff[:, 0].reshape(4,5,5,5).transpose(0,2,1,3).reshape(4*5, 5*5), cmap='gray') # axis('off') imshow(solver.net.params['conv1'][0].diff[:, 0].reshape(4, 5, 5, 5).transpose(0, 2, 1, 3).reshape(4*5, 5*5), cmap='gray'); axis('off') (-0.5, 24.5, 19.5, -0.5)

5.寫一個訓練的循環

一定發生了什么吧。我們花點時間跑跑這個網絡，在它運行的同時也注意記錄一些東西。注意，這里跟使用caffe編譯好的二進制程序訓練的過程是一樣的。特別地：
- 終端依然會照常打印日志信息（logging）。
- snapshots（也就是保存中間過程產生的模型）會按照在solver prototxt文件中定義的間隔，比如這里是指每隔5000次迭代，取一次。
- 每過特定的間隔就會測試一次網絡，這里是指500次迭代。
既然我們已經在Python代碼中控制了循環操作，那么我們可以在運行程序的同時計算些別的東西了，如下所示。
我們也可以做些別的事，比如：
- 寫一個停止循環的條件
- 在循環更新網絡的同時改變解決方案的進程

%%time niter = 200 test_interval = 25 # losses will also be stored in the log train_loss = zeros(niter) test_acc = zeros(int(np.ceil(niter / test_interval))) output = zeros((niter, 8, 10))# the main solver loop for it in range(niter):solver.step(1) # SGD by Caffe# store the train losstrain_loss[it] = solver.net.blobs['loss'].data# store the output on the first test batch# (start the forward pass at conv1 to avoid loading new data)solver.test_nets[0].forward(start='conv1')output[it] = solver.test_nets[0].blobs['score'].data[:8]# run a full test every so often# (Caffe can also do this for us and write to a log, but we show here# how to do it directly in Python, where more complicated things are easier.)if it % test_interval == 0:print 'Iteration', it, 'testing...'correct = 0for test_it in range(100):solver.test_nets[0].forward()correct += sum(solver.test_nets[0].blobs['score'].data.argmax(1)== solver.test_nets[0].blobs['label'].data)test_acc[it // test_interval] = correct / 1e4 Iteration 0 testing... Iteration 25 testing... Iteration 50 testing... Iteration 75 testing... Iteration 100 testing... Iteration 125 testing... Iteration 150 testing... Iteration 175 testing... CPU times: user 1min 21s, sys: 68 ms, total: 1min 21s Wall time: 1min 20s

接下來畫出訓練的loss和測試的準確率。

_, ax1 = subplots() ax2 = ax1.twinx() ax1.plot(arange(niter), train_loss) ax2.plot(test_interval * arange(len(test_acc)), test_acc, 'r') ax1.set_xlabel('iteration') ax1.set_ylabel('train loss') ax2.set_ylabel('test accuracy') ax2.set_title('Test Accuracy: {:.2f}'.format(test_acc[-1])) Text(0.5,1,u'Test Accuracy: 0.94')

loss看起來下降的很快，也很快趨于收斂（當然要出去局部的隨機性振蕩），同時準確率也相應地提高了。萬歲！
- 既然我們在第一個測試的batch中保存了結果，我們也當然可以看一下預測結果的變化。我們令x軸為時間，y軸對應每個可能的標簽，亮度代表置信度。

for i in range(8):figure(figsize=(2,2))imshow(solver.test_nets[0].blobs['data'].data[i, 0], cmap='gray')figure(figsize=(10,2))imshow(output[150:200,i].T, interpolation='nearest', cmap='gray')xlabel('iteration')ylabel('label')

最初，我們幾乎無法正確預測任何手寫數字，最后慢慢的能夠正確地分類他們了。如果你一直跟著教程走的話，你會看到最后的一個數字是最復雜的，一個傾斜的“9”，很容易被誤認為是“4”
- 注意，這些都是神經網絡最后的輸出，而不是通過softmax計算后的向量。后者，正如下面所示，讓我們更方便地看出網絡的置信率。

for i in range(8):figure(figsize=(2,2))imshow(solver.test_nets[0].blobs['data'].data[i, 0], cmap='gray')figure(figsize=(10,2))imshow(exp(output[150:200,i].T) / exp(output[150:200,i].T).sum(0), interpolation='nearest', cmap='gray')xlabel('iteration')ylabel('label')

6.有關網絡結構和優化的實驗

現在我已經定義好了，分別用于訓練和測試的LeNet網絡，我們還有些別的事情要做：
- 定義新的結構，并與現在的對比效果
- 設置base_lr微調優化，或是再訓練更長的時間
- 切換優化算法，比如使用AdaDelta或者Adam替換SGD
可以通過編輯下面的整合好的例子來試著自行探索。注釋有“EDIT HERE”的地方是建議你修改的地方。
默認定義好了一個簡單的線性分類器作為基線。
如果你更改的方案行不通，試著按照以下建議做做看：
1. 把非線性單元ReLU切換為ELU，或是一個基礎的非線性單元，比如Sigmoid
2. 堆疊更多的全連接層和非線性層
3. 每次都試著10倍10倍地取學習率（比如0.1和0.001）
4. 切換優化算法為Adam（一般來說，這種自適應優化器對超參數不敏感，但也不保證一定如此…）
5. 多訓練一段時間，把niter設置高一些（比如500或是1000）來看看差異

examples_path = '/home/xhb/caffe/caffe/examples/'train_net_path = examples_path + 'mnist/custom_auto_train.prototxt' test_net_path = examples_path + 'mnist/custom_auto_test.prototxt' solver_config_path = examples_path + 'mnist/custom_auto_solver.prototxt'### define net def custom_net(lmdb, batch_size):# define your own net!n = caffe.NetSpec()# keep this data layer for all networksn.data, n.label = L.Data(batch_size=batch_size, backend=P.Data.LMDB, source=lmdb,transform_param=dict(scale=1./255), ntop=2)# EDIT HERE to try different networks# this single layer defines a simple linear classifier# (in particular this defines a multiway logistic regression)n.score = L.InnerProduct(n.data, num_output=10, weight_filler=dict(type='xavier'))# EDIT HERE this is the LeNet variant we have already tried# n.conv1 = L.Convolution(n.data, kernel_size=5, num_output=20, weight_filler=dict(type='xavier'))# n.pool1 = L.Pooling(n.conv1, kernel_size=2, stride=2, pool=P.Pooling.MAX)# n.conv2 = L.Convolution(n.pool1, kernel_size=5, num_output=50, weight_filler=dict(type='xavier'))# n.pool2 = L.Pooling(n.conv2, kernel_size=2, stride=2, pool=P.Pooling.MAX)# n.fc1 = L.InnerProduct(n.pool2, num_output=500, weight_filler=dict(type='xavier'))# EDIT HERE consider L.ELU or L.Sigmoid for the nonlinearity# n.relu1 = L.ReLU(n.fc1, in_place=True)# n.score = L.InnerProduct(n.fc1, num_output=10, weight_filler=dict(type='xavier'))# keep this loss layer for all networksn.loss = L.SoftmaxWithLoss(n.score, n.label)return n.to_proto()with open(train_net_path, 'w') as f:f.write(str(custom_net('mnist/mnist_train_lmdb', 64))) with open(test_net_path, 'w') as f:f.write(str(custom_net('mnist/mnist_test_lmdb', 100)))### define solver from caffe.proto import caffe_pb2 s = caffe_pb2.SolverParameter()# Set a seed for reproducible experiments: # this controls for randomization in training. s.random_seed = 0xCAFFE# Specify locations of the train and (maybe) test networks. s.train_net = train_net_path s.test_net.append(test_net_path) s.test_interval = 500 # Test after every 500 training iterations. s.test_iter.append(100) # Test on 100 batches each time we test.s.max_iter = 10000 # no. of times to update the net (training iterations)# EDIT HERE to try different solvers # solver types include "SGD", "Adam", and "Nesterov" among others. s.type = "SGD"# Set the initial learning rate for SGD. s.base_lr = 0.01 # EDIT HERE to try different learning rates # Set momentum to accelerate learning by # taking weighted average of current and previous updates. s.momentum = 0.9 # Set weight decay to regularize and prevent overfitting s.weight_decay = 5e-4# Set `lr_policy` to define how the learning rate changes during training. # This is the same policy as our default LeNet. s.lr_policy = 'inv' s.gamma = 0.0001 s.power = 0.75 # EDIT HERE to try the fixed rate (and compare with adaptive solvers) # `fixed` is the simplest policy that keeps the learning rate constant. # s.lr_policy = 'fixed'# Display the current training loss and accuracy every 1000 iterations. s.display = 1000# Snapshots are files used to store networks we've trained. # We'll snapshot every 5K iterations -- twice during training. s.snapshot = 5000 s.snapshot_prefix = 'mnist/custom_net'# Train on the GPU s.solver_mode = caffe_pb2.SolverParameter.GPU# Write the solver to a temporary file and return its filename. with open(solver_config_path, 'w') as f:f.write(str(s))### load the solver and create train and test nets solver = None # ignore this workaround for lmdb data (can't instantiate two solvers on the same data) solver = caffe.get_solver(solver_config_path)### solve niter = 250 # EDIT HERE increase to train for longer test_interval = niter / 10 # losses will also be stored in the log train_loss = zeros(niter) test_acc = zeros(int(np.ceil(niter / test_interval)))# the main solver loop for it in range(niter):solver.step(1) # SGD by Caffe# store the train losstrain_loss[it] = solver.net.blobs['loss'].data# run a full test every so often# (Caffe can also do this for us and write to a log, but we show here# how to do it directly in Python, where more complicated things are easier.)if it % test_interval == 0:print 'Iteration', it, 'testing...'correct = 0for test_it in range(100):solver.test_nets[0].forward()correct += sum(solver.test_nets[0].blobs['score'].data.argmax(1)== solver.test_nets[0].blobs['label'].data)test_acc[it // test_interval] = correct / 1e4_, ax1 = subplots() ax2 = ax1.twinx() ax1.plot(arange(niter), train_loss) ax2.plot(test_interval * arange(len(test_acc)), test_acc, 'r') ax1.set_xlabel('iteration') ax1.set_ylabel('train loss') ax2.set_ylabel('test accuracy') ax2.set_title('Custom Test Accuracy: {:.2f}'.format(test_acc[-1])) Iteration 0 testing... Iteration 25 testing... Iteration 50 testing... Iteration 75 testing... Iteration 100 testing... Iteration 125 testing... Iteration 150 testing... Iteration 175 testing... Iteration 200 testing... Iteration 225 testing... Text(0.5,1,u'Custom Test Accuracy: 0.88')

總結

以上是生活随笔為你收集整理的Caffe官方教程翻译（6）：Learning LeNet的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： Caffe官方教程翻译（5）：Class
下一篇： Caffe官方教程翻译（7）：Fine-