回憶ImageNet的步驟 :http://caffe.berkeleyvision.org/gathered/examples/imagenet.html
Brewing ImageNet
This guide is meant to get you ready to train your own model on your own data. If you just want an ImageNet-trained network, then note that since training takes a lot of energy and we hate global warming, we provide the CaffeNet model trained as described below in the?model zoo.
Data Preparation
The guide specifies all paths and assumes all commands are executed from the root caffe directory.
By “ImageNet” we here mean the ILSVRC12 challenge, but you can easily train on the whole of ImageNet as well, just with more disk space, and a little longer training time.
We assume that you already have downloaded the ImageNet training data and validation data, and they are stored on your disk like:
/path/to/imagenet/train/n01440764/n01440764_10026.JPEG
/path/to/imagenet/val/ILSVRC2012_val_00000001.JPEG
You will first need to prepare some auxiliary data for training. This data can be downloaded by:
./data/ilsvrc12/get_ilsvrc_aux.sh
The training and validation input are described in?train.txt?and?val.txt?as text listing all the files and their labels. Note that we use a different indexing for labels than the ILSVRC devkit: we sort the synset names in their ASCII order, and then label them from 0 to 999. See?synset_words.txt?for the synset/name mapping.
You may want to resize the images to 256x256 in advance. By default, we do not explicitly do this because in a cluster environment, one may benefit from resizing images in a parallel fashion, using mapreduce. For example, Yangqing used his lightweight?mincepie?package. If you prefer things to be simpler, you can also use shell commands, something like:
for name in /path/to/imagenet/val/*.JPEG; doconvert -resize 256x256\! $name $name
done
Take a look at?examples/imagenet/create_imagenet.sh. Set the paths to the train and val dirs as needed, and set “RESIZE=true” to resize all images to 256x256 if you haven’t resized the images in advance. Now simply create the leveldbs with?examples/imagenet/create_imagenet.sh. Note thatexamples/imagenet/ilsvrc12_train_leveldb?and?examples/imagenet/ilsvrc12_val_leveldb?should not exist before this execution. It will be created by the script.?GLOG_logtostderr=1?simply dumps more information for you to inspect, and you can safely ignore it.
Compute Image Mean
The model requires us to subtract the image mean from each image, so we have to compute the mean.?tools/compute_image_mean.cpp?implements that - it is also a good example to familiarize yourself on how to manipulate the multiple components, such as protocol buffers, leveldbs, and logging, if you are not familiar with them. Anyway, the mean computation can be carried out as:
./examples/imagenet/make_imagenet_mean.sh
which will make?data/ilsvrc12/imagenet_mean.binaryproto.
Model Definition
We are going to describe a reference implementation for the approach first proposed by Krizhevsky, Sutskever, and Hinton in their?NIPS 2012 paper.
The network definition (models/bvlc_reference_caffenet/train_val.prototxt) follows the one in Krizhevsky et al. Note that if you deviated from file paths suggested in this guide, you’ll need to adjust the relevant paths in the?.prototxt?files.
If you look carefully at?models/bvlc_reference_caffenet/train_val.prototxt, you will notice severalinclude?sections specifying either?phase: TRAIN?or?phase: TEST. These sections allow us to define two closely related networks in one file: the network used for training and the network used for testing. These two networks are almost identical, sharing all layers except for those marked with?include { phase: TRAIN }?or?include { phase: TEST }. In this case, only the input layers and one output layer are different.
Input layer differences: ?The training network’s?data?input layer draws its data fromexamples/imagenet/ilsvrc12_train_leveldb?and randomly mirrors the input image. The testing network’s?data?layer takes data from?examples/imagenet/ilsvrc12_val_leveldb?and does not perform random mirroring.
Output layer differences: ?Both networks output the?softmax_loss?layer, which in training is used to compute the loss function and to initialize the backpropagation, while in validation this loss is simply reported. The testing network also has a second output layer,?accuracy, which is used to report the accuracy on the test set. In the process of training, the test network will occasionally be instantiated and tested on the test set, producing lines like?Test score #0: xxx?and?Test score #1: xxx. In this case score 0 is the accuracy (which will start around 1/1000 = 0.001 for an untrained network) and score 1 is the loss (which will start around 7 for an untrained network).
We will also lay out a protocol buffer for running the solver. Let’s make a few plans:
We will run in batches of 256, and run a total of 450,000 iterations (about 90 epochs). For every 1,000 iterations, we test the learned net on the validation data. We set the initial learning rate to 0.01, and decrease it every 100,000 iterations (about 20 epochs). Information will be displayed every 20 iterations. The network will be trained with momentum 0.9 and a weight decay of 0.0005. For every 10,000 iterations, we will take a snapshot of the current status.
Sound good? This is implemented in?models/bvlc_reference_caffenet/solver.prototxt.
Training ImageNet
Ready? Let’s train.
./build/tools/caffe train --solver=models/bvlc_reference_caffenet/solver.prototxt
Sit back and enjoy!
數據集準備:
ImageNet consists of variable-resolution images, while our system requires a constant input dimensionality.Therefore, we down-sampled the images to a fixed resolution of 256 × 256. Given arectangular image, we first rescaled the image such that the shorter side was of length 256, and thencropped out the central 256×256 patch from the resulting image. We did not pre-process the imagesin any other way, except for subtracting the mean activity over the training set from each pixel.
參照?http://blog.csdn.net/u010417185/article/details/52651761
Data augmentation中的crop:
[python] ?view plaincopy
layer?{?? ??name:?"data" ?? ??type:?"Data" ?? ??top:?"data" ?? ??top:?"label" ?? ??include?{?? ????phase:?TRAIN?? ??}?? ??transform_param?{?? ????mirror:?true?? ????crop_size:?600 ?? ????mean_file:?"examples/images/imagenet_mean.binaryproto" ?? ??}?? ??data_param?{?? ????source:?"examples/images/train_lmdb" ?? ????batch_size:?256 ?? ????backend:?LMDB?? ??}?? }?? layer?{?? ??name:?"data" ?? ??type:?"Data" ?? ??top:?"data" ?? ??top:?"label" ?? ??include?{?? ????phase:?TEST?? ??}?? ??transform_param?{?? ????mirror:?false?? ????crop_size:?600 ?? ????mean_file:?"examples/images/imagenet_mean.binaryproto" ?? ??}?? ??data_param?{?? ????source:?"examples/images/val_lmdb" ?? ????batch_size:?50 ?? ????backend:?LMDB?? ??}?? } ?
從上面的 數據層的定義,看得出用了鏡像和crop_size,還定義了 mean_file。
利用crop_size這種方式可以剪裁中心關注點和邊角特征,mirror可以產生鏡像,彌補小數據集的不足.
這里要重點講一下crop_size在訓練層與測試層的區別:
首先我們需要了解mean_file和crop_size沒什么大關系。mean_file是根據訓練集圖片制作出來的,crop_size是對訓練集圖像進行裁剪,兩個都是對原始的訓練集圖像進行處理。如果原始訓練圖像的尺寸大小為800*800,crop_size的圖片為600*600,則mean_file與crop_size的圖片均為800*800的圖像集。
文中用的是從256x256圖像上crop224x224區域,而如果尺寸超過256,則crop size也需要增大,盡管在multi-scale training中,提倡將同一大小的crop用在不同大小輸入圖像上,但那里最大也就是512,差距還好。
在caffe中,如果定義了crop_size,那么在train時會對大于crop_size的圖片進行隨機裁剪,而在test時只是截取中間部分(詳見/caffe/src/caffe/data_transformer.cpp ):
[python] ?view plaincopy
//We?only?do?random?crop?when?we?do?training.?? ????if ?(phase_?==?TRAIN)?{?? ??????h_off?=?Rand(datum_height?-?crop_size?+?1 );?? ??????w_off?=?Rand(datum_width?-?crop_size?+?1 );?? ????}?else ?{?? ??????h_off?=?(datum_height?-?crop_size)?/?2 ;?? ??????w_off?=?(datum_width?-?crop_size)?/?2 ;?? ????}?? ??} ?
下面是我在網上找到的自己進行圖像裁剪的程序:
可對照給出的網址進行詳細閱讀:http://blog.csdn.NET/u011762313/article/details/48343799
我們可以手動將圖片裁剪并導入pycaffe中,這樣能夠提高識別率(pycaffe利用caffemodel進行分類中:進行分類這一步改為如下):
[python] ?view plaincopy
?? pridects?=?np.zeros((1 ,?CLASS_NUM))?? ?? ?? img_shape?=?np.array(img.shape)?? ?? crop_dims?=?(32 ,? 96 )?? crop_dims?=?np.array(crop_dims)?? ?? ?? w_range?=?img_shape[1 ]?-?crop_dims[ 1 ]?? ?? for ?k? in ?range( 0 ,?w_range?+? 1 ,?crop_dims[ 1 ]?/? 4 )?+?range(w_range,? 1 ,?-crop_dims[ 1 ]?/? 4 ):?? ?????? ????crop_img?=?img[:,?k:k?+?crop_dims[1 ],?:]?? ?????? ????net.blobs['data' ].data[...]?=?transformer.preprocess( 'data' ,?crop_img)?? ?????? ????out?=?net.forward()?? ?????? ????pridects?+=?out['prob' ]?? ?? ?? pridect?=?pridects.argmax() ?
caffe中提供了過采樣的方法(oversample ),詳見/caffe/python/caffe/io.py ,裁剪的是圖片中央、4個角以及鏡像共10張圖片。
在使用pycaffe定義網絡、使用pycaffe進行網絡訓練與測試之后得到caffemodel文件,下面利用caffemodel進行分類: <code class="language-python hljs has-numbering" style="display: block; padding: 0px; background: transparent; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> caffe</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li></ul>
<code class="language-python hljs has-numbering" style="display: block; padding: 0px; background: transparent; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal;"><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># caffemodel文件</span>
MODEL_FILE = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'model/_iter_10000.caffemodel'</span>
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># deploy文件,參考/caffe/models/bvlc_alexnet/deploy.prototxt</span>
DEPLOY_FILE = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'deploy.prototxt'</span>
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 測試圖片存放文件夾</span>
TEST_ROOT = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'datas/'</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li></ul>
<code class="language-python hljs has-numbering" style="display: block; padding: 0px; background: transparent; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal;">caffe.set_mode_gpu()
net = caffe.Net(DEPLOY_FILE, MODEL_FILE, caffe.TEST)</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li></ul>
<code class="language-python hljs has-numbering" style="display: block; padding: 0px; background: transparent; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal;"><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 'data'對應于deploy文件:</span>
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># input: "data"</span>
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># input_dim: 1</span>
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># input_dim: 3</span>
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># input_dim: 32</span>
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># input_dim: 96</span>
transformer = caffe.io.Transformer({<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'data'</span>: net.blobs[<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'data'</span>].data.shape})
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># python讀取的圖片文件格式為H×W×K,需轉化為K×H×W</span>
transformer.set_transpose(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'data'</span>, (<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>))
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># python中將圖片存儲為[0, 1],而caffe中將圖片存儲為[0, 255],</span>
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 所以需要一個轉換</span>
transformer.set_raw_scale(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'data'</span>, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">255</span>)
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># caffe中圖片是BGR格式,而原始格式是RGB,所以要轉化</span>
transformer.set_channel_swap(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'data'</span>, (<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>))
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 將輸入圖片格式轉化為合適格式(與deploy文件相同)</span>
net.blobs[<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'data'</span>].reshape(<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">3</span>, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">32</span>, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">96</span>)</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li></ul>
<code class="language-python hljs has-numbering" style="display: block; padding: 0px; background: transparent; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal;"><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 詳見/caffe/python/caffe/io.py</span>
img = caffe.io.load_image(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'temp.jpg'</span>)
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 讀取的圖片文件格式為H×W×K,需轉化</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li></ul>
<code class="language-python hljs has-numbering" style="display: block; padding: 0px; background: transparent; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal;"><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 數據輸入、預處理</span>
net.blobs[<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'data'</span>].data[...] = transformer.preprocess(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'data'</span>, img)
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 前向迭代,即分類</span>
out = net.forward()
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 輸出結果為各個可能分類的概率分布</span>
pridects = out[<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'prob'</span>]
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 上述'prob'來源于deploy文件:</span>
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># layer {</span>
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># name: "prob"</span>
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># type: "Softmax"</span>
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># bottom: "ip2"</span>
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># top: "prob"</span>
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># }</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li></ul>
<code class="language-python hljs has-numbering" style="display: block; padding: 0px; background: transparent; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal;">pridect = pridects.argmax()</code>
注:如果圖片過大, 需要適當縮小batch_size的值,否則使用GPU時可能超出其緩存大小而報錯
在AlexNet訓練中,trainset的batch_size是256,testset的batch_size是50,與兩個集合的大小不成比例。
關于輸入圖像尺寸問題:http://caffecn.cn/?/question/74
建議讀一下caffe.proto文件,里面有對每種layer的詳細參數定義,在ConvolutioalParameter里可以找到你想找到的。
看examples/imagenet里面的convert_imageset.sh
GLOG_logtostderr=1 $TOOLS/convert_imageset \
????
--resize_height=$RESIZE_HEIGHT \ ??? --resize_width=$RESIZE_WIDTH \
??? --shuffle \
??? $TRAIN_DATA_ROOT \
??? $DATA/train.txt \
??? $EXAMPLE/ilsvrc12_train_lmdb
構造一個網絡首先要保證數據流是通的,即各層的輸出形狀的是整數,不能是小數。至于構造出來的網絡效果好不好,按下不表。只要數據流通暢,你的輸入圖像是什么形狀的都無所謂了。 ? 如果你的圖像是邊長為 256 的正方形。那么卷積層的輸出就滿足 [ (256 - kernel_size)/ stride ] + 1 ,這個數值得是整數才行,否則沒有物理意義。例如,你算得一個邊長為 7.7 的 feature map 是沒有物理意義的。?pooling 層同理可得。FC 層的輸出形狀總是滿足整數,其唯一的要求就是整個訓練過程中 FC 層的輸入得是定長的。 ? 如果你的圖像不是正方形。那么可以 在制作 leveldb / lmdb 數據庫時,縮放到統一大小(非正方形)。然后再 使用非正方形的 kernel_size 來使得卷積層的輸出依然是整數。
其他問題:http://blog.csdn.net/u010417185/article/details/52649178
1、均值計算是否需要統一圖像的尺寸?
在圖像計算均值時,應該先統一圖像的尺寸,否則會報出錯誤的。
粘貼一部分官方語言:
均值削減是數據預處理中常見的處理方式,按照之前在學習ufldl教程PCA的一章時,對于圖像介紹了兩種:第一種常用的方式叫做dimension_mean(個人命名),是依據輸入數據的維度,每個維度內進行削減,這個也是常見的做法;第二種叫做per_image_mean,ufldl教程上說,在natural images上訓練網絡時;給每個像素(這里只每個dimension)計算一個獨立的均值和方差是make little sense的;這是因為圖像本身具有統計不變性,即在圖像的一部分的統計特性和另一部分相同。作者最后建議,如果你訓練你的算法在非natural images(如mnist,或者在白背景存在單個獨立的物體),其他類型的規則化是值得考慮的。但是當在natural images上訓練時,per_image_mean是一個合理的默認選擇。
這段話意在告訴我們在訓練的圖像不同,我們均值采用的方法亦可發生變化。
了解完后我們來看一下如果圖像尺寸不統一會報出什么樣子的錯誤:
上圖中很明顯爆出了“size_in_datum == data_size ” 的錯誤。
下面是小編找到的問題原因:
在把圖片轉化到levelDB中遇到了Check failed: data.size() == data_size,歸根究底還是源碼沒細看,找到出錯的行在F0714 20:31:14.899121 26565 convert_imageset.cpp:84] convert_imageset.cpp中的第84行,? CHECK_EQ(data.size(), data_size) << "Incorrect data field size "?<< data.size();就是說兩個大小不一致,再看代碼
[cpp] ?view plaincopy
int ?data_size;?? ???bool ?data_size_initialized?=? false ;?? ???for ?( int ?line_id?=?0;?line_id?<?lines.size();?++line_id)?{?? ?????if ?(!ReadImageToDatum(root_folder?+?lines[line_id].first,lines[line_id].second,?datum))?{?? ???????continue ;?? ?????}?? ?????if ?(!data_size_initialized)?{?? ???????data_size?=?datum.channels()?*?datum.height()?*?datum.width();?? ???????data_size_initialized?=?true ;?? ?????}?else ?{?? ???????const ?string&?data?=?datum.data();?? ???????CHECK_EQ(data.size(),?data_size)?<<?"Incorrect?data?field?size?" ?? ???????????<<?data.size();?? ?????} ?
從上面的代碼可知,第一次循環中,data_size_initialized=false,然后進入到if (!data_size_initialized) 中,把data_size設為了datum.channels() * datum.height() * datum.width(),同時把data_size_initialized=true,在以后的迭代中,都是執行else語句,從而加入圖片大小不一致會報錯,處理的辦法可選的是,在轉換到數據庫levelDB前,讓圖片resize到一樣的大小,或者把ReadImageToDatum改成ReadImageToDatum(root_folder + lines[line_id].first,lines[line_id].second,width,height ,datum)。
參考博文地址:http://blog.csdn.NET/alan317/article/details/37772457
2、caffe實際運行中圖像大小不一,放大縮小時都有可能失真,此時該如何處理數據?
如果處理的圖像大小不一且過度放大或者過度縮小會造成圖像嚴重失真且丟失信息,則不能直接對圖像尺寸進行歸一化。
措施:
可以采用一個居中的尺寸,例如統一圖像的寬度為600,而高度根據寬度的大小按照比例進行縮放。處理完之后可以對圖像進行切片處理,進而將圖像尺寸進行歸一化。
3、Crop_size的作用?
對圖像進行裁剪,如果原圖為800*800,而我們只需進行600*600圖像檢測時,我們可以使用crop_size進行圖像截取。當截取的模式為TRAIN時,截取方式為隨機截取。其他的模式則只截取圖像的中間區域。
具體可查看http://blog.csdn.net/u010417185/article/details/52651761
4、在網絡配置文件中的 test_iter 值得確定
[python] ?view plaincopy
?? ?? ?? net:?"examples/cifar10/cifar10_quick_train_test.prototxt" ?? ?? ?? ?? test_iter:?100 ?? ?? test_interval:?100 ?? ?? base_lr:?0.001 ?? momentum:?0.9 ?? weight_decay:?0.004 ?? ?? lr_policy:?"fixed" ?? ?? display:?100 ?? ?? max_iter:?4000 ?? ?? snapshot:?4000 ?? snapshot_format:?HDF5?? snapshot_prefix:?"examples/cifar10/cifar10_quick" ?? ?? solver_mode:?CPU ?
在設置配置時,對于test_iter值的計算有一點模糊,不知是根據batch size 值與整體圖像庫(測試集合與訓練集合)還是單獨的某個圖像集合數據計算獲得。后來通過認真讀給出的解釋與實例,最終確定該值是batch size 值與測試圖像集合計算獲得的。若batch size 值為100,而訓練集合含有6000幅圖片,測試集含有1000幅圖片,則test_iter值為1000/10,與訓練集的圖片量無關。
整體步驟:參照http://blog.csdn.net/alexqiweek/article/details/51281240
1.數據準備
在caffe/data下新建目錄myself,并在myself里又新建兩個目錄train、val。
?
注意:圖片的格式必須為 .jpeg 格式
train存放訓練用的數據源;該目錄下又兩個目錄bird(70張圖)、cat(70張圖)
?
?
?
val存放用于測試的數據源;bird和cat各20張圖
?
在終端下切換到caffe/data/myself目錄下,利用上面的數據源生成train.txt、val.txt、test.txt。
test.txt的內容和val.txt相同,只是沒有后面的數字標識。
?
生成 val.txt 的命令 :find? -name *.jpeg |grep -v train | cut -d/ -f3>val.txt
?
生成 train.txt 的命令 :find? -name *.jpeg |grep? train | cut -d/ -f3-4 > train.txt;但由于bird和cat的圖片需要通過在后面添加不同的數字區分開來,因此還需命令:sed -i '1,70s/.*/&? 0/' train.txt和sed -i'71,141s/.*/&? 1/' train.txt
2創建數據庫
在caffe/example目錄下新建目錄myself。并將caffe/examples/imagenet目錄下create_imagenet.sh文件拷貝到myself中。
?
create_imagenet.sh的內容如下:
第5行的EXAMPLE指定生成的數據庫文件存放路徑。
第6行的DATA指定生成數據庫所需文件來源路徑。
第 9 行的 TRAIN_DATA_ROOT 指明存放訓練數據的絕對路徑。
第 10 行的VAL_DATA_ROOT 指明存放測試數據的絕對路徑。 TRAIN_DATA_ROOT 和VAL_DATA_ROOT 寫錯了,就會報一堆找不到圖片的錯誤。
第12行到21行用于將圖片調節成統一大小,256X256。
?
第45、55行指定生成的數據庫文件夾的名稱。
?
在caffe的主目錄下輸了命令./examples/myself/create_imagenet.sh就會在create_imagenet.sh中的EXAMPLE所指定的目錄下(此次為example/myself)生成兩個數據庫文件。
?
3訓練網絡【使用CaffeNet網絡進行訓練的時間可能比LeNet網絡用的時間多,本次實驗使用的網絡是 CaffeNet 】
①??拷貝models/bvlc_alexnet目錄下的train_val.prototxt文件到example/myself目錄下。
該文件的定義的為待訓練網絡的結構。
?
②拷貝models/bvlc_alexnet目錄下的solver.prototxt文件到example/myself目錄下。
該文件為訓練網絡時的所需的一些配置和設置
第1行指定定義網絡結構的文件的相對路徑。
?
③??拷貝examples/imagenet目錄下的make_imagenet_mean.sh文件到examples/myself目錄下。用于計算圖像均值,使用的源文件在/tools/compute_image_mean.cpp。
?
?
④??拷貝examples/imagenet目錄下的train_caffenet.sh文件到example/myself目錄下。
該文件為一個腳本文件,內容為訓練網絡的命令
?
在caffe的主目錄下輸入命令:./ examples/myself/train_caffenet.sh開始訓練網絡。
?
?
?
4使用測試數據測試網絡
使用命令:./build/tools/caffe.bintest --model=examples/myself/train_val.prototxt?--weights=examples/myself/caffenet_model/caffenet_train_iter_16000.caffemodel對網絡進行測試。Train_val.prototxt為網絡的定義;caffenet_train_iter_16000.caffemodel為訓練網絡時生成的模型。
[ 出現的問題 ]
[ 解決辦法 ]
復制上圖中的三個文件到 /caffe/examples/mysel 下。使其和與 CaffeNet 有關的網絡結構定義文件 .protxt 、訓練網絡時生成的 .caffemodel 和 .solversate 文件在同一目錄下。
?
5.使用某張圖片測試網絡,并顯示所提取的特征。
編寫Classification:Instant Recognition with Caffe有關的文件。
在/caffe/examples/myself/下使用命令Python?./xxxxx.py命令運行Classification:Instant Recognition with Caffe有關的文件有關文件。
【出錯 1 】
【解決辦法】
修改定義 CaffeNet 訓練網絡結構的定義 .prototxt 文件的有關內容,使相對路徑變成絕對路徑。
?
?
【出錯 2 】
【解決辦法 2 】
該問題無法解決,因為現在測試所用的網絡與訓練所用的網絡是同一個網絡。可以考慮用其它的網絡來測試訓練生成的模型是否準確。修改前面提到的 xxxxxxx.py 文件
總結
以上是生活随笔 為你收集整理的Caffe 在自己的数据库上训练步骤 的全部內容,希望文章能夠幫你解決所遇到的問題。
如果覺得生活随笔 網站內容還不錯,歡迎將生活随笔 推薦給好友。