SSD 安装、训练
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?安裝
*************************************************************************
###0 安裝 git,下載 SSD 安裝包
sudo apt-get install git git clone https://github.com/weiliu89/caffe.git cd caffe git checkout ssd sudo apt-get install python-pip sudo apt-get install python-numpy sudo apt-get install python-scipy pip install cython -i http://pypi.douban.com/simple pip install eaydict
也可以指定git clone 存放地址?
###1 修改Makefile.config文件?
復制根目錄下的Makefile.config.example為Makefile.config
然后根據(jù)自己的系統(tǒng)環(huán)境調(diào)整相應參數(shù)設(shè)置
MATLAB_DIR: PYTHON_INCLUDE: ablas:###2 編譯??
make -j8 make py make test -j8 make runtest -j8
Error1?
如果有多GPU,運行make runtest 可能會出錯,這時需要嘗試?
export CUDA_VISIBLE_DEVICES=0; make runtest -j8?
如果出現(xiàn)錯誤: check failed :error == cudasuccess(10 vs. 0) invalid device ordinal?
首先需要確保使用的是特定的GPU,或者嘗試?
unset CUDA_VISIBLE_DEVICES?
Error2?
使用caffe時編譯出錯?
include and lib?
使用自己機器編譯的include和lib (caffe/build/lib, caffe/include)?
caffe.pb.h丟失問題:?
/home/xxx/caffe/include/caffe/blob.hpp:9:34: fatal error: caffe/proto/caffe.pb.h: No such file or directory?
?#include "caffe/proto/caffe.pb.h"?
解決方法: 用protoc從caffe/src/caffe/proto/caffe.proto生成caffe.pb.h和caffe.pb.cc?
li@li:~/caffe/src/caffe/proto$ protoc --cpp_out=/home/xxx/caffe/include/caffe/ caffe.proto??
Error3?
stdc++?
linker error:?
/usr/bin/ld: caffe_cnn_handler.o: undefined reference to symbol '_ZNSs4_Rep10_M_destroyERKSaIcE@@GLIBCXX_3.4'?
//usr/lib/x86_64-linux-gnu/libstdc++.so.6: error adding symbols: DSO missing from command line?
是找不到libstdc++.so.6,解決方法是在Makefile中加入:?
LIBS += -L/usr/lib/x86_64-linux-gnu -lstdc++?
*************************************************************************
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?測試
*************************************************************************?
###1 下載訓練好的模型,放到models/VGGNet/?
比如:models_VGGNet_VOC0712_SSD_300x300.tar.gz,解壓放到./models/VGGNet?
models/VGGNet/VOC0712/SSD_300x300?
models/VGGNet/VOC0712/SSD_300X300_webcam?
###2 測試?
根目錄下 運行:?
python examples/ssd/score_ssd_pascal.py (數(shù)值在0.718左右)
###3 演示
根目錄下 運行:?
python examles/ssd/ssd_pascal_webcam.py
*************************************************************************
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?訓練
*************************************************************************
###1 制作自己的數(shù)據(jù)集(與faster rcnn類似)可參考我的另一篇博文:faster rcnn的安裝、訓練、調(diào)試
①新建
(1)data/VOCdevkit/VOC2007新建 Annotations;ImageSets/Main;JPEGImages
說明:
Annotations:保存標簽txt轉(zhuǎn)換的xml文件
JPEGImages: 圖片文件
ImageSets/Main:文件名列表(不含后綴)
訓練集: ? ? train.txt
訓練驗證集: trainval.txt
測試集: ? ? test.txt
驗證集: ? ? val.txt
②拷貝
將data/VOC0712下面的create_list.sh、create_data.sh、labelmap_voc.prototxt拷貝到data/VOCdevkit2007/VOC2007/
③修改接口
**create_list.sh**:修改3處
1.root_dir=$HOME/data/VOCdevkit/ ? ? ? ? ? ? ? ?
改寫為?root_dir=$HOME/caffe/data/VOCdevkit/
2.for name in VOC2007 VOC2012 ? ? ? ? ? ? ? ? ??
改寫為?for name in VOC2007
3.$bash_dir/../../build/tools/get_image_size ? ??
改寫為?$HOME/caffe/build/tools/get_image_size
**create_data.sh**修改5處
1.root_dir=$cur_dir/../.. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
改寫為?root_dir=$HOME/caffe
2.data_root_dir="$HOME/data/VOCdevkit" ? ??
改寫為?data_root_dir="$HOME/caffe/data/VOCdevkit"
3.dataset_name="VOC0712" ? ? ? ? ? ? ? ? ? ? ? ? ??
改寫為?dataset_name="VOC2007"
4.mapfile="$root_dir/data/$dataset_name/labelmap_voc.prototxt"改寫為?mapfile="$root_dir/data/VOCdevkit/$dataset_name/labelmap_voc.prototxt"
5.python $root_dir/scripts/create_annoset.py --anno-type=$anno_type --label-map-file=$mapfile --min-dim=$min_dim --max-dim=$max_dim --resize-width=$width --resize-height=$height --check-label $extra_cmd $data_root_dir $root_dir/data/$dataset_name/$subset.txt $data_root_dir/$dataset_name/$db/$dataset_name"_"$subset"_"$db examples/$dataset_name
改寫為
python $root_dir/scripts/create_annoset.py --anno-type=$anno_type --label-map-file=$mapfile --min-dim=$min_dim --max-dim=$max_dim --resize-width=$width --resize-height=$height --check-label $extra_cmd $data_root_dir $root_dir/data/VOCdevkit/$dataset_name/$subset.txt $data_root_dir/$dataset_name/$db/$dataset_name"_"$subset"_"$db examples/$dataset_name
**labelmap_voc.prototxt**
需要注意是label需要小寫,刪除多余的label,保留label=0的背景,以及自己數(shù)據(jù)的name和label
例如:item {name: "none_of_the_above"label: 0display_name: "background" } item {name: "face"label: 1display_name: "face" } item {name: "pedestrian"label: 2display_name: "pedestrian" }
###2 轉(zhuǎn)換成 LMDB 文件?
到 ./examples 路徑下新建VOC2007文件夾,用于創(chuàng)建LMDB文件軟連接?
然后到根目錄下運行已經(jīng)修改的sh文件?
./data/VOCdevkit/VOC2007/create_list.sh?
./data/VOCdevkit/VOC2007/create_data.sh?
如果出現(xiàn): ? ?no moudle named caffe/caffe-proto,?
則在終端輸入:export PYTHONPATH=$PYTHONPATH:/home/**(服務(wù)器的名字)/caffe/python?
如果依然不行,打開 ./scripts/creta_annosetpy
在import sys后添加以下代碼:
import os.path as osp def add_path(path):if path not in sys.path:sys.path.insert(0,path) caffe_path = osp.join('/home/****/caffe/python') add_path(caffe_path)###3如果是直接使用他人已經(jīng)制作好的LMDB 文件,則只需創(chuàng)建鏈接文件
到 ./scripts 創(chuàng)建 create_link.py 文件,并粘貼如下代碼:
import argparse import os import shutil import subprocess import sysfrom caffe.proto import caffe_pb2 from google.protobuf import text_formatexample_dir = '/home/li/caffe/examples/VOC2007' out_dir = '/home/***/caffe/data/VOCdevkit/VOC2007/lmdb' lmdb_name = ['VOC2007_test_lmdb', 'VOC2007_trainval_lmdb']# check example_dir is exist if not os.path.exists(example_dir):os.makedirs(example_dir) for lmdb_sub in lmdb_name:link_dir = os.path.join(example_dir, lmdb_sub)# check lin_dir is existif os.path.exists(link_dir):os.unlink(link_dir)os.symlink(os.path.join(out_dir,lmdb_sub), link_dir)
###4 下載預訓練模型
下載預訓練模型VGG_ILSVRC_16_layers_fc_reduced.caffemodel,放在 ./models/VGGNet/路徑下
需要修改的地方在對應行之后用######標注出來了
from __future__ import print_function import caffe from caffe.model_libs import * from google.protobuf import text_formatimport math import os import shutil import stat import subprocess import sys# Add extra layers on top of a "base" network (e.g. VGGNet or Inception). def AddExtraLayers(net, use_batchnorm=True, lr_mult=1):use_relu = True# Add additional convolutional layers.# 19 x 19from_layer = net.keys()[-1]# TODO(weiliu89): Construct the name using the last layer to avoid duplication.# 10 x 10out_layer = "conv6_1"ConvBNLayer(net, from_layer, out_layer, use_batchnorm, use_relu, 256, 1, 0, 1,lr_mult=lr_mult)from_layer = out_layerout_layer = "conv6_2"ConvBNLayer(net, from_layer, out_layer, use_batchnorm, use_relu, 512, 3, 1, 2,lr_mult=lr_mult)# 5 x 5from_layer = out_layerout_layer = "conv7_1"ConvBNLayer(net, from_layer, out_layer, use_batchnorm, use_relu, 128, 1, 0, 1,lr_mult=lr_mult)from_layer = out_layerout_layer = "conv7_2"ConvBNLayer(net, from_layer, out_layer, use_batchnorm, use_relu, 256, 3, 1, 2,lr_mult=lr_mult)# 3 x 3from_layer = out_layerout_layer = "conv8_1"ConvBNLayer(net, from_layer, out_layer, use_batchnorm, use_relu, 128, 1, 0, 1,lr_mult=lr_mult)from_layer = out_layerout_layer = "conv8_2"ConvBNLayer(net, from_layer, out_layer, use_batchnorm, use_relu, 256, 3, 0, 1,lr_mult=lr_mult)# 1 x 1from_layer = out_layerout_layer = "conv9_1"ConvBNLayer(net, from_layer, out_layer, use_batchnorm, use_relu, 128, 1, 0, 1,lr_mult=lr_mult)from_layer = out_layerout_layer = "conv9_2"ConvBNLayer(net, from_layer, out_layer, use_batchnorm, use_relu, 256, 3, 0, 1,lr_mult=lr_mult)return net### Modify the following parameters accordingly ### # The directory which contains the caffe code. # We assume you are running the script at the CAFFE_ROOT. caffe_root = os.getcwd()# Set true if you want to start training right after generating all files. run_soon = True # Set true if you want to load from most recently saved snapshot. # Otherwise, we will load from the pretrain_model defined below. resume_training = True # If true, Remove old model files. remove_old_models = False# The database file for training data. Created by data/VOC0712/create_data.sh train_data = "examples/VOC2007/VOC2007_trainval_lmdb"###### # The database file for testing data. Created by data/VOC0712/create_data.sh test_data = "examples/VOC2007/VOC2007_test_lmdb"###### # Specify the batch sampler. resize_width = 300 resize_height = 300 resize = "{}x{}".format(resize_width, resize_height) batch_sampler = [{'sampler': {},'max_trials': 1,'max_sample': 1,},{'sampler': {'min_scale': 0.3,'max_scale': 1.0,'min_aspect_ratio': 0.5,'max_aspect_ratio': 2.0,},'sample_constraint': {'min_jaccard_overlap': 0.1,},'max_trials': 50,'max_sample': 1,},{'sampler': {'min_scale': 0.3,'max_scale': 1.0,'min_aspect_ratio': 0.5,'max_aspect_ratio': 2.0,},'sample_constraint': {'min_jaccard_overlap': 0.3,},'max_trials': 50,'max_sample': 1,},{'sampler': {'min_scale': 0.3,'max_scale': 1.0,'min_aspect_ratio': 0.5,'max_aspect_ratio': 2.0,},'sample_constraint': {'min_jaccard_overlap': 0.5,},'max_trials': 50,'max_sample': 1,},{'sampler': {'min_scale': 0.3,'max_scale': 1.0,'min_aspect_ratio': 0.5,'max_aspect_ratio': 2.0,},'sample_constraint': {'min_jaccard_overlap': 0.7,},'max_trials': 50,'max_sample': 1,},{'sampler': {'min_scale': 0.3,'max_scale': 1.0,'min_aspect_ratio': 0.5,'max_aspect_ratio': 2.0,},'sample_constraint': {'min_jaccard_overlap': 0.9,},'max_trials': 50,'max_sample': 1,},{'sampler': {'min_scale': 0.3,'max_scale': 1.0,'min_aspect_ratio': 0.5,'max_aspect_ratio': 2.0,},'sample_constraint': {'max_jaccard_overlap': 1.0,},'max_trials': 50,'max_sample': 1,},] train_transform_param = {'mirror': True,'mean_value': [104, 117, 123],'resize_param': {'prob': 1,'resize_mode': P.Resize.WARP,'height': resize_height,'width': resize_width,'interp_mode': [P.Resize.LINEAR,P.Resize.AREA,P.Resize.NEAREST,P.Resize.CUBIC,P.Resize.LANCZOS4,],},'distort_param': {'brightness_prob': 0.5,'brightness_delta': 32,'contrast_prob': 0.5,'contrast_lower': 0.5,'contrast_upper': 1.5,'hue_prob': 0.5,'hue_delta': 18,'saturation_prob': 0.5,'saturation_lower': 0.5,'saturation_upper': 1.5,'random_order_prob': 0.0,},'expand_param': {'prob': 0.5,'max_expand_ratio': 4.0,},'emit_constraint': {'emit_type': caffe_pb2.EmitConstraint.CENTER,}} test_transform_param = {'mean_value': [104, 117, 123],'resize_param': {'prob': 1,'resize_mode': P.Resize.WARP,'height': resize_height,'width': resize_width,'interp_mode': [P.Resize.LINEAR],},}# If true, use batch norm for all newly added layers. # Currently only the non batch norm version has been tested. use_batchnorm = False lr_mult = 1 # Use different initial learning rate. if use_batchnorm:base_lr = 0.0004 else:# A learning rate for batch_size = 1, num_gpus = 1.base_lr = 0.000004####### Modify the job name if you want. job_name = "SSD_{}".format(resize) # The name of the model. Modify it if you want. model_name = "VGG_VOC2007_{}".format(job_name)####### Directory which stores the model .prototxt file. save_dir = "models/VGGNet/VOC2007/{}".format(job_name)###### # Directory which stores the snapshot of models. snapshot_dir = "models/VGGNet/VOC2007/{}".format(job_name)###### # Directory which stores the job script and log file. job_dir = "jobs/VGGNet/VOC2007/{}".format(job_name)###### # Directory which stores the detection results. output_result_dir = "{}/data/VOCdevkit/results/VOC2007/{}/Main".format(os.environ['HOME'], job_name)####### model definition files. train_net_file = "{}/train.prototxt".format(save_dir) test_net_file = "{}/test.prototxt".format(save_dir) deploy_net_file = "{}/deploy.prototxt".format(save_dir) solver_file = "{}/solver.prototxt".format(save_dir) # snapshot prefix. snapshot_prefix = "{}/{}".format(snapshot_dir, model_name) # job script path. job_file = "{}/{}.sh".format(job_dir, model_name)# Stores the test image names and sizes. Created by data/VOC0712/create_list.sh name_size_file = "data/VOCdevkit/VOC2007/test_name_size.txt"###### # The pretrained model. We use the Fully convolutional reduced (atrous) VGGNet. pretrain_model = "models/VGGNet/VGG_ILSVRC_16_layers_fc_reduced.caffemodel"###### # Stores LabelMapItem. label_map_file = "data/VOCdevkit/VOC2007/labelmap_voc.prototxt"####### MultiBoxLoss parameters. num_classes = 2###### share_location = True background_label_id=0 train_on_diff_gt = True normalization_mode = P.Loss.VALID code_type = P.PriorBox.CENTER_SIZE ignore_cross_boundary_bbox = False mining_type = P.MultiBoxLoss.MAX_NEGATIVE neg_pos_ratio = 3. loc_weight = (neg_pos_ratio + 1.) / 4. multibox_loss_param = {'loc_loss_type': P.MultiBoxLoss.SMOOTH_L1,'conf_loss_type': P.MultiBoxLoss.SOFTMAX,'loc_weight': loc_weight,'num_classes': num_classes,'share_location': share_location,'match_type': P.MultiBoxLoss.PER_PREDICTION,'overlap_threshold': 0.5,'use_prior_for_matching': True,'background_label_id': background_label_id,'use_difficult_gt': train_on_diff_gt,'mining_type': mining_type,'neg_pos_ratio': neg_pos_ratio,'neg_overlap': 0.5,'code_type': code_type,'ignore_cross_boundary_bbox': ignore_cross_boundary_bbox,} loss_param = {'normalization': normalization_mode,}# parameters for generating priors. # minimum dimension of input image min_dim = 300 # conv4_3 ==> 38 x 38 # fc7 ==> 19 x 19 # conv6_2 ==> 10 x 10 # conv7_2 ==> 5 x 5 # conv8_2 ==> 3 x 3 # conv9_2 ==> 1 x 1 mbox_source_layers = ['conv4_3', 'fc7', 'conv6_2', 'conv7_2', 'conv8_2', 'conv9_2'] # in percent % min_ratio = 20 max_ratio = 90 step = int(math.floor((max_ratio - min_ratio) / (len(mbox_source_layers) - 2))) min_sizes = [] max_sizes = [] for ratio in xrange(min_ratio, max_ratio + 1, step):min_sizes.append(min_dim * ratio / 100.)max_sizes.append(min_dim * (ratio + step) / 100.) min_sizes = [min_dim * 10 / 100.] + min_sizes max_sizes = [min_dim * 20 / 100.] + max_sizes steps = [8, 16, 32, 64, 100, 300] aspect_ratios = [[2], [2, 3], [2, 3], [2, 3], [2], [2]] # L2 normalize conv4_3. normalizations = [20, -1, -1, -1, -1, -1] # variance used to encode/decode prior bboxes. if code_type == P.PriorBox.CENTER_SIZE:prior_variance = [0.1, 0.1, 0.2, 0.2] else:prior_variance = [0.1] flip = True clip = False# Solver parameters. # Defining which GPUs to use. gpus = "0"###### gpulist = gpus.split(",") num_gpus = len(gpulist)# Divide the mini-batch to different GPUs. batch_size = 32###### accum_batch_size = 32###### iter_size = accum_batch_size / batch_size solver_mode = P.Solver.CPU device_id = 0 batch_size_per_device = batch_size if num_gpus > 0:batch_size_per_device = int(math.ceil(float(batch_size) / num_gpus))iter_size = int(math.ceil(float(accum_batch_size) / (batch_size_per_device * num_gpus)))solver_mode = P.Solver.GPUdevice_id = int(gpulist[0])if normalization_mode == P.Loss.NONE:base_lr /= batch_size_per_device elif normalization_mode == P.Loss.VALID:base_lr *= 25. / loc_weight elif normalization_mode == P.Loss.FULL:# Roughly there are 2000 prior bboxes per image.# TODO(weiliu89): Estimate the exact # of priors.base_lr *= 2000.# Evaluate on whole test set. num_test_image = 15439###### test_batch_size = 8###### test_iter = num_test_image / test_batch_sizesolver_param = {# Train parameters'base_lr': base_lr,'weight_decay': 0.0005,'lr_policy': "multistep",'stepvalue': [80000, 100000, 120000],'gamma': 0.1,'momentum': 0.9,'iter_size': iter_size,'max_iter': 120000,'snapshot': 80000,'display': 10,'average_loss': 10,'type': "SGD",'solver_mode': solver_mode,'device_id': device_id,'debug_info': False,'snapshot_after_train': True,# Test parameters'test_iter': [test_iter],'test_interval': 10000,'eval_type': "detection",'ap_version': "11point",'test_initialization': False,}# parameters for generating detection output. det_out_param = {'num_classes': num_classes,'share_location': share_location,'background_label_id': background_label_id,'nms_param': {'nms_threshold': 0.45, 'top_k': 400},'save_output_param': {'output_directory': output_result_dir,'output_name_prefix': "comp4_det_test_",'output_format': "VOC",'label_map_file': label_map_file,'name_size_file': name_size_file,'num_test_image': num_test_image,},'keep_top_k': 200,'confidence_threshold': 0.01,'code_type': code_type,}# parameters for evaluating detection results. det_eval_param = {'num_classes': num_classes,'background_label_id': background_label_id,'overlap_threshold': 0.5,'evaluate_difficult_gt': False,'name_size_file': name_size_file,}### Hopefully you don't need to change the following ### # Check file. check_if_exist(train_data) check_if_exist(test_data) check_if_exist(label_map_file) check_if_exist(pretrain_model) make_if_not_exist(save_dir) make_if_not_exist(job_dir) make_if_not_exist(snapshot_dir)# Create train net. net = caffe.NetSpec() net.data, net.label = CreateAnnotatedDataLayer(train_data, batch_size=batch_size_per_device,train=True, output_label=True, label_map_file=label_map_file,transform_param=train_transform_param, batch_sampler=batch_sampler)VGGNetBody(net, from_layer='data', fully_conv=True, reduced=True, dilated=True,dropout=False)AddExtraLayers(net, use_batchnorm, lr_mult=lr_mult)mbox_layers = CreateMultiBoxHead(net, data_layer='data', from_layers=mbox_source_layers,use_batchnorm=use_batchnorm, min_sizes=min_sizes, max_sizes=max_sizes,aspect_ratios=aspect_ratios, steps=steps, normalizations=normalizations,num_classes=num_classes, share_location=share_location, flip=flip, clip=clip,prior_variance=prior_variance, kernel_size=3, pad=1, lr_mult=lr_mult)# Create the MultiBoxLossLayer. name = "mbox_loss" mbox_layers.append(net.label) net[name] = L.MultiBoxLoss(*mbox_layers, multibox_loss_param=multibox_loss_param,loss_param=loss_param, include=dict(phase=caffe_pb2.Phase.Value('TRAIN')),propagate_down=[True, True, False, False])with open(train_net_file, 'w') as f:print('name: "{}_train"'.format(model_name), file=f)print(net.to_proto(), file=f) shutil.copy(train_net_file, job_dir)# Create test net. net = caffe.NetSpec() net.data, net.label = CreateAnnotatedDataLayer(test_data, batch_size=test_batch_size,train=False, output_label=True, label_map_file=label_map_file,transform_param=test_transform_param)VGGNetBody(net, from_layer='data', fully_conv=True, reduced=True, dilated=True,dropout=False)AddExtraLayers(net, use_batchnorm, lr_mult=lr_mult)mbox_layers = CreateMultiBoxHead(net, data_layer='data', from_layers=mbox_source_layers,use_batchnorm=use_batchnorm, min_sizes=min_sizes, max_sizes=max_sizes,aspect_ratios=aspect_ratios, steps=steps, normalizations=normalizations,num_classes=num_classes, share_location=share_location, flip=flip, clip=clip,prior_variance=prior_variance, kernel_size=3, pad=1, lr_mult=lr_mult)conf_name = "mbox_conf" if multibox_loss_param["conf_loss_type"] == P.MultiBoxLoss.SOFTMAX:reshape_name = "{}_reshape".format(conf_name)net[reshape_name] = L.Reshape(net[conf_name], shape=dict(dim=[0, -1, num_classes]))softmax_name = "{}_softmax".format(conf_name)net[softmax_name] = L.Softmax(net[reshape_name], axis=2)flatten_name = "{}_flatten".format(conf_name)net[flatten_name] = L.Flatten(net[softmax_name], axis=1)mbox_layers[1] = net[flatten_name] elif multibox_loss_param["conf_loss_type"] == P.MultiBoxLoss.LOGISTIC:sigmoid_name = "{}_sigmoid".format(conf_name)net[sigmoid_name] = L.Sigmoid(net[conf_name])mbox_layers[1] = net[sigmoid_name]net.detection_out = L.DetectionOutput(*mbox_layers,detection_output_param=det_out_param,include=dict(phase=caffe_pb2.Phase.Value('TEST'))) net.detection_eval = L.DetectionEvaluate(net.detection_out, net.label,detection_evaluate_param=det_eval_param,include=dict(phase=caffe_pb2.Phase.Value('TEST')))with open(test_net_file, 'w') as f:print('name: "{}_test"'.format(model_name), file=f)print(net.to_proto(), file=f) shutil.copy(test_net_file, job_dir)# Create deploy net. # Remove the first and last layer from test net. deploy_net = net with open(deploy_net_file, 'w') as f:net_param = deploy_net.to_proto()# Remove the first (AnnotatedData) and last (DetectionEvaluate) layer from test net.del net_param.layer[0]del net_param.layer[-1]net_param.name = '{}_deploy'.format(model_name)net_param.input.extend(['data'])net_param.input_shape.extend([caffe_pb2.BlobShape(dim=[1, 3, resize_height, resize_width])])print(net_param, file=f) shutil.copy(deploy_net_file, job_dir)# Create solver. solver = caffe_pb2.SolverParameter(train_net=train_net_file,test_net=[test_net_file],snapshot_prefix=snapshot_prefix,**solver_param)with open(solver_file, 'w') as f:print(solver, file=f) shutil.copy(solver_file, job_dir)max_iter = 0 # Find most recent snapshot. for file in os.listdir(snapshot_dir):if file.endswith(".solverstate"):basename = os.path.splitext(file)[0]iter = int(basename.split("{}_iter_".format(model_name))[1])if iter > max_iter:max_iter = itertrain_src_param = '--weights="{}" \\\n'.format(pretrain_model) if resume_training:if max_iter > 0:train_src_param = '--snapshot="{}_iter_{}.solverstate" \\\n'.format(snapshot_prefix, max_iter)if remove_old_models:# Remove any snapshots smaller than max_iter.for file in os.listdir(snapshot_dir):if file.endswith(".solverstate"):basename = os.path.splitext(file)[0]iter = int(basename.split("{}_iter_".format(model_name))[1])if max_iter > iter:os.remove("{}/{}".format(snapshot_dir, file))if file.endswith(".caffemodel"):basename = os.path.splitext(file)[0]iter = int(basename.split("{}_iter_".format(model_name))[1])if max_iter > iter:os.remove("{}/{}".format(snapshot_dir, file))# Create job file. with open(job_file, 'w') as f:f.write('cd {}\n'.format(caffe_root))f.write('./build/tools/caffe train \\\n')f.write('--solver="{}" \\\n'.format(solver_file))f.write(train_src_param)if solver_param['solver_mode'] == P.Solver.GPU:f.write('--gpu {} 2>&1 | tee {}/{}.log\n'.format(gpus, job_dir, model_name))else:f.write('2>&1 | tee {}/{}.log\n'.format(job_dir, model_name))# Copy the python script to job_dir. py_file = os.path.abspath(__file__) shutil.copy(py_file, job_dir)# Run the job. os.chmod(job_file, stat.S_IRWXU) if run_soon:subprocess.call(job_file, shell=True)
train\test\deploy\solver.prototxt等都是運行這個腳本自動生成的。
gpus='0,1,2,3',如果有一塊GPU,則刪除123,有兩塊則刪除23
如果沒有GPU,需要注釋以下幾行,程序會以cpu形式訓練:(這個是解決 cudasucess(10vs0)的方法)
#ifnum_gpus >0:
#batch_size_per_device=int(math.ceil(float(batch_size)/num_gpus))
#iter_size = int(math.ceil(float(accum_batch_size)/(batch_size_per_device*num_gpus)))
#solver_model=P.Solver.GPU
#device_id=int(gpulist[0])
###5 修改 ./examples/ssd/ssd_pascal_webcam.py
對應修改就行了
###6 訓練?在根目錄下運行 python ./examples/ssd/ssd_pascal.py 如果出現(xiàn) cudasucess(2vs0):說明顯卡的計算能力有限,需要更改 caffe/examples/sdd/ssd_pascal.py 中的batch_size. 默認的32變小成16、8、4。 相關(guān)文章
- 1.?ssd安裝與訓練
- 2.?使用SSD訓練文本檢測
- 3.?SSD安裝Ubuntu 13.04
- 4.?SSD安裝UbuntuKylin 13.04
- 5.?Mac極簡安裝Caffe并訓練MNIST
- 6.?Ubuntu安裝Caffe并訓練MNIST
- 7.?Faster rcnn 安裝、訓練、測試記錄
- 8.?ubuntu下DiskSim+SSD extension安裝
- 9.?安卓訓練-目錄
- 10.?安卓訓練-開始
- 更多相關(guān)文章...
總結(jié)
- 上一篇: SSD安装及训练自己的数据集
- 下一篇: SSD配置+训练VOC0712+训练自己