當前位置：首頁 > 编程语言 > python >内容正文

python

【Python3】Tensorflow_Fasterrcnn训练自己数据集，Keras_Yolov3_GPU训练自己数据集

發(fā)布時間：2024/4/24 python 80 豆豆

生活随笔收集整理的這篇文章主要介紹了【Python3】Tensorflow_Fasterrcnn训练自己数据集，Keras_Yolov3_GPU训练自己数据集小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

文章目錄

1.Tensorflow_Fasterrcnn訓練自己數(shù)據(jù)集
- 1.1 環(huán)境塔建
- 1.2 用預訓練好的Resnet101模型演示demo圖片
- 1.3 用預訓練好的Resnet101模型對數(shù)據(jù)進行測試(評價模型的mAP指數(shù))
- 1.4 用VGG16預訓練模型對VOC數(shù)據(jù)集進行訓練并測試
- 1.5 訓練自己標注的數(shù)據(jù)集
2.Keras_Yolov3_GPU訓練自己數(shù)據(jù)集
- 2.1 數(shù)據(jù)準備
- - 2.1.1 挑選像素足夠的圖片
  - 2.1.2 數(shù)據(jù)標注及檢查
  - 2.1.3 圖像壓縮
  - 2.1.4 劃分訓練集和測試集
- 2.2 模型訓練
- 2.3 模型測試
- - 2.3.1 單張圖
  - 2.3.2 視頻
  - 2.3.3 多張圖

1.Tensorflow_Fasterrcnn訓練自己數(shù)據(jù)集

1.1 環(huán)境塔建

復現(xiàn)地址：https://github.com/endernewton/tf-faster-rcnn
深度學習環(huán)境搭建見文章：https://blog.csdn.net/weixin_43435675/article/details/88359636
fasterrcnn代碼解讀見文章：https://www.cnblogs.com/darkknightzh/p/10043864.html
coco數(shù)據(jù)集上如下：

$ conda create -n tf-faster-rcnn python=3.6 $ conda remove -n tf-faster-rcnn --all #若想刪除這conda環(huán)境 $ source activate tf-faster-rcnn $ conda env list $ source deactivate

下載后原始的文件夾如下：

文件夾data：數(shù)據(jù)，權(quán)重
lib：網(wǎng)絡結(jié)構(gòu)，數(shù)據(jù)讀入
tools：訓練，測試
voc數(shù)據(jù)集分享：鏈接：https://pan.baidu.com/s/1UPRbgR2rMZUDHHGI6roGuA 提取碼：jlj8
resnet101（for voc pre-trained on 07+12 set），vgg16預訓練模型文件夾分享：鏈接：https://pan.baidu.com/s/14kIWwxdOqCCioiyBDNtnlg 提取碼：gmta
修改為gpu：在tf-faster-rcnn/lib/setup.py的第130行修改為自己gpu對應，算力查詢：http://arnon.dk/matching-sm-architectures-arch-and-gencode-for-various-nvidia-cards/

修改為cpu：

1.2 用預訓練好的Resnet101模型演示demo圖片

1.編譯Cython：虛擬環(huán)境中l(wèi)ib路徑下make clean，make (pip install easydict，pip install Cython)，(注意如果運行python3 tool/demo.py就會出現(xiàn)錯誤提示：ImportError: No module named gpu_nms，把Makefile里面的python改成python3就可以了)。
2.安裝COCO API：在自己建的虛擬環(huán)境中make

cd data git clone https://github.com/pdollar/coco.git cd coco/PythonAPI make

3.下載voc數(shù)據(jù)集或在上文百度云分享：

wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar

Extract all of these tars into one directorynamed VOCdevkit：

tar xvf VOCtrainval_06-Nov-2007.tar tar xvf VOCtest_06-Nov-2007.tar tar xvf VOCdevkit_08-Jun-2007.tar

VOCdevkit文件夾拷貝到tf-fater-rcnn/data路徑下并重命名為VOCdevkit2007，用不到分割可以改為如下：

4.將下載的Resnet101預訓練模型voc_2007_trainval+voc_2012_trainval（一個文件夾里面只有幾個ckpt文件，上文百度云分享的）放在tf-fater-rcnn目錄，在tf-faster-rcnn目錄創(chuàng)建一個output文件夾并且在其中存放預訓練模型的軟鏈接，output文件夾中會在每次訓練后存放訓練好的模型：

NET=res101 TRAIN_IMDB=voc_2007_trainval+voc_2012_trainval mkdir -p output/${NET}/${TRAIN_IMDB} # 創(chuàng)建名稱的空文件夾 cd output/${NET}/${TRAIN_IMDB} ln -s ../../../voc_2007_trainval+voc_2012_trainval ./default # 建立軟連 cd ../../..

5.tf-faster-rcnn目錄下運行：

GPU_ID=0 CUDA_VISIBLE_DEVICES=${GPU_ID} ./tools/demo.py

1.3 用預訓練好的Resnet101模型對數(shù)據(jù)進行測試(評價模型的mAP指數(shù))

1.同樣采用上面在ImageNet和VOC0712上訓練過的Resnet101預訓練模型,對其用VOC0712的test測試集進行測試(查閱test_faster_rcnn.sh會發(fā)現(xiàn)，其實所用的就是VOC07的test測試集,test.txt里面有4952行)

把tf-faster-rcnn/lib/datasets/voc_eval.py的第121行的
with open(cachefile,'w') as f
改成：
with open(cachefile,'wb') as f

同時還要把第105行的
cachefile = os.path.join(cachedir, '%s_annots.pkl' % imagesetfile)
改為
cachefile = os.path.join(cachedir, '%s_annots.pkl' % imagesetfile.split("/")[-1].split(".")[0])

2.faster-rcnn根目錄下

GPU_ID=0 ./experiments/scripts/test_faster_rcnn.sh $GPU_ID pascal_voc_0712 res101

測試4952張完顯示下圖。

測試完后會自動建立/output/res101/voc_2007_test/default/res101_faster_rcnn_iter_110000/…pkl

1.4 用VGG16預訓練模型對VOC數(shù)據(jù)集進行訓練并測試

1.wget -v http://download.tensorflow.org/models/vgg_16_2016_08_28.tar.gz ，或在官網(wǎng)上下載：https://github.com/tensorflow/models/tree/master/research/slim#pre-trained-models

./experiments/scripts/train_faster_rcnn.sh [GPU_ID] [DATASET] [NET] # GPU_ID 是你要使用的GPU編號 # NET 是你采用的網(wǎng)絡類型，可選范圍為{vgg16, res50, res101, res152} # DATASET 是數(shù)據(jù)集，在train_faster_rcnn.sh中預先定義過{pascal_voc, pascal_voc_0712, coco} ，還可以根據(jù)自己需要進行添加 # Examples: ./experiments/scripts/train_faster_rcnn.sh 0 pascal_voc vgg16 ./experiments/scripts/train_faster_rcnn.sh 1 coco res101

上面在演示demo時，在output中建立了一個經(jīng)過VOC0712數(shù)據(jù)集訓練好的Res101模型的軟鏈接，如果有再次使用VOC0712數(shù)據(jù)集對Res101進行訓練的需要，記得刪除掉該軟鏈接

2.下面使用VOC07數(shù)據(jù)集訓練VGG16訓練并測試: 為了節(jié)省時間并排除錯誤，我把迭代次數(shù)只設置了20次,即在./experiments/scripts/train_faster_rcnn.sh里的第22行把ITERS=70000改成ITERS=20，同時由于train_faster_rcnn.sh中最后有調(diào)用test_faster_rcnn.sh，因此記得把./experiments/scripts/test_faster_rcnn.sh的ITERS也改成20。

$ ./experiments/scripts/train_faster_rcnn.sh 0 pascal_voc vgg16

有20類AP，21類Result。

訓練了20輪準確度很差，運行train_faster_rcnn.sh后會自動運行test_faster_rcnn.sh，即訓練完自動進行測試。
3.用Tensorboard可視化：訓練完成后，可以使用Tensorboard對訓練過程的log進行可視化。
logdir用來設置想要把Tensorboard的event文件存放在哪里，port用來設置通過哪個端口顯示Tensorboard界面
$ tensorboard --logdir=tensorboard/vgg16/voc_2007_trainval/ --port=7001 &，產(chǎn)生下面文件。

復制http://…，瀏覽器打開，只訓練了20輪效果差。

4.默認情況下，訓練好的網(wǎng)絡存放在：output/[NET]/[DATASET_trainval]/default/

測試的輸出結(jié)果存放在：output/[NET]/[DATASET_test]/default/[SNAPSHOT]/

訓練和驗證的Tensorboard信息存放在：

tensorboard/[NET]/[DATASET]/default/ tensorboard/[NET]/[DATASET]/default_val/

1.5 訓練自己標注的數(shù)據(jù)集

1.在與PASCAL VOC（VOCdevkit2007）數(shù)據(jù)集同級目錄建立文件夾名為"MSDD"，在MSDD文件夾下，建立3個文件夾：annotations_cache，results，VOC2007。ananotations_cache里存放的是標注緩存數(shù)據(jù)，如果你對數(shù)據(jù)集有改動，記得清除緩存數(shù)據(jù)，否則的話依舊加載的是原始數(shù)據(jù)。
results文件夾下依次創(chuàng)建如下路徑：

2.VOC2007文件夾下創(chuàng)建如下3個子文件夾：

在Annotations目錄下存放所有的xml標簽文件，名字格式為000001.xml。
在JPEGImages目錄下存放所有的jpg圖片，名字格式為000001.jpg。
3.在ImageSets文件夾下創(chuàng)建3個子文件夾：

Main文件夾下存放的是記錄各個子集所包含樣本編號的txt文件。

4.可在data文件夾下建立軟鏈接，也可將數(shù)據(jù)集直接放在data文件夾下：

修改程序：1.新建msdd.py：在lib/dataset/目錄下復制pascal_voc.py并以自己的數(shù)據(jù)集命名，該文件會生成該各個子數(shù)據(jù)集的imdb，如msdd_train、msdd_val等

# pascal_voc.py class pascal_voc(imdb):def __init__(self, image_set, year, use_diff=False):name = 'voc_' + year + '_' + image_setif use_diff:name += '_diff'imdb.__init__(self, name)self._year = yearself._image_set = image_setself._devkit_path = self._get_default_path()self._data_path = os.path.join(self._devkit_path, 'VOC' + self._year)self._classes = ('__background__', # always index 0'aeroplane', 'bicycle', 'bird', 'boat','bottle', 'bus', 'car', 'cat', 'chair','cow', 'diningtable', 'dog', 'horse','motorbike', 'person', 'pottedplant','sheep', 'sofa', 'train', 'tvmonitor')self._class_to_ind = dict(list(zip(self.classes, list(range(self.num_classes)))))self._image_ext = '.jpg'self._image_index = self._load_image_set_index() # Default to roidb handlerself._roidb_handler = self.gt_roidbself._salt = str(uuid.uuid4())self._comp_id = 'comp4'# PASCAL specific config optionsself.config = {'cleanup': True,'use_salt': True,'use_diff': use_diff,'matlab_eval': False,'rpn_file': None} # msdd.py class msdd(imdb):def __init__(self, image_set):name = 'msdd_' + image_setimdb.__init__(self, name)self._year = '2007'self._image_set = image_setself._devkit_path = self._get_default_path()self._data_path = os.path.join(self._devkit_path, 'VOC' + self._year) # 數(shù)據(jù)位置 data/MSDD/VOC2007self._classes = ('__background__', # always index 0 'cr', 'ln', 'pa', 'ps', # 改成自己的類別'rs', 'sc')self._class_to_ind = dict(list(zip(self.classes, list(range(self.num_classes)))))self._image_ext = '.jpg' # 建議使用jpg格式self._image_index = self._load_image_set_index()# Default to roidb handlerself._roidb_handler = self.gt_roidbself._salt = str(uuid.uuid4())self._comp_id = 'comp4'# PASCAL specific config optionsself.config = {'cleanup': True,'use_salt': True,'use_diff': False, # 關(guān)閉use_diff'matlab_eval': False,'rpn_file': None} # pascal_voc.pydef _get_default_path(self):"""Return the default path where PASCAL VOC is expected to be installed."""return os.path.join(cfg.DATA_DIR, 'VOCdevkit' + self._year) # msdd.pydef _get_default_path(self):"""Return the default path where PASCAL VOC is expected to be installed."""return os.path.join(cfg.DATA_DIR, 'MSDD') # 開發(fā)包位置 data/MSDD

2.修改factory.py：該文件同樣在lib/dataset/目錄下，該文件生成由各個子數(shù)據(jù)集(xxx_train、xxx_test)的imdb格式組成的字典，其中會調(diào)用如pascal_voc.py、msdd.py等文件，以按各個數(shù)據(jù)集的指定格式生成各個子數(shù)據(jù)集的imdb，仿照其他格式在相應位置插入下面代碼

# factory.py # Set up mssd_<split> for split in ['train', 'val', 'trainval', 'test']:# 只能選ImageSets/Main中有的txt文件，可以僅選一部分name = 'msdd_{}'.format(split)__sets[name] = (lambda split = split: msdd(split))

3.修改train_faster_rcnn.sh，仿照其他數(shù)據(jù)集，添加自己數(shù)據(jù)集，可選的IMDB都是在factory.py中生成的

# train_faster_rcnn.shmsdd)TRAIN_IMDB="msdd_trainval" # 訓練所用IMDBTEST_IMDB="msdd_test" # 測試所用IMDBSTEPSIZE="[50000]" ITERS=20 ANCHORS="[8,16,32]"RATIOS="[0.5,1,2]";;

4.與上同理修改test_faster_rcnn.sh

# test_faster_rcnn.shmsdd)TRAIN_IMDB="msdd_trainval" # 訓練所用IMDBTEST_IMDB="msdd_test" # 測試所用IMDBITERS=20 ANCHORS="[8,16,32]"RATIOS="[0.5,1,2]";;

5.開始訓練測試

$ ./experiments/scripts/train_faster_rcnn.sh 0 msdd vgg16

6.推理應用
修改tools/demo.py里面的’CLASSES‘、’NETS’以及’DATASETS’變量為和自己的數(shù)據(jù)集相關(guān)，修改main函數(shù)里面測試圖片名，即可利用訓練好的模型對幾張圖片進行檢測應用。

2.Keras_Yolov3_GPU訓練自己數(shù)據(jù)集

2.1 數(shù)據(jù)準備

https://github.com/qqwweee/keras-yolo3，8GB顯存GTX1070，Windows會顯存不足，ubuntu深度學習環(huán)境搭建：https://blog.csdn.net/weixin_43435675/article/details/88359636

原圖鏈接：https://pan.baidu.com/s/17iI62gt9HyRbQ-Wr8h28jw 提取碼：4swb，mkdir n01440764創(chuàng)建文件夾n01440764。運命令tar -xvf n01440764.tar -C n01440764完成壓縮文件的解壓，其中-C參數(shù)后面必須為已經(jīng)存在的文件夾，否則運行命令會報錯。

labelImg的下載地址：https://github.com/tzutalin/labelImg，在文件夾keras_YOLOv3中打開Terminal，運行下列命令：
1.加快apt-get命令的下載速度，需要做Ubuntu系統(tǒng)的換源：Ubuntu的設置Settings中選擇Software & Updates，將Download from的值設置為http://mirrors.aliyun.com/ubuntu
2.運行命令sudo apt-get install pyqt5-dev-tools安裝軟件pyqt5-dev-tools。
3.運行命令cd labelImg-master進入文件夾labelImg-master。運行命令pip install -r requirements/requirements-linux-python3.txt安裝軟件labelImg運行時需要的庫，如果已經(jīng)安裝Anaconda此步可能不用進行。
4.運行命令make qt5py3編譯產(chǎn)生軟件labelImg運行時需要的組件。python labelImg.py 運行打開labelImg軟件。

2.1.1 挑選像素足夠的圖片

n01440764中有一部分圖片像素不足416 * 416，不利于模型訓練，新建_01_select_images.py：
1.可以選取文件夾n01440764中的200張像素足夠的圖片；
2.將選取的圖片復制到在新文件夾selected_images中。

import os import random from PIL import Image import shutil#獲取文件夾中的文件路徑 def getFilePathList(dirPath, partOfFileName=''):allFileName_list = list(os.walk(dirPath))[0][2]fileName_list = [k for k in allFileName_list if partOfFileName in k]filePath_list = [os.path.join(dirPath, k) for k in fileName_list]return filePath_list#獲取一部分像素足夠，即長，寬都大于416的圖片 def generate_qualified_images(dirPath, sample_number, new_dirPath):jpgFilePath_list = getFilePathList(dirPath, '.JPEG')random.shuffle(jpgFilePath_list)if not os.path.isdir(new_dirPath):os.makedirs(new_dirPath)i = 0for jpgFilePath in jpgFilePath_list:image = Image.open(jpgFilePath)width, height = image.sizeif width >= 416 and height >= 416:i += 1new_jpgFilePath = os.path.join(new_dirPath, '%03d.jpg' %i)shutil.copy(jpgFilePath, new_jpgFilePath)if i == sample_number:break#獲取數(shù)量為200的合格樣本存放到selected_images文件夾中 generate_qualified_images('n01440764', 200, 'selected_images')

2.1.2 數(shù)據(jù)標注及檢查

標簽好200張圖片：鏈接：https://pan.baidu.com/s/13-fRksSjUeEii54gClA3Pw 提取碼:57lz
新建一個代碼文件_02_check_labels.py，將下面一段代碼復制到其中運行：
1.檢查代碼檢查標記好的文件夾是否有圖片漏標
2.檢查標記的xml文件中是否有物體標記類別拼寫錯誤

2.1.3 圖像壓縮

預先壓縮好圖像，模型訓練時不用再臨時改變圖片大小，可加快模型訓練速度。新建_03_compress_images.py：
1.將舊文件夾中的jpg文件壓縮后放到新文件夾images_416x416。
2.將舊文件夾中的jpg文件對應的xml文件修改后放到新文件夾images_416x416。

#獲取文件夾中的文件路徑 import os def getFilePathList(dirPath, partOfFileName=''):allFileName_list = list(os.walk(dirPath))[0][2]fileName_list = [k for k in allFileName_list if partOfFileName in k]filePath_list = [os.path.join(dirPath, k) for k in fileName_list]return filePath_list#生成新的xml文件 import xml.etree.ElementTree as ET def generateNewXmlFile(old_xmlFilePath, new_xmlFilePath, new_size):new_width, new_height = new_sizewith open(old_xmlFilePath) as file:fileContent = file.read()root = ET.XML(fileContent)#獲得圖片寬度變化倍數(shù)，并改變xml文件中width節(jié)點的值width = root.find('size').find('width')old_width = int(width.text)width_times = new_width / old_widthwidth.text = str(new_width)#獲得圖片高度變化倍數(shù)，并改變xml文件中height節(jié)點的值height = root.find('size').find('height')old_height = int(height.text)height_times = new_height / old_heightheight.text = str(new_height)#獲取標記物體的列表，修改其中xmin,ymin,xmax,ymax這4個節(jié)點的值object_list = root.findall('object')for object_item in object_list:bndbox = object_item.find('bndbox')xmin = bndbox.find('xmin')xminValue = int(xmin.text)xmin.text = str(int(xminValue * width_times))ymin = bndbox.find('ymin')yminValue = int(ymin.text)ymin.text = str(int(yminValue * height_times))xmax = bndbox.find('xmax')xmaxValue = int(xmax.text)xmax.text = str(int(xmaxValue * width_times))ymax = bndbox.find('ymax')ymaxValue = int(ymax.text)ymax.text = str(int(ymaxValue * height_times))tree = ET.ElementTree(root) # 初始化一個tree對象tree.write(new_xmlFilePath)#修改文件夾中的若干xml文件 def batch_modify_xml(old_dirPath, new_dirPath, new_size):xmlFilePath_list = getFilePathList(old_dirPath, '.xml')for xmlFilePath in xmlFilePath_list:xmlFileName = os.path.split(xmlFilePath)[1] #不同與str.split，os.path.split返回文件的路徑[0]和文件名[1]new_xmlFilePath = os.path.join(new_dirPath, xmlFileName)generateNewXmlFile(xmlFilePath, new_xmlFilePath, new_size)#生成新的jpg文件 from PIL import Image def generateNewJpgFile(old_jpgFilePath, new_jpgFilePath, new_size):old_image = Image.open(old_jpgFilePath) new_image = old_image.resize(new_size, Image.ANTIALIAS) # new_size是(,),Image.ANTIALIAS表示高質(zhì)量是一個參數(shù)new_image.save(new_jpgFilePath)#修改文件夾中的若干jpg文件 def batch_modify_jpg(old_dirPath, new_dirPath, new_size):if not os.path.isdir(new_dirPath):os.makedirs(new_dirPath)xmlFilePath_list = getFilePathList(old_dirPath, '.xml')for xmlFilePath in xmlFilePath_list:old_jpgFilePath = xmlFilePath[:-4] + '.jpg'jpgFileName = os.path.split(old_jpgFilePath)[1]new_jpgFilePath = os.path.join(new_dirPath, jpgFileName)generateNewJpgFile(old_jpgFilePath, new_jpgFilePath, new_size)if __name__ == '__main__':old_dirPath = 'selected_images'new_width = 416new_height = 416new_size = (new_width, new_height)new_dirPath = 'images_%sx%s' %(str(new_width), str(new_height))batch_modify_jpg(old_dirPath, new_dirPath, new_size)batch_modify_xml(old_dirPath, new_dirPath, new_size)

2.1.4 劃分訓練集和測試集

編輯類別文件resources/className_list.txt，每1行表示1個類別。運行命令python _04_generate_txtFile.py -dir images_416*416會劃分訓練集dataset_train.txt和測試集dataset_test.txt，_04_generate_txtFile.py代碼如下：

import xml.etree.ElementTree as ET import os import argparse from sklearn.model_selection import train_test_split# 從文本文件中解析出物體種類列表className_list，要求每個種類占一行 def get_classNameList(txtFilePath):with open(txtFilePath, 'r', encoding='utf8') as file:fileContent = file.read() # strip()會把兩頭所有的空格、制表符和換行都去掉line_list = [k.strip() for k in fileContent.split('\n') if k.strip()!=''] className_list= sorted(line_list, reverse=False)return className_list # 獲取文件夾中的文件路徑 import os def get_filePathList(dirPath, partOfFileName=''):allFileName_list = list(os.walk(dirPath))[0][2]fileName_list = [k for k in allFileName_list if partOfFileName in k]filePath_list = [os.path.join(dirPath, k) for k in fileName_list]return filePath_list# 解析運行代碼文件時傳入的參數(shù) import argparse def parse_args():parser = argparse.ArgumentParser()parser.add_argument('-d', '--dirPath', type=str, help='文件夾路徑', default='../resources/images_416x416') parser.add_argument('-s', '--suffix', type=str, default='.JPG')parser.add_argument('-c', '--class_txtFilePath', type=str, default='../resources/category_list.txt')argument_namespace = parser.parse_args()return argument_namespace # 主函數(shù) if __name__ == '__main__':argument_namespace = parse_args()dataset_dirPath = argument_namespace.dirPathassert os.path.exists(dataset_dirPath), 'not exists this path: %s' %dataset_dirPath suffix = argument_namespace.suffixclass_txtFilePath = argument_namespace.class_txtFilePath xmlFilePath_list = get_filePathList(dataset_dirPath, '.xml')className_list = get_classNameList(class_txtFilePath)train_xmlFilePath_list, test_xmlFilePath_list = train_test_split(xmlFilePath_list, test_size=0.1)dataset_list = [('dataset_train', train_xmlFilePath_list), ('dataset_test', test_xmlFilePath_list)]for dataset in dataset_list: #先第一個(),再第二個()txtFile_path = '%s.txt' %dataset[0] #dataset[0]表示'dataset_train'和'dataset_test'txtFile = open(txtFile_path, 'w') # txtFile就是dataset_train.txt和dataset_test.txt寫在循環(huán)里 for xmlFilePath in dataset[1]:jpgFilePath = xmlFilePath.replace('.xml', '.JPG')txtFile.write(jpgFilePath)with open(xmlFilePath) as xmlFile:xmlFileContent = xmlFile.read()root = ET.XML(xmlFileContent)for obj in root.iter('object'):className = obj.find('name').textif className not in className_list:print('error!! className not in className_list')continueclassId = className_list.index(className)bndbox = obj.find('bndbox')bound = [int(bndbox.find('xmin').text), int(bndbox.find('ymin').text),int(bndbox.find('xmax').text), int(bndbox.find('ymax').text)]txtFile.write(" " + ",".join([str(k) for k in bound]) + ',' + str(classId))txtFile.write('\n')txtFile.close()

from os import listdir from os.path import isfile, isdir, join import random path = './Annotations' # 里面全xml文件 files = listdir(path) # print(files) #[、、、.xml]data_rate = {'test': 10, 'train': 60, 'val': 30 }test, train, validation = list(), list(), list() for index, file_name in enumerate(files):rand = random.randint(1,100)filename = file_name.split('.')[0]if (rand <= 10):test.append(filename)elif (rand <= 70):train.append(filename)elif (rand <= 100):validation.append(filename)print('test: \n', test) print('train: \n', train) print('validation: \n', validation)with open('./Main/test.txt', 'w') as f: # 0.1for name in test:f.write(name+'\n') with open('./Main/train.txt', 'w') as f: # 0.6for name in train:f.write(name+'\n') with open('./Main/val.txt', 'w') as f: # 0.3 for name in validation:f.write(name+'\n') with open('./Main/trainval.txt', 'w') as f: # 0.9for name in train:f.write(name+'\n')for name in validation:f.write(name+'\n')

2.2 模型訓練

文件夾keras-yolo3-master中打開終端Terminal，然后運行命令python _05_train.py即可開始訓練。調(diào)整模型訓練的輪次epochs需要修改代碼文件_05_train.py的第85行fit_generator方法中的參數(shù)，即第90行參數(shù)epochs的值。_05_train.py代碼如下：

# 導入常用的庫 import os import numpy as np # 導入keras庫 import keras.backend as K from keras.layers import Input, Lambda from keras.models import Model # 導入yolo3文件夾中mode.py、utils.py這2個代碼文件中的方法 from yolo3.model import preprocess_true_boxes, yolo_body, yolo_loss from yolo3.utils import get_random_data# 從文本文件中解析出物體種類列表category_list，要求每個種類占一行 def get_categoryList(txtFilePath):with open(txtFilePath, 'r', encoding='utf8') as file:fileContent = file.read()line_list = [k.strip() for k in fileContent.split('\n') if k.strip()!='']category_list = sorted(line_list, reverse=False)return category_list # 從表示anchor的文本文件中解析出anchor_ndarray def get_anchorNdarray(anchor_txtFilePath): # anchor_txtFilePath是./model_data/yolo_anchors.txtwith open(anchor_txtFilePath) as file:anchor_ndarray = [float(k) for k in file.read().split(',')]return np.array(anchor_ndarray).reshape(-1, 2)# 創(chuàng)建YOLOv3模型，通過yolo_body方法架構(gòu)推理層inference，配合損失函數(shù)完成搭建卷積神經(jīng)網(wǎng)絡。 def create_model(input_shape,anchor_ndarray,num_classes,load_pretrained=True,freeze_body=False,weights_h5FilePath='../resources/saved_models/trained_weights.h5'):K.clear_session() # get a new sessionimage_input = Input(shape=(None, None, 3))height, width = input_shapenum_anchors = len(anchor_ndarray)y_true = [Input(shape=(height // k,width // k,num_anchors // 3,num_classes + 5)) for k in [32, 16, 8]]model_body = yolo_body(image_input, num_anchors//3, num_classes)print('Create YOLOv3 model with {} anchors and {} classes.'.format(num_anchors, num_classes))if load_pretrained and os.path.exists(weights_h5FilePath):model_body.load_weights(weights_h5FilePath, by_name=True, skip_mismatch=True)print('Load weights from this path: {}.'.format(weights_h5FilePath))if freeze_body:num = len(model_body.layers)-7for i in range(num):model_body.layers[i].trainable = Falseprint('Freeze the first {} layers of total {} layers.'.format(num, len(model_body.layers)))model_loss = Lambda(yolo_loss,output_shape=(1,),name='yolo_loss',arguments={'anchors': anchor_ndarray,'num_classes': num_classes,'ignore_thresh': 0.5})([*model_body.output, *y_true])model = Model([model_body.input, *y_true], model_loss)return model# 調(diào)用此方法時，模型開始訓練 def train(model,annotationFilePath,input_shape,anchor_ndarray,num_classes,logDirPath='../resources/saved_models/'):model.compile(optimizer='adam',loss={'yolo_loss': lambda y_true, y_pred: y_pred})# 劃分訓練集和驗證集 batch_size = 8val_split = 0.05with open(annotationFilePath) as file:lines = file.readlines()np.random.shuffle(lines)num_val = int(len(lines)*val_split)num_train = len(lines) - num_valprint('Train on {} samples, val on {} samples, with batch size {}.'.format(num_train, num_val, batch_size))# 模型利用生成器產(chǎn)生的數(shù)據(jù)做訓練model.fit_generator(data_generator(lines[:num_train], batch_size, input_shape, anchor_ndarray, num_classes),steps_per_epoch=max(1, num_train // batch_size),validation_data=data_generator(lines[num_train:], batch_size, input_shape, anchor_ndarray, num_classes),validation_steps=max(1, num_val // batch_size),epochs=1000,initial_epoch=0)# 當模型訓練結(jié)束時，保存模型if not os.path.isdir(logDirPath):os.makedirs(logDirPath)model_savedPath = os.path.join(logDirPath, 'trained_weights.h5')model.save(model_savedPath)# 圖像數(shù)據(jù)生成器 def data_generator(annotation_lines, batch_size, input_shape, anchors, num_classes):n = len(annotation_lines)np.random.shuffle(annotation_lines)i = 0while True:image_data = []box_data = []for b in range(batch_size):i %= nimage, box = get_random_data(annotation_lines[i], input_shape, random=True)image_data.append(image)box_data.append(box)i += 1image_data = np.array(image_data)box_data = np.array(box_data)y_true = preprocess_true_boxes(box_data, input_shape, anchors, num_classes)yield [image_data, *y_true], np.zeros(batch_size)# 解析運行代碼文件時傳入的參數(shù) import argparse def parse_args():parser = argparse.ArgumentParser() parser.add_argument('-w', '--width', type=int, default=416)parser.add_argument('-he', '--height', type=int, default=416)parser.add_argument('-c', '--class_txtFilePath', type=str, default='../resources/category_list.txt')parser.add_argument('-a', '--anchor_txtFilePath', type=str, default='./model_data/yolo_anchors.txt')argument_namespace = parser.parse_args()return argument_namespace # 主函數(shù) if __name__ == '__main__':argument_namespace = parse_args()class_txtFilePath = argument_namespace.class_txtFilePathanchor_txtFilePath = argument_namespace.anchor_txtFilePathcategory_list = get_categoryList(class_txtFilePath)anchor_ndarray = get_anchorNdarray(anchor_txtFilePath)width = argument_namespace.widthheight = argument_namespace.heightinput_shape = (width, height) # multiple of 32, height and widthmodel = create_model(input_shape, anchor_ndarray, len(category_list))annotationFilePath = 'dataset_train.txt'train(model, annotationFilePath, input_shape, anchor_ndarray, len(category_list))

2.3 模型測試

已經(jīng)訓練好的模型權(quán)重文件:鏈接：https://pan.baidu.com/s/1gPkH_zdSS_Eu1V9hSb7MXA
提取碼：a0ld ， fish_weights.zip解壓后，將文件trained_weights.h5放到文件夾saved_model中。

2.3.1 單張圖

文件夾keras-yolo3-master中打開終端Terminal運行命令jupyter notebook，打開代碼文件_07_yolo_test.ipynb如下：第1個代碼塊加載YOLOv3模型；第2個代碼塊加載測試集文本文件dataset_test.txt，并取出其中的圖片路徑賦值給變量jpgFilePath_list；第3個代碼塊是根據(jù)圖片路徑打開圖片后，調(diào)用YOLO對象的detect_image方法對圖片做目標檢測。

from _06_yolo import YoloModel yolo_model = YoloModel(weightsFilePath='saved_model/trained_weights.h5') with open('dataset_test.txt') as file:line_list = file.readlines() jpgFilePath_list = [k.split()[0] for k in line_list] jpgFilePath_list from PIL import Image jpgFilePath = jpgFilePath_list[0] image = Image.open(jpgFilePath) yolo_model.detect_image(image)

2.3.2 視頻

將圖片合成為1部視頻：文件夾keras-YOLOv3中打開Terminal，運行命令sudo apt-get install ffmpeg安裝軟件ffmpeg。繼續(xù)在此Terminal中運行命令ffmpeg -start_number 1 -r 1 -i images_416x416/%03d.jpg -vcodec mpeg4 keras-yolo3-master/1.mp4，請讀者確保當前Terminal所在目錄中有文件夾images_416x416。
ffmpeg命令參數(shù)解釋：
1.-start_number，配合參數(shù)-i使用，默認為0，表示%03d索引開始的數(shù)字；
2.-r，表示視頻的幀數(shù)，即一秒取多少張圖片制作視頻；
3.-i，input的簡寫，表示制作視頻的圖片路徑；
4.-vcodec，視頻編碼格式，mpeg4為常用的視頻編碼；
5.最后是輸出文件保存的路徑；
繼續(xù)在此Terminal中運行命令pip install opencv-python安裝opencv-python庫。cd keras-yolo3-master，在此Terminal中運行命令python yolo_video.py --input 1.mp4 --output fish_output.mp4，表示對視頻文件1.mp4做目標檢測，并將檢測結(jié)果保存為視頻文件fish_output.mp4。YOLOv3模型速度很快，本案例中檢測1張圖片只需要0.05秒。如果不人為干預，完成1幀圖片的目標檢測后立即開始下1幀，速度過快，人眼看不清楚。本文作者修改了代碼文件_06_yolo.py的第183行，使完成1幀的目標檢測后停止0.5秒，這樣視頻的展示效果能夠易于人眼接受。_06_yolo.py代碼如下：

# -*- coding: utf-8 -*- # 導入常用的庫 import os import time import numpy as np # 導入keras庫 from keras import backend as K from keras.layers import Input # 導入yolo3文件夾中mode.py、utils.py這2個代碼文件中的方法 from yolo3.model import yolo_eval, yolo_body from yolo3.utils import letterbox_image # 導入PIL畫圖庫 from PIL import Image, ImageFont, ImageDraw# 通過種類的數(shù)量，每個種類對應的顏色，顏色變量color為rgb這3個數(shù)值組成的元祖 import colorsys def get_colorList(category_quantity):hsv_list = []for i in range(category_quantity):hue = i / category_quantitysaturation = 1value = 1hsv = (hue, saturation, value)hsv_list.append(hsv)colorFloat_list = [colorsys.hsv_to_rgb(*k) for k in hsv_list]color_list = [tuple([int(x * 255) for x in k]) for k in colorFloat_list]return color_list# 定義類YoloModel class YoloModel(object):defaults = {"weights_h5FilePath": '../resources/trained_weights.h5',"anchor_txtFilePath": 'model_data/yolo_anchors.txt',"category_txtFilePath": '../resources/category_list.txt',"score" : 0.3,"iou" : 0.35,"model_image_size" : (416, 416) #must be a multiple of 32}@classmethoddef get_defaults(cls, n):if n in cls.defaults:return cls.defaults[n]else:return 'Unrecognized attribute name "%s"' %n# 類實例化方法def __init__(self, **kwargs):self.__dict__.update(self.defaults) # set up default valuesself.__dict__.update(kwargs) # and update with user overridesself.category_list = self.get_categoryList()self.anchor_ndarray = self.get_anchorNdarray()self.session = K.get_session()self.boxes, self.scores, self.classes = self.generate()# 從文本文件中解析出物體種類列表category_list，要求每個種類占一行def get_categoryList(self):with open(self.category_txtFilePath, 'r', encoding='utf8') as file:fileContent = file.read()line_list = [k.strip() for k in fileContent.split('\n') if k.strip()!='']category_list= sorted(line_list, reverse=False)return category_list # 從表示anchor的文本文件中解析出anchor_ndarraydef get_anchorNdarray(self):with open(self.anchor_txtFilePath, 'r', encoding='utf8') as file:number_list = [float(k) for k in file.read().split(',')]anchor_ndarray = np.array(number_list).reshape(-1, 2)return anchor_ndarray# 加載模型def generate(self):# 在Keras中，如果模型訓練完成后只保存了權(quán)重，那么需要先構(gòu)建網(wǎng)絡，再加載權(quán)重num_anchors = len(self.anchor_ndarray)num_classes = len(self.category_list)self.yolo_model = yolo_body(Input(shape=(None, None, 3)),num_anchors//3,num_classes)self.yolo_model.load_weights(self.weights_h5FilePath)# 給不同類別的物體準備不同顏色的方框category_quantity = len(self.category_list)self.color_list = get_colorList(category_quantity)# 目標檢測的輸出：方框box,得分score，類別classself.input_image_size = K.placeholder(shape=(2, ))boxes, scores, classes = yolo_eval(self.yolo_model.output,self.anchor_ndarray,category_quantity,self.input_image_size,score_threshold=self.score,iou_threshold=self.iou)return boxes, scores, classes# 檢測圖片def detect_image(self, image):startTime = time.time()# 模型網(wǎng)絡結(jié)構(gòu)運算所需的數(shù)據(jù)準備boxed_image = letterbox_image(image, tuple(reversed(self.model_image_size)))image_data = np.array(boxed_image).astype('float') / 255image_data = np.expand_dims(image_data, 0) # Add batch dimension.# 模型網(wǎng)絡結(jié)構(gòu)運算out_boxes, out_scores, out_classes = self.session.run([self.boxes, self.scores, self.classes],feed_dict={self.yolo_model.input: image_data,self.input_image_size: [image.size[1], image.size[0]],K.learning_phase(): 0})# 調(diào)用ImageFont.truetype方法實例化畫圖字體對象font = ImageFont.truetype(font='font/FiraMono-Medium.otf',size=np.floor(2e-2 * image.size[1] + 0.5).astype('int32'))thickness = (image.size[0] + image.size[1]) // 300# 循環(huán)繪制若干個方框for i, c in enumerate(out_classes):# 定義方框上方文字內(nèi)容predicted_class = self.category_list[c]score = out_scores[i]label = '{} {:.2f}'.format(predicted_class, score)# 調(diào)用ImageDraw.Draw方法實例化畫圖對象draw = ImageDraw.Draw(image)label_size = draw.textsize(label, font)box = out_boxes[i]top, left, bottom, right = boxtop = max(0, np.floor(top + 0.5).astype('int32'))left = max(0, np.floor(left + 0.5).astype('int32'))bottom = min(image.size[1], np.floor(bottom + 0.5).astype('int32'))right = min(image.size[0], np.floor(right + 0.5).astype('int32'))# 如果方框在圖片中的位置過于靠上，調(diào)整文字區(qū)域if top - label_size[1] >= 0:text_region = np.array([left, top - label_size[1]])else:text_region = np.array([left, top + 1])# 方框厚度為多少，則畫多少個矩形for j in range(thickness):draw.rectangle([left + j, top + j, right - j, bottom - j],outline=self.color_list[c])# 繪制方框中的文字draw.rectangle([tuple(text_region), tuple(text_region + label_size)],fill=self.color_list[c])draw.text(text_region, label, fill=(0, 0, 0), font=font)del draw# 打印檢測圖片使用的時間usedTime = time.time() - startTimeprint('檢測這張圖片用時%.2f秒' %(usedTime))return image# 關(guān)閉tensorflow的會話def close_session(self):self.session.close()# 對視頻進行檢測 def detect_video(yolo, video_path, output_path=""):import cv2vid = cv2.VideoCapture(video_path)if not vid.isOpened():raise IOError("Couldn't open webcam or video")video_FourCC = int(vid.get(cv2.CAP_PROP_FOURCC))video_fps = vid.get(cv2.CAP_PROP_FPS)video_size = (int(vid.get(cv2.CAP_PROP_FRAME_WIDTH)),int(vid.get(cv2.CAP_PROP_FRAME_HEIGHT)))isOutput = True if output_path != "" else Falseif isOutput:print("!!! TYPE:", type(output_path), type(video_FourCC), type(video_fps), type(video_size))print(video_FourCC, video_fps, video_size)out = cv2.VideoWriter(output_path, video_FourCC, video_fps, video_size)accum_time = 0curr_fps = 0fps = "FPS: ??"prev_time = time.time()cv2.namedWindow("result", cv2.WINDOW_NORMAL)cv2.resizeWindow('result', video_size[0], video_size[1])while True:return_value, frame = vid.read()try:#圖片第1維是寬，第2維是高，第3維是RGB#PIL庫圖片第三維是RGB，cv2庫圖片第三維正好相反，是BGRimage = Image.fromarray(frame[...,::-1])except Exception as e:breakimage = yolo.detect_image(image)result = np.asarray(image)curr_time = time.time()exec_time = curr_time - prev_timeprev_time = curr_timeaccum_time = accum_time + exec_timecurr_fps = curr_fps + 1if accum_time > 1:accum_time = accum_time - 1fps = "FPS: " + str(curr_fps)curr_fps = 0cv2.putText(result, text=fps, org=(3, 15), fontFace=cv2.FONT_HERSHEY_SIMPLEX,fontScale=0.50, color=(255, 0, 0), thickness=2)cv2.imshow("result", result[...,::-1])if isOutput:out.write(result[...,::-1])if cv2.waitKey(1) & 0xFF == ord('q'):breaksleepTime = 0.5time.sleep(sleepTime)yolo.close_session()

2.3.3 多張圖

本節(jié)將前2節(jié)內(nèi)容結(jié)合，直接讀取文件夾的若干圖片做檢測并展示為視頻。新建_08_detect_multi_images.py代碼如下：

# 導入YOLO類 from _06_yolo import YoloModel # 導入常用的庫 from PIL import Image import cv2 import os import time import numpy as np# 獲取文件夾中的文件路徑 def get_filePathList(dirPath, partOfFileName=''):all_fileName_list = next(os.walk(dirPath))[2]fileName_list = [k for k in all_fileName_list if partOfFileName in k]filePath_list = [os.path.join(dirPath, k) for k in fileName_list]return filePath_list# 對多張圖片做檢測，并保存為avi格式的視頻文件 def detect_multi_images(weights_h5FilePath, imageFilePath_list, out_aviFilePath=None):yolo_model = YoloModel(weights_h5FilePath=weights_h5FilePath)windowName = 'detect_multi_images_result'cv2.namedWindow(windowName, cv2.WINDOW_NORMAL)width = 1000height = 618display_size = (width, height)cv2.resizeWindow(windowName, width, height)if out_aviFilePath is not None:fourcc = cv2.VideoWriter_fourcc('M', 'P', 'E', 'G')videoWriter = cv2.VideoWriter(out_aviFilePath, fourcc, 1.3, display_size)for imageFilePath in imageFilePath_list:image = Image.open(imageFilePath)out_image = yolo_model.detect_image(image)resized_image = out_image.resize(display_size, Image.ANTIALIAS)resized_image_ndarray = np.array(resized_image)#圖片第1維是寬，第2維是高，第3維是RGB#PIL庫圖片第三維是RGB，cv2庫圖片第三維正好相反，是BGRcv2.imshow(windowName, resized_image_ndarray[..., ::-1])if out_aviFilePath is not None:videoWriter.write(resized_image_ndarray[..., ::-1])# 第1次按空格鍵可以暫停檢測，第2次按空格鍵繼續(xù)檢測pressKey = cv2.waitKey(500)if ord(' ') == pressKey:cv2.waitKey(0)# 按Esc鍵或者q鍵可以退出循環(huán)if 27 == pressKey or ord('q') == pressKey:break# 退出程序時關(guān)閉模型、寫入器、cv窗口 yolo_model.close_session()videoWriter.release()cv2.destroyAllWindows()# 解析運行代碼文件時傳入的參數(shù) import argparse # dirPath, image_suffix, weights_h5FilePath, imageFilePath_list, out_aviFilePath def parse_args(): parser = argparse.ArgumentParser()parser.add_argument('-d', '--dirPath', type=str, help='directory path', default='../resources/n01440764')parser.add_argument('--image_suffix', type=str, default='.JPEG')parser.add_argument('-w', '--weights_h5FilePath', type=str, default='../resources/trained_weights.h5')argument_namespace = parser.parse_args()return argument_namespace # 主函數(shù) if __name__ == '__main__': argument_namespace = parse_args()dirPath = argument_namespace.dirPathimage_suffix = argument_namespace.image_suffixweights_h5FilePath = argument_namespace.weights_h5FilePathimageFilePath_list = get_filePathList(dirPath, image_suffix)out_aviFilePath = '../resources/fish_output_2.avi' detect_multi_images(weights_h5FilePath, imageFilePath_list, out_aviFilePath)

刪除了原作者代碼中的以下功能：對YOLOv3_tiny的支持、檢測時對多GPU的支持：

在model_data文件夾中如下：

在yolo3文件夾中如下：

總結(jié)

以上是生活随笔為你收集整理的【Python3】Tensorflow_Fasterrcnn训练自己数据集，Keras_Yolov3_GPU训练自己数据集的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

上一篇：【Python2】Keras_ResNe
下一篇：【Python4】字符分割识别，车牌识别