當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

用yolov5训练kitti数据集

發布時間：2023/12/31 编程问答 27 豆豆

生活随笔收集整理的這篇文章主要介紹了用yolov5训练kitti数据集小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

一、KITTI數據集介紹
KITTI數據集是一個用于自動駕駛場景下的計算機視覺算法測評數據集，由德國卡爾斯魯厄理工學院（KIT）和豐田工業大學芝加哥分校（TTIC）共同創立。

包含場景：市區、鄉村和高速公路
在這里，我們只用到它的部分與行人，車輛有關的內容

下載可以轉到官網
http://www.cvlibs.net/download.php?file=data_object_image_2.zip
http://www.cvlibs.net/download.php?file=data_object_label_2.zip
得到我們的圖片和標簽

我們再yolov5/dataset下創建文件夾kitti
再kiiti中放入我們的數據

|——kitti ├── imgages │ ├── val │ │ └── 000000.png ├── ....... │ └── train │ │ └── 000000.png ├── ....... │ └── labels└── train

注意此時先不要把標簽數據放入，我們需要對標簽轉換一下

二、KITTI數據集轉換
我們打開標簽中的一個內容
比如000000.txt

Pedestrian 0.00 0 -0.20 712.40 143.00 810.73 307.92 1.89 0.48 1.20 1.84 1.47 8.41 0.01

這里是kitty獨有的數據格式，不適用于我們的yolov5網絡，所以我們得轉換一下

首先我們把類別歸一一下，因為我們只需要用到三個類（代碼中的路徑自行修改）

# modify_annotations_txt.py #將原來的8類物體轉換為我們現在需要的3類：Car，Pedestrian，Cyclist。 #我們把原來的Car、Van、Truck，Tram合并為Car類，把原來的Pedestrian，Person(sit-ting)合并為現在的Pedestrian，原來的Cyclist這一類保持不變。 import glob import string txt_list = glob.glob('你下載的標簽文件夾的標簽路徑/*.txt') def show_category(txt_list):category_list= []for item in txt_list:try:with open(item) as tdf:for each_line in tdf:labeldata = each_line.strip().split(' ') # 去掉前后多余的字符并把其分開category_list.append(labeldata[0]) # 只要第一個字段，即類別except IOError as ioerr:print('File error:'+str(ioerr))print(set(category_list)) # 輸出集合 def merge(line):each_line=''for i in range(len(line)):if i!= (len(line)-1):each_line=each_line+line[i]+' 'else:each_line=each_line+line[i] # 最后一條字段后面不加空格each_line=each_line+'\n'return (each_line) print('before modify categories are:\n') show_category(txt_list) for item in txt_list:new_txt=[]try:with open(item, 'r') as r_tdf:for each_line in r_tdf:labeldata = each_line.strip().split(' ')if labeldata[0] in ['Truck','Van','Tram']: # 合并汽車類labeldata[0] = labeldata[0].replace(labeldata[0],'Car')if labeldata[0] == 'Person_sitting': # 合并行人類labeldata[0] = labeldata[0].replace(labeldata[0],'Pedestrian')if labeldata[0] == 'DontCare': # 忽略Dontcare類continueif labeldata[0] == 'Misc': # 忽略Misc類continuenew_txt.append(merge(labeldata)) # 重新寫入新的txt文件with open(item,'w+') as w_tdf: # w+是打開原文件將內容刪除，另寫新內容進去for temp in new_txt:w_tdf.write(temp)except IOError as ioerr:print('File error:'+str(ioerr)) print('\nafter modify categories are:\n') show_category(txt_list)

然后我們再把它轉換為xml文件
創建一個Annotations文件夾用于存放xml

# kitti_txt_to_xml.py # encoding:utf-8 # 根據一個給定的XML Schema，使用DOM樹的形式從空白文件生成一個XML from xml.dom.minidom import Document import cv2 import os def generate_xml(name,split_lines,img_size,class_ind):doc = Document() # 創建DOM文檔對象annotation = doc.createElement('annotation')doc.appendChild(annotation)title = doc.createElement('folder')title_text = doc.createTextNode('KITTI')title.appendChild(title_text)annotation.appendChild(title)img_name=name+'.png'title = doc.createElement('filename')title_text = doc.createTextNode(img_name)title.appendChild(title_text)annotation.appendChild(title)source = doc.createElement('source')annotation.appendChild(source)title = doc.createElement('database')title_text = doc.createTextNode('The KITTI Database')title.appendChild(title_text)source.appendChild(title)title = doc.createElement('annotation')title_text = doc.createTextNode('KITTI')title.appendChild(title_text)source.appendChild(title)size = doc.createElement('size')annotation.appendChild(size)title = doc.createElement('width')title_text = doc.createTextNode(str(img_size[1]))title.appendChild(title_text)size.appendChild(title)title = doc.createElement('height')title_text = doc.createTextNode(str(img_size[0]))title.appendChild(title_text)size.appendChild(title)title = doc.createElement('depth')title_text = doc.createTextNode(str(img_size[2]))title.appendChild(title_text)size.appendChild(title)for split_line in split_lines:line=split_line.strip().split()if line[0] in class_ind:object = doc.createElement('object')annotation.appendChild(object)title = doc.createElement('name')title_text = doc.createTextNode(line[0])title.appendChild(title_text)object.appendChild(title)bndbox = doc.createElement('bndbox')object.appendChild(bndbox)title = doc.createElement('xmin')title_text = doc.createTextNode(str(int(float(line[4]))))title.appendChild(title_text)bndbox.appendChild(title)title = doc.createElement('ymin')title_text = doc.createTextNode(str(int(float(line[5]))))title.appendChild(title_text)bndbox.appendChild(title)title = doc.createElement('xmax')title_text = doc.createTextNode(str(int(float(line[6]))))title.appendChild(title_text)bndbox.appendChild(title)title = doc.createElement('ymax')title_text = doc.createTextNode(str(int(float(line[7]))))title.appendChild(title_text)bndbox.appendChild(title)# 將DOM對象doc寫入文件f = open('Annotations/trian'+name+'.xml','w')f.write(doc.toprettyxml(indent = ''))f.close() if __name__ == '__main__':class_ind=('Pedestrian', 'Car', 'Cyclist')cur_dir=os.getcwd()labels_dir=os.path.join(cur_dir,'Labels')for parent, dirnames, filenames in os.walk(labels_dir): # 分別得到根目錄，子目錄和根目錄下文件 for file_name in filenames:full_path=os.path.join(parent, file_name) # 獲取文件全路徑f=open(full_path)split_lines = f.readlines()name= file_name[:-4] # 后四位是擴展名.txt，只取前面的文件名img_name=name+'.png' img_path=os.path.join('./JPEGImages/trian',img_name) # 路徑需要自行修改 img_size=cv2.imread(img_path).shapegenerate_xml(name,split_lines,img_size,class_ind) print('all txts has converted into xmls')

這個時候我們已經將.txt轉化為.xml并存放在Annotations下了

最后我們再把.xml轉化為適合于yolo訓練的標簽模式
也就是darknet的txt格式
例如:

0 0.9074074074074074 0.7413333333333333 0.09178743961352658 0.256 0 0.3635265700483092 0.6386666666666667 0.0785024154589372 0.14533333333333334 2 0.6996779388083736 0.5066666666666667 0.008051529790660225 0.08266666666666667 0 0.7024959742351047 0.572 0.0430756843800322 0.09733333333333333 0 0.6755233494363929 0.544 0.03140096618357488 0.06933333333333333 0 0.48027375201288247 0.5453333333333333 0.030998389694041867 0.068 0 0.5032206119162641 0.528 0.021739130434782608 0.05333333333333334 0 0.6533816425120773 0.5306666666666666 0.02214170692431562 0.056 0 0.5515297906602254 0.5053333333333333 0.017713365539452495 0.03866666666666667 # xml_to_yolo_txt.py # 此代碼和VOC_KITTI文件夾同目錄 import glob import xml.etree.ElementTree as ET # 這里的類名為我們xml里面的類名，順序現在不需要考慮 class_names = ['Car', 'Cyclist', 'Pedestrian'] # xml文件路徑 path = './Annotations/' # 轉換一個xml文件為txt def single_xml_to_txt(xml_file):tree = ET.parse(xml_file)root = tree.getroot()# 保存的txt文件路徑txt_file = xml_file.split('.')[0]+'.'+xml_file.split('.')[1]+'.txt'with open(txt_file, 'w') as txt_file:for member in root.findall('object'):#filename = root.find('filename').textpicture_width = int(root.find('size')[0].text)picture_height = int(root.find('size')[1].text)class_name = member[0].text# 類名對應的indexclass_num = class_names.index(class_name)box_x_min = int(member[1][0].text) # 左上角橫坐標box_y_min = int(member[1][1].text) # 左上角縱坐標box_x_max = int(member[1][2].text) # 右下角橫坐標box_y_max = int(member[1][3].text) # 右下角縱坐標print(box_x_max,box_x_min,box_y_max,box_y_min)# 轉成相對位置和寬高x_center = float(box_x_min + box_x_max) / (2 * picture_width)y_center = float(box_y_min + box_y_max) / (2 * picture_height)width = float(box_x_max - box_x_min) / picture_widthheight = float(box_y_max - box_y_min) / picture_heightprint(class_num, x_center, y_center, width, height)txt_file.write(str(class_num) + ' ' + str(x_center) + ' ' + str(y_center) + ' ' + str(width) + ' ' + str(height) + '\n') # 轉換文件夾下的所有xml文件為txt def dir_xml_to_txt(path):for xml_file in glob.glob(path + '*.xml'):single_xml_to_txt(xml_file) dir_xml_to_txt(path)

最后我們將得到的Annotations/下的所有txt文件放入我們之前的dataset/labels中

|——kitti ├── imgages │ ├── val │ │ └── 000000.png ├── ....... │ └── train │ │ └── 000000.png ├── ....... │ └── labels└── train└── 000000.txt├── .......

這樣我們的數據集就準備好了
接下來我們可以訓練了，跟我上一篇的教程一樣，你們可以先了解怎么訓練yolov5的步驟
https://blog.csdn.net/qq_45978858/article/details/119686255?spm=1001.2014.3001.5501

三、KITTI數據集訓練
這里我們直接開始
1.在data文件夾中復制一份coco.yaml然后改名kitti.yaml修改內容

train: ../yolov5-kitty/dataset/kitti/images/train # train images (relative to 'path') 118287 images val: ../yolov5-kitty/dataset/kitti/images/train # train images (relative to 'path') 5000 images# Classes nc: 3 # number of classes names: ['Car','Pedestrian','Cyclist']

2.在models文件夾修改yolov5s.yaml內容

# Parameters nc: 3 # number of classes depth_multiple: 0.33 # model depth multiple width_multiple: 0.50 # layer channel multiple anchors:- [10,13, 16,30, 33,23] # P3/8- [30,61, 62,45, 59,119] # P4/16- [116,90, 156,198, 373,326] # P5/32# YOLOv5 backbone backbone:# [from, number, module, args][[-1, 1, Focus, [64, 3]], # 0-P1/2[-1, 1, Conv, [128, 3, 2]], # 1-P2/4[-1, 3, C3, [128]],[-1, 1, Conv, [256, 3, 2]], # 3-P3/8[-1, 9, C3, [256]],[-1, 1, Conv, [512, 3, 2]], # 5-P4/16[-1, 9, C3, [512]],[-1, 1, Conv, [1024, 3, 2]], # 7-P5/32[-1, 1, SPP, [1024, [5, 9, 13]]],[-1, 3, C3, [1024, False]], # 9]# YOLOv5 head head:[[-1, 1, Conv, [512, 1, 1]],[-1, 1, nn.Upsample, [None, 2, 'nearest']],[[-1, 6], 1, Concat, [1]], # cat backbone P4[-1, 3, C3, [512, False]], # 13[-1, 1, Conv, [256, 1, 1]],[-1, 1, nn.Upsample, [None, 2, 'nearest']],[[-1, 4], 1, Concat, [1]], # cat backbone P3[-1, 3, C3, [256, False]], # 17 (P3/8-small)[-1, 1, Conv, [256, 3, 2]],[[-1, 14], 1, Concat, [1]], # cat head P4[-1, 3, C3, [512, False]], # 20 (P4/16-medium)[-1, 1, Conv, [512, 3, 2]],[[-1, 10], 1, Concat, [1]], # cat head P5[-1, 3, C3, [1024, False]], # 23 (P5/32-large)[[17, 20, 23], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)]

當然你也要有個一yolo5s.pt權重文件放在yolov5文件夾中，在我的前面的博客也有下載地址
3.開始訓練

不用空行，空格間隔就可以 python train.py --img 640 --batch-size 16 --epochs 10 --data data/kitti.yaml--cfg models/yolov5s.yaml --weights yolov5s.pt

我這里雖然只訓練了十個周期，但是還是花了一個多小時，準確率也非常不錯，達到了0.9以上

現在我們在runs/train/exp下可以看到我們的訓練的結果

準確率都可以

我們可以拿著訓練完的最好的權重試一試

python detect.py --weights runs/train/exp/weights/best.pt --source Road_traffic_video2.mp4 這里可以是圖片也可以是視頻也可以是0（攝像頭）--device 0

可以看到效果還是可以的，我這只訓練了10個epoch，條件好的可以訓練300個甚至更久

總結

以上是生活随笔為你收集整理的用yolov5训练kitti数据集的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

数据
kitti

上一篇： Linux上mysql忘记密码重置密码
下一篇： iweboffice2015库文件Web