當前位置：首頁 > 人工智能 > 目标检测 >内容正文

目标检测

realsense D455深度相机+YOLO V5结合实现目标检测（二）

發(fā)布時間：2024/3/24 目标检测 128 豆豆

生活随笔收集整理的這篇文章主要介紹了 realsense D455深度相机+YOLO V5结合实现目标检测（二）小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

realsense D455深度相機+YOLO V5結合實現(xiàn)目標檢測（二）

1.代碼來源
2.環(huán)境配置
3.代碼分析：
- 3.1 主要展示在將detect.py轉換為realsensedetect.py的文件部分，大家也可以直接將自己的detect.py 文件改成下面的文件，直接執(zhí)行即可。
- 3.2 文件或者文件夾里面文件的對比差異分析軟件介紹：
4. 思考與結束語

realsense D455深度相機+YOLO V5結合實現(xiàn)目標檢測（一）第一篇鏈接

為什么會出現(xiàn)關于realsense D455 +YOLO V5結合的第二篇文章呢，因為上一篇文章是從github上面找到并且跑通之后寫的，后來發(fā)現(xiàn)怎么也用不到我自己git下來的YOLO V5代碼之中，發(fā)現(xiàn)還是缺一點東西，所以從各種途徑中學習后將原汁原味的從github上找到的YOLO v5代碼應用到了里面，最后可以很好的檢測啦！

可以實現(xiàn)將D435,D455深度相機和yolo v5結合到一起，在識別物體的同時，還能測到物體相對與相機的距離。

說明一下為什么需要做這個事情？1.首先為什么需要用到realsense D455深度相機? 因為他是普通的相機還加了一個紅外測距的東西，所以其他二維圖像一樣，能夠得到三維世界在二維像素平面的投影，也就是圖片，但是我們損失了一個深度的維度以后得到的都是投影的東西，比如說蘋果可以和足球一樣大，因為我們不知道深度也就是物體距離相機的距離信息，所以我們需要一個深度相機來實現(xiàn)測距離。2.為什么需要用到y(tǒng)olo算法？因為他在實時性和準確率方面都可以，可以應用于工農(nóng)業(yè)生產(chǎn)當中，所以肯定很需要。所以才會有這二者的結合的必要性！

1.代碼來源

這是我第一次將代碼更改后放在了github上，希望大家多多star,主要重寫了detect.py文件為realsensedetect.py.首先大家如果想用這個代碼的話可以去這里git clone 這是代碼鏈接（為了防止鏈接不過去還是再寫在這里 https://github.com/wenyishengkingkong/realsense-D455-YOLOV5.git）。

2.環(huán)境配置

大家按照YOLO V5環(huán)境配置方法配置環(huán)境就可以，或者是向前面的一篇一樣前面的一篇，有一個簡單的配置。

然后cd到進入工程文件夾下執(zhí)行：

python realsensedetect.py

主要重寫了detect.py部分為realsensedetect.py文件。運行結果如下：

3.代碼分析：

3.1 主要展示在將detect.py轉換為realsensedetect.py的文件部分，大家也可以直接將自己的detect.py 文件改成下面的文件，直接執(zhí)行即可。

import argparse import os import shutil import time from pathlib import Pathimport cv2 import torch import torch.backends.cudnn as cudnn from numpy import random import numpy as np import pyrealsense2 as rsfrom models.experimental import attempt_load from utils.general import (check_img_size, non_max_suppression, apply_classifier, scale_coords,xyxy2xywh, plot_one_box, strip_optimizer, set_logging) from utils.torch_utils import select_device, load_classifier, time_synchronized from utils.datasets import letterboxdef detect(save_img=False):out, source, weights, view_img, save_txt, imgsz = \opt.save_dir, opt.source, opt.weights, opt.view_img, opt.save_txt, opt.img_sizewebcam = source == '0' or source.startswith(('rtsp://', 'rtmp://', 'http://')) or source.endswith('.txt')# Initializeset_logging()device = select_device(opt.device)if os.path.exists(out): # output dirshutil.rmtree(out) # delete diros.makedirs(out) # make new dirhalf = device.type != 'cpu' # half precision only supported on CUDA# Load modelmodel = attempt_load(weights, map_location=device) # load FP32 modelimgsz = check_img_size(imgsz, s=model.stride.max()) # check img_sizeif half:model.half() # to FP16# Set Dataloadervid_path, vid_writer = None, Noneview_img = Truecudnn.benchmark = True # set True to speed up constant image size inference#dataset = LoadStreams(source, img_size=imgsz)# Get names and colorsnames = model.module.names if hasattr(model, 'module') else model.namescolors = [[random.randint(0, 255) for _ in range(3)] for _ in range(len(names))]# Run inferencet0 = time.time()img = torch.zeros((1, 3, imgsz, imgsz), device=device) # init img_ = model(img.half() if half else img) if device.type != 'cpu' else None # run oncepipeline = rs.pipeline()# 創(chuàng)建 config 對象：config = rs.config()# config.enable_stream(rs.stream.depth, 640, 480, rs.format.z16, 30)config.enable_stream(rs.stream.depth, 640, 480, rs.format.z16, 60)config.enable_stream(rs.stream.color, 640, 480, rs.format.bgr8, 60)# Start streamingpipeline.start(config)align_to_color = rs.align(rs.stream.color)while True:start = time.time()# Wait for a coherent pair of frames（一對連貫的幀）: depth and colorframes = pipeline.wait_for_frames()frames = align_to_color.process(frames)# depth_frame = frames.get_depth_frame()depth_frame = frames.get_depth_frame()color_frame = frames.get_color_frame()color_image = np.asanyarray(color_frame.get_data())depth_image = np.asanyarray(depth_frame.get_data())mask = np.zeros([color_image.shape[0], color_image.shape[1]], dtype=np.uint8)mask[0:480, 320:640] = 255sources = [source]imgs = [None]path = sourcesimgs[0] = color_imageim0s = imgs.copy()img = [letterbox(x, new_shape=imgsz)[0] for x in im0s]img = np.stack(img, 0)img = img[:, :, :, ::-1].transpose(0, 3, 1, 2) # BGR to RGB, to 3x416x416, uint8 to float32img = np.ascontiguousarray(img, dtype=np.float16 if half else np.float32)img /= 255.0 # 0 - 255 to 0.0 - 1.0# Get detectionsimg = torch.from_numpy(img).to(device)if img.ndimension() == 3:img = img.unsqueeze(0)t1 = time_synchronized()pred = model(img, augment=opt.augment)[0]# Apply NMSpred = non_max_suppression(pred, opt.conf_thres, opt.iou_thres, classes=opt.classes, agnostic=opt.agnostic_nms)t2 = time_synchronized()for i, det in enumerate(pred): # detections per imagep, s, im0 = path[i], '%g: ' % i, im0s[i].copy()s += '%gx%g ' % img.shape[2:] # print stringgn = torch.tensor(im0.shape)[[1, 0, 1, 0]] # normalization gain whwhif det is not None and len(det):# Rescale boxes from img_size to im0 sizedet[:, :4] = scale_coords(img.shape[2:], det[:, :4], im0.shape).round()# Print resultsfor c in det[:, -1].unique():n = (det[:, -1] == c).sum() # detections per classs += '%g %ss, ' % (n, names[int(c)]) # add to string# Write resultsfor *xyxy, conf, cls in reversed(det):xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist() # normalized xywhline = (cls, conf, *xywh) if opt.save_conf else (cls, *xywh) # label formatdistance_list = []mid_pos = [int((int(xyxy[0]) + int(xyxy[2])) / 2), int((int(xyxy[1]) + int(xyxy[3])) / 2)] # 確定索引深度的中心像素位置左上角和右下角相加在/2min_val = min(abs(int(xyxy[2]) - int(xyxy[0])), abs(int(xyxy[3]) - int(xyxy[1]))) # 確定深度搜索范圍# print(box,)randnum = 40for i in range(randnum):bias = random.randint(-min_val // 4, min_val // 4)dist = depth_frame.get_distance(int(mid_pos[0] + bias), int(mid_pos[1] + bias))# print(int(mid_pos[1] + bias), int(mid_pos[0] + bias))if dist:distance_list.append(dist)distance_list = np.array(distance_list)distance_list = np.sort(distance_list)[randnum // 2 - randnum // 4:randnum // 2 + randnum // 4] # 冒泡排序+中值濾波label = '%s %.2f%s' % (names[int(cls)], np.mean(distance_list), 'm')plot_one_box(xyxy, im0, label=label, color=colors[int(cls)], line_thickness=3)# Print time (inference + NMS)print('%sDone. (%.3fs)' % (s, t2 - t1))# Stream resultsif view_img:cv2.imshow(p, im0)if cv2.waitKey(1) == ord('q'): # q to quitraise StopIterationprint('Done. (%.3fs)' % (time.time() - t0))if __name__ == '__main__':parser = argparse.ArgumentParser()parser.add_argument('--weights', nargs='+', type=str, default='yolov5m.pt', help='model.pt path(s)')parser.add_argument('--source', type=str, default='inference/images', help='source') # file/folder, 0 for webcamparser.add_argument('--img-size', type=int, default=640, help='inference size (pixels)')parser.add_argument('--conf-thres', type=float, default=0.25, help='object confidence threshold')parser.add_argument('--iou-thres', type=float, default=0.45, help='IOU threshold for NMS')parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')parser.add_argument('--view-img', action='store_true', help='display results')parser.add_argument('--save-txt', action='store_true', help='save results to *.txt')parser.add_argument('--save-conf', action='store_true', help='save confidences in --save-txt labels')parser.add_argument('--save-dir', type=str, default='inference/output', help='directory to save results')parser.add_argument('--classes', nargs='+', type=int, help='filter by class: --class 0, or --class 0 2 3')parser.add_argument('--agnostic-nms', action='store_true', help='class-agnostic NMS')parser.add_argument('--augment', action='store_true', help='augmented inference')parser.add_argument('--update', action='store_true', help='update all models')opt = parser.parse_args()print(opt)with torch.no_grad(): # 一個上下文管理器，被該語句wrap起來的部分將不會track梯度detect()

相信大家看到這么多代碼已經(jīng)覺得頭疼了，其實更改的就不多的幾行，只不過是將順序的和位置更改了一下。大家如果覺得麻煩，有兩個軟件可以輔助大家對文件進行對比（說明上面用的到是YOLO V5代碼中的v3.1版本，相信換成其他版本應該不會有任何問題，對于其他的目標檢測算法沒有進行試驗，相信應該都是換湯不換藥）。

3.2 文件或者文件夾里面文件的對比差異分析軟件介紹：

無論是在windows上或者是在ubuntu上面，好用的pycharm軟件都是可以應用的，可以在選擇文件或者文件夾然后右鍵有一個compare with的選項就可以進行差異分析了，大家可以對比上面realsensedetect.py文件和detect.py文件兩者的差異部分就可以知道到底更改了多少。第二是在Windows上面可以應用diffnity的軟件，按道理來說挺好用的！

4. 思考與結束語

為什么需要用到這個realsense深度相機呢，正如上一篇講述的一樣，他會增加一個維度，就是距離，那多的這個維度到底有什么應用呢？首先第一個就是在社交距離檢測中，比如你發(fā)現(xiàn)檢測到一個人沒有戴口罩，那么你可以直接檢測到他距離攝像頭的距離，你就可以提前通知他帶好口罩，以避免在入口處人員多的時候交叉感染。這是一個實際的例子。其次，主要應用在三維重建中，我們得到了物體的二維像素點和距離值，就可以通過三維重建或者數(shù)學建模來實現(xiàn)三維物體的重新建模，這是很重要的！最后，我們都可以利用已經(jīng)得到的信息進行三維建模和用pcl庫進行更加準確的距離計算，實現(xiàn)在現(xiàn)實世界中的應用！

這是第一在github上git自己的代碼，希望能夠幫助到您，對我感興趣的童鞋可以關注我，說不定那一天就可以幫到您！

總結

以上是生活随笔為你收集整理的realsense D455深度相机+YOLO V5结合实现目标检测（二）的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

上一篇： PTX JIT complied fai
下一篇： ip 路由选项