MindSDK+yolov5部署及python版图像视频推理实现
一、前言
???????? 基于華為云上的MindX SDK + Pytorch yolov5 應用案例:
????????https://bbs.huaweicloud.com/forum/thread-118598-1-1.html
???????? 原帖使用預訓練yolov5s.onnx模型進行處理,使用c++進行圖像推理,由于原帖python版的實現并不完整,這里對python版圖像和視頻推理進行實現。
整個實現流程:
???????? 1、基礎環境:Atlas800-3000、mxManufacture、Ascend-CANN-toolkit、Ascend Driver
???????? 2、模型轉換:pytorch模型轉onnx模型,yolov5s.pt----->yolov5.onnx
???????? 3、onnx模型簡化,onnx模型轉om模型
???????? 4、業務流程編排與配置
???????? 5、python推理流程代碼開發(圖像、視頻)
二、圖像推理流程開發實現。
1、初始化流管理。????????
streamManagerApi = StreamManagerApi() ret = streamManagerApi.InitManager() if ret != 0:print("Failed to init Stream manager, ret=%s" % str(ret))exit() with open("../pipeline/yolov5x_example.pipeline", 'rb') as f:pipelineStr = f.read() ret = streamManagerApi.CreateMultipleStreams(pipelineStr) if ret != 0:print("Failed to create Stream, ret=%s" % str(ret))exit()2、加載圖像,進行推理。????????
dataPath = "dog.jpg" savePath = "dog_result.jpg" # 獲取圖像 dataInput = MxDataInput() with open(dataPath, 'rb') as f:dataInput.data = f.read() streamName = b'classification+detection' inPluginId = 0 uniqueId = streamManagerApi.SendDataWithUniqueId(streamName, inPluginId, dataInput) if uniqueId < 0:print("Failed to send data to stream.")exit() inferResult = streamManagerApi.GetResultWithUniqueId(streamName, uniqueId, 3000) if inferResult.errorCode != 0:print("GetResultWithUniqueId error. errorCode=%d, errorMsg=%s" % (inferResult.errorCode, inferResult.data.decode()))exit()3、解析推理結果,獲取推理結果的坐標和置信度,并在圖像上進行繪制。這里使用json對結果進行解析生成字典,獲取圖像目標檢測兩個角點坐標(x0,y0),(x1,y1),以及置信度confidence,使用OpenCV加載圖像繪制檢測框和置信度。????????
infer_results = inferResult.data.decode() temp_dic = json.loads(infer_results) img = cv2.imread(dataPath) if 'MxpiObject' in temp_dic.keys():for i in range(len(temp_dic["MxpiObject"])):name = temp_dic["MxpiObject"][i]["classVec"][0]["className"]confidence = temp_dic["MxpiObject"][i]["classVec"][0]["confidence"]text = name + ":" + str(confidence)x0 = int(temp_dic["MxpiObject"][i]["x0"])y0 = int(temp_dic["MxpiObject"][i]["y0"])x1 = int(temp_dic["MxpiObject"][i]["x1"])y1 = int(temp_dic["MxpiObject"][i]["y1"])img = cv2.rectangle(img, (x0, y0), (x1, y1), (0, 255, 0), 2)cv2.putText(img, text, (x0, y0 + 20), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 255, 0), 2, )cv2.imwrite(savePath, img) else:cv2.putText(img, 'No object detect !', (0, 20), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 255, 0), 2, )cv2.imwrite(savePath, img) # destroy streams streamManagerApi.DestroyAllStreams()結果展示:
三、視頻推理實現。
???????? 由于這里用來測試的視頻為mp4格式,所以采用OpenCV進行視頻解碼,解析每一幀轉換為圖像之后再進行推理,所以這里視頻的推理本質上與圖像推理大致相同。也可以嘗試將mp4轉換為h264格式,昇騰可以支持h264、h265直接解碼。
具體實現:????????
videoCapture = cv2.VideoCapture(videoPath) # 獲取視頻幀率 fps = videoCapture.get(cv2.CAP_PROP_FPS) # 獲取視頻寬和高 width = videoCapture.get(cv2.CAP_PROP_FRAME_WIDTH) height = videoCapture.get(cv2.CAP_PROP_FRAME_HEIGHT) model_width = 640 model_height = 640 x_scale = width / model_width y_scale = height / model_width size = (int(width), int(height)) videoWriter = cv2.VideoWriter(savePath, cv2.VideoWriter_fourcc('X', 'V', 'I', 'D'), fps, size) count = 0 success, frame = videoCapture.read() while success:img_temp = 'temp.jpg'img = cv2.resize(frame, [model_width, model_height], cv2.INTER_LINEAR)cv2.imwrite(img_temp, img)dataInput = MxDataInput()with open(img_temp, 'rb') as f:dataInput.data = f.read()streamName = b'classification+detection'inPluginId = 0uniqueId = streamManagerApi.SendDataWithUniqueId(streamName, inPluginId, dataInput)if uniqueId < 0:print("Failed to send data to stream.")exit()# Obtain the inference result by specifying streamName and uniqueId.inferResult = streamManagerApi.GetResultWithUniqueId(streamName, uniqueId, 3000)if inferResult.errorCode != 0:print("GetResultWithUniqueId error. errorCode=%d, errorMsg=%s" % (inferResult.errorCode, inferResult.data.decode()))exit()infer_results = inferResult.data.decode()temp_dic = json.loads(infer_results)#print(infer_results)if 'MxpiObject' in temp_dic.keys():for i in range(len(temp_dic["MxpiObject"])):name = temp_dic["MxpiObject"][i]["classVec"][0]["className"]confidence = temp_dic["MxpiObject"][i]["classVec"][0]["confidence"]text = name + ":" + str(confidence)x0 = int(x_scale * temp_dic["MxpiObject"][i]["x0"])y0 = int(y_scale * temp_dic["MxpiObject"][i]["y0"])x1 = int(x_scale * temp_dic["MxpiObject"][i]["x1"])y1 = int(y_scale * temp_dic["MxpiObject"][i]["y1"])img = cv2.rectangle(frame, (x0, y0), (x1, y1), (0, 255, 0), 2)cv2.putText(frame, text, (x0, y0 + 20), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 255, 0), 2, )videoWriter.write(frame)count += 1print(count)success, frame = videoCapture.read() # destroy streams streamManagerApi.DestroyAllStreams()視頻推理效果:
總結
以上是生活随笔為你收集整理的MindSDK+yolov5部署及python版图像视频推理实现的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: python字符串转字典并获取多层嵌套字
- 下一篇: websocket python爬虫_p