當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

.pth文件转.weight文件For YOLO

發布時間：2023/12/16 编程问答 34 豆豆

生活随笔收集整理的這篇文章主要介紹了 .pth文件转.weight文件For YOLO 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

.pth文件轉.weight文件

任務介紹

首先，.pth是pytorch框架訓練模型的常見保存格式，.weight是darknet框架訓練和加載模型的擴展名，實現將.pth轉為.weight便可以將基于pytorch訓練的模型在darknet框架里進行應用，比如作為預訓練模型或直接進行檢測。要做這件事，首先，咱得整明白下面這些東西：

怎么給參數從.pth文件中正確地拿出來，以及怎么給參數按照.weight文件需求的寫進去
權重文件的存儲規則：權重文件中哪部分是頭文件，哪部分是網絡參數，頭文件都寫了些啥；
網絡參數的存儲規則：網絡中都有哪些模塊有參數，一個模塊中各個參數的存儲順序是啥，不同模塊之間存儲順序是啥等等

前期查閱的資料

torch.load可以解析.pth文件，得到參數存儲的鍵值對，這樣就可以直接獲取到對應層的權重，隨心所欲進行轉換

net = torch.load(src_file,map_location=torch.device('cpu'))

得到的輸出如下所示：

一個講了.weight文件的頭文件應該咋寫才能被darknet讀取，鏈接
一個指明了轉換的前進方向(.weight文件參數部分都是誰的參數，每個模塊寫的順序是什么)，鏈接
猜測darknet按照config文件進行權重讀取，所以寫.weight文件的時候應該和config保持一致

代碼及注釋

這里實現了對official_yolov3_weights_pytorch.pth的轉換，這個權重是在pytorch框架中用YOLOv3算法訓練的模型的權重，還特地torch.load了一下，確保參數是可以和darknet里yolov3的config文件對上的，這樣轉換之后的權重就可以很方便的在darknet中得到驗證。

做這個實驗就是為了驗證現有方法對卷積核等權重導入的順序（NCHW到底拉成一維向量是怎么拉的）是否正確，主要是也沒找到明確的關于.weight文件中對于卷積核權重的存儲順序的說明，生怕現有的轉換代碼出現偏差，所以想著驗證一下。驗證的思路也比較蠢，就是想著找個訓練的比較好的.pth文件，然后轉換為.weight文件，然后在darknet框架中做個測試，如果檢測結果不錯，那證明當前的轉換的代碼沒啥問題。

下面就貼上代碼，這只實現了針對yolov3的權重轉換，但是思路是一樣的，改成resnet或其他網絡結構只需要把load的順序配合著darknet中的config來寫就行。

明白這幾點就可以：
1，torch.load()可以解析.pth中存儲的參數鍵值對，還是按照順序存儲的，在一定程度上可以反映出網絡結構
2，config中帶有bn的conv，其參數寫入.weight文件時的順序：
‘bn1.bias’，‘bn1.weight’，‘bn1.running_mean’，‘bn1.running_var’，‘conv1.weight’
3，config中不帶bn的conv，其參數寫入.weight文件時的順序：
‘conv.bias’，‘conv.weight’
4，對于卷積核的權重，其大小為NxCxHxW，從.pth中索引出conv_weight之后直接借助numpy的tofile()來實現拉成一維向量即可匹配.weight文件正確的卷積核參數存儲順序，像這樣conv_weight.data.cpu().numpy().tofile(fp)
5，draknet是按照config文件寫的順序來導入權重的，route，shortcut這些層不影響導入順序

import torch import numpy as np# list the path of the two kind of weight file below src_file = '/disk2/pretrained_model/official_yolov3_weights_pytorch.pth' dst_file = '/disk2/pretrained_model/yolov3.weight'####################################################### structure of yolov3 ###################################################### # backbone part backbone = ['module.backbone.','module.backbone.layer1.','module.backbone.layer1.residual', #1'module.backbone.layer2.','module.backbone.layer2.residual', #2'module.backbone.layer3.','module.backbone.layer3.residual', #8'module.backbone.layer4.','module.backbone.layer4.residual', #8'module.backbone.layer5.','module.backbone.layer5.residual' #4] num_of_residual = {'layer1':1,'layer2':2,'layer3':8,'layer4':8,'layer5':4} ds_convbn = ['ds_bn.bias','ds_bn.weight','ds_bn.running_mean','ds_bn.running_var','ds_conv.weight'] convbn1 = ['bn1.bias','bn1.weight','bn1.running_mean','bn1.running_var','conv1.weight'] convbn2 = ['bn2.bias','bn2.weight','bn2.running_mean','bn2.running_var','conv2.weight']# head part embeddings = ['module.embedding0.','module.embedding1_cbl.','module.embedding1.','module.embedding2_cbl.','module.embedding2.'] convbn = ['bn.bias','bn.weight','bn.running_mean','bn.running_var','conv.weight'] conv = ['conv_out.bias','conv_out.weight'] ####################################################### load the .pth file #######################################################net = torch.load(src_file,map_location=torch.device('cpu'))#################################################### write the .weight files ###################################################### open a empty file and start to write fp = open(dst_file, "wb")# write head infomation into the file header_info = np.array([0, 2, 0, 32013312, 0], dtype=np.int32) header_info.tofile(fp)# write the backbone part for layer in backbone:if layer.split('.')[-2] == 'backbone':for i in convbn1:content = net[layer+i]content.data.cpu().numpy().tofile(fp)if layer.split('.')[-2] == 'layer1':if layer.split('.')[-1] =='':# load the downsample partfor i in ds_convbn:content = net[layer+i]content.data.cpu().numpy().tofile(fp)else:# load the residual partfor j in range(num_of_residual['layer1']):layer_new = layer+'_'+str(j)+'.'for i in convbn1:content = net[layer_new+i]content.data.cpu().numpy().tofile(fp)for i in convbn2:content = net[layer_new+i]content.data.cpu().numpy().tofile(fp)if layer.split('.')[-2] == 'layer2':if layer.split('.')[-1] =='':# load the downsample partfor i in ds_convbn:content = net[layer+i]content.data.cpu().numpy().tofile(fp)else:# load the residual partfor j in range(num_of_residual['layer2']):layer_new = layer+'_'+str(j)+'.'for i in convbn1:content = net[layer_new+i]content.data.cpu().numpy().tofile(fp)for i in convbn2:content = net[layer_new+i]content.data.cpu().numpy().tofile(fp)if layer.split('.')[-2] == 'layer3':if layer.split('.')[-1] =='':# load the downsample partfor i in ds_convbn:content = net[layer+i]content.data.cpu().numpy().tofile(fp)else:# load the residual partfor j in range(num_of_residual['layer3']):layer_new = layer+'_'+str(j)+'.'for i in convbn1:content = net[layer_new+i]content.data.cpu().numpy().tofile(fp)for i in convbn2:content = net[layer_new+i]content.data.cpu().numpy().tofile(fp)if layer.split('.')[-2] == 'layer4':if layer.split('.')[-1] =='':# load the downsample partfor i in ds_convbn:content = net[layer+i]content.data.cpu().numpy().tofile(fp)else:# load the residual partfor j in range(num_of_residual['layer4']):layer_new = layer+'_'+str(j)+'.'for i in convbn1:content = net[layer_new+i]content.data.cpu().numpy().tofile(fp)for i in convbn2:content = net[layer_new+i]content.data.cpu().numpy().tofile(fp)if layer.split('.')[-2] == 'layer5':if layer.split('.')[-1] =='':# load the downsample partfor i in ds_convbn:content = net[layer+i]content.data.cpu().numpy().tofile(fp)else:# load the residual partfor j in range(num_of_residual['layer5']):layer_new = layer+'_'+str(j)+'.'for i in convbn1:content = net[layer_new+i]content.data.cpu().numpy().tofile(fp)for i in convbn2:content = net[layer_new+i]content.data.cpu().numpy().tofile(fp)# write the head part for embedding in embeddings:if embedding.split('_')[-1] == 'cbl.':for i in convbn:content = net[embedding+i]content.data.cpu().numpy().tofile(fp)else:for j in range(6):embedding_new = embedding+str(j)+'.'for i in convbn:content = net[embedding_new+i]content.data.cpu().numpy().tofile(fp)for i in conv:content = net[embedding+i]content.data.cpu().numpy().tofile(fp) fp.close() # finish !

把轉換之后的權重使用darknet框架進行測試

./darknet detect cfg/yolov3.cfg pretrain_model/yolov3.weight data/dog.jpg

得到如下的檢測結果
你看看！這狗多狗！這說明我們轉換的權重是么得問題的~

目前只能說是針對模型來寫轉換的代碼，而且因為.weight文件不存儲網絡結構，只能配合config文件加載權重，所以還不知道如果想導入的模塊不連續或者不是從頭開始該如何實現一次性導入，可能如果是想轉換權重來做預訓練可能還不太有這個需求

猜的不一定對，要是有更好的轉換辦法還麻煩評論區交流交流，互相學習呀~

總結

以上是生活随笔為你收集整理的.pth文件转.weight文件For YOLO的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： XML实现异构数据库间转换的实现与分析
下一篇： Android组件化开发，组件间的Act