當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

初探TVM--TVM优化resnet50

發布時間：2023/12/10 编程问答 35 豆豆

生活随笔收集整理的這篇文章主要介紹了初探TVM--TVM优化resnet50 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

測試用TVM編譯出的resnet50在CPU上的效果

測試resnet50在CPU上的效果
- 編譯后的resnet50模型
- 圖像預處理
- 運行編譯后的模型
- 查看輸出結果
resnet50自動調優
- 模型調優 auto-tune
- 編譯調優過的模型

測試resnet50在CPU上的效果

如果直接點開了這篇，可能你會不知道編譯過的模型是咋來的，戳這里。再回顧一下，編譯過的模型會被壓縮后存在一個tar壓縮包里面。首先解壓出來他：

mkdir model tar -xvf resnet50-v2-7-tvm.tar -C model ls model

你會看到model里面有三個文件：

mod.so 這個其實就是模型，只不過被編譯為c++共享庫，TVM的runtime會加載并調用它
mod.params 包含模型的預訓練數據
mod.json 表示relay計算圖的文本文件

這些東西可以直接被你的應用加載，模型可以通過TVM的runtime API調用。

編譯后的resnet50模型

我們已經編譯出了模型模塊，現在需要測試一下效果。測試使用tvm的runtime api，當然tvmc里面集成了它。使用時，我們需要準備：

編譯過的模型，剛編出來，熱乎的

一張輸入的圖片

每個模型都會有期望的輸入尺寸，數據類型，數據格式等等，因此對于一張圖片，通常需要對齊進行預處理或者后處理。tvmc接受numpy的.npz文件，可以讓我們簡單的使用。我很喜歡貓子，這里就跟tvm教程里一樣，就用這個貓子的照片了。

圖像預處理

對于resnet50，圖像需要使用ImageNet的格式，下面放上一個pre-processing和post-processing的例子。在做前后處理的時候，需要使用pillow模塊，如果沒有的話，可以這樣安裝pip3 install pillow。

#!python ./preprocess.py from tvm.contrib.download import download_testdata from PIL import Image import numpy as np# if you have problem of download,just use images above img_url = "https://s3.amazonaws.com/model-server/inputs/kitten.jpg" img_path = download_testdata(img_url, "imagenet_cat.png", module="data")# Resize it to 224x224 resized_image = Image.open(img_path).resize((224, 224)) img_data = np.asarray(resized_image).astype("float32")# ONNX expects NCHW input, so convert the array img_data = np.transpose(img_data, (2, 0, 1))# Normalize according to ImageNet imagenet_mean = np.array([0.485, 0.456, 0.406]) imagenet_stddev = np.array([0.229, 0.224, 0.225]) norm_img_data = np.zeros(img_data.shape).astype("float32") for i in range(img_data.shape[0]):norm_img_data[i, :, :] = (img_data[i, :, :] / 255 - imagenet_mean[i]) / imagenet_stddev[i]# Add batch dimension img_data = np.expand_dims(norm_img_data, axis=0)# Save to .npz (outputs imagenet_cat.npz) np.savez("imagenet_cat", data=img_data)

運行編譯后的模型

有了編譯后的模型和轉換后的圖片，我們就可以測試模型的效果了：

python -m tvm.driver.tvmc run \ --inputs imagenet_cat.npz \ --output predictions.npz \ resnet50-v2-7-tvm.tar

在tar文件包里面，有編譯后的模型運行時庫，tvmc封裝了tvm的runtime接口，運行后，tvmc會給出一個預測結果的.npz文件。在這個例子中，運行模型的編譯模型的機器為同一個平臺，但你也可以使用RPC中提供的平臺運算測試，通過python -m tvm.driver.tvmc run --help查看RPC使用的方式。

查看輸出結果

其實每個模型都有自己的輸出tensor格式，我們這里可以下載一個resnet50的輸出查找表格，從中提取信息，并打印輸出。這里會用到一個后處理的腳本：

#!python ./postprocess.py import os.path import numpy as npfrom scipy.special import softmaxfrom tvm.contrib.download import download_testdata# Download a list of labels labels_url = "https://s3.amazonaws.com/onnx-model-zoo/synset.txt" labels_path = download_testdata(labels_url, "synset.txt", module="data")with open(labels_path, "r") as f:labels = [l.rstrip() for l in f]output_file = "predictions.npz"# Open the output and read the output tensor if os.path.exists(output_file):with np.load(output_file) as data:scores = softmax(data["output_0"])scores = np.squeeze(scores)ranks = np.argsort(scores)[::-1]for rank in ranks[0:5]:print("class='%s' with probability=%f" % (labels[rank], scores[rank]))

運行后可以拿到如下結果;

class='n02123045 tabby, tabby cat' with probability=0.610552 class='n02123159 tiger cat' with probability=0.367179 class='n02124075 Egyptian cat' with probability=0.019365 class='n02129604 tiger, Panthera tigris' with probability=0.001273 class='n04040759 radiator' with probability=0.000261

預測的top5全部都是不同種類的貓虎豹。

resnet50自動調優

上述模型僅僅完成基礎的編譯，并未加入任何與目標平臺相關的調優工作，我們使用tvmc可以對模型根據目標平臺特性，做自動調優。在一些情況下，我們其實不清楚在平臺上使用哪些優化策略會比較好，auto-tune模塊可以幫助我創建一個調優的搜索空間，并且進行性能調優。這里的tune并不是模型訓練時的fine-tune，這里不改變模型的預測精度，僅僅是對目標平臺的運行時速度做調優。tvm提供多個調優調度的模板，在目標平臺中選出最優的那個，調優也可以通過tvmc實現。在最簡單的調優模式中，tvmc需要我們給出：

目標平臺
調優輸出文件
模型

模型調優 auto-tune

在下面的命令可以完成一次調優：

python -m tvm.driver.tvmc tune \ --target "llvm" \ --output resnet50-v2-7-autotuner_records.json \ resnet50-v2-7.onnx

在這個例子中，我們可以寫明我們的目標平臺架構，例如cpu的skylake架構，你可以通過 – target llvm mcpu=skylake。這樣auto-tune模塊就可以找到更適用的算子優化組合。auto-tune模塊優化在模型中的各個算子子圖，每個子圖都有一個優化調度的搜索空間，auto-tune會找出最佳的搜索結果¹。

在這個模型中，得到如下tuning結果：

調優的時間有時會比較長，所以tvmc提供其他選項供大家控制運算時間（--repeat或者--number）等。

編譯調優過的模型

調優后的數據記錄在resnet50-v2-7-autotuner_records.json文件中，這個文件可以在將來用作：

編譯優化后的模型 tvmc tune --tuning-records
直接使用用作后續的進一步調優

編譯器會利用調優后的記錄，在目標平臺生成高效代碼，可以用tvmc compile --tuning-records完成，也可以查看tvmc compile --help使用。收集過優化的數據，我們可以重新編譯一遍模型了，：

python -m tvm.driver.tvmc compile \ --target "llvm" \ --tuning-records resnet50-v2-7-autotuner_records.json \ --output resnet50-v2-7-tvm_autotuned.tar \ resnet50-v2-7.onnx

驗證下優化的模型的預測結果：

python -m tvm.driver.tvmc run \ --inputs imagenet_cat.npz \ --output predictions.npz \ resnet50-v2-7-tvm_autotuned.tarpython postprocess.py

可以看到是相同的輸出：

tvmc也提供了對比推理的運算時間的工具：

python -m tvm.driver.tvmc run \ --inputs imagenet_cat.npz \ --output predictions.npz \ --print-time \ --repeat 100 \ resnet50-v2-7-tvm_autotuned.tar# Execution time summary: # mean (ms) max (ms) min (ms) std (ms) # 19.19 99.95 16.60 9.33 python -m tvm.driver.tvmc run \ --inputs imagenet_cat.npz \ --output predictions.npz \ --print-time \ --repeat 100 \ resnet50-v2-7-tvm.tar# Execution time summary: # mean (ms) max (ms) min (ms) std (ms) # 22.93 150.05 21.02 12.93

可以看到，在我的這個服務器上面，時間提升了3~4ms，還算有點兒效果。撒花

搜索算法使用xgboost grid，但是也有別的算法可供選擇，使用tvmc tune --help查看使用細節 ??

總結

以上是生活随笔為你收集整理的初探TVM--TVM优化resnet50的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：剪映电脑版_2020 年双十一要不要选一
下一篇： android gps 锁屏更新坐标_把