當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

【OpenVINO 3】POT量化流程

發布時間：2024/1/8 编程问答 36 豆豆

生活随笔收集整理的這篇文章主要介紹了【OpenVINO 3】POT量化流程小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

模型量化具備降低內存、提高計算速度等有點，并且是一種比較成熟的方案，已經得到廣泛應用。
OpenVINO提供了兩種量化方式
參考自官網 https://docs.openvino.ai/latest/openvino_docs_model_optimization_guide.html

Post-training Optimization w/POT。通過post-traning方法，對模型進行量化，比如post-training 8-Bit量化，無需對模型進行重新訓練或者fine-tuning
Training-time Optimization w/NNCF。在DL框架內，訓練時間段進行模型優化。比如可以基于Pytorch和TensorFlow框架內，支持量化感知訓練和裁剪。

下圖為量化的流程

訓練一個全精度的模型

運行Model Optimizer或者NNCF模塊，得到 IR模型或者量化后的框架模型

運行POT模塊對模型進行量化，或者運行Model Optimizer模塊獲取優化后的IR模型

二、Post-training Optimization Tool

優勢：

無需重新訓練模型
將全精度IR模型轉換為低精度數據類型INT8，可以減少模型大小、降低latency
會降低一些精度，也可能降低的比較多

下圖是PTO的量化流程
輸入模型->經過MO后得到IR文件->運行PTO工具(可輸入數據)->得到量化后的模型

2.1 運用MO工具獲取OpenVINO的IR模型

IR指 Intermediate Representation 中間表示，生成的也是OpenVINO的模型，可以是FP32或者FP16的。
Mo工具是OpenVINO提供的，可以在命令行操作。

#!/usr/bin/env python3 # -*- coding: utf-8 -*- """ * * *** * * * * * * * ** * * **** * ** * * * * * ** * * * * ** * * **** @File : hello_openvino.py @Date : 2022/9/26/026 @Require : @Author : https://blog.csdn.net/hjxu2016 @Funtion : """import openvino.inference_engine as ie import osprint(ie.__version__) if __name__ == "__main__":import subprocessfile = "F:/PyPro/Classification/weight/res18_5_focal_loss_9954.onnx"#f = str(file).replace('.onnx', '_openvino_model_fp16' + os.sep)## cmd = f"mo --input_model {file} --output_dir {f} --data_type FP16 --log_level NOTSET --input_shape [1,3,224,224]"# cmd = f"mo --help "cmd = f"mo --input_model {file} --output_dir {f} --data_type FP16 --log_level NOTSET"p = os.popen(cmd)print(p.read())# subprocess.check_output(cmd, shell=True)

2.2 DefaultQuantization 與 AccuracyAwareQuantization

PostTraining 提供了兩種量化方式
可以通過python腳本執行量化步驟
也可以通過命令行的接口來進行量化，這里只介紹Python量化的流程

整個量化差不多準備三個步驟

準備數據和數據接口

設置量化算法參數

定義和執行量化過程

2.2.1 準備數據和數據接口

在大多數案例中，需要集成openvino.tools.pot.DataLoade 來設置數據。
接口介紹： https://docs.openvino.ai/latest/pot_default_quantization_usage.html
接口可以從數據集中獲取數據，并且應用模型的特殊預處理工具，可以按照索引訪問。

再看Dataloader接口

len(), 返回數據集的size
getitem(), 可以按照索引訪問數據，它還可以封裝特定于模型的預處理邏輯。此方法應以（data，annotation）格式返回數據，其中：數據是在推理時傳遞給模型的輸入，因此應該對其進行適當的預處理。它可以是numpy。數組對象或字典，其中鍵是模型輸入的名稱，值是numpy。對應于此輸入的數組。默認量化方法不使用annotation。因此，在這種情況下，此對象可以為“None”

class DataLoader(ABC):"""An abstract class representing a dataset.All custom datasets should inherit.``__len__`` provides the size of the dataset and``__getitem__`` supports integer indexing in range from 0 to len(self)"""def __init__(self, config):""" Constructor:param config: data loader specific config"""self.config = config if isinstance(config, Dict) else Dict(config)@abstractmethoddef __getitem__(self, index):pass@abstractmethoddef __len__(self):pass

2.2.2 設置量化參數

默DefaultQuantization量化算子有一些強制性或者可選的參數，這些參數以字典的方式定義
如果選擇AccuracyAwareQuantization量化算子，可以設置maximal_drop最大精度下降的的范圍，這時候會自動搜索哪些層對量化的精度損失高的層，然后對這些層不進行量化操作

{"name": "DefaultQuantization", # AccuracyAwareQuantization"params": {"target_device": "ANY","stat_subset_size": 300,"stat_batch_size": 1,"maximal_drop":0.01,}, }

默認量化算子存在三個參數

target_device 目前只可以選擇“ANY”或者“CPU”
stat_subset_size 用于計算用于量化的激活統計信息的數據子集的大小。如果未指定參數，則使用整個數據集。建議使用不少于300個樣品。
stat_batch_size 用于計算用于量化的激活統計信息的批大小。如果未指定參數，則為1。
maximal_drop 精度下降的最大值

2.2.3 設置metric評估指標

metric評估指標在DefaultQuantization量化階段可以用來衡量量化前和量化后的精度對比，當然，在DefaultQuantization量化階段，可以將這個設置為None
在AccuracyAwareQuantization量化階段，則必須設置好，因為需要通過這個指標來確定精度下降的范圍。
如下實例為分割的IOU評估指標。

class Accuracy(Metric):def __init__(self):super().__init__()self._name = "accuracy"self._matches = []self.intersection = 0.0self.union = 0.0@propertydef value(self):"""Returns accuracy metric value for the last model output."""# print(self._matches[-1])return {self._name: self._matches[-1]}@propertydef avg_value(self):"""Returns accuracy metric value for all model outputs. Results per image are stored inself._matches, where True means a correct prediction and False a wrong prediction.Accuracy is computed as the number of correct predictions divided by the totalnumber of predictions."""miou = 1.0 * self.intersection / self.unionprint('miou', miou)return {self._name: miou}def update(self, output, target):"""Updates prediction matches.:param output: model output:param target: annotations"""predict = output[1]predict = predict[0] > 0.5target = target[0] > 0.5intersection = np.sum((predict) & (target))self.intersection += np.sum((predict) & (target))self.union += np.sum(predict) + np.sum(target) - intersectionself._matches.append([self.intersection/(self.union+0.00001)])def reset(self):"""Resets the Accuracy metric. This is a required method that should initialize allattributes to their initial value."""self.intersection = 0self.union = 0self._matches = []def get_attributes(self):"""Returns a dictionary of metric attributes {metric_name: {attribute_name: value}}.Required attributes: 'direction': 'higher-better' or 'higher-worse''type': metric type"""return {self._name: {"direction": "higher-better", "type": "accuracy"}}

2.2.4 執行量化

參考案例來自
https://github.com/openvinotoolkit/openvino_notebooks/blob/main/notebooks/301-tensorflow-training-openvino/301-tensorflow-training-openvino-pot.ipynb
總共有9個步驟，其中精度metric是可選的，DefaultQuantization量化時，metric可以設置為None，也可以用來對比量化前和量化后的精度損失。
在AccuracyAwareQuantization量化階段，則必須設置metric好，因為需要通過這個指標來確定精度下降的范圍

folder = "F:/DataSet/LyophilizedBall/classification/val/"# step1: 加載模型model = load_model(model_config)original_model = copy.deepcopy(model)# print(model)# step2: 初始化 dataloaderdata_loader = ClassificationDataLoader(folder)# step3: 可選，設置評估指標，可用于和原模型做對比metric = Accuracy()# metric = None# step4: 初始化引擎，通過數據、評估指標計算engine = IEEngine(config=engine_config, data_loader=data_loader, metric=metric)# step5: 創建模型壓縮算法的管道pipeline = create_pipeline(algo_config=algorithms, engine=engine)# step6: 執行管道流程compressed_model = pipeline.run(model=model)# step7: 可選：為了減少最后.bin 文件的大希奧，壓縮模型權重進度compress_model_weights(model=compressed_model)# step8: 可選：保存模型, 返回保存模型的路徑compress_model_path = save_model(model=compressed_model, save_path="./models/weight/ptqModel")print(compress_model_path)# Step 9 (Optional): Evaluate the original and compressed model. Print the resultsoriginal_metric_results = pipeline.evaluate(original_model)if original_metric_results:print(f"Accuracy of the original model: {next(iter(original_metric_results.values())):.5f}")quantized_metric_results = pipeline.evaluate(compressed_model)if quantized_metric_results:print(f"Accuracy of the quantized model: {next(iter(quantized_metric_results.values())):.5f}")

總結

以上是生活随笔為你收集整理的【OpenVINO 3】POT量化流程的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：企业要怎样做才能避免进入直播带货误区？
下一篇： 2018微信公开课：微信小游戏的精华内容