當前位置：首頁 > 人文社科 > 生活经验 >内容正文

生活经验

scikit-learn - 分类模型的评估 (classification_report)

發布時間：2023/11/28 生活经验 27 豆豆

生活随笔收集整理的這篇文章主要介紹了 scikit-learn - 分类模型的评估 (classification_report) 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

20201225

分類報告輸出到csv

from sklearn.metrics import classification_report
report = classification_report(y_test, y_pred, output_dict=True)
df = pd.DataFrame(report).transpose()
df.to_csv("result.csv", index= True)

?

使用說明

參數

sklearn.metrics.classification_report(y_true, y_pred, labels=None, target_names=None, sample_weight=None, digits=2, output_dict=False)

y_true：1 維數組，真實數據的分類標簽
y_pred：1 維數組，模型預測的分類標簽
labels：列表，需要評估的標簽名稱
target_names：列表，指定標簽名稱
sample_weight：1 維數組，不同數據點在評估結果中所占的權重
digits：評估報告中小數點的保留位數，如果 output_dict=True，此參數不起作用，返回的數值不作處理
output_dict：若真，評估結果以字典形式返回

字符串或字典。

每個分類標簽的精確度，召回率和 F1-score。

精確度：precision，正確預測為正的，占全部預測為正的比例，TP / (TP+FP)
召回率：recall，正確預測為正的，占全部實際為正的比例，TP / (TP+FN)
F1-score：精確率和召回率的調和平均數，2 * precision*recall / (precision+recall)

同時還會給出總體的微平均值，宏平均值和加權平均值。

微平均值：micro average，所有數據結果的平均值
宏平均值：macro average，所有標簽結果的平均值
加權平均值：weighted average，所有標簽結果的加權平均值

在二分類場景中，正標簽的召回率稱為敏感度（sensitivity），負標簽的召回率稱為特異性（specificity）。

鳶尾花數據集的隨機森林結果評估


from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report
from sklearn.model_selection import train_test_split# 鳶尾花數據集
iris = load_iris()
X = iris.data
y = iris.target# [0, 1, 2] 標簽轉換為名稱 ['setosa' 'versicolor' 'virginica']
y_labels = iris.target_names[y]# 數據集拆分為訓練集與測試集
X_train, X_test, y_train, y_test = train_test_split(X, y_labels, test_size=0.2)# 使用訓練集訓練模型
clf = RandomForestClassifier(n_estimators=100)
clf.fit(X_train, y_train)# 使用測試集預測結果
y_pred = clf.predict(X_test)# 生成文本型分類報告
print(classification_report(y_test, y_pred))
"""precision    recall  f1-score   supportsetosa       1.00      1.00      1.00        10versicolor       0.83      1.00      0.91        10virginica       1.00      0.80      0.89        10micro avg       0.93      0.93      0.93        30macro avg       0.94      0.93      0.93        30
weighted avg       0.94      0.93      0.93        30
"""# 生成字典型分類報告
report = classification_report(y_test, y_pred, output_dict=True)
for key, value in report["setosa"].items():print(f"{key:10s}:{value:10.2f}")
"""
precision :      1.00
recall    :      1.00
f1-score  :      1.00
support   :     10.00
"""

Reference

sklearn.metrics.classification_report
準確率、精確率、召回率、F1值、ROC/AUC整理筆記

20201207

????????????????precision ???recall ?f1-score ??support

???????????0 ??????0.94 ?????0.98 ?????0.96 ?????5259

???????????1 ??????0.06 ?????0.02 ?????0.03 ??????307

????accuracy ?????????????????????????0.93 ?????5566

???macro avg ??????0.50 ?????0.50 ????0.49 ?????5566

weighted avg ??????0.90 ?????0.93 ?????0.91 ?????5566

precision 是精準率?

準確率是所有預測正確處于總數

總結

以上是生活随笔為你收集整理的scikit-learn - 分类模型的评估 (classification_report)的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： python 列表生成式、lower()
下一篇： Python第三方库jieba（中文分词