日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程语言 > python >内容正文

python

【Python-ML】SKlearn库学习曲线和验证曲线

發布時間:2025/4/16 python 15 豆豆
生活随笔 收集整理的這篇文章主要介紹了 【Python-ML】SKlearn库学习曲线和验证曲线 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
# -*- coding: utf-8 -*- ''' Created on 2018年1月18日 @author: Jason.F @summary: 判別過擬合和欠擬合 學習曲線Learning Curve:評估樣本量和指標的關系 驗證曲線validation Curve:評估參數和指標的關系 ''' import pandas as pd from sklearn.preprocessing import LabelEncoder from sklearn.cross_validation import train_test_split from sklearn.preprocessing import StandardScaler from sklearn.linear_model import LogisticRegression from sklearn.pipeline import Pipeline import matplotlib.pyplot as plt from sklearn.learning_curve import learning_curve import numpy as np from sklearn.learning_curve import validation_curve #導入數據 df = pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/wdbc.data',header=None) X=df.loc[:,2:].values y=df.loc[:,1].values le=LabelEncoder() y=le.fit_transform(y)#類標整數化 print (le.transform(['M','B'])) #劃分訓練集合測試集 X_train,X_test,y_train,y_test = train_test_split (X,y,test_size=0.20,random_state=1) #標準化、模型訓練串聯 pipe_lr=Pipeline([('scl',StandardScaler()),('clf',LogisticRegression(random_state=1,penalty='l2'))])#case1:學習曲線 #構建學習曲線評估器,train_sizes:控制用于生成學習曲線的樣本的絕對或相對數量 train_sizes,train_scores,test_scores=learning_curve(estimator=pipe_lr,X=X_train,y=y_train,train_sizes=np.linspace(0.1,1.0,10),cv=10,n_jobs=1) #統計結果 train_mean= np.mean(train_scores,axis=1) train_std = np.std(train_scores,axis=1) test_mean =np.mean(test_scores,axis=1) test_std=np.std(test_scores,axis=1) #繪制效果 plt.plot(train_sizes,train_mean,color='blue',marker='o',markersize=5,label='training accuracy') plt.fill_between(train_sizes,train_mean+train_std,train_mean-train_std,alpha=0.15,color='blue') plt.plot(train_sizes,test_mean,color='green',linestyle='--',marker='s',markersize=5,label='test accuracy') plt.fill_between(train_sizes,test_mean+test_std,test_mean-test_std,alpha=0.15,color='green') plt.grid() plt.xlabel('Number of training samples') plt.ylabel('Accuracy') plt.legend(loc='lower right') plt.ylim([0.8,1.0]) plt.show()#case2:驗證曲線 param_range=[0.001,0.01,0.1,1.0,10.0,100.0] #10折,驗證正則化參數C train_scores,test_scores =validation_curve(estimator=pipe_lr,X=X_train,y=y_train,param_name='clf__C',param_range=param_range,cv=10) #統計結果 train_mean= np.mean(train_scores,axis=1) train_std = np.std(train_scores,axis=1) test_mean =np.mean(test_scores,axis=1) test_std=np.std(test_scores,axis=1) plt.plot(param_range,train_mean,color='blue',marker='o',markersize=5,label='training accuracy') plt.fill_between(param_range,train_mean+train_std,train_mean-train_std,alpha=0.15,color='blue') plt.plot(param_range,test_mean,color='green',linestyle='--',marker='s',markersize=5,label='test accuracy') plt.fill_between(param_range,test_mean+test_std,test_mean-test_std,alpha=0.15,color='green') plt.grid() plt.xscale('log') plt.xlabel('Parameter C') plt.ylabel('Accuracy') plt.legend(loc='lower right') plt.ylim([0.8,1.0]) plt.show()

結果:


總結

以上是生活随笔為你收集整理的【Python-ML】SKlearn库学习曲线和验证曲线的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。