當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

模型越复杂越容易惰性_ML模型的惰性预测

發布時間：2023/12/15 编程问答 35 豆豆

生活随笔收集整理的這篇文章主要介紹了模型越复杂越容易惰性_ML模型的惰性预测小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

模型越復雜越容易惰性

Hey, hope you are having a wonderful day!

嘿，希望您今天過得愉快！

Whenever I work on a new ML project. These lines always pop up in my mind every time

每當我從事新的ML項目時。這些線每次都會在我的腦海中彈出

“I need to fit the data for every model then apply metrics to check which model has better accuracy for the available dataset ,then choose best model and also this process is time-consuming and even it might not be that much effective too“

“我需要為每個模型擬合數據，然后應用度量標準來檢查哪個模型對可用數據集具有更好的準確性，然后選擇最佳模型，而且此過程非常耗時，甚至可能效果也不那么好”

For this problem, I got a simple solution when surfing through python org, which is a small python library by name “lazypredict” and it does wonders

對于這個問題，我在通過python org進行瀏覽時得到了一個簡單的解決方案，這是一個名為“ lazypredict”的小型python庫，它的確令人驚訝

Let me tell you how it works:-

讓我告訴你它是如何工作的：

安裝庫 (Install the library)

pip install lazypredict

注意 (Note)

lazypredict only works for python version≥3.6

lazypredict僅適用于Python版本≥3.6

It's built on top of various other libraries so if you don't have those libraries in the system, python will throw ModuleError so interpret the error properly and install the required libraries.

它建立在其他各種庫的基礎上，因此，如果系統中沒有這些庫，則python會拋出ModuleError，從而正確解釋錯誤并安裝所需的庫。

lazypredict comes only for supervised learning (Classification and Regression)

lazypredict僅用于監督學習(分類和回歸)

I will be using jupyter notebook in this article

我將在本文中使用Jupyter Notebook

碼 (Code)

# import necessary modules
import warnings
warnings.filterwarnings('ignore')
import time
from sklearn.datasets import load_iris,fetch_california_housing
from sklearn.model_selection import train_test_split
from lazypredict.Supervised import LazyClassifier,LazyRegressor

warnings: Package to handle warnings and ‘ignore’ is used when we need to filter out all the warnings
警告：處理警告和“忽略”的包在我們需要過濾掉所有警告時使用
time: Package to handle time manipulation
time：處理時間的軟件包
sklearn.datasets: Package to load datasets, today we gonna use the classic datasets which everyone works on it that are load_iris() for classification problem and fetch_california_housing() for a regression problem
sklearn.datasets：打包以加載數據集，今天我們將使用每個人都可以處理的經典數據集，其中load_iris()用于分類問題，而fetch_california_housing()用于回歸問題。
sklearn.model_selection.train_test_split:Used to split the dataset into train and split
sklearn.model_selection.train_test_split：用于將數據集拆分為訓練并拆分
lazypredict:this is the package we gonna learn today in lazypredict.Supervised there are two main functions LazyClassifier for Classification and LazyRegressor for Regression
lazypredict：這是我們今天將要在lazypredict中學習的軟件包。在監督下，有兩個主要功能用于分類的LazyClassifier和用于回歸的LazyRegressor

惰性分類器 (LazyClassifier)

# load the iris dataset
data=load_iris()
X=data.data
Y=data.target

The data is a variable with dictionary data type where there are two keys the data which contains independent features/column values and target which contains dependent feature value
數據是具有字典數據類型的變量，其中有兩個鍵，數據包含獨立的要素/列值，目標包含相關的要素值
X has all the independent features values
X具有所有獨立特征值
Y has all the dependent features values
Y具有所有從屬特征值

# split the dataset
X_train, X_test, Y_train, Y_test =train_test_split(X,Y,test_size=.3,random_state =23)
classi=LazyClassifier(verbose=0,predictions=True)

We will split the data into train and test using train_test_split()
我們將數據分為train和使用train_test_split()進行測試
The test size will be 0.3(30%) of the dataset
測試大小將為數據集的0.3(30％)
random_state will decide the splitting of data into train and test indices just choose any number you like!
random_state將決定將數據拆分為訓練索引和測試索引，只需選擇您喜歡的任何數字即可！

Tip 1:If you want to see source code behind any function or object in the jupyter notebook then just add ? or ?? after the object or the function you want to check out and excute it

提示1：如果要查看jupyter筆記本中任何函數或對象背后的源代碼，則只需添加？要么？？在要檢出并執行的對象或功能之后

Next, we will call LazyClassifier() and initialize to classic with two parameters verbose and prediction
接下來，我們將調用LazyClassifier()并使用詳細信息和預測兩個參數將其初始化為經典
verbose: int data type, if non zero, progress messages are printed. Above 50, the output is sent to stdout. The frequency of the messages increases with the verbosity level. If it more than 10, all iterations are reported. I would suggest you try different values based on your depth of analysis
詳細：int數據類型，如果非零，則顯示進度消息。高于50時，輸出將發送到stdout。消息的頻率隨著詳細程度而增加。如果大于10，則報告所有迭代。我建議您根據您的分析深度嘗試其他值
predictions: boolean data type if it is set to True then it will return all the predicted values from the models
預測：布爾數據類型，如果將其設置為True，則它將返回模型中的所有預測值

# fit and train the model
start_time_1=time.time()
models_c,predictions_c=classi.fit(X_train, X_test, Y_train, Y_test)
end_time_1=time.time()

we gonna fit train and test data to the classi object
我們將訓練和測試數據擬合到classi對象
classic will return two values:
經典版將返回兩個值：
models_c: will have all the models and with some metrics
models_c：將具有所有模型并具有一些指標
predictions_c: will have all the predicted values that is ?
projections_c ：將具有所有預測值?

# to check which model did better on the iris dataset
models_cmodel_c outputmodel_c輸出

To be honest I didn't know some of these models even exist for classification until I saw this
老實說，我不知道其中一些模型可以分類，直到我看到了
I know that your mind would be thinking why is ROC AUC is None is this function not giving proper output nope that's not the case here, ROC AUC is None because we have taken multi-classification dataset
我知道您的大腦會在思考為什么ROC AUC為None是該函數沒有提供適當的輸出空間嗎，這里不是這種情況，ROC AUC為None是因為我們采用了多分類數據集

Tip 2: For the above dataset or multi-classification we can use roc_auc_score rather than ROC AUC

提示2：對于上述數據集或多分類，我們可以使用roc_auc_score而不是ROC AUC

# to check the predications for the models
predictions_cpredictions_c outputprojections_c輸出

This is just a few sample predictions from the models
這只是來自模型的一些樣本預測

惰性回歸器 (LazyRegressor)

So we checked out LazyClassifier it will be sad if we didn't pay some attention to LazyRegressor
因此，我們檢查了LazyClassifier，如果我們不注意LazyRegressor將會很可惜

The following code is similar to LazyClassifier so let's pick up the phase and skip some explanations
以下代碼與LazyClassifier相似，因此讓我們開始階段并跳過一些解釋。

# load the fetch_california_housing dataset
data1=fetch_california_housing()
X1=data1.data
Y1=data1.target

data1 is dict data type with data and target as keys
data1是dict數據類型，數據和目標為鍵

# split the dataset
X_train1, X_test1, Y_train1, Y_test1 =train_test_split(X1,Y1,test_size=.3,random_state =23)
regr=LazyRegressor(verbose=0,predictions=True)

after fitting the model next we will train
擬合模型之后，我們將進行訓練

# fit and train the model
start_time_2=time.time()
models_r,predictions_r=regr.fit(X_train1, X_test1, Y_train1, Y_test1)
end_time_2=time.time()

注意 (Note)

1. Before running the above cell make sure you clear all the unnecessary background process because it takes a lot of computation power

1.在運行上面的單元格之前，請確保清除所有不必要的后臺進程，因為這需要大量的計算能力

2. I would suggest if you have low computation power(RAM, GPU) then use Google Colab, This is the simplest solution you can get

2.我建議如果您的計算能力(RAM，GPU)低，那么請使用Google Colab，這是您可以獲得的最簡單的解決方案

# to check which model did better on the fetch_california_housing dataset
models_rmodels_r outputmodels_r輸出

And again I didn't know there were so many models for regression
再一次，我不知道有這么多回歸模型

# to check the predications for the models
predictions_rpredictions_r outputprojections_r輸出

時間復雜度 (Time Complexity)

We should talk about time complexity because that's the main goal for all us to reduce it as much as possible
我們應該談論時間復雜性，因為這是我們所有人盡可能降低時間的主要目標

# time complexity
print("The time taken by LazyClassifier for {0} samples is {1} ms".format(len(data.data),round(end_time_1-start_time_1,0)))
print("The time taken by LazyRegressor for {0} samples is {1} ms".format(len(data1.data),round(end_time_2-start_time_2,0)))time complexity output時間復雜度輸出

Tip 3: Add %%time to check the execution time of the current jupyter cell

提示3：添加%% time以檢查當前jupyter單元的執行時間

注意 (Note)

Use this library in the first iteration of your ML project before hypertunning models
在對模型進行超調整之前，請在ML項目的第一個迭代中使用此庫
lazypredict only works for Python versions ≥3.6
lazypredict僅適用于Python版本≥3.6
If you don’t have the computational power just use Google colab
如果您沒有計算能力，請使用Google colab

The Github link is here for the code.

Github鏈接在此處提供代碼。

If you want to read the official docs

如果您想閱讀官方文檔

That's all the things you need to know about lazypredict library for now

這就是您現在需要了解的關于lazypredict庫的所有信息

Hope you learned new things from this article today and will help you to make your ML projects a bit easier

希望您今天從本文中學到了新東西，并可以幫助您簡化ML項目

Thank you for dedicating a few mins of your day

感謝您奉獻您的幾分鐘時間

If you have any doubts just comment down below I will be happy to help you out!

如果您有任何疑問，請在下方留言，我們將竭誠為您服務！

Thank you!

謝謝！

-Mani

-馬尼

翻譯自: https://medium.com/swlh/lazy-predict-for-ml-models-c513a5daf792

模型越復雜越容易惰性

總結

以上是生活随笔為你收集整理的模型越复杂越容易惰性_ML模型的惰性预测的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：特斯拉公布2022财报：2023预计交付
下一篇：机器学习:贝叶斯和优化方法_Facebo