日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程语言 > python >内容正文

python

History of pruning algorithm development and python implementation(finished)

發布時間:2023/12/20 python 34 豆豆
生活随笔 收集整理的這篇文章主要介紹了 History of pruning algorithm development and python implementation(finished) 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

All the python-implementation for 7 post-pruning Algorithms
are here.

Table of Decision Trees:

name of treeinventername of articleyear
ID3Ross Quinlan《Discovering rules by induction from large collections of examples》1979
ID3Ross QuinlanAnother origin:《Learning efficient classification procedures and their application to chess end games》1983a
CARTLeo Breiman《Classification and Regression Trees》1984
C4.5Ross Quinlan《C4.5: Programming for machine learning》1993
C5.0Ross QuinlanCommercial Edition of C4.5 ,no relevant papers-

Table of Pruning Algorithms:

name of post-pruning algorithmname of article or bookyearinventerthe tree prunedRemark
Pessimistic Pruning《Simplifying Decision Trees》part2.31986b(也有1987b的說法,這里以論文上寫的時間為準)QuinlanC4.5Ross Quinlan invented “Pessimistic Pruning”,
John Mingers rename it as “Pessimistic Error Pruning” in his article《An Empirical Comparison of Pruning Methods for Decision Tree induction》
Reduced Error Pruning《Simplifying Decision Trees》part2.21986bQuinlanC4.5需要額外的驗證集才能剪枝
Cost-Complexity Pruning《Classification and Regression Trees》3.3節1984L BreimanCARTFor Classification Tree only
Error-Complexity Pruning《Classification and Regression Trees》8.5.1節1984L BreimanCARTFor Regression Tree only
Critical Value Pruning《Expert System-Rule Induction with Statistical Data》,還有一說是:《An Empirical Comparison of Pruning Methods for Decision Tree Induction》但是該文作者自己說是引用自1987年的論文1987aJohn Mingers論文中沒有明說哪一種關于CVP算法的出處眾說紛紜,這里的出處是以《An Empirical Comparison of Pruning Methods for Ensemble Classifiers》P212提到的為準
Minimum-Error Pruning《Learning decision rules in noisy domains》1986Niblett and BratkoCan Not be Downloaded from Internet
Minimum-Error Pruning《on estimating probabilities in tree pruning》1991Bojan Cestnik,Ivan Bratkomodified MEP algorithm
Error-Based Pruning《C4.5: Programs for Machine Learning 》4.2節1993QuinlanC4.5EBP is an evolution of PEP

Pruning Target of classification trees(ID3,C4.5,CART-classification)
1.simplifying your classification trees without losing precision too much
2.improve Generalization ability of validation Sets.
3.alleviate overfit

Pruning Target of classification trees(CART-Regression):
1.simplifying your classification trees without increasing MSE too much
2.improve Generalization ability of validation Sets.
3.alleviate overfit

----------------For C4.5-----------------
U25%(1,16) and U25%(1,168)on《C4.5:programs for machine learning》
Weka的-3.6.10的C4.5與Quinlan教授的C4.5算法的區別
《C4.5: Programs for Machine Learning》chaper4實驗結果重現
C4.5-Release8中Ross Quinlan對缺失值的處理
C4.5最新版本Release8與MDL的關系的詳細解讀
some understanding of《Inferring Decision Trees Using the Minimum Description Length Principle*》
some understanding of《Improved Use of Continuous Attributes in C4.5》
C4.5-Release8的代碼架構圖

------------REP-start--------
ID3的REP(Reduced Error Pruning)剪枝代碼詳細解釋+周志華《機器學習》決策樹圖4.5、圖4.6、圖4.7繪制
周志華《機器學習》圖4.4和圖4.9繪制(轉載+增加熵顯示功能)
ID3決策樹中連續值的處理+周志華《機器學習》圖4.8和圖4.10繪制
sklearn沒有實現ID3算法

------------REP-end--------

----------------PEP-start-----------------
Earliest PEP Algorithm Principles
Pessimistic Error Pruning example of C4.5
Pessimistic error pruning illustration with C4.5-python implemention
----------------PEP-end----------------

-------------EBP-start--------------------
Error Based Pruning剪枝算法、代碼實現與舉例
這里有人會質疑為何不直接采用weka中的J48的python接口?
注意,weka是以quinlan的C語言版本代碼為準的(說白了就是J48抄的C4.5-Release8),在某些數據集中,例如使用hypo數據集,weka生成的決策樹很龐大,這是非常糟糕的。
因為決策樹的初衷是幫助分類,生成知識,
十分龐大的決策樹顯然是不利于使用的。
J48:Java edition of C4.5-Release8
-------------EBP-end--------------------

-------------MEP-start--------------------
Two Examples of Minimum Error Pruning(reprint)
MEP(minimum error pruning) principle with python implemention

-----------------MEP-end--------------------

-----------------CVP-start--------------------
proof of Chi-square statistics used in CVP
CVP(Critical Value Pruning)illustration with clear principle in details
CVP(Critical value pruning)examples with python implemention
Error in a paper about CVP
contingency(列聯表)python計算與實驗結果分析
python卡方分布計算
-----------------CVP-end--------------------

------------------------For CART-------------------------------------------------------
notes from《classification and regression trees》
\--------CCP start---------------------
《統計學習方法》P59決策樹繪制-sklearn版本
CCP(Cost complexity pruning) on sklearn with python implemention
Theory Defect in selecting best pruned tree from CCP with Cross-validation
1SE rule details in CCP pruning of CART
\--------CCP-end---------------------

\--------ECP start---------------------
舉例講清楚模型樹和回歸樹的區別
Error Complexity Pruning for sklearn’s Regression Tree with Python Implementation
\--------ECP end---------------------

Note:
Critical Value Pruning can be both used in pre-pruning and post-pruning.
When in pre-pruing,IM(Information Measure)is replaced with Chi-Square Statistics.
When post-pruning with χ2\chi^2χ2 test,then you need an independent datasets.
Of course,if you grow a tree with CVP,then you need not post-prune it with CVP with the same data which is used to grow the tree.

TablesaboutwhetheryouneedextravalidationsetswhenpruningTables\ about\ whether\ you\ need\ extra\ validation\ sets\ when\ pruningTables?about?whether?you?need?extra?validation?sets?when?pruning
In “Reference and Quotation” of the following table,
the word “test"means sets with
“class label”,so it actually means"validation datasets”.

Pruning AlgorithmNeed extra validation datasets?Reference and Quotation
REP(Reduced Error Pruning)yesAccording to 《An Empirical Comparison of Pruing Methods for Decision Tree Induction》Section2.2.4:
“The pruned node will often make fewer errors on the test data than the sub-tree makes.”
CCP(Cost Complexity Pruning)pruning stage:No
selecting best pruned tree stage:
①(small datasets)cross-validation:yes
②(large datasets)1-SE rule:no
According to《Simplifying Decision Trees》part 2.1:
“Finally,the procedure requires a test set distinct from the original training set”
ECP(Error Complexity Pruning)pruning stage:No
selecting best pruned tree stage:
①(small datasets)cross-validation:yes
②(large datasets)1-SE rule:no
According to《An Empirical Comparison of Pruing Methods for Decision Tree Induction》part 2.2.1-page230:
“Instead,each of the pruned trees is used to classify an independent test data set”
CVP(Critical Value Pruning)pre-pruning:no
post-pruning:yes
《Expert System-Rule Induction with Statistical Data》(pre-prune):
“Intial runs were performed using chi-square purely as a means of choosing attributes-not of judging their significance-onthe two years separately and on the data as a whole.”
MEP(Minimum Error Pruning)It all depends how you set mAccording to《ON ESTIMATING PROBABILITIES IN TREE PRUNING》
Section1:
"m can be adjusted to some essential properties of the learning domain,such as the level of noise in the learning data."
Section 2:
“m can be set so as to maximise the classification accuracy on an independent data set”
PEP(Pessimistic Error Pruning)noAccording to 《Simplifying Decision Trees》Section2.3:
“Unlike these methods,it does not require a test set separate from the cases in the training set from which the tree was constructed.”
EBP(Error Based Pruning)noAccording to《C4.5:Programs for machine learning》Page 40th:
“The approach taken in C4.5 belongs to the second family of techniques that use only the training set from which the tree was build.”

Atttention:
When your datasets is small,you need Cross Validation in CCP、ECP to produce k candidate model,and each candidate model is then pruned many times to get its pruned tree sequence (K such sequences totally),and finally to choose the best pruned tree from among the K sequences.

When your datasets is large,you need 1-SE Rule in CCP、ECP to select the best pruned tree.
In above Github Link ,we use the ② method.

PruningStyleTableofAlgorithmsPruning\ Style\ Table\ of\ AlgorithmsPruning?Style?Table?of?Algorithms

Pruning AlgorithmPruning StyleReference and Quotation
REP(Reduced Error Pruning)bottom-upNOT mentioned in earliest article 《Simplifying Decision Trees》about it
CCP(Cost Complexity Pruning)bottom-up《Classification and Regression Trees》
-LEO BREIMAN
Chapter 3.3 MINIMAL COST-COMPLEXITY PRUNING:
"Thus, the pruning process produces a finite sequence of subtrees T1 , T2 , T3 , …with progressively fewer terminal nodes "
ECP(Error Complexity Pruning)bottom-up《Classification and Regression Trees》
-LEO BREIMAN
Chpater8.5.1:
“Now minimal error-complexity pruning is done exactly as minimal cost-complexity pruning in classification.”
CVP(Critical Value Pruning)top-down in pre-pruning
bottom-up in post-pruing
pre-pruning
《Expert Systems-Rule Induction with Statistical Data》"The ID3 algorithm,with the enhancements mentioned previously,1^11was modified to calculate χ2\chi^2χ2instead of IM"
post-pruning:
https://www.cs.rit.edu/~rlaz/prec2010/slides/DecisionTrees.pdf
MEP(Minimum Error Pruning)bottom-up《The effects of pruning methods on the predictive accuracy of induced decision rules》:
“Niblett and Bratko[26] proposed a bottom-up approach for seaching a single tree that minimizes the expected error rate.”
PEP(Pessimistic Error Pruning)top-down《Top-Down Induction of Decision Trees Classifiers – A Survey》part E:
“The pessimistic pruning procedure performs top-down traversing over the internal nodes.”
EBP(Error Based Pruning)bottom-up《C4.5: Programs for Machine Learning》page 39th:
“Start from the bottom of the tree and examine each nonleaf sub-tree.”

Do all above pruning tend to be overpruning or underpruning?
According to the following 2 articles:
《Simplifying Decision Trees by Pruning and Grafting:New Results》
《Top-Down Induction of Decision Trees Classifiers –A Survey》
The result table is as follows:

Pruning Algorithmtendency of pruning
REPover-pruning or not significant
PEPunder-pruning
EBPunder-pruning
MEPunder-pruning
CVPunder-pruning
CCPunder-pruning(from my own experiment)
ECPunder-pruning (from my own experiment)

markdown tables generation table
https://tool.lu/tables

總結

以上是生活随笔為你收集整理的History of pruning algorithm development and python implementation(finished)的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。

主站蜘蛛池模板: 天天色天天操天天 | 国产人妻777人伦精品hd | 在线看片网址 | 欧美亚洲黄色片 | 亚洲精品www久久久 一级aaa毛片 | 丰满人妻在公车被猛烈进入电影 | 亚洲美女福利 | 日韩av在线高清 | 夜夜嗨aⅴ一区二区三区 | 狠狠撸视频| 亚洲偷拍一区 | 午夜黄色在线 | 在线观看无遮挡 | 中国毛片基地 | 亚洲欧美日本韩国 | 日韩欧美精品在线播放 | 日日干天天 | 韩国伦理片免费看 | 国产综合亚洲精品一区二 | 激情六月天婷婷 | 久久精品偷拍视频 | 日本顶级大片 | 青草视屏 | 久久福利视频导航 | 一女三黑人理论片在线 | 久久国产激情视频 | 日韩精品91| 九九九九精品九九九九 | 国产精品久久久一区二区三区 | 国产一级一级 | 久久婷婷综合国产 | 色悠悠在线视频 | 丰满女人又爽又紧又丰满 | 一区二区三区精彩视频 | 新天堂网 | 91免费成人| 亚洲天堂区 | 天堂av2018 | 91玉足脚交嫩脚丫在线播放 | 欧美一区二区三区网站 | 肉番在线观看 | 特黄aaaaaaaaa真人毛片 | 欧美永久 | 亚洲三级中文字幕 | 双腿张开被9个男人调教 | 91视频h | 高清一区在线观看 | 天堂中文视频在线 | 黄色网址国产 | 中文字幕亚洲欧美日韩在线不卡 | 亚洲精品国产乱伦 | 国产在线超碰 | 91蜜桃视频在线观看 | 很很干很很日 | 无码人妻丰满熟妇精品 | 成人免费视频一区二区 | 日本免费不卡一区二区 | 你懂的国产在线 | 夜夜爽夜夜 | 青青青视频在线播放 | 97色伦影院 | 精品欧美一区二区三区 | 激情女主播 | 四虎影城库 | 国产美女精品一区二区三区 | 优优色影院 | 中文字幕无线码 | 啪啪的网站 | 看欧美一级片 | 69堂在线观看 | 亚洲涩色| 国产a免费观看 | 午夜精品久久久久久久久久久久 | 天天射日日干 | 在线涩涩 | bl动漫在线观看 | 精品久久久久久中文字幕 | 国产精品12区 | 国产12页| 欧美日韩精品一区二区三区 | 日日爽爽 | 一级黄色片看看 | 亚洲国产精品人人爽夜夜爽 | 丰满少妇在线观看网站 | 色久在线| 蜜臀久久精品久久久久 | 天天干天天操天天玩 | yy4138理论片动漫理论片 | 精品在线99 | 无码一区二区三区在线观看 | 精品在线播放视频 | 亚洲国内在线 | 老子午夜影院 | 国产精品99精品无码视 | 色亚洲欧美 | 日韩精品免费一区二区三区竹菊 | 91禁动漫在线| 欧美视频在线免费看 | 国精产品一品二品国精品69xx |