History of pruning algorithm development and python implementation(finished)
All the python-implementation for 7 post-pruning Algorithms
are here.
Table of Decision Trees:
| ID3 | Ross Quinlan | 《Discovering rules by induction from large collections of examples》 | 1979 |
| ID3 | Ross Quinlan | Another origin:《Learning efficient classification procedures and their application to chess end games》 | 1983a |
| CART | Leo Breiman | 《Classification and Regression Trees》 | 1984 |
| C4.5 | Ross Quinlan | 《C4.5: Programming for machine learning》 | 1993 |
| C5.0 | Ross Quinlan | Commercial Edition of C4.5 ,no relevant papers | - |
Table of Pruning Algorithms:
| Pessimistic Pruning | 《Simplifying Decision Trees》part2.3 | 1986b(也有1987b的說法,這里以論文上寫的時間為準) | Quinlan | C4.5 | Ross Quinlan invented “Pessimistic Pruning”, John Mingers rename it as “Pessimistic Error Pruning” in his article《An Empirical Comparison of Pruning Methods for Decision Tree induction》 |
| Reduced Error Pruning | 《Simplifying Decision Trees》part2.2 | 1986b | Quinlan | C4.5 | 需要額外的驗證集才能剪枝 |
| Cost-Complexity Pruning | 《Classification and Regression Trees》3.3節 | 1984 | L Breiman | CART | For Classification Tree only |
| Error-Complexity Pruning | 《Classification and Regression Trees》8.5.1節 | 1984 | L Breiman | CART | For Regression Tree only |
| Critical Value Pruning | 《Expert System-Rule Induction with Statistical Data》,還有一說是:《An Empirical Comparison of Pruning Methods for Decision Tree Induction》但是該文作者自己說是引用自1987年的論文 | 1987a | John Mingers | 論文中沒有明說哪一種 | 關于CVP算法的出處眾說紛紜,這里的出處是以《An Empirical Comparison of Pruning Methods for Ensemble Classifiers》P212提到的為準 |
| Minimum-Error Pruning | 《Learning decision rules in noisy domains》 | 1986 | Niblett and Bratko | - | Can Not be Downloaded from Internet |
| Minimum-Error Pruning | 《on estimating probabilities in tree pruning》 | 1991 | Bojan Cestnik,Ivan Bratko | - | modified MEP algorithm |
| Error-Based Pruning | 《C4.5: Programs for Machine Learning 》4.2節 | 1993 | Quinlan | C4.5 | EBP is an evolution of PEP |
Pruning Target of classification trees(ID3,C4.5,CART-classification)
1.simplifying your classification trees without losing precision too much
2.improve Generalization ability of validation Sets.
3.alleviate overfit
Pruning Target of classification trees(CART-Regression):
1.simplifying your classification trees without increasing MSE too much
2.improve Generalization ability of validation Sets.
3.alleviate overfit
----------------For C4.5-----------------
U25%(1,16) and U25%(1,168)on《C4.5:programs for machine learning》
Weka的-3.6.10的C4.5與Quinlan教授的C4.5算法的區別
《C4.5: Programs for Machine Learning》chaper4實驗結果重現
C4.5-Release8中Ross Quinlan對缺失值的處理
C4.5最新版本Release8與MDL的關系的詳細解讀
some understanding of《Inferring Decision Trees Using the Minimum Description Length Principle*》
some understanding of《Improved Use of Continuous Attributes in C4.5》
C4.5-Release8的代碼架構圖
------------REP-start--------
ID3的REP(Reduced Error Pruning)剪枝代碼詳細解釋+周志華《機器學習》決策樹圖4.5、圖4.6、圖4.7繪制
周志華《機器學習》圖4.4和圖4.9繪制(轉載+增加熵顯示功能)
ID3決策樹中連續值的處理+周志華《機器學習》圖4.8和圖4.10繪制
sklearn沒有實現ID3算法
------------REP-end--------
----------------PEP-start-----------------
Earliest PEP Algorithm Principles
Pessimistic Error Pruning example of C4.5
Pessimistic error pruning illustration with C4.5-python implemention
----------------PEP-end----------------
-------------EBP-start--------------------
Error Based Pruning剪枝算法、代碼實現與舉例
這里有人會質疑為何不直接采用weka中的J48的python接口?
注意,weka是以quinlan的C語言版本代碼為準的(說白了就是J48抄的C4.5-Release8),在某些數據集中,例如使用hypo數據集,weka生成的決策樹很龐大,這是非常糟糕的。
因為決策樹的初衷是幫助分類,生成知識,
十分龐大的決策樹顯然是不利于使用的。
J48:Java edition of C4.5-Release8
-------------EBP-end--------------------
-------------MEP-start--------------------
Two Examples of Minimum Error Pruning(reprint)
MEP(minimum error pruning) principle with python implemention
-----------------MEP-end--------------------
-----------------CVP-start--------------------
proof of Chi-square statistics used in CVP
CVP(Critical Value Pruning)illustration with clear principle in details
CVP(Critical value pruning)examples with python implemention
Error in a paper about CVP
contingency(列聯表)python計算與實驗結果分析
python卡方分布計算
-----------------CVP-end--------------------
------------------------For CART-------------------------------------------------------
notes from《classification and regression trees》
\--------CCP start---------------------
《統計學習方法》P59決策樹繪制-sklearn版本
CCP(Cost complexity pruning) on sklearn with python implemention
Theory Defect in selecting best pruned tree from CCP with Cross-validation
1SE rule details in CCP pruning of CART
\--------CCP-end---------------------
\--------ECP start---------------------
舉例講清楚模型樹和回歸樹的區別
Error Complexity Pruning for sklearn’s Regression Tree with Python Implementation
\--------ECP end---------------------
Note:
Critical Value Pruning can be both used in pre-pruning and post-pruning.
When in pre-pruing,IM(Information Measure)is replaced with Chi-Square Statistics.
When post-pruning with χ2\chi^2χ2 test,then you need an independent datasets.
Of course,if you grow a tree with CVP,then you need not post-prune it with CVP with the same data which is used to grow the tree.
TablesaboutwhetheryouneedextravalidationsetswhenpruningTables\ about\ whether\ you\ need\ extra\ validation\ sets\ when\ pruningTables?about?whether?you?need?extra?validation?sets?when?pruning
In “Reference and Quotation” of the following table,
the word “test"means sets with
“class label”,so it actually means"validation datasets”.
| REP(Reduced Error Pruning) | yes | According to 《An Empirical Comparison of Pruing Methods for Decision Tree Induction》Section2.2.4: “The pruned node will often make fewer errors on the test data than the sub-tree makes.” |
| CCP(Cost Complexity Pruning) | pruning stage:No selecting best pruned tree stage: ①(small datasets)cross-validation:yes ②(large datasets)1-SE rule:no | According to《Simplifying Decision Trees》part 2.1: “Finally,the procedure requires a test set distinct from the original training set” |
| ECP(Error Complexity Pruning) | pruning stage:No selecting best pruned tree stage: ①(small datasets)cross-validation:yes ②(large datasets)1-SE rule:no | According to《An Empirical Comparison of Pruing Methods for Decision Tree Induction》part 2.2.1-page230: “Instead,each of the pruned trees is used to classify an independent test data set” |
| CVP(Critical Value Pruning) | pre-pruning:no post-pruning:yes | 《Expert System-Rule Induction with Statistical Data》(pre-prune): “Intial runs were performed using chi-square purely as a means of choosing attributes-not of judging their significance-onthe two years separately and on the data as a whole.” |
| MEP(Minimum Error Pruning) | It all depends how you set m | According to《ON ESTIMATING PROBABILITIES IN TREE PRUNING》 Section1: "m can be adjusted to some essential properties of the learning domain,such as the level of noise in the learning data." Section 2: “m can be set so as to maximise the classification accuracy on an independent data set” |
| PEP(Pessimistic Error Pruning) | no | According to 《Simplifying Decision Trees》Section2.3: “Unlike these methods,it does not require a test set separate from the cases in the training set from which the tree was constructed.” |
| EBP(Error Based Pruning) | no | According to《C4.5:Programs for machine learning》Page 40th: “The approach taken in C4.5 belongs to the second family of techniques that use only the training set from which the tree was build.” |
Atttention:
When your datasets is small,you need Cross Validation in CCP、ECP to produce k candidate model,and each candidate model is then pruned many times to get its pruned tree sequence (K such sequences totally),and finally to choose the best pruned tree from among the K sequences.
When your datasets is large,you need 1-SE Rule in CCP、ECP to select the best pruned tree.
In above Github Link ,we use the ② method.
PruningStyleTableofAlgorithmsPruning\ Style\ Table\ of\ AlgorithmsPruning?Style?Table?of?Algorithms
| REP(Reduced Error Pruning) | bottom-up | NOT mentioned in earliest article 《Simplifying Decision Trees》about it |
| CCP(Cost Complexity Pruning) | bottom-up | 《Classification and Regression Trees》 -LEO BREIMAN Chapter 3.3 MINIMAL COST-COMPLEXITY PRUNING: "Thus, the pruning process produces a finite sequence of subtrees T1 , T2 , T3 , …with progressively fewer terminal nodes " |
| ECP(Error Complexity Pruning) | bottom-up | 《Classification and Regression Trees》 -LEO BREIMAN Chpater8.5.1: “Now minimal error-complexity pruning is done exactly as minimal cost-complexity pruning in classification.” |
| CVP(Critical Value Pruning) | top-down in pre-pruning bottom-up in post-pruing | pre-pruning 《Expert Systems-Rule Induction with Statistical Data》"The ID3 algorithm,with the enhancements mentioned previously,1^11was modified to calculate χ2\chi^2χ2instead of IM" post-pruning: https://www.cs.rit.edu/~rlaz/prec2010/slides/DecisionTrees.pdf |
| MEP(Minimum Error Pruning) | bottom-up | 《The effects of pruning methods on the predictive accuracy of induced decision rules》: “Niblett and Bratko[26] proposed a bottom-up approach for seaching a single tree that minimizes the expected error rate.” |
| PEP(Pessimistic Error Pruning) | top-down | 《Top-Down Induction of Decision Trees Classifiers – A Survey》part E: “The pessimistic pruning procedure performs top-down traversing over the internal nodes.” |
| EBP(Error Based Pruning) | bottom-up | 《C4.5: Programs for Machine Learning》page 39th: “Start from the bottom of the tree and examine each nonleaf sub-tree.” |
Do all above pruning tend to be overpruning or underpruning?
According to the following 2 articles:
《Simplifying Decision Trees by Pruning and Grafting:New Results》
《Top-Down Induction of Decision Trees Classifiers –A Survey》
The result table is as follows:
| REP | over-pruning or not significant |
| PEP | under-pruning |
| EBP | under-pruning |
| MEP | under-pruning |
| CVP | under-pruning |
| CCP | under-pruning(from my own experiment) |
| ECP | under-pruning (from my own experiment) |
markdown tables generation table
https://tool.lu/tables
總結
以上是生活随笔為你收集整理的History of pruning algorithm development and python implementation(finished)的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: C4.5-Release8中Ross Q
- 下一篇: 《C4.5: Programs for