當(dāng)前位置：首頁(yè) > 编程语言 > python >内容正文

python

python决策树可视化_「决策树」| Part3—Python实现之可视化

發(fā)布時(shí)間：2024/9/19 python 35 豆豆

生活随笔收集整理的這篇文章主要介紹了 python决策树可视化_「决策树」| Part3—Python实现之可视化小編覺(jué)得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

文章首發(fā)于微信公眾號(hào)：AlgorithmDeveloper，專(zhuān)注機(jī)器學(xué)習(xí)與Python，編程與算法，還有生活。

1.前言

「決策樹(shù)」| Part2—Python實(shí)現(xiàn)之構(gòu)建決策樹(shù)中我們已經(jīng)可以基于給定數(shù)據(jù)集訓(xùn)練出決策樹(shù)模型，只不過(guò)是以字典方式表示決策樹(shù)，決策樹(shù)直觀、易于理解的優(yōu)點(diǎn)完全體現(xiàn)不出來(lái)。因此，這篇文章的目的就是將訓(xùn)練出的決策樹(shù)模型以樹(shù)狀圖形表示。

給定數(shù)據(jù)集：

字典形式?jīng)Q策樹(shù)模型：

{'人品': {'好': '見(jiàn) ', '差': {'富有': {'沒(méi)錢(qián)': '不見(jiàn)', '有錢(qián)': {'外貌': {'漂亮': '見(jiàn) ', '丑': '不見(jiàn)'}}}}}}

2.獲取決策樹(shù)的葉節(jié)點(diǎn)數(shù)及深度

為了使繪制出的決策樹(shù)圖形不因樹(shù)的節(jié)點(diǎn)、深度的增減而變得畸形，因此利用決策樹(shù)的葉子節(jié)點(diǎn)個(gè)數(shù)以及樹(shù)的深度將x軸、y軸平均切分，從而使樹(shù)狀圖平均分布在畫(huà)布上。

#獲取決策樹(shù)葉節(jié)點(diǎn)個(gè)數(shù)

def getNumLeafs(tree):

numLeafs = 0

#獲取第一個(gè)節(jié)點(diǎn)的分類(lèi)特征

firstFeat = list(tree.keys())[0]

#得到firstFeat特征下的決策樹(shù)(以字典方式表示)

secondDict = tree[firstFeat]

#遍歷firstFeat下的每個(gè)節(jié)點(diǎn)

for key in secondDict.keys():

#如果節(jié)點(diǎn)類(lèi)型為字典，說(shuō)明該節(jié)點(diǎn)下仍然是一棵樹(shù)，此時(shí)遞歸調(diào)用getNumLeafs

if type(secondDict[key]).__name__== 'dict':

numLeafs += getNumLeafs(secondDict[key])

#否則該節(jié)點(diǎn)為葉節(jié)點(diǎn)

else:

numLeafs += 1

return numLeafs

#獲取決策樹(shù)深度

def getTreeDepth(tree):

maxDepth = 0

#獲取第一個(gè)節(jié)點(diǎn)分類(lèi)特征

firstFeat = list(tree.keys())[0]

#得到firstFeat特征下的決策樹(shù)(以字典方式表示)

secondDict = tree[firstFeat]

#遍歷firstFeat下的每個(gè)節(jié)點(diǎn)，返回子樹(shù)中的最大深度

for key in secondDict.keys():

#如果節(jié)點(diǎn)類(lèi)型為字典，說(shuō)明該節(jié)點(diǎn)下仍然是一棵樹(shù)，此時(shí)遞歸調(diào)用getTreeDepth，獲取該子樹(shù)深度

if type(secondDict[key]).__name__ == 'dict':

thisDepth = 1 + getTreeDepth(secondDict[key])

else:

thisDepth = 1

if thisDepth > maxDepth:

maxDepth = thisDepth

return maxDepth

3.繪制決策樹(shù)

3.1繪制節(jié)點(diǎn)

#繪制決策樹(shù)

import matplotlib.pyplot as plt

def createPlot(tree):

#定義一塊畫(huà)布，背景為白色

fig = plt.figure(1, facecolor='white')

#清空畫(huà)布

fig.clf()

#不顯示x、y軸刻度

xyticks = dict(xticks=[],yticks=[])

#frameon：是否繪制坐標(biāo)軸矩形

createPlot.pTree = plt.subplot(111, frameon=False, **xyticks)

#計(jì)算決策樹(shù)葉子節(jié)點(diǎn)個(gè)數(shù)

plotTree.totalW = float(getNumLeafs(tree))

#計(jì)算決策樹(shù)深度

plotTree.totalD = float(getTreeDepth(tree))

#最近繪制的葉子節(jié)點(diǎn)的x坐標(biāo)

plotTree.xOff = -0.5/plotTree.totalW

#當(dāng)前繪制的深度：y坐標(biāo)

plotTree.yOff = 1.0

#(0.5,1.0)為根節(jié)點(diǎn)坐標(biāo)

plotTree(tree,(0.5,1.0),'')

plt.show()

#定義決策節(jié)點(diǎn)以及葉子節(jié)點(diǎn)屬性：boxstyle表示文本框類(lèi)型，sawtooth：鋸齒形；fc表示邊框線粗細(xì)

decisionNode = dict(boxstyle="sawtooth", fc="0.5")

leafNode = dict(boxstyle="round4", fc="0.5")

#定義箭頭屬性

arrow_args = dict(arrowstyle="

#nodeText:要顯示的文本；centerPt：文本中心點(diǎn)，即箭頭所在的點(diǎn)；parentPt：指向文本的點(diǎn)；nodeType:節(jié)點(diǎn)屬性

#ha='center'，va='center':水平、垂直方向中心對(duì)齊；bbox：方框?qū)傩?/p>

#arrowprops：箭頭屬性

#xycoords，textcoords選擇坐標(biāo)系；axes fraction-->0,0是軸域左下角，1,1是右上角

def plotNode(nodeText, centerPt, parentPt, nodeType):

createPlot.pTree.annotate(nodeText, xy=parentPt, xycoords="axes fraction",

xytext=centerPt, textcoords='axes fraction',

va='center',ha='center',bbox=nodeType, arrowprops=arrow_args)

def plotMidText(centerPt,parentPt,midText):

xMid = (parentPt[0] - centerPt[0])/2.0 + centerPt[0]

yMid = (parentPt[1] - centerPt[1])/2.0 + centerPt[1]

createPlot.pTree.text(xMid, yMid, midtext)

plotNode函數(shù)一次繪制的是一個(gè)箭頭與一個(gè)節(jié)點(diǎn)，plotMidText函數(shù)繪制的是直線中點(diǎn)上的文本。

3.2遞歸繪制決策樹(shù)

遞歸繪制決策樹(shù)的整體思路如下：

(1)繪制當(dāng)前節(jié)點(diǎn)；

(2)如果當(dāng)前節(jié)點(diǎn)的子節(jié)點(diǎn)不是葉子節(jié)點(diǎn)，則遞歸；

(3)如果當(dāng)前節(jié)點(diǎn)的子節(jié)點(diǎn)是葉子節(jié)點(diǎn)，則繪制。

def plotTree(tree, parentPt, nodeTxt):

#計(jì)算葉子節(jié)點(diǎn)個(gè)數(shù)

numLeafs = getNumLeafs(tree)

#獲取第一個(gè)節(jié)點(diǎn)特征

firstFeat = list(tree.keys())[0]

#計(jì)算當(dāng)前節(jié)點(diǎn)的x坐標(biāo)

centerPt = (plotTree.xOff + (1.0 + float(numLeafs))/2.0/plotTree.totalW, plotTree.yOff)

#繪制當(dāng)前節(jié)點(diǎn)

plotMidText(centerPt,parentPt,nodeTxt)

plotNode(firstFeat,centerPt,parentPt,decisionNode)

secondDict = tree[firstFeat]

#計(jì)算繪制深度

plotTree.yOff -= 1.0/plotTree.totalD

for key in secondDict.keys():

#如果當(dāng)前節(jié)點(diǎn)的子節(jié)點(diǎn)不是葉子節(jié)點(diǎn)，則遞歸

if type(secondDict[key]).__name__ == 'dict':

plotTree(secondDict[key],centerPt,str(key))

#如果當(dāng)前節(jié)點(diǎn)的子節(jié)點(diǎn)是葉子節(jié)點(diǎn)，則繪制該葉節(jié)點(diǎn)

else:

#plotTree.xOff在繪制葉節(jié)點(diǎn)坐標(biāo)的時(shí)候才會(huì)發(fā)生改變

plotTree.xOff += 1.0/plotTree.totalW

plotNode(secondDict[key], (plotTree.xOff,plotTree.yOff),centerPt,leafNode)

plotMidText((plotTree.xOff,plotTree.yOff),centerPt,str(key))

plotTree.yOff += 1.0/plotTree.totalD

根據(jù)決策樹(shù)的葉子節(jié)點(diǎn)數(shù)和深度來(lái)平均切分畫(huà)布，并且x、y軸的總長(zhǎng)度為1，如下圖所示：

原諒我的畫(huà)圖水平

3.2.1在createPlot函數(shù)中:

plotTree.totalW ：表示葉子節(jié)點(diǎn)個(gè)數(shù)，因此上圖中每?jī)蓚€(gè)葉子節(jié)點(diǎn)之間的距離為：1/plotTree.totalW；

plotTree.totalD ：表示決策樹(shù)深度；

plotTree.xOff：表示最近繪制的葉子節(jié)點(diǎn)x坐標(biāo)，在繪制葉節(jié)點(diǎn)時(shí)其值才會(huì)更新；其初始值為圖中虛線圓圈位置，這樣在以后確定葉子節(jié)點(diǎn)位置時(shí)可以直接加整數(shù)倍的1/plotTree.totalW；

plotTree.yOff = 1.0 ：表示當(dāng)前繪制的深度，其值初始化為根節(jié)點(diǎn)y坐標(biāo)。

3.2.2在plotTree函數(shù)中:

#計(jì)算當(dāng)前節(jié)點(diǎn)的x坐標(biāo)

centerPt = (plotTree.xOff + (1.0 + float(numLeafs))/2.0/plotTree.totalW, plotTree.yOff)

在確定當(dāng)前節(jié)點(diǎn)x坐標(biāo)時(shí)，只需確定當(dāng)前節(jié)點(diǎn)下的葉節(jié)點(diǎn)個(gè)數(shù)，其x坐標(biāo)即為葉節(jié)點(diǎn)所占距離的一半：float(numLeafs)/2.0/plotTree.totalW；

由于plotTree.xOff初始值為-0.5/plotTree.totalW，因此當(dāng)前節(jié)點(diǎn)x坐標(biāo)還需加上0.5/plotTree.totalW。

4.決策樹(shù)可視化

#決策樹(shù)節(jié)點(diǎn)文本可以以中文顯示

import matplotlib as mpl

mpl.rcParams["font.sans-serif"] = ["Microsoft YaHei"]

mpl.rcParams['axes.unicode_minus'] = False

#創(chuàng)建數(shù)據(jù)集

def createDataSet():

dataSet = [['有錢(qián)','好','漂亮','見(jiàn) '],

['有錢(qián)','差','漂亮','見(jiàn) '],

['有錢(qián)','差','丑','不見(jiàn)'],

['沒(méi)錢(qián)','好','丑','見(jiàn) '],

['沒(méi)錢(qián)','差','漂亮','不見(jiàn)'],

['沒(méi)錢(qián)','好','漂亮','見(jiàn) ']]

labels = ['富有','人品','外貌']

return dataSet, labels

dataSet, dataLabels = createDataSet()

#創(chuàng)建決策樹(shù)

myTree = createDecideTree(dataSet,dataLabels)

print(myTree)

#繪制決策樹(shù)

createPlot(myTree)

字典形式表示決策樹(shù)：

{'人品': {'好': '見(jiàn) ', '差': {'富有': {'沒(méi)錢(qián)': '不見(jiàn)', '有錢(qián)': {'外貌': {'漂亮': '見(jiàn) ', '丑': '不見(jiàn)'}}}}}}

樹(shù)狀圖形決策樹(shù)：

5.使用決策樹(shù)算法

在已知對(duì)方有錢(qián)，人品差，長(zhǎng)得漂亮后，利用前面訓(xùn)練的決策樹(shù)做出決策，見(jiàn)或不見(jiàn)？！

#使用決策樹(shù)進(jìn)行分類(lèi)

def classify(tree,feat,featValue):

firstFeat = list(tree.keys())[0]

secondDict = tree[firstFeat]

featIndex = feat.index(firstFeat)

for key in secondDict.keys():

if featValue[featIndex] == key:

if type(secondDict[key]).__name__ == 'dict':

classLabel = classify(secondDict[key],feat,featValue)

else:

classLabel = secondDict[key]

return classLabel

feat = ['富有','人品','外貌']

featValue = ['有錢(qián)','差','漂亮']

print(classify(myTree,feat,featValue))

決策結(jié)果：

見(jiàn)

6.存儲(chǔ)決策樹(shù)模型

構(gòu)建決策樹(shù)消耗的時(shí)間還是很可觀的，尤其在數(shù)據(jù)量大的時(shí)候，因此，當(dāng)訓(xùn)練完決策樹(shù)模型后有必要將其保存下來(lái)，以便后續(xù)使用。使用Python模塊的pickle序列化對(duì)象可以解決這個(gè)問(wèn)題，序列化對(duì)象可以在磁盤(pán)上保存對(duì)象，在需要時(shí)將其讀取出來(lái)。

#保存決策樹(shù)模型

import pickle

def saveTree(tree, fileName):

fw = open(fileName,'wb')

pickle.dump(tree, fw)

fw.close()

#加載決策樹(shù)模型

def loadTree(fileName):

fr = open(fileName,'rb')

return pickle.load(fr)

saveTree((myTree),'myTree.txt')

print(loadTree('myTree.txt'))

{'人品': {'差': {'富有': {'有錢(qián)': {'外貌': {'丑': '不見(jiàn)', '漂亮': '見(jiàn) '}}, '沒(méi)錢(qián)': '不見(jiàn)'}}, '好': '見(jiàn) '}}

Coding Your Ambition！

總結(jié)

以上是生活随笔為你收集整理的python决策树可视化_「决策树」| Part3—Python实现之可视化的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問(wèn)題。

如果覺(jué)得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇： st东电是什么公司
下一篇： python论坛app_理解python