日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

kaggle研究生招生(中)

發(fā)布時間:2024/10/8 编程问答 27 豆豆
生活随笔 收集整理的這篇文章主要介紹了 kaggle研究生招生(中) 小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

上次將數(shù)據(jù)訓練了模型

由于數(shù)據(jù)中的大多數(shù)候選人都有70%以上的機會,許多不成功的候選人都沒有很好的預測。

df["Chance of Admit"].plot(kind = 'hist',bins = 200,figsize = (6,6)) plt.title("Chance of Admit") plt.xlabel("Chance of Admit") plt.ylabel("Frequency") plt.show()


為分類準備數(shù)據(jù)

如果候選人的錄取機會大于80%,則該候選人將獲得1個標簽。
如果候選人的錄取機會小于或等于80%,則該候選人將獲得0標簽。

# reading the dataset df = pd.read_csv("../input/Admission_Predict.csv",sep = ",")# it may be needed in the future. serialNo = df["Serial No."].values df.drop(["Serial No."],axis=1,inplace = True)y = df["Chance of Admit"].values x = df.drop(["Chance of Admit"],axis=1)# separating train (80%) and test (%20) sets from sklearn.model_selection import train_test_split x_train, x_test,y_train, y_test = train_test_split(x,y,test_size = 0.20,random_state = 42)# normalization from sklearn.preprocessing import MinMaxScaler scalerX = MinMaxScaler(feature_range=(0, 1)) x_train[x_train.columns] = scalerX.fit_transform(x_train[x_train.columns]) x_test[x_test.columns] = scalerX.transform(x_test[x_test.columns])y_train_01 = [1 if each > 0.8 else 0 for each in y_train] y_test_01 = [1 if each > 0.8 else 0 for each in y_test]# list to array y_train_01 = np.array(y_train_01) y_test_01 = np.array(y_test_01)

邏輯回歸

from sklearn.linear_model import LogisticRegression lrc = LogisticRegression() lrc.fit(x_train,y_train_01) print("score: ", lrc.score(x_test,y_test_01)) print("real value of y_test_01[1]: " + str(y_test_01[1]) + " -> the predict: " + str(lrc.predict(x_test.iloc[[1],:]))) print("real value of y_test_01[2]: " + str(y_test_01[2]) + " -> the predict: " + str(lrc.predict(x_test.iloc[[2],:])))# confusion matrix from sklearn.metrics import confusion_matrix cm_lrc = confusion_matrix(y_test_01,lrc.predict(x_test)) # print("y_test_01 == 1 :" + str(len(y_test_01[y_test_01==1]))) # 29# cm visualization import seaborn as sns import matplotlib.pyplot as plt f, ax = plt.subplots(figsize =(5,5)) sns.heatmap(cm_lrc,annot = True,linewidths=0.5,linecolor="red",fmt = ".0f",ax=ax) plt.title("Test for Test Dataset") plt.xlabel("predicted y values") plt.ylabel("real y values") plt.show()from sklearn.metrics import precision_score, recall_score print("precision_score: ", precision_score(y_test_01,lrc.predict(x_test))) print("recall_score: ", recall_score(y_test_01,lrc.predict(x_test)))from sklearn.metrics import f1_score print("f1_score: ",f1_score(y_test_01,lrc.predict(x_test)))

score: 0.9
real value of y_test_01[1]: 0 -> the predict: [0]
real value of y_test_01[2]: 1 -> the predict: [1]


precision_score: 0.9565217391304348
recall_score: 0.7586206896551724
f1_score: 0.8461538461538461

Test for Train Dataset:

cm_lrc_train = confusion_matrix(y_train_01,lrc.predict(x_train)) f, ax = plt.subplots(figsize =(5,5)) sns.heatmap(cm_lrc_train,annot = True,linewidths=0.5,linecolor="red",fmt = ".0f",ax=ax) plt.xlabel("predicted y values") plt.ylabel("real y values") plt.title("Test for Train Dataset") plt.show()

SVC

from sklearn.svm import SVC svm = SVC(random_state = 1) svm.fit(x_train,y_train_01) print("score: ", svm.score(x_test,y_test_01)) print("real value of y_test_01[1]: " + str(y_test_01[1]) + " -> the predict: " + str(svm.predict(x_test.iloc[[1],:]))) print("real value of y_test_01[2]: " + str(y_test_01[2]) + " -> the predict: " + str(svm.predict(x_test.iloc[[2],:])))# confusion matrix from sklearn.metrics import confusion_matrix cm_svm = confusion_matrix(y_test_01,svm.predict(x_test)) # print("y_test_01 == 1 :" + str(len(y_test_01[y_test_01==1]))) # 29# cm visualization import seaborn as sns import matplotlib.pyplot as plt f, ax = plt.subplots(figsize =(5,5)) sns.heatmap(cm_svm,annot = True,linewidths=0.5,linecolor="red",fmt = ".0f",ax=ax) plt.title("Test for Test Dataset") plt.xlabel("predicted y values") plt.ylabel("real y values") plt.show()from sklearn.metrics import precision_score, recall_score print("precision_score: ", precision_score(y_test_01,svm.predict(x_test))) print("recall_score: ", recall_score(y_test_01,svm.predict(x_test)))from sklearn.metrics import f1_score print("f1_score: ",f1_score(y_test_01,svm.predict(x_test)))

score: 0.9
real value of y_test_01[1]: 0 -> the predict: [0]
real value of y_test_01[2]: 1 -> the predict: [1]

precision_score: 0.9565217391304348
recall_score: 0.7586206896551724
f1_score: 0.8461538461538461

Test for Train Dataset

cm_svm_train = confusion_matrix(y_train_01,svm.predict(x_train)) f, ax = plt.subplots(figsize =(5,5)) sns.heatmap(cm_svm_train,annot = True,linewidths=0.5,linecolor="red",fmt = ".0f",ax=ax) plt.xlabel("predicted y values") plt.ylabel("real y values") plt.title("Test for Train Dataset") plt.show()

樸素貝葉斯

from sklearn.naive_bayes import GaussianNB nb = GaussianNB() nb.fit(x_train,y_train_01) print("score: ", nb.score(x_test,y_test_01)) print("real value of y_test_01[1]: " + str(y_test_01[1]) + " -> the predict: " + str(nb.predict(x_test.iloc[[1],:]))) print("real value of y_test_01[2]: " + str(y_test_01[2]) + " -> the predict: " + str(nb.predict(x_test.iloc[[2],:])))# confusion matrix from sklearn.metrics import confusion_matrix cm_nb = confusion_matrix(y_test_01,nb.predict(x_test)) # print("y_test_01 == 1 :" + str(len(y_test_01[y_test_01==1]))) # 29 # cm visualization import seaborn as sns import matplotlib.pyplot as plt f, ax = plt.subplots(figsize =(5,5)) sns.heatmap(cm_nb,annot = True,linewidths=0.5,linecolor="red",fmt = ".0f",ax=ax) plt.title("Test for Test Dataset") plt.xlabel("predicted y values") plt.ylabel("real y values") plt.show()from sklearn.metrics import precision_score, recall_score print("precision_score: ", precision_score(y_test_01,nb.predict(x_test))) print("recall_score: ", recall_score(y_test_01,nb.predict(x_test)))from sklearn.metrics import f1_score print("f1_score: ",f1_score(y_test_01,nb.predict(x_test)))

score: 0.9625
real value of y_test_01[1]: 0 -> the predict: [0]
real value of y_test_01[2]: 1 -> the predict: [1]


precision_score: 0.9333333333333333
recall_score: 0.9655172413793104
f1_score: 0.9491525423728815

Test for Train Dataset:

cm_nb_train = confusion_matrix(y_train_01,nb.predict(x_train)) f, ax = plt.subplots(figsize =(5,5)) sns.heatmap(cm_nb_train,annot = True,linewidths=0.5,linecolor="red",fmt = ".0f",ax=ax) plt.xlabel("predicted y values") plt.ylabel("real y values") plt.title("Test for Train Dataset") plt.show()

決策樹

from sklearn.tree import DecisionTreeClassifier dtc = DecisionTreeClassifier() dtc.fit(x_train,y_train_01) print("score: ", dtc.score(x_test,y_test_01)) print("real value of y_test_01[1]: " + str(y_test_01[1]) + " -> the predict: " + str(dtc.predict(x_test.iloc[[1],:]))) print("real value of y_test_01[2]: " + str(y_test_01[2]) + " -> the predict: " + str(dtc.predict(x_test.iloc[[2],:])))# confusion matrix from sklearn.metrics import confusion_matrix cm_dtc = confusion_matrix(y_test_01,dtc.predict(x_test)) # print("y_test_01 == 1 :" + str(len(y_test_01[y_test_01==1]))) # 29# cm visualization import seaborn as sns import matplotlib.pyplot as plt f, ax = plt.subplots(figsize =(5,5)) sns.heatmap(cm_dtc,annot = True,linewidths=0.5,linecolor="red",fmt = ".0f",ax=ax) plt.title("Test for Test Dataset") plt.xlabel("predicted y values") plt.ylabel("real y values") plt.show()from sklearn.metrics import precision_score, recall_score print("precision_score: ", precision_score(y_test_01,dtc.predict(x_test))) print("recall_score: ", recall_score(y_test_01,dtc.predict(x_test)))from sklearn.metrics import f1_score print("f1_score: ",f1_score(y_test_01,dtc.predict(x_test)))

score: 0.9375
real value of y_test_01[1]: 0 -> the predict: [0]
real value of y_test_01[2]: 1 -> the predict: [1]

precision_score: 0.9615384615384616
recall_score: 0.8620689655172413
f1_score: 0.9090909090909091

Test for Train Dataset

cm_dtc_train = confusion_matrix(y_train_01,dtc.predict(x_train)) f, ax = plt.subplots(figsize =(5,5)) sns.heatmap(cm_dtc_train,annot = True,linewidths=0.5,linecolor="red",fmt = ".0f",ax=ax) plt.xlabel("predicted y values") plt.ylabel("real y values") plt.title("Test for Train Dataset") plt.show()

隨機森林

from sklearn.ensemble import RandomForestClassifier rfc = RandomForestClassifier(n_estimators = 100,random_state = 1) rfc.fit(x_train,y_train_01) print("score: ", rfc.score(x_test,y_test_01)) print("real value of y_test_01[1]: " + str(y_test_01[1]) + " -> the predict: " + str(rfc.predict(x_test.iloc[[1],:]))) print("real value of y_test_01[2]: " + str(y_test_01[2]) + " -> the predict: " + str(rfc.predict(x_test.iloc[[2],:])))# confusion matrix from sklearn.metrics import confusion_matrix cm_rfc = confusion_matrix(y_test_01,rfc.predict(x_test)) # print("y_test_01 == 1 :" + str(len(y_test_01[y_test_01==1]))) # 29 # cm visualization import seaborn as sns import matplotlib.pyplot as plt f, ax = plt.subplots(figsize =(5,5)) sns.heatmap(cm_rfc,annot = True,linewidths=0.5,linecolor="red",fmt = ".0f",ax=ax) plt.title("Test for Test Dataset") plt.xlabel("predicted y values") plt.ylabel("real y values") plt.show()from sklearn.metrics import precision_score, recall_score print("precision_score: ", precision_score(y_test_01,rfc.predict(x_test))) print("recall_score: ", recall_score(y_test_01,rfc.predict(x_test)))from sklearn.metrics import f1_score print("f1_score: ",f1_score(y_test_01,rfc.predict(x_test)))

score: 0.9375
real value of y_test_01[1]: 0 -> the predict: [0]
real value of y_test_01[2]: 1 -> the predict: [1]

precision_score: 0.9615384615384616
recall_score: 0.8620689655172413
f1_score: 0.9090909090909091

Test for Train Dataset

cm_rfc_train = confusion_matrix(y_train_01,rfc.predict(x_train)) f, ax = plt.subplots(figsize =(5,5)) sns.heatmap(cm_rfc_train,annot = True,linewidths=0.5,linecolor="red",fmt = ".0f",ax=ax) plt.xlabel("predicted y values") plt.ylabel("real y values") plt.title("Test for Train Dataset") plt.show()

kNN

from sklearn.neighbors import KNeighborsClassifier# finding k value scores = [] for each in range(1,50):knn_n = KNeighborsClassifier(n_neighbors = each)knn_n.fit(x_train,y_train_01)scores.append(knn_n.score(x_test,y_test_01))plt.plot(range(1,50),scores) plt.xlabel("k") plt.ylabel("accuracy") plt.show()knn = KNeighborsClassifier(n_neighbors = 3) # n_neighbors = k knn.fit(x_train,y_train_01) print("score of 3 :",knn.score(x_test,y_test_01)) print("real value of y_test_01[1]: " + str(y_test_01[1]) + " -> the predict: " + str(knn.predict(x_test.iloc[[1],:]))) print("real value of y_test_01[2]: " + str(y_test_01[2]) + " -> the predict: " + str(knn.predict(x_test.iloc[[2],:])))# confusion matrix from sklearn.metrics import confusion_matrix cm_knn = confusion_matrix(y_test_01,knn.predict(x_test)) # print("y_test_01 == 1 :" + str(len(y_test_01[y_test_01==1]))) # 29# cm visualization import seaborn as sns import matplotlib.pyplot as plt f, ax = plt.subplots(figsize =(5,5)) sns.heatmap(cm_knn,annot = True,linewidths=0.5,linecolor="red",fmt = ".0f",ax=ax) plt.title("Test for Test Dataset") plt.xlabel("predicted y values") plt.ylabel("real y values") plt.show()from sklearn.metrics import precision_score, recall_score print("precision_score: ", precision_score(y_test_01,knn.predict(x_test))) print("recall_score: ", recall_score(y_test_01,knn.predict(x_test)))from sklearn.metrics import f1_score print("f1_score: ",f1_score(y_test_01,knn.predict(x_test)))


score of 3 : 0.9375
real value of y_test_01[1]: 0 -> the predict: [0]
real value of y_test_01[2]: 1 -> the predict: [1]


precision_score: 0.9285714285714286
recall_score: 0.896551724137931
f1_score: 0.912280701754386

Test for Train Dataset:

cm_knn_train = confusion_matrix(y_train_01,knn.predict(x_train)) f, ax = plt.subplots(figsize =(5,5)) sns.heatmap(cm_knn_train,annot = True,linewidths=0.5,linecolor="red",fmt = ".0f",ax=ax) plt.xlabel("predicted y values") plt.ylabel("real y values") plt.title("Test for Train Dataset") plt.show()


所有分類算法都取得了大約90%的成功。最成功的是高斯樸素貝葉斯,得分為96%。

y = np.array([lrc.score(x_test,y_test_01),svm.score(x_test,y_test_01),nb.score(x_test,y_test_01),dtc.score(x_test,y_test_01),rfc.score(x_test,y_test_01),knn.score(x_test,y_test_01)]) #x = ["LogisticRegression","SVM","GaussianNB","DecisionTreeClassifier","RandomForestClassifier","KNeighborsClassifier"] x = ["LogisticReg.","SVM","GNB","Dec.Tree","Ran.Forest","KNN"]plt.bar(x,y) plt.title("Comparison of Classification Algorithms") plt.xlabel("Classfication") plt.ylabel("Score") plt.show()


上文是回歸算法,此文分類

總結(jié)

以上是生活随笔為你收集整理的kaggle研究生招生(中)的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯,歡迎將生活随笔推薦給好友。

主站蜘蛛池模板: 色哟哟在线观看 | 国产剧情一区在线 | 饥渴放荡受np公车奶牛 | 亚洲国产精品综合久久久 | 成人深夜视频 | 精品久久久一区二区 | 精品视频一二三区 | 特黄特色特刺激免费播放 | 精品成在人线av无码免费看 | 亚洲美女一区二区三区 | 国产美女福利视频 | 成人手机av| 激情xxx| 亚洲欧美日韩另类 | 男女涩涩| 成人爱爱免费视频 | 欧美日在线 | 亚欧成人精品一区二区 | 正在播放经典国语对白 | 91网国产| 男女av网站 | 夜色视频网 | 77777av| 少妇2做爰hd韩国电影 | 欧美高清视频一区二区三区 | 久久aaaa片一区二区 | 天堂中文网在线 | 日韩精品视频在线看 | 久久香蕉综合 | 午夜视频日韩 | 熟妇人妻系列aⅴ无码专区友真希 | 香蕉视频在线观看www | 日本在线不卡一区 | 日本一区二区三区在线观看 | 欧美精品韩国精品 | 丁香婷婷色| 国产在线伊人 | 麻豆黄色网 | 日产国产亚洲精品系列 | 艳妇乳肉豪妇荡乳av无码福利 | 艳妇乳肉豪妇荡乳xxx | 琪琪色在线视频 | www三级 | 97涩涩网| 一区三区在线观看 | 蜜桃av乱码一区二区三区 | 一区二区三区欧美日韩 | 日日插插| 亚洲宗人网 | 欧美大片大全 | av在线不卡观看 | 久久精品国产精品亚洲 | 91在线不卡 | 青娱乐在线视频观看 | 国产精品成人免费视频 | 国产综合第一页 | 欧美激情成人网 | 直接看的毛片 | 亚洲中文字幕一区二区 | 一级做a爰片 | 黄色www网站 | jizz黑人 | 国产精品va在线观看无码 | 国产精品呦呦 | 日韩91| 亚洲精品福利 | 日本黄色免费 | 超碰97在线人人 | 色999视频| 爆操91 | 欧洲一区二区视频 | 国产av第一区| 91免费在线视频 | 高潮疯狂过瘾粗话对白 | 99国产在线| av不卡网| 日韩精品一区二区三区丰满 | 国产无遮挡又黄又爽在线观看 | 亚洲视频自拍偷拍 | 一区二区三区视频网站 | 麻豆成人精品 | 亚洲美女在线观看 | 久久国产精品一区二区 | 丁香花高清视频完整电影 | 国精产品乱码一区一区三区四区 | 免费二区 | 亚洲欧美日韩免费 | 日韩成人av电影 | 亚洲一本在线 | 欧美日韩乱国产 | 久热精品免费视频 | 成年人免费网站在线观看 | 一级特黄录像免费看 | 亚洲一区二区三区四区在线 | 国产精品电影一区二区三区 | 国产aⅴ精品一区二区果冻 台湾性生生活1 | 日本午夜一区二区 | 国产农村妇女精品一二区 | 韩日视频一区 |