日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

假设检验代码篇

發布時間:2024/9/27 编程问答 24 豆豆
生活随笔 收集整理的這篇文章主要介紹了 假设检验代码篇 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?假設檢驗代碼篇

假設檢驗常見的有單樣本T-檢驗、雙樣本T-檢驗、成對T-檢驗、方差分析等。詳細見如下代碼部分。

from scipy import stats import pandas as pd# 1 One-Sample T-Test #原假設為住院女醫生的血壓與一般人群的血壓無顯著差異,即和一般人群的血壓(120)差異不大,以下為血壓數據: female_doctor_bps = [128, 127, 118, 115, 144, 142, 133, 140, 132, 131,111, 132, 149, 122, 139, 119, 136, 129, 126, 128]d = pd.DataFrame(female_doctor_bps); d.columns=["amt"] d_ref=120 d_std=d.std()[0] d_n =d.shape[0] ##d_free=d.shape[0]-1 d_se=d_std/(d_n**0.5) d_tvalue=(d.mean()[0]-d_ref)/(d_se) print("one-sampe T-test:\tT values is:"+str(d_tvalue)) print(stats.ttest_1samp(female_doctor_bps, 120)) ## 本例p值為0.0002,遠低于0.05或0.01的標準閾值,因此我們拒絕原假設,可以認為住院女醫生的靜息收縮壓與一般人群有差異。# 2 Two-sample T-test female_doctor_bps = [128, 127, 118, 115, 144, 142, 133, 140, 132, 131,111, 132, 149, 122, 139, 119, 136, 129, 126, 128]male_consultant_bps = [118, 115, 112, 120, 124, 130, 123, 110, 120, 121,123, 125, 129, 130, 112, 117, 119, 120, 123, 128]d_femal=pd.DataFrame(female_doctor_bps) d_male=pd.DataFrame(male_consultant_bps) d_femal_mean=d_femal.mean()[0] d_male_mean=d_male.mean()[0] d_femal_var = d_femal.var()[0] d_male_var = d_male.var()[0] d_femal_n = d_femal.shape[0] d_male_n = d_male.shape[0] d_sp=((d_femal_n-1)*d_femal_var + (d_male_n-1)*d_male_var)/(d_femal_n+d_male_n-2) d_t = (d_femal_mean - d_male_mean)/((d_sp*(1/d_femal_n+1/d_male_n))**0.5) print("Two-sample T-test:\tT values is:"+str(d_t)) print(stats.ttest_ind(female_doctor_bps, male_consultant_bps)) #p值是0.0012,這比標準閾值低于0.05或0.01,所以我們拒絕零假設,我們可以說女醫生和男醫生的舒張壓有顯著差異。#3 Paired T-Test control = [8.0, 7.1, 6.5, 6.7, 7.2, 5.4, 4.7, 8.1, 6.3, 4.8] treatment = [9.9, 7.9, 7.6, 6.8, 7.1, 9.9, 10.5, 9.7, 10.9, 8.2]d_control=pd.DataFrame(control) d_treatment=pd.DataFrame(treatment) d_diff = d_treatment - d_control d_mean = d_diff.mean()[0] d_treatment_std = d_diff.std()[0] d_treatment_n = d_treatment.shape[0] d_t = (d_mean)/(d_treatment_std/(d_treatment_n**0.5)) print("Paired T-Test:\tT values is:"+"\t"+str(d_t)) print(stats.ttest_rel(control, treatment))#p值為0.0055,低于0.05或0.01的標準閾值,因此我們拒絕原假設,我們可以說,由安眠藥引起的睡眠時間有差異。# 4 Analysis of Variance (ANOVA) #ctrl = [4.17, 5.58, 5.18, 6.11, 4.5, 4.61, 5.17, 4.53, 5.33, 5.14] #trt1 = [4.81, 4.17, 4.41, 3.59, 5.87, 3.83, 6.03, 4.89, 4.32, 4.69] #trt2 = [6.31, 5.12, 5.54, 5.5, 5.37, 5.29, 4.92, 6.15, 5.8, 5.26]## 這里樣本量是一樣的,每組的樣本量可以不一樣。 ctrl = [4.17, 5.58, 5.18] trt1 = [4.81, 4.17, 4.41] trt2 = [6.31, 5.12, 5.54]d_group = 3 d_ctr1 = pd.DataFrame(ctrl) d_trt1 = pd.DataFrame(trt1) d_trt2 = pd.DataFrame(trt2) ## 樣本相加除以總樣本數,總體均值(總共9個樣本) d_total_mean=(d_ctr1.sum()[0]+d_trt1.sum()[0]+d_trt2.sum()[0])/d_ctr1.shape[0]/d_group ##print(d_total_mean) d_ctr1_mean=d_ctr1.mean()[0] d_trt1_mean=d_trt1.mean()[0] d_trt2_mean=d_trt2.mean()[0] d_ctr1_n=d_ctr1.shape[0] d_trt1_n=d_trt1.shape[0] d_trt2_n=d_trt2.shape[0] # ## 組間平方和(SSA) d_ssa=(d_ctr1_mean-d_total_mean)**2*d_ctr1_n+ \ (d_trt1_mean-d_total_mean)**2*d_trt1_n+ \ (d_trt2_mean-d_total_mean)**2*d_trt2_n##print("組間平方和(SSA):\t"+str(d_ssa))## 組內平方和(SSE): d_sse=(4.17-d_ctr1_mean)**2+(5.58-d_ctr1_mean)**2+(5.18-d_ctr1_mean)**2+\ (4.81-d_trt1_mean)**2+(4.17-d_trt1_mean)**2+(4.41-d_trt1_mean)**2+\ (6.31-d_trt2_mean)**2+(5.12-d_trt2_mean)**2+(5.54-d_trt2_mean)**2##print("組內平方和(SSE):\t" + str(d_sse))#總體平方和(SST): d_sst = (4.17-d_total_mean)**2+(5.58-d_total_mean)**2+(5.18-d_total_mean)**2 +\(4.81-d_total_mean)**2+(4.17-d_total_mean)**2+(4.41-d_total_mean)**2 +\(6.31-d_total_mean)**2+(5.12-d_total_mean)**2+(5.54-d_total_mean)**2 ##print("總體平方和(SST):\t"+str(d_sst))#組間均方(MSA) = SSA/自由度 d_msa = d_ssa/(d_group-1) #組內均方(MSE) = SSE/自由度 d_mse = d_sse/(d_ctr1_n+d_ctr1_n+d_ctr1_n-d_group) #MSA又稱為組間方差,MSE稱為組內方差 d_f = d_msa/d_mse print("Analysis of Variance (ANOVA) f values:\t"+str(d_f)) print(stats.f_oneway(ctrl, trt1, trt2))# 5 chi-squared test w from scipy.stats import chi2_contingency from scipy.stats import chi2 table = [ [10, 20, 30],[6, 9, 17]] stat, p, dof, expected = chi2_contingency(table) print('dof=%d' % dof) #degrees of freedom: (rows - 1) * (cols - 1) ##print(expected) 打印每列的期望值 # 以第一列第一行為例,算期望值 print("第一行第一列期望值:\t"+str('%.8f'%((10+6)/(10+6+20+9+30+17)*(10+20+30) )))#[10.43478261 18.91304348 30.65217391] #[5.56521739 10.08695652 16.34782609] print('卡方值:\t'+str('%.10f'%( (10-10.43478261)**2/(10.43478261)+(20-18.91304348)**2/(18.91304348)+(30-30.65217391)**2/(30.65217391)+ (6-5.56521739)**2/(5.56521739)+(9-10.08695652)**2/(10.08695652)+(17-16.34782609)**2/(16.34782609) ))) prob = 0.95 critical = chi2.ppf(prob, dof) print('probability=%.3f, critical=%.3f, stat=%.8f' % (prob, critical, stat)) #這里p值大于0.05,所以接受原假設,即兩樣本之間沒有顯著差異,樣本均值無差異if abs(stat) >= critical:print('Dependent (reject H0)') else:print('Independent (fail to reject H0)') # interpret p-value alpha = 1.0 - prob print('significance=%.3f, p=%.3f' % (alpha, p)) if p <= alpha:print('Dependent (reject H0)') else:print('Independent (fail to reject H0)')

?執行結果:

"F:\Python37\python.exe" E:/hypothesistest.py
one-sampe T-test:?? ?T values is:4.512403659336718
Ttest_1sampResult(statistic=4.512403659336718, pvalue=0.00023838063630967753)
Two-sample T-test:?? ?T values is:3.5143256412718564
Ttest_indResult(statistic=3.5143256412718564, pvalue=0.0011571376404026158)
Paired T-Test:?? ?T values is:?? ?3.624485995178213
Ttest_relResult(statistic=-3.6244859951782136, pvalue=0.0055329408161001415)
Analysis of Variance (ANOVA) f values:?? ?3.23528624933119
F_onewayResult(statistic=3.2352862493311934, pvalue=0.11137675915188745)
dof=2
第一行第一列期望值:?? ?10.43478261
卡方值:?? ?0.2715746509
probability=0.950, critical=5.991, stat=0.27157465
Independent (fail to reject H0)
significance=0.050, p=0.873
Independent (fail to reject H0)

Process finished with exit code 0
?

相關配圖:

1 One-Sample T-Test

注:SE即standard error 即樣本的標準誤

2?Two-sample T-test

注: 適用于判斷兩個樣本是否獨立或者相關?

?3?Paired T-Test

注: 1 其中??是兩個成對樣本均值的差,s是兩個樣本對應相減算出的標準差。

? ? ? ? 2 適用于比較兩個相關的樣本,比如測試前后的變化。?

4?Analysis of Variance (ANOVA)

計算過程詳見代碼部分。

5 chi-squared test

每個A對應的期望值T:?(所在的縱向和/總和)*所在的橫向和

卡方值0.27157465在0.21和0.45之間,所以p值在0.80和0.90之間。通過計算的0.873

6 refer

Comparative Statistics in Python using SciPy – Ben Alex Keen

方差分析_張俊紅-CSDN博客_方差分析csdn

卡方檢驗(詳解)_ludan_xia的博客-CSDN博客_卡方檢驗

A Gentle Introduction to the Chi-Squared Test for Machine Learning

總結

以上是生活随笔為你收集整理的假设检验代码篇的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。