Pandas-常用统计分析方法 describe、quantile、sum、mean、median、count、max、min、idxmax、idxmin、mad、var、std、cumsum
理論:
describe():快速查看每列數(shù)據(jù)的統(tǒng)計信息,以下是可以輸出的統(tǒng)計指標
count,數(shù)據(jù)個數(shù)(非空數(shù)據(jù))
mean,均值
std,標準差
min,最小值
25%,第1四分位數(shù),即第25百分位數(shù)
50%,第2四分位數(shù),即第50百分位數(shù)
75%,第3四分位數(shù),即第75百分位數(shù)
max,最大值
quantile(q):
輸出指定位置的百分位數(shù),默認q=0.5,q的范圍是[0,1]
?
常用統(tǒng)計方法:
sum(),求和
mean(),求均值
median(),求中位數(shù)
count(),求非空的個數(shù)
注意:以上統(tǒng)計方法不對缺失數(shù)據(jù)進行統(tǒng)計
?
max(),求最大值
min(),求最小值
idxmax(),返回最大值對應的索引
idxmin(),返回最小值對應的索引
注意:argmax()和argmin()在近期的版本中即將停止使用
?
mad(),求平均絕對誤差(mean absolute deviation),對表示各個變量值之間差異程度的數(shù)值之一
var():方差
std():求標準差
cumsum(),求累加
?
第15節(jié) 常用統(tǒng)計方法(1) --describe、quantile
In?[1]:
?
import pandas as pdIn?[2]:
?
data = pd.read_csv(r'C:\Users\ML Learning\Projects\第四章-數(shù)據(jù)分析預習內(nèi)容\第四章-數(shù)據(jù)分析預習內(nèi)容\第一節(jié)-數(shù)據(jù)分析工具pandas基礎\lesson_05\lesson_05\examples\datasets\2021_happiness.csv') data.head()Out[2]:
| Denmark | Western Europe | 1 | 7.526 | 7.460 | 7.592 | 1.44178 | 1.16374 | 0.79504 | 0.57941 | 0.44453 | 0.36171 | 2.73939 |
| Switzerland | Western Europe | 2 | 7.509 | 7.428 | 7.590 | 1.52733 | 1.14524 | 0.86303 | 0.58557 | 0.41203 | 0.28083 | 2.69463 |
| Iceland | Western Europe | 3 | 7.501 | 7.333 | 7.669 | 1.42666 | 1.18326 | 0.86733 | 0.56624 | 0.14975 | 0.47678 | 2.83137 |
| Norway | Western Europe | 4 | 7.498 | 7.421 | 7.575 | 1.57744 | 1.12690 | 0.79579 | 0.59609 | 0.35776 | 0.37895 | 2.66465 |
| Finland | Western Europe | 5 | 7.413 | 7.351 | 7.475 | 1.40598 | 1.13464 | 0.81091 | 0.57104 | 0.41004 | 0.25492 | 2.82596 |
In?[3]:
?
data.describe()Out[3]:
| 157.000000 | 157.000000 | 157.000000 | 157.000000 | 157.000000 | 157.000000 | 157.000000 | 157.000000 | 157.000000 | 157.000000 | 157.000000 |
| 78.980892 | 5.382185 | 5.282395 | 5.481975 | 0.953880 | 0.793621 | 0.557619 | 0.370994 | 0.137624 | 0.242635 | 2.325807 |
| 45.466030 | 1.141674 | 1.148043 | 1.136493 | 0.412595 | 0.266706 | 0.229349 | 0.145507 | 0.111038 | 0.133756 | 0.542220 |
| 1.000000 | 2.905000 | 2.732000 | 3.078000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.817890 |
| 40.000000 | 4.404000 | 4.327000 | 4.465000 | 0.670240 | 0.641840 | 0.382910 | 0.257480 | 0.061260 | 0.154570 | 2.031710 |
| 79.000000 | 5.314000 | 5.237000 | 5.419000 | 1.027800 | 0.841420 | 0.596590 | 0.397470 | 0.105470 | 0.222450 | 2.290740 |
| 118.000000 | 6.269000 | 6.154000 | 6.434000 | 1.279640 | 1.021520 | 0.729930 | 0.484530 | 0.175540 | 0.311850 | 2.664650 |
| 157.000000 | 7.526000 | 7.460000 | 7.669000 | 1.824270 | 1.183260 | 0.952770 | 0.608480 | 0.505210 | 0.819710 | 3.837720 |
In?[4]:
?
data.quantile(q=0.5)Out[4]:
Happiness Rank 79.00000 Happiness Score 5.31400 Lower Confidence Interval 5.23700 Upper Confidence Interval 5.41900 Economy (GDP per Capita) 1.02780 Family 0.84142 Health (Life Expectancy) 0.59659 Freedom 0.39747 Trust (Government Corruption) 0.10547 Generosity 0.22245 Dystopia Residual 2.29074 Name: 0.5, dtype: float64In?[5]:
?
data.quantile(q=0.25)Out[5]:
Happiness Rank 40.00000 Happiness Score 4.40400 Lower Confidence Interval 4.32700 Upper Confidence Interval 4.46500 Economy (GDP per Capita) 0.67024 Family 0.64184 Health (Life Expectancy) 0.38291 Freedom 0.25748 Trust (Government Corruption) 0.06126 Generosity 0.15457 Dystopia Residual 2.03171 Name: 0.25, dtype: float64In?[6]:
?
import pandas as pdIn?[7]:
?
data = pd.read_csv(r'C:\Users\ML Learning\Projects\第四章-數(shù)據(jù)分析預習內(nèi)容\第四章-數(shù)據(jù)分析預習內(nèi)容\第一節(jié)-數(shù)據(jù)分析工具pandas基礎\lesson_05\lesson_05\examples\datasets\log.csv') ? data.head()Out[7]:
| 1469974424 | cheryl | intro.html | 5 | False | 10.0 |
| 1469974454 | cheryl | intro.html | 6 | NaN | NaN |
| 1469974544 | cheryl | intro.html | 9 | NaN | NaN |
| 1469974574 | cheryl | intro.html | 10 | NaN | NaN |
| 1469977514 | bob | intro.html | 1 | NaN | NaN |
In?[8]:
?
data.sum() #求和Out[8]:
time 48509194942 user cherylcherylcherylcherylbobbobbobbobcherylcher... video intro.htmlintro.htmlintro.htmlintro.htmlintro.... playback position 429 paused 1 volume 35 dtype: objectIn?[9]:
?
data.mean() # 求均值Out[9]:
time 1.469976e+09 playback position 1.300000e+01 paused 3.333333e-01 volume 8.750000e+00 dtype: float64In?[10]:
?
data.median() # 求中位數(shù)Out[10]:
time 1.469975e+09 playback position 1.000000e+01 paused 0.000000e+00 volume 1.000000e+01 dtype: float64In?[11]:
?
data.count() #求非空的個數(shù)Out[11]:
time 33 user 33 video 33 playback position 33 paused 3 volume 4 dtype: int64In?[12]:
?
import pandas as pdIn?[13]:
?
data = pd.read_csv(r'C:\Users\ML Learning\Projects\第四章-數(shù)據(jù)分析預習內(nèi)容\第四章-數(shù)據(jù)分析預習內(nèi)容\第一節(jié)-數(shù)據(jù)分析工具pandas基礎\lesson_05\lesson_05\examples\datasets\2021_happiness.csv') data.head()Out[13]:
| Denmark | Western Europe | 1 | 7.526 | 7.460 | 7.592 | 1.44178 | 1.16374 | 0.79504 | 0.57941 | 0.44453 | 0.36171 | 2.73939 |
| Switzerland | Western Europe | 2 | 7.509 | 7.428 | 7.590 | 1.52733 | 1.14524 | 0.86303 | 0.58557 | 0.41203 | 0.28083 | 2.69463 |
| Iceland | Western Europe | 3 | 7.501 | 7.333 | 7.669 | 1.42666 | 1.18326 | 0.86733 | 0.56624 | 0.14975 | 0.47678 | 2.83137 |
| Norway | Western Europe | 4 | 7.498 | 7.421 | 7.575 | 1.57744 | 1.12690 | 0.79579 | 0.59609 | 0.35776 | 0.37895 | 2.66465 |
| Finland | Western Europe | 5 | 7.413 | 7.351 | 7.475 | 1.40598 | 1.13464 | 0.81091 | 0.57104 | 0.41004 | 0.25492 | 2.82596 |
In?[14]:
?
data.max()Out[14]:
Country Zimbabwe Region Western Europe Happiness Rank 157 Happiness Score 7.526 Lower Confidence Interval 7.46 Upper Confidence Interval 7.669 Economy (GDP per Capita) 1.82427 Family 1.18326 Health (Life Expectancy) 0.95277 Freedom 0.60848 Trust (Government Corruption) 0.50521 Generosity 0.81971 Dystopia Residual 3.83772 dtype: objectIn?[15]:
?
data.min()Out[15]:
Country Afghanistan Region Australia and New Zealand Happiness Rank 1 Happiness Score 2.905 Lower Confidence Interval 2.732 Upper Confidence Interval 3.078 Economy (GDP per Capita) 0 Family 0 Health (Life Expectancy) 0 Freedom 0 Trust (Government Corruption) 0 Generosity 0 Dystopia Residual 0.81789 dtype: objectIn?[17]:
?
data['Happiness Score'].idxmax()Out[17]:
0In?[18]:
?
data['Happiness Score'].idxmin()Out[18]:
156In?[21]:
?
data.mad() # 求絕對值誤差Out[21]:
Happiness Rank 39.254899 Happiness Score 0.955256 Lower Confidence Interval 0.957480 Upper Confidence Interval 0.953032 Economy (GDP per Capita) 0.342828 Family 0.211727 Health (Life Expectancy) 0.188426 Freedom 0.119887 Trust (Government Corruption) 0.084441 Generosity 0.102143 Dystopia Residual 0.413041 dtype: float64In?[22]:
?
data.var() #求方差Out[22]:
Happiness Rank 2067.159889 Happiness Score 1.303418 Lower Confidence Interval 1.318002 Upper Confidence Interval 1.291617 Economy (GDP per Capita) 0.170235 Family 0.071132 Health (Life Expectancy) 0.052601 Freedom 0.021172 Trust (Government Corruption) 0.012329 Generosity 0.017891 Dystopia Residual 0.294003 dtype: float64In?[23]:
?
data.std() #求標準差Out[23]:
Happiness Rank 45.466030 Happiness Score 1.141674 Lower Confidence Interval 1.148043 Upper Confidence Interval 1.136493 Economy (GDP per Capita) 0.412595 Family 0.266706 Health (Life Expectancy) 0.229349 Freedom 0.145507 Trust (Government Corruption) 0.111038 Generosity 0.133756 Dystopia Residual 0.542220 dtype: float64In?[24]:
?
data.cumsum() #求累加Out[24]:
| Denmark | Western Europe | 1 | 7.526 | 7.460 | 7.592 | 1.44178 | 1.16374 | 0.79504 | 0.57941 | 0.44453 | 0.36171 | 2.73939 |
| DenmarkSwitzerland | Western EuropeWestern Europe | 3 | 15.035 | 14.888 | 15.182 | 2.96911 | 2.30898 | 1.65807 | 1.16498 | 0.85656 | 0.64254 | 5.43402 |
| DenmarkSwitzerlandIceland | Western EuropeWestern EuropeWestern Europe | 6 | 22.536 | 22.221 | 22.851 | 4.39577 | 3.49224 | 2.52540 | 1.73122 | 1.00631 | 1.11932 | 8.26539 |
| DenmarkSwitzerlandIcelandNorway | Western EuropeWestern EuropeWestern EuropeWest... | 10 | 30.034 | 29.642 | 30.426 | 5.97321 | 4.61914 | 3.32119 | 2.32731 | 1.36407 | 1.49827 | 10.93004 |
| DenmarkSwitzerlandIcelandNorwayFinland | Western EuropeWestern EuropeWestern EuropeWest... | 15 | 37.447 | 36.993 | 37.901 | 7.37919 | 5.75378 | 4.13210 | 2.89835 | 1.77411 | 1.75319 | 13.75600 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| DenmarkSwitzerlandIcelandNorwayFinlandCanadaNe... | Western EuropeWestern EuropeWestern EuropeWest... | 11778 | 832.366 | 817.188 | 847.544 | 148.28013 | 124.10506 | 86.33722 | 57.62264 | 21.15342 | 36.91896 | 357.94872 |
| DenmarkSwitzerlandIcelandNorwayFinlandCanadaNe... | Western EuropeWestern EuropeWestern EuropeWest... | 11932 | 835.726 | 820.476 | 850.976 | 148.66240 | 124.21543 | 86.51066 | 57.78694 | 21.22454 | 37.23164 | 360.09430 |
| DenmarkSwitzerlandIcelandNorwayFinlandCanadaNe... | Western EuropeWestern EuropeWestern EuropeWest... | 12087 | 839.029 | 823.668 | 854.390 | 148.94363 | 124.21543 | 86.75877 | 58.13372 | 21.34041 | 37.40681 | 362.22970 |
| DenmarkSwitzerlandIcelandNorwayFinlandCanadaNe... | Western EuropeWestern EuropeWestern EuropeWest... | 12243 | 842.098 | 826.604 | 857.592 | 149.69082 | 124.36409 | 87.38871 | 58.20284 | 21.51274 | 37.89078 | 363.04759 |
| DenmarkSwitzerlandIcelandNorwayFinlandCanadaNe... | Western EuropeWestern EuropeWestern EuropeWest... | 12400 | 845.003 | 829.336 | 860.670 | 149.75913 | 124.59851 | 87.54618 | 58.24604 | 21.60693 | 38.09368 | 365.15163 |
157 rows × 13 columns
?
?
?
?
?
?
?
?
?
?
總結(jié)
以上是生活随笔為你收集整理的Pandas-常用统计分析方法 describe、quantile、sum、mean、median、count、max、min、idxmax、idxmin、mad、var、std、cumsum的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: dubbo思维导图
- 下一篇: 今天来黑一黑Intel的傲腾