當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

pandas(二) -- Dataframe创建及索引

發(fā)布時間：2025/1/21 编程问答 20 豆豆

生活随笔收集整理的這篇文章主要介紹了 pandas(二) -- Dataframe创建及索引小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

Dataframe創(chuàng)建

由數(shù)組/list組成的字典

data1 = {'a':[1,2,3],'b':[3,4,5],'c':[5,6,7]} df1 = pd.DataFrame(data1) print(df1)

輸出

a b c 0 1 3 5 1 2 4 6 2 3 5 7

添加索引

df1 = pd.DataFrame(data1,index = ['f1','f2','f3'])

2. 由Series組成的字典

data1 = {'one':pd.Series(np.random.rand(2)),'two':pd.Series(np.random.rand(3))} # 沒有設(shè)置index的Series df1 = pd.DataFrame(data1,index = ['a','b','c']) print(df1)

較短的序列補0

one two a 0.065605 0.217466 b 0.973106 0.908904 c NaN 0.663079

通過二維數(shù)組直接創(chuàng)建

ar = np.random.rand(9).reshape(3,3) print(ar) df1 = pd.DataFrame(ar) df2 = pd.DataFrame(ar, index = ['a', 'b', 'c'], columns = ['one','two','three']) 0 1 2 0 0.339401 0.773847 0.253083 1 0.281513 0.028760 0.751607 2 0.347467 0.252451 0.689796one two three a 0.339401 0.773847 0.253083 b 0.281513 0.028760 0.751607 c 0.347467 0.252451 0.689796

Dataframe索引與切片

列索引和行索引
df

a b c d one 94.473099 30.077407 70.953102 9.416436 two 41.958628 15.709462 47.400670 56.909647 three 14.539075 8.398997 80.139084 83.250374

1. 列索引
按照列名選擇列，只選擇一列輸出Series，選擇多列輸出Dataframe

data1 = df['a'] data2 = df[['a','c']] data1 = df['a'] one 94.473099 two 41.958628 three 14.539075 Name: a, dtype: float64 <class 'pandas.core.series.Series'> data2 = df[['a','c']] a c one 94.473099 70.953102 two 41.958628 47.400670 three 14.539075 80.139084 <class 'pandas.core.frame.DataFrame'>

2. 行索引
按照index選擇行，只選擇一行輸出Series，選擇多行輸出Dataframe
df.loc['one'] – 按標(biāo)簽索引
df.iloc[0] – 按位置索引

data3 = df.loc['one']#單標(biāo)簽索引 data4 = df.loc[['one','two']]#逐個選擇--多標(biāo)簽索引 data5 = df.loc['one':'three']#范圍--切片索引

data3

a 94.473099 b 30.077407 c 70.953102 d 9.416436 Name: one, dtype: float64 <class 'pandas.core.series.Series'>

data4

a b c d one 94.473099 30.077407 70.953102 9.416436 two 41.958628 15.709462 47.400670 56.909647 <class 'pandas.core.frame.DataFrame'>

data5

a b c d one 94.473099 30.077407 70.953102 9.416436 two 41.958628 15.709462 47.400670 56.909647

df.iloc[] - 按照整數(shù)位置做行索引
- 單位置索引 df.iloc[0] ———與 df.loc['one']相同
- 多位置索引 df.iloc[[2,1]] ———與 df.loc['three','two']相同
- 切片索引 df.iloc[0:2]———與 df.loc['one':'three']相同

3. 布爾型索引
df

a b c d one 52.462365 92.336489 95.512607 85.587735 two 34.853185 12.887189 69.575950 79.705655 three 90.755125 98.826032 12.686749 99.404063 four 75.758254 97.520349 36.782117 18.956917

布爾型矩陣索引
b1 = df < 20得到與df同型的矩陣

a b c d one False False False False two False True False False three False False True False four False False False True

通過布爾型矩陣索引df[b1]，False處的值為NaN,True的為相應(yīng)值

a b c d one NaN NaN NaN NaN two NaN 12.887189 NaN NaN three NaN NaN 12.686749 NaN four NaN NaN NaN 18.956917

布爾型序列/行索引
b2 = df['a'] > 50,得到布爾型序列

one True two False three True four True Name: a, dtype: bool <class 'pandas.core.series.Series'> df[b2]

布爾型序列索引，得到的是布爾型序列為true所在的行

a b c d one 52.462365 92.336489 95.512607 85.587735 three 90.755125 98.826032 12.686749 99.404063 four 75.758254 97.520349 36.782117 18.956917

不能通過相似的方法對行進(jìn)行判斷。列得到的是列名，與原始的DataFrame的index['one','two','three','four'],無法匹配

錯誤提示Boolean Series key will be reindexed to match DataFrame index.
通過行索引方式為。df在行索引為真時，為原始值，其余地方為NaN

多列索引和多行索引

b3 = df[['a','b']] > 50 #兩列 a bone True Truetwo False Falsethree True Truefour True True <class 'pandas.core.frame.DataFrame'> df[b3] 布爾型DataFrame中為True的地方有值，其余的均為NaN.多行索引也是同樣的效果 a b c d one 52.462365 92.336489 NaN NaN two NaN NaN NaN NaN three 90.755125 98.826032 NaN NaN four 75.758254 97.520349 NaN NaN

多重索引：比如同時索引行和列
df['a'].loc[['one','three']]

one 52.462365 three 90.755125 Name: a, dtype: float64 print(df[['b','c','d']].iloc[::2]) # 選擇b，c，d列的one，three行 b c d one 92.336489 95.512607 85.587735 three 98.826032 12.686749 99.404063

總結(jié)

以上是生活随笔為你收集整理的pandas(二) -- Dataframe创建及索引的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

上一篇： pandas（一）-- Series创建
下一篇： pandas(三) -- DataFra