當前位置：首頁 > 编程语言 > python >内容正文

python

excel修改列名 pandas_P9：pythonpandas玩转excel文件

發(fā)布時間：2023/12/15 python 27 豆豆

生活随笔收集整理的這篇文章主要介紹了 excel修改列名 pandas_P9：pythonpandas玩转excel文件小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

Python中pandas庫的數據結構主要有兩種：一個是Series ，一個是DataFrame。

DataFrame是一種數據結構，類似excel，是一種二維表；

series是一個一維數組，是基于NumPy的ndarray結構。Pandas會默然用0到n-1來作為series的index，但也可以自己指定index(可以把index理解為dict里面的key)。

一、先安裝pandas、xlrd

pip3 install?pandas

pip3 install xlrd

二、pandas-DataFrame()函數

1、新建excel文件：直接使用pandas的DataFrame函數創(chuàng)建

# -*- coding: utf-8 -*-import pandas as pd
df = pd.DataFrame({'id':[1,2,3],'name':['劉備','關羽','張飛']})
df.to_excel("D:/用python寫代碼/output.xlsx")
print("Done!")

結果：

2、創(chuàng)建excel時將索引設置成id

# -*- coding: utf-8 -*-import pandas as pd
df = pd.DataFrame({'id':[1,2,3],'name':['劉備','關羽','張飛']})
df = df.set_index('id')
df.to_excel("D:/用python寫代碼/output1.xlsx")
print("Done!")

結果：

3、讀取excel文件

# -*- coding: utf-8 -*-import pandas as pd
renyuan = pd.read_excel('D:/用python寫代碼/test1/Day1-7/ryb.xlsx')

print(renyuan.shape) #行數、列數

結果：(20, 7)

print(renyuan.columns) #列名

結果：Index(['id', 'name', 'sex', 'sfz', 'birth_date', 'zhicheng', 'zcsj'], dtype='object')

print(renyuan.head(2)) #前兩行

結果：

? ?id name sex? ? ? ? ? ? ? ? ?sfz? birth_date zhicheng? ? ?zcsj

0? ?1? ?劉備? ?男? 58829919701006****? ? ?1970.10? ? 高級工程師? 2008.08

1? ?2? ?曹操? ?男? 21079819880119****? ? ?1988.01? ? ? 工程師? 2015.12

print(renyuan.tail(1)) #最后一行

結果：

? ? id name sex? ? ? ? ? ? ? ? ?sfz? birth_date zhicheng? ? ?zcsj

19? 20? ?貂蟬? ?女? 11826119901008****? ? ? 1990.1? ? ? 工程師? 2009.11

# -*- coding: utf-8 -*-import pandas as pd
renyuan = pd.read_excel('D:/用python寫代碼/test1/Day1-7/ryb.xlsx',header=1)print(renyuan.columns) #列名

結果：

Index(['001', '劉備', '男', '58829919701006****', '1970.10', '高級工程師', '2008.08'], dtype='object')

注：

1)在read_excel方法中，header默認值為0(索引)，我們可根據需要修改參數。

2)header為默認值時，如果首行為空，print(xx.columns)可自動跳過空行至首行標題行。

3)renyuan?=?pd.read_excel("D:/用python寫代碼/test1/Day1-7/ryb.xlsx",skiprows=3,usecols="C:F“),其中，skiprows=3代表跳過前三行空行，usecols="C:F“代表僅使用C:F列，即跳過A-B列。

4、給無首行標題欄的表格添加標題欄

# -*- coding: utf-8 -*-import pandas as pd
renyuan = pd.read_excel('D:/用python寫代碼/test1/Day1-7/ryb.xlsx',header=None)
renyuan.columns = ['id','name','sex','sfz','birth_date','zhicheng','zcsj']print(renyuan.columns) #列名renyuan.to_excel("D:/用python寫代碼/test1/Day1-7/ryb.xlsx")print("Done!")

結果：

Index(['id', 'name', 'sex', 'sfz', 'birth_date', 'zhicheng', 'zcsj'], dtype='object')

Done!

可將序號設置成id，從1開始，完善如下：

# -*- coding: utf-8 -*-import pandas as pd
renyuan = pd.read_excel('D:/用python寫代碼/test1/Day1-7/ryb.xlsx',header=None)
renyuan.columns = ['id','name','sex','sfz','birth_date','zhicheng','zcsj']
renyuan.set_index('id',inplace=True)print(renyuan.columns) #列名renyuan.to_excel("D:/用python寫代碼/test1/Day1-7/ryb.xlsx")print("Done!")

結果：

Index(['name', 'sex', 'sfz', 'birth_date', 'zhicheng', 'zcsj'], dtype='object')

Done!

注：

1)修改后，print(renyun.columns)看不到id了，只看到index。而讀取時，又將id當作普通的列名，自動新增了索引號。

2)pandas 中 inplace 參數在很多函數中都會有，作用：是否在原對象基礎上進行修改。inplace = True：不創(chuàng)建新的對象，直接對原始對象進行修改；inplace = False：對數據進行修改，創(chuàng)建并返回新的對象承載其修改結果。默認是False。

# -*- coding: utf-8 -*-import pandas as pd
renyuan = pd.read_excel('D:/用python寫代碼/test1/Day1-7/ryb.xlsx')print(renyuan.columns) #列名print(renyuan.head(2))

結果：

Index(['id', 'name', 'sex', 'sfz', 'birth_date', 'zhicheng', 'zcsj'], dtype='object')

? ?id name sex? ? ? ? ? ? ? ? ?sfz? birth_date zhicheng? ? ?zcsj

0? ?1? ?劉備? ?男? 58829919701006****? ? ?1970.10? ? 高級工程師? 2008.08

1? ?2? ?曹操? ?男? 21079819880119****? ? ?1988.01? ? ? 工程師? 2015.12

三、pandas-Series()函數

1、利用列表創(chuàng)建序列Series

# -*- coding: utf-8 -*-import pandas as pd
l1 = [100,200,300]
l2 = ['x','y','z']
s1 = pd.Series(l1,index=l2)

或者：s1 = pd.Series([100,200,300],index=['x','y','z'])print(s1)

結果：

x? ? 100

y? ? 200

z? ? 300

dtype: int64

2、將序列Series轉化成Dataframe

# -*- coding: utf-8 -*-import pandas as pd
s1 = pd.Series([1,2,3],index=[1,2,3],name='A')
s2 = pd.Series([10,20,30],index=[1,2,3],name='B')
s3 = pd.Series([100,200,300],index=[1,2,3],name='C')
df = pd.DataFrame({s1.name:s1,s2.name:s2,s3.name:s3})print(df)

結果：

? ?A? ?B? ? C

1? 1? 10? 100

2? 2? 20? 200

3? 3? 30? 300

結果：

? ? ?1? ? 2? ? 3

A? ? 1? ? 2? ? 3

B? ?10? ?20? ?30

C? 100? 200? 300

四、示例

1)字符串拼接、轉換

讀取excel，并將str型字段“birth_date”(如：1988.08)轉換成日期格式y(tǒng)yyy-mm-dd.

原表格式如下：

代碼1：

import pandas as pd
df = pd.read_excel("D:/用python寫代碼/test1/Day1-7/ryb.xlsx",index_col='id',dtype={'birth_date':str,'sfz':str})
birth1 = df['birth_date'].str[:4]
birth2 = df['birth_date'].str[-2:]
sfz = df['sfz'].str[12:14]print(birth1+"-"+birth2+"-"+sfz)

五、補充：

1、time模塊--計算當前日期

import time

p_time = time.strftime('%Y-%m-%d',time.localtime(time.time()))print(p_time)

結果：

2020-09-01

2、datetime模塊--strptime和strftime的區(qū)別strptime：p表示parse，表示分析的意思，所以strptime是給定一個時間字符串和分析模式，返回一個時間對象。strftime：
f表示format，表示格式化，和strptime正好相反，要求給一個時間對象和輸出格式，返回一個時間字符串將str轉換成datetimeimport timeimport datetimep_time = time.strftime('%Y-%m-%d',time.localtime(time.time()))
pp_time = datetime.datetime.strptime(p_time,'%Y-%m-%d').date()print(pp_time)結果：2020-09-01示例：根據str類型的“出生年月”，計算出實際年齡import pandas as pdimport timeimport datetime
df = pd.read_excel("D:/用python寫代碼/test1/Day1-7/ryb.xlsx",index_col='id',dtype={'birth_date':str,'sfz':str})
birth1 = df['birth_date'].str[:4]
birth2 = df['birth_date'].str[-2:]
sfz = df['sfz'].str[12:14]
bir = (birth1+"-"+birth2+"-"+sfz).str[0:11]
p_time = time.strftime('%Y-%m-%d',time.localtime(time.time())) #計算當前日期(元組)
p2_time = datetime.datetime.strptime(p_time,'%Y-%m-%d').date() #將p_time轉換成日期格式for i in range(1,20):
p1_time = datetime.datetime.strptime(bir[i],'%Y-%m-%d').date()
p = (p2_time-p1_time).daysprint(round(p/365,1))結果：50.032.725.429.838.441.0

總結

以上是生活随笔為你收集整理的excel修改列名 pandas_P9：pythonpandas玩转excel文件的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：容器大小_C++ 顺序容器基础知识总结
下一篇： python管道安装包_Python 炫