日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程语言 > python >内容正文

python

python matplotlib,plt.pie,plt.bar(bike数据的分析)

發(fā)布時間:2024/3/12 python 28 豆豆
生活随笔 收集整理的這篇文章主要介紹了 python matplotlib,plt.pie,plt.bar(bike数据的分析) 小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

說在前面:這個是七月在線線下課程的一個課件,侵權的話聯(lián)系刪除,想學的可以點擊鏈接買,運行環(huán)境是notebook。因為方便手機查看,所以把這個放博客上。

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
from datetime import datetime
from pandas import Series, DataFrame

get_ipython().run_line_magic(‘matplotlib’, ‘inline’)

In[5]:

數(shù)據(jù)來源:https://s3.amazonaws.com/tripdata/index.html

偷懶只處理最近6個月數(shù)據(jù)

bike_df = pd.read_csv(‘data/citibike/JC-201704-citibike-tripdata.csv’)
bike_df = bike_df.append(pd.read_csv(‘data/citibike/JC-201705-citibike-tripdata.csv’), ignore_index=False)
bike_df = bike_df.append(pd.read_csv(‘data/citibike/JC-201706-citibike-tripdata.csv’), ignore_index=False)
bike_df = bike_df.append(pd.read_csv(‘data/citibike/JC-201707-citibike-tripdata.csv’), ignore_index=False)
bike_df = bike_df.append(pd.read_csv(‘data/citibike/JC-201708-citibike-tripdata.csv’), ignore_index=False)
bike_df = bike_df.append(pd.read_csv(‘data/citibike/JC-201709-citibike-tripdata.csv’), ignore_index=False)
print(bike_df.shape)
bike_df.head()

In[3]:

print(‘租賃點:%d’ % len(bike_df[‘start station name’].unique()))
print(‘被騎自行車數(shù)量:%d’ % len(bike_df[‘bikeid’].unique()))
print(‘騎行次數(shù):%d’ % bike_df.shape[0])
print(‘每部騎行時長(分鐘):%.2f’ % float(bike_df[‘tripduration’].sum() / bike_df[‘bikeid’].count() / 60))
print(‘租借頻率:%.2f’ % float(bike_df[‘bikeid’].count() / len(bike_df[‘bikeid’].unique())))

In[4]:

按月統(tǒng)計

bike_df[‘starttime’]=pd.to_datetime(bike_df[‘starttime’])
bike_df = bike_df.set_index(‘starttime’) # 轉成時間序列
print(bike_df.head())
bike_df_by_month = bike_df.resample(‘M’).apply(len)
bike_df_by_month = bike_df_by_month[‘bikeid’]
bike_df_by_month

In[5]:

畫出每月騎行次數(shù)

plt.rcParams[‘font.sans-serif’]=[‘SimHei’] #用來正常顯示中文標簽
plt.rcParams[‘a(chǎn)xes.unicode_minus’]=False #用來正常顯示負號
plt.rc(‘font’, family=‘SimHei’, size=15)
plt.plot(bike_df_by_month, ‘r8’, bike_df_by_month, ‘g-’, linewidth=1, markeredgewidth=5, alpha=0.8)
plt.xlabel(‘月份’)
plt.ylabel(‘租賃騎行次數(shù)’)
plt.title(‘最近半年Citi Bike每月騎行次數(shù)’)
plt.show()

In[6]:

畫出按季度的分布

bike_df_by_quarter = bike_df.resample(‘Q’).apply(len)
bike_df_by_quarter = bike_df_by_quarter[‘bikeid’]
print(bike_df_by_quarter)

In[7]:

#繪制按季度匯總的騎行次數(shù)柱狀圖
plt.bar([10,15], bike_df_by_quarter, alpha=0.8, align=‘center’, edgecolor=‘white’)
plt.xlabel(‘季度’)
plt.ylabel(‘租賃騎行次數(shù)’)
plt.title(‘最近2個季度Citi Bike每季度騎行次數(shù)’)
plt.legend([‘次數(shù)’], loc=‘upper right’)
plt.grid(color=’#95a5a6’, linestyle=’–’, linewidth=1, axis=‘y’, alpha=0.4)
plt.xlim(5, 20)
plt.ylim(60000, 120000)
plt.xticks([10,15], (‘二季度’,‘三季度’))
plt.show()

In[8]:

畫出性別分布

bike_df_by_gender= bike_df.groupby(‘gender’)[‘bikeid’].agg(len) / bike_df[“bikeid”].count() * 100
print(bike_df_by_gender)

plt.pie(bike_df_by_gender,labels=[‘未知’, ‘男性’, ‘女性’], colors=[‘red’, ‘blue’, ‘green’],explode=(0, 0, 0),startangle=60,autopct=’%1.1f%%’)

plt.pie(bike_df_by_gender,labels=[‘未知’, ‘男性’, ‘女性’], colors=[‘red’, ‘blue’, ‘green’], explode=(0, 0, 0), startangle=60, autopct=’%1.1f%%’)
plt.title(‘Citi Bike用戶性別占比’)
plt.legend([‘未知’, ‘男性’, ‘女性’], loc=‘upper left’)
plt.show()

In[9]:

畫出年齡分布

bike_df[‘a(chǎn)ge’] = 2016 - bike_df[‘birth year’] # 大于100歲雖然不合理,但先不管了,暫時保留。(15, 129)
bins = [0, 18, 30, 50, 131]
age_group = [‘少年’, ‘青年’, ‘中年’, ‘老年’]
bike_df[‘a(chǎn)ge_group’] = pd.cut(bike_df[‘a(chǎn)ge’], bins, labels=age_group)
bike_df_by_ag = bike_df.groupby(‘a(chǎn)ge_group’)[‘a(chǎn)ge_group’].agg(len)

mean_ages = bike_df.groupby(‘a(chǎn)ge_group’)[‘a(chǎn)ge’].mean()

print(bike_df.groupby(‘a(chǎn)ge_group’)[‘a(chǎn)ge_group’].value_counts())

plt.bar([1,2,3,4], bike_df_by_ag, color=‘red’, alpha=0.8, align=‘center’, edgecolor=‘white’)
plt.xlabel(‘年齡分組’)
plt.ylabel(‘租賃次數(shù)’)
plt.title(‘Citi Bike用戶年齡分布’)
plt.legend([‘次數(shù)’], loc=‘upper right’)
plt.grid(color=‘green’, linestyle=’–’, linewidth=1, axis=‘y’, alpha=0.4)
plt.xticks([1,2,3,4], (‘少年’,‘青年’,‘中年’,‘老年’))
plt.show()

In[10]:

平均速度計算

#通過經(jīng)緯度計算距離的函數(shù),網(wǎng)上可以找到。
def haversine(lon1, lat1, lon2, lat2): # 經(jīng)度1,緯度1,經(jīng)度2,緯度2 (十進制度數(shù))
from math import radians, cos, sin, asin, sqrt
“”"
Calculate the great circle distance between two points
on the earth (specified in decimal degrees)
“”"
# 將十進制度數(shù)轉化為弧度,將角度轉化為弧度
lon1= map(radians, np.array(lon1))
lat1= map(radians, np.array(lat1))
lon2= map(radians, np.array(lon2))
lat2= map(radians, np.array(lat2))
lon1 = np.array(list(lon1)).reshape(-1,1)
lon2 = np.array(list(lon2)).reshape(-1,1)
lat1 = np.array(list(lat1)).reshape(-1,1)
lat2 = np.array(list(lat2)).reshape(-1,1)
# haversine公式
dlon = lon2 - lon1
dlat = lat2 - lat1

a = (np.sin(dlat/2) ** 2) + np.cos(lat1) * np.cos(lat2) * (np.sin(dlon / 2)**2) c = 2 * np.arcsin(np.sqrt(a)) r = 6371 # 地球平均半徑,單位為公里 return c * r * 1000

start_lon = bike_df[‘start station longitude’]
start_lat = bike_df[‘start station latitude’]
end_lon = bike_df[‘end station longitude’]
end_lat = bike_df[‘end station latitude’]
bike_df[‘meter’]=haversine(start_lon, start_lat, end_lon, end_lat)
bike_df[“duration_hour”] = bike_df[“tripduration”] / 3600.0 # 轉換為小時數(shù)
bike_df[“speed”] = bike_df[“meter”]/ 1000.0 / bike_df[“duration_hour”]
total_km = bike_df[“meter”].sum() / 1000.0
total_hour = bike_df[“tripduration”].sum() / 3600.0
speed = total_km / total_hour
print(’%.2f’ % speed)

總結

以上是生活随笔為你收集整理的python matplotlib,plt.pie,plt.bar(bike数据的分析)的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內容還不錯,歡迎將生活随笔推薦給好友。