上海中考分数线爬虫及使用plotly数据可视化
生活随笔
收集整理的這篇文章主要介紹了
上海中考分数线爬虫及使用plotly数据可视化
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
上海中考分數線爬蟲及使用plotly數據可視化
馬上就中考了,蹭一波熱度,做了一個上海市近幾年中考分數線對比的爬蟲,各區學校對比用了柱狀圖,各校歷年分數線變化用了線形圖
效果如下:
數據來源:微信小程序 升學查分
數據獲取代碼
#-----------------引入區----------------- import requests import pandas as pd from urllib.parse import quote ? #-----------------常數區----------------- dict = {} df=pd.DataFrame() res=requests.session() token=''#自行抓包獲取 h={"API-CITY": quote('上海市'),"API-TOKEN": token,"Accept-Encoding": "gzip,compress,br,deflate","Connection": "keep-alive","Host": "xiaokedou.xkd100.com","Referer": "https://servicewechat.com/wxd588a54f779b2090/43/page-frame.html","User-Agent": "Mozilla/5.0 (iPhone; CPU iPhone OS 14_5_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Mobile/15E148 MicroMessenger/8.0.6(0x18000632) NetType/WIFI Language/zh_CN","content-type": "application/json"} ? #-----------------函數區----------------- ? ? def getcode(year,vol,area):#獲取數據 ?url='https://xiaokedou.xkd100.com/api/mid/search?year='+year+'&volunteer='+vol+'&area='+areas=res.get(url,headers=h).json()['data']['list']dd={}for i in s:sname=i['school_name']scode=i['recruit_code']sline=i['score_line']stype=i['volunteer_type']dd.update({sname:{'stype':stype,'scode':scode,'sline':sline}})return dd ? ? def getlist():#獲取列表構造數據框架url='https://xiaokedou.xkd100.com/api/mid/where's=res.get(url,headers=h).json()['data']['where']years=s['years']volunteers=s['volunteers']areas=s['areas']for area in areas:dic = {}for vol in volunteers:ys={}for year in years:x=getcode(str(year),vol,area)ys.update({year:x})dic.update({vol:ys})dict.update({area:dic})print(area) if __name__ == '__main__': ?getlist()pt=df.from_dict(dict)#將dict轉為dateframept.to_json('data.json')#轉存位json文件 ?數據可視化代碼
#-----------------引入區----------------- import pandas as pd import plotly import plotly.graph_objects as go import plotly.io as po import os ? #-----------------常數區----------------- df=pd.DataFrame() x=pd.read_json('data.json') pl=plotly.plot ? #-----------------函數區----------------- def zhu(year,zhiyuan,area,dd,dic):#通過數據畫線性圖,并存入jpg圖片文件line = go.Bar(x=dd, y=dic)layout = go.Layout(title=year + area + zhiyuan)fg = go.Figure(line, layout)path = 'tmp/' + area + '/' + zhiyuan + '/'if os.path.exists(path) == False:os.makedirs(path)po.write_image(fg, path + year + '.jpg', width=1920, height=1080) ? ? def li(school,zhiyuan,area,dd,dic):#通過數據畫柱狀圖,并存入jpg圖片文件line = go.Scatter(x=dd, y=dic)layout=go.Layout(title = school+area+zhiyuan)fg = go.Figure(line,layout)path='score/'+area+'/'+zhiyuan+'/'if os.path.exists(path)==False:os.makedirs(path)po.write_image(fg,path+school+'.jpg',width=1920,height=1080) ? def compare():#橫向對比每年各區各校分數線情況for area in x:s = df.from_dict(x[area]).Tfor zhiyuan in s:ss=s[zhiyuan][0]for year in ss:dic = []dd = []for j in ss[year]:dd.append(j)dic.append(float(ss[year][j]['sline']))zhu(year,zhiyuan,area,dd,dic) ? def getline():#縱向對比每個學校各志愿分數線情況for area in x:s = df.from_dict(x[area]).Tfor zhiyuan in s:ss = s[zhiyuan][0]schools=ss['2020'].keys()for school in schools:dd = []dic = []for i in ss:try:score=float(ss[i][school]['sline'])dd.append(i)dic.append(score)except:passli(school,zhiyuan,area,dd,dic) ? ? if __name__ == '__main__':getline()compare()最后成果
鏈接: https://pan.baidu.com/s/1QXbLiPCaSNByiyJNdVUzXg 密碼: vows
最后祝各位考生旗開得勝!
這是老魏的公眾號,會發布一些爬蟲案例和心得,大家可以一起交流
總結
以上是生活随笔為你收集整理的上海中考分数线爬虫及使用plotly数据可视化的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 统计套利类策略
- 下一篇: 一文看懂Shiro权限管理框架!