日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

百度迁徙数据爬取 生成excel数据

發布時間:2023/12/10 编程问答 32 豆豆
生活随笔 收集整理的這篇文章主要介紹了 百度迁徙数据爬取 生成excel数据 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

百度遷徙爬蟲

    • 一、原由
    • 二、部分代碼
    • 三、效果展示
    • 四、可執行.exe 下載鏈接

一、原由

學校表白墻有償爬取百度遷徙數據,就拿下了。
根據情況生成三個excel文件爬取每天的數據信息

二、部分代碼

def JsonTextConvert(text):text = text.encode('utf-8').decode('unicode_escape')head, sep, tail = text.partition('(')tail=tail.replace(")","")return taildef UrlFormate(rankMethod, dt, name, migrationType, date):list_date = list(date)list_date.insert(4, '-')list_date.insert(7, '-')formatDate = ''.join(list_date)formatDate = formatDate + " 00:00:00"timeArray = time.strptime(formatDate, "%Y-%m-%d %H:%M:%S")timeUnix = time.mktime(timeArray)ID = code[name]if migrationType == 'in' or migrationType == 'out' or rankMethod == 'historycurve':url = 'http://huiyan.baidu.com/migration/{0}.jsonp?dt={1}&id={2}&type=move_{3}&date={4}&callback=jsonp_{5}000_0000000'.format(rankMethod, dt, ID, migrationType, date, int(timeUnix))elif rankMethod == 'internalflowhistory':url = 'http://huiyan.baidu.com/migration/{0}.jsonp?dt={1}&id={2}&date={3}&callback=jsonp_{4}000_0000000'.format(rankMethod, dt, ID, date, int(timeUnix))return urldef GetData(cityName, moveType, date, rankMethod):# historycurve 'cityrank'response = requests.get(UrlFormate(rankMethod, 'city', cityName, moveType, date), timeout=10)text = response.textrawData = json.loads(JsonTextConvert(text))if rawData['errno'] == 501:return 501data = rawData['data']list = data['list']return listdef write_Excel(data, data_time, move_type):name = 'd:/qianxi/'+date_constant+'/'+data_time+"_"+move_type+".xlsx"app = xw.App(visible=True, add_book=False)wb = app.books.add()sht = wb.sheets['sheet1']sht.range('A1').options(expand='table').value = dataprint(sht.range('A1').value)wb.save(name)# 退出工作簿wb.close()# 推出excelapp.quit()returndef function_cityrank(rankMethod, type, date_time):result = []type_name = ['']for i in code:type_name.append(i)result.append(type_name)for a in code:list_data = {}list_name = []list_name.append(a)# historycurve 'cityrank'tags = GetData(a, type, date_time, rankMethod)if tags == 501:return 501for tag in tags:list_data[tag['city_name']] = tag['value']for i in code:if i in list_data:list_name.append(list_data[i])else:list_name.append(0)result.append(list_name)print(result)write_Excel(result, date_time, type)returndef function_historycurve(date_time):result = []type_name = ['city_name', 'move_in', 'move_out', 'internal']result.append(type_name) # http://huiyan.baidu.com/migration/internalflowhistory.jsonp?dt=city&id=440100&date=20201114&callback=jsonp_1605340876623_8581344 # cityName, moveType, date, rankMethodlist_data = {}for a in code:list_name = []list_name.append(a)# internalflowhistorytags_in = GetData(a, 'in', date_time, 'historycurve')tags_out = GetData(a, 'out', date_time, 'historycurve')tips = GetData(a, type, date_time, 'internalflowhistory')if date_time in tags_in:list_name.append(tags_in[date_time])else:list_name.append(0)if date_time in tags_in:list_name.append(tags_out[date_time])else:list_name.append(0)if date_time in tags_in:list_name.append(tips[date_time])else:list_name.append(0)result.append(list_name)print(result)write_Excel(result, date_time, "規模")returndef create_document(date_time):tag = function_cityrank('cityrank', 'in', date_time)if tag == 501:print('查無該日期信息')return 501print('in_excel 已經生成')function_cityrank('cityrank', 'out', date_time)print('out_excel 已經生成')function_historycurve(date_time)print('guimo_excel 已生成')return

三、效果展示


在D:目錄下,生成qianxi文件夾,進而生成日期文件夾及三個需求excel


xxxx_in.xlsx


規模.xlsx

四、可執行.exe 下載鏈接

總結

以上是生活随笔為你收集整理的百度迁徙数据爬取 生成excel数据的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。