當前位置：首頁 > 编程语言 > python >内容正文

python

python爬取51job网

發布時間：2024/1/1 python 22 豆豆

生活随笔收集整理的這篇文章主要介紹了 python爬取51job网小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

廢話不說了，直接展示代碼！！！

import urllib.request from bs4 import BeautifulSoup import re import time''' 項目目標：51job爬取職業，地區，薪資，工資，公司，首先根據url爬取整個網頁其次根據爬取的頁面獲取所要的數據最后用字典一一保存，最后保存在文件夾中 '''class python_job():#定義一個字典def __init__(self):self.date = {}#根據url，獲取51job的網站數據def get_content(self,namber):url = 'https://search.51job.com/list/200200,000000,0000,00,9,99,python,2,'#拼接urlnew_url = url + str(namber) + '.html'#創建頭消息headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36'}# 創建消息頭request = urllib.request.Request(url=new_url, headers=headers)#獲取網頁信息content = urllib.request.urlopen(request)return self.get_response(content)#根據首頁獲取工作欄def get_response(self,content):# 生成soup對象soup = BeautifulSoup(content, 'lxml')# 表頭信息分成分類信息heat_ret與所要信息body_retheat_ret = soup.find_all('div',class_ = 'el title')body_ret = soup.select('.dw_table > .el')body_ret.pop(0)# return heat_ret,body_retreturn self.get_head_body(heat_ret,body_ret)#整合表頭表內容信息def get_head_body(self,heat_ret,body_ret):#將分類信息提取出來head = heat_ret[0].find_all('span')fp = open('python.txt', 'a', encoding='utf8')# 數據整理for i in body_ret:body_head = i.find('a')# 將所要信息提取出來body_body = i.find_all('span',class_ =re.compile(r'^t\d') )#職位名self.date[head[0].text] = str(body_head.text).strip()#公司名self.date[head[1].text] = str(body_body[0].text).strip()#工作地點self.date[head[2].text] = str(body_body[1].text).strip()#薪資self.date[head[3].text] = str(body_body[2].text).strip()#發布時間self.date[head[4].text] = str(body_body[3].text).strip()#寫入文件fp.write(str(self.date) + '\n')#關閉文件fp.close()time.sleep(5)print('下載一頁完成。。。。。。')#進行頁面跳轉調用函數def first(self,i):print('第%s頁開始下載。。。。。。'%i)return self.get_content(i)if __name__ == '__main__':staet_num = int(input('請輸入起始頁碼:'))end_num = int(input('請輸入終止頁碼:'))a = python_job()for i in range(staet_num, end_num + 1):a.first(i)

總結

以上是生活随笔為你收集整理的python爬取51job网的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

Python
job

上一篇：提升引理：唯一提升与同伦提升
下一篇： Proficoud FAQ常见问题解答(