當前位置：首頁 > 编程语言 > python >内容正文

python

python登录斗鱼_python3 selenium模拟登陆斗鱼提取数据保存数据库

發(fā)布時間：2023/12/8 python 26 豆豆

生活随笔收集整理的這篇文章主要介紹了 python登录斗鱼_python3 selenium模拟登陆斗鱼提取数据保存数据库小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

# coding=utf-8

from selenium import webdriver

import json

import time

import pymongo

class Douyu:

def __init__(self):

self.driver = webdriver.Chrome()

# 發(fā)送首頁請求

self.driver.get("https://www.douyu.com/directory/all")

self.host = '127.0.0.1'

self.port = 27017

self.DBname = 'douyu'

def get_content(self):

time.sleep(3)

li_list = self.driver.find_elements_by_xpath('//ul[@id="live-list-contentbox"]/li')

#?print(li_list)

contents = []

# 遍歷房間列表

for i in li_list:

item = {}

# 獲取房間圖片

item['img'] = i.find_element_by_xpath('./a//img').get_attribute("src")

# 獲取房間名字

item['title'] = i.find_element_by_xpath('./a').get_attribute("title")

# 獲取房間分類

item['category'] = i.find_element_by_xpath('./a/div[@class="mes"]/div/span').text

# 獲取主播名字

item['name'] = i.find_element_by_xpath("./a/div[@class='mes']/p/span[1]").text

# 觀看人數(shù)

item['watch_num'] = i.find_element_by_xpath("./a/div[@class='mes']/p/span[2]").text

#?print(item)

contents.append(item)

return contents

# 保存到MongoDB

def save_content(self, contents):

# 創(chuàng)建MongoDB連接

client = pymongo.MongoClient(host=self.host, port=self.port)

# 指向指定的數(shù)據(jù)庫

mdb = client[self.DBname]

self.post = mdb[self.DBname]

self.post.insert(contents)

# 保存到本地

# def save_content(self, contents):

# with open("douyu.json", "a") as f:

# for content in contents:

# json.dump(content, f, ensure_ascii=False, indent=2)

# f.write(',\n')

def run(self):

# 1.發(fā)送首頁請求

# 2.獲取首頁信息

contents = self.get_content()

# 3.保存內(nèi)容

self.save_content(contents)

# 4.循環(huán)　點擊下一頁按鈕，直到下一頁對應的class名字不再是"shark-pager-next"

# 判斷有沒有下一頁

while self.driver.find_element_by_class_name("shark-pager-next"):

# 5.點擊下一頁按鈕

self.driver.find_element_by_class_name("shark-pager-next").click()

# 6.獲取下一頁的內(nèi)容

contents = self.get_content()

# 7.保存內(nèi)容

self.save_content(contents)

if __name__ == '__main__':

douyu = Douyu()

douyu.run()

總結(jié)

以上是生活随笔為你收集整理的python登录斗鱼_python3 selenium模拟登陆斗鱼提取数据保存数据库的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

上一篇： Cisco ❀ MPLS中的路由器角色
下一篇： python 例子银行_Python3