【Python爬虫】写个爬虫爬取自己的博客,可以刷访问量
生活随笔
收集整理的這篇文章主要介紹了
【Python爬虫】写个爬虫爬取自己的博客,可以刷访问量
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
預備工作
添加外部包:
pip install bs4 pip install requests pip install virtualenv(這個好像沒有必要) pip install lxml第一步:爬取自己首頁的博客鏈接
代碼
第二步:通過request訪問這些鏈接
只是request請求,不進行任何操作,相當于訪問自己的每一篇博客
代碼
# coding: utf-8 import re import requests from bs4 import BeautifulSoupdef get_page():try:headers = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) ''AppleWebKit/537.36 (KHTML, like Gecko) ''Ubuntu Chromium/44.0.2403.89 ''Chrome/44.0.2403.89 ''Safari/537.36'}for blog_url in blog_urls:response = requests.get(blog_url, headers=headers, timeout=10)print("url="+blog_url)#return response.textexcept:return ""def write_to_file(content):with open('article.txt', 'a', encoding='utf-8') as f:f.write(content)if __name__ == '__main__':blog_urls = ["https://blog.csdn.net/sinat_42483341/article/details/91826523","https://blog.csdn.net/sinat_42483341/article/details/89931215",'https://blog.csdn.net/sinat_42483341/article/details/89034286', 'https://blog.csdn.net/sinat_42483341/article/details/88849892', 'https://blog.csdn.net/sinat_42483341/article/details/95871910', 'https://blog.csdn.net/sinat_42483341/article/details/95768679', 'https://blog.csdn.net/sinat_42483341/article/details/95495296', 'https://blog.csdn.net/sinat_42483341/article/details/95043847', 'https://blog.csdn.net/sinat_42483341/article/details/95014941', 'https://blog.csdn.net/sinat_42483341/article/details/94969983', 'https://blog.csdn.net/sinat_42483341/article/details/94492282', 'https://blog.csdn.net/sinat_42483341/article/details/94443619', 'https://blog.csdn.net/sinat_42483341/article/details/94388710', 'https://blog.csdn.net/sinat_42483341/article/details/94296696', 'https://blog.csdn.net/sinat_42483341/article/details/94133323', 'https://blog.csdn.net/sinat_42483341/article/details/94053208', 'https://blog.csdn.net/sinat_42483341/article/details/94050774', 'https://blog.csdn.net/sinat_42483341/article/details/93769801', 'https://blog.csdn.net/sinat_42483341/article/details/93746360', 'https://blog.csdn.net/sinat_42483341/article/details/93739451']get_page()第三步:(待寫)
想要達到自動刷訪問量的效果,當然不可能每次都自己手動把字符串存到數組中。應該把所有鏈接自動存進數組里,逐個訪問即可,懶得寫了,不過已經寫好了Java版,可以參考:
【Java爬蟲】自己寫爬蟲練手,刷CSDN訪問量
總結
以上是生活随笔為你收集整理的【Python爬虫】写个爬虫爬取自己的博客,可以刷访问量的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 【Python环境搭建】PyCharm鼠
- 下一篇: websocket python爬虫_p