日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程语言 > python >内容正文

python

python 点击按钮采集图片_python多线程采集图片

發布時間:2023/12/9 python 32 豆豆
生活随笔 收集整理的這篇文章主要介紹了 python 点击按钮采集图片_python多线程采集图片 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

cmd中運行

>python untitled2.py??? 圖片的網站

import requests

import threading

from bs4 import BeautifulSoup

import sys

import os

if len(sys.argv) != 2:

print("Usage : " )

print(" python main.py [URL]" )

exit(1)

# config-start

url = sys.argv[1]

threadNumber = 20

# 設置線程數 # config-end

def getContent(url):

try:

response = requests.get(url)

response.raise_for_status()

response.encoding = response.apparent_encoding

return response.text

except Exception? as e:

print(e)

return str(e)

def getTitle(soup):

try:

return soup.title.string

except:

return "UnTitled"

def getImageLinks(soup):

imgs = soup.findAll("img")

result = []

for img in imgs:

try:

src = img['src']

if src.startswith("http"):

result.append(img['src'])

else:

result.append(domain + img['src'])

except:

continue

return result

def makeDirectory(dicName):

if not os.path.exists(dicName):

os.mkdir(dicName)

def downloadImage(imgUrl,savePath):

local_filename = imgUrl.split('/')[-1]

local_filename = formatFileName(local_filename)

r = requests.get(imgUrl, stream=True)

counter = 0

if not savePath.endswith("/"):

savePath += "/"

f = open(savePath + local_filename, 'wb')

for chunk in r.iter_content(chunk_size=1024):

if chunk:

f.write(chunk)

f.flush()

counter += 1

f.close()

def formatFileName(fileName):

fileName = fileName.replace("/","_")

fileName = fileName.replace("\\","_")

fileName = fileName.replace(":","_")

fileName = fileName.replace("*","_")

fileName = fileName.replace("?","_")

fileName = fileName.replace("\"","_")

fileName = fileName.replace(">","_")

fileName = fileName.replace("

fileName = fileName.replace("|","_")

fileName = fileName.replace(" ","_")

return fileName

def threadFunction(imgSrc,directoryName):

downloadImage(imgSrc,directoryName)

class myThread (threading.Thread):

def __init__(self, imgSrc, directoryName):

threading.Thread.__init__(self)

self.imgSrc = imgSrc

self.directoryName = directoryName

def run(self):

threadFunction(self.imgSrc, self.directoryName)

def getPrefix(url):

# http://doamin/xxx.jpg

return ''.join(i+"/" for i in url.split("/")[0:4])

def getDomain(url):

return ''.join(i+"/" for i in url.split("/")[0:3])

content = getContent(url)

prefix = getPrefix(url)

domain = getDomain(url)

soup = BeautifulSoup(content, "html.parser")

images = getImageLinks(soup)

title = getTitle(soup)

title = formatFileName(title)

print(u"頁面標題 : " , title )

print(u"本頁圖片數量 :",len(images))

print(u"正在創建文件夾以用來保存所有圖片")

makeDirectory(title)

threads = []

for image in images:

print(u"圖片地址 : " + image)

threads.append(myThread(image, title))

for t in threads:

t.start()

while True:

if(len(threading.enumerate()) < threadNumber):

break

print(u"所有圖片已加入下載隊列 ! 正在下載...")

總結

以上是生活随笔為你收集整理的python 点击按钮采集图片_python多线程采集图片的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。