當前位置：首頁 > 编程语言 > python >内容正文

python

python多线程读取文件夹下的文件_是否可以使用python多线程从文件夹数读取文件数，并处理这些文件以获得组合结果？...

發布時間：2025/3/20 python 24 豆豆

生活随笔收集整理的這篇文章主要介紹了 python多线程读取文件夹下的文件_是否可以使用python多线程从文件夹数读取文件数，并处理这些文件以获得组合结果？... 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

我認為學習使用線程的最簡單方法是在concurrent.futures模塊中使用ThreadPoolExecutor類，因為它比通常的同步for循環多了幾行。尤其是在Python3中，但這可以適用于Python2.7。在

基本上你有一個線程池(一堆)等待工作。Work通常只是一個方法/函數，它與參數一起發送到池，ThreadPool處理其他所有事情(將任務分配給可用的資源并進行調度)。在

假設我的日志目錄結構如下：~ ? tree log

log

├── 1.log

├── 2.log

├── 3.log

└── schedules

├── 1.log

├── 2.log

└── 3.log

1 directory, 6 files

因此，首先得到文件列表(Python3)。在

^{pr2}$

每個文件(現在只是一個字符串變量)就是你希望線程處理的。因此，您有一個通用方法，接受一個file參數，在每個文件中找到感興趣的字符串。基本上是一樣的，如果你做了一個普通的Python程序，比如：def find_string(file):

# insert your specific code to find your string

# including opening the file and such

# returning values also possible see further down

print(file)

現在您只需將這些工作發送到ThreadPool。在from concurrent.futures import ThreadPoolExecutor

# We can use a with statement to ensure threads are cleaned up promptly

with ThreadPoolExecutor() as executor:

# Basically the same as if you did the normal for-loop

for file in list_of_files:

# But you submit your method to the Pool instead.

future = executor.submit(find_string, file) # see future.result() too

print("All tasks complete")

這里有一個很好的完整示例here，搜索ThreadPoolExecutor Example，它確實會打開一個網站列表并以字節為單位打印大小。你可以修改成文件搜索。在

這里的瓶頸可能是大量的文件，這將是磁盤讀取速度。如果您有多個磁盤上的日志文件，那將是一個解決方案。在

另一個建議是多線程通常用于網絡操作或I/O，因此讀取文件是一個很好的用途。不過，你也在做一些處理。根據CPU的密集程度，您可能需要查看ProcessPoolExecutor中使用multiprocessing模塊的對象。它與ThreadPoolExecutor共享相同的接口。在

希望這是有道理的。在

《新程序員》：云原生和全面數字化實踐50位技術專家共同創作，文字、視頻、音頻交互閱讀

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。