日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當(dāng)前位置: 首頁 > 编程语言 > python >内容正文

python

Python 正则re模块之compile()和findall()详解

發(fā)布時(shí)間:2023/12/20 python 33 豆豆
生活随笔 收集整理的這篇文章主要介紹了 Python 正则re模块之compile()和findall()详解 小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

首先我們看下官方文檔里關(guān)于的compile的說明:

re.compile(pattern, flags=0) Compile a regular expression pattern into a regular expression object, which can be used for matching using its match() and search() methods, described below.The expression’s behaviour can be modified by specifying a flags value. Values can be any of the following variables, combined using bitwise OR (the | operator). </pre><pre name="code" class="python">The sequence: prog = re.compile(pattern) result = prog.match(string) <strong><span style="font-size:24px;">is equivalent to</span></strong> result = re.match(pattern, string) but using re.compile() and saving the resulting regular expression object for reuse is more efficient when the expression will be used several times in a single program.Note:The compiled versions of the most recent patterns passed to re.compile() and the module-level matching functions are cached, so programs that use only a few regular expressions at a time needn’t worry about compiling regular expressions.

下面是flag dotall的說明:

re.DOTALL Make the '.' special character match any character at all, including a newline; without this flag, '.' will match anything except a newline.

》》》》》》》》》》》》》》》》》》》》

下面是關(guān)于findall的說明:

re.findall(pattern, string, flags=0) Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result unless they touch the beginning of another match.

》》》》》》》》》》》》》》》》》》》》

下面舉個(gè)栗子進(jìn)行講解:

>>> import re >>> s = "adfad asdfasdf asdfas asdfawef asd adsfas ">>> reObj1 = re.compile('((\w+)\s+\w+)') >>> reObj1.findall(s) [('adfad asdfasdf', 'adfad'), ('asdfas asdfawef', 'asdfas'), ('asd adsfas', 'asd')]>>> reObj2 = re.compile('(\w+)\s+\w+') >>> reObj2.findall(s) ['adfad', 'asdfas', 'asd']>>> reObj3 = re.compile('\w+\s+\w+') >>> reObj3.findall(s) ['adfad asdfasdf', 'asdfas asdfawef', 'asd adsfas']

代碼參考下圖進(jìn)行理解:


對(duì)于上面的代碼,我們可以看到:

findall函數(shù)返回的總是正則表達(dá)式在字符串中所有匹配結(jié)果的列表list,此處主要討論列表中“結(jié)果”的展現(xiàn)方式,即findall中返回列表中每個(gè)元素包含的信息。

1.當(dāng)給出的正則表達(dá)式中帶有多個(gè)括號(hào)時(shí),列表的元素為多個(gè)字符串組成的tuple,tuple中字符串個(gè)數(shù)與括號(hào)對(duì)數(shù)相同,字符串內(nèi)容與每個(gè)括號(hào)內(nèi)的正則表達(dá)式相對(duì)應(yīng),并且排放順序是按括號(hào)出現(xiàn)的順序。

2.當(dāng)給出的正則表達(dá)式中帶有一個(gè)括號(hào)時(shí),列表的元素為字符串,此字符串的內(nèi)容與括號(hào)中的正則表達(dá)式相對(duì)應(yīng)(不是整個(gè)正則表達(dá)式的匹配內(nèi)容)。

3.當(dāng)給出的正則表達(dá)式中不帶括號(hào)時(shí),列表的元素為字符串,此字符串為整個(gè)正則表達(dá)式匹配的內(nèi)容。

《《《《《《《《《《《《《《《《《

對(duì)于.re.compile.findall(data)之后的數(shù)據(jù),我們可以通過list的offset索引或者str.join()函數(shù)來使之變成str字符串,從而進(jìn)行方便的處理,下面是python3.5中str.join()的文檔:

str.join(iterable) Return a string which is the concatenation of the strings in the iterable iterable. A TypeError will be raised if there are any non-string values in iterable, including bytes objects.The separator between elements is the string providing this method.

經(jīng)過上面的介紹,相信對(duì)crawler里的正則有很大的幫助

總結(jié)

以上是生活随笔為你收集整理的Python 正则re模块之compile()和findall()详解的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò),歡迎將生活随笔推薦給好友。