日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程语言 > python >内容正文

python

python爬虫lxml xpath测试

發(fā)布時間:2025/3/11 python 17 豆豆
生活随笔 收集整理的這篇文章主要介紹了 python爬虫lxml xpath测试 小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

xpath測試1:
main.py

"""=== coding: UTF8 ===""" from lxml import etreexml = """ <book><id>1</id><name>春風</name><price>1.56</price><nick>萬里</nick><author><nick id="10086">周大慶</nick><nick id="10010">黃天山</nick><nick class="joy">周談浩</nick><div><nick>嘟嘟</nick></div><span><nick>珊瑚</nick></span></author> </book> """""" ======================================== 主函數(shù)功能測試 ======================================== """ if __name__ == '__main__':tree = etree.XML(xml)# result = tree.xpath("/book") # /表示層級關系,第一個/是根節(jié)點# result = tree.xpath("/book/name/text()") # text()拿文本# result = tree.xpath("/book/author//nick/text()") # //后代result = tree.xpath("/book/author/*/nick/text()") # *任意的節(jié)點,通配符print(result)

xpath測試2:
test.html

<!DOCTYPE html> <html lang="en"> <head><meta charset="UTF-8"><title>測試</title> </head> <body><ul><li><a href="http://www.baidu.com">百度</a></li><li><a href="http://www.google.com">谷歌</a></li><li><a href="http://www.sogou.com">搜狗</a></li></ul><ol><li><a href="feiji">飛機</a></li><li><a href="dapao">大炮</a></li><li><a href="huoche">火車</a></li></ol><div class="job">高凡爾</div><div class="common">劉珂</div> </body> </html>

main.py

"""=== coding: UTF8 ===""" from lxml import etree""" ======================================== 主函數(shù)功能測試 ======================================== """ if __name__ == '__main__':parser = etree.HTMLParser(encoding='utf-8')tree = etree.parse("test.html", parser=parser)# result = tree.xpath("/html") # /表示層級關系,第一個/是根節(jié)點# result = tree.xpath("/html/body/ul/li/a/text()") # text()拿文本# result = tree.xpath("/html/body/ul/li[1]/a/text()") # xpath的順序是從1開始數(shù)的,[]表示索引# result = tree.xpath("/html/body/ol/li/a[@href='dapao']/text()") # @xxx=xxx表示屬性的篩選# print(result)ol_li_list = tree.xpath("/html/body/ol/li")for li in ol_li_list:# 從每一個li中提取到文字信息result = li.xpath("./a/text()") # 在li中繼續(xù)查找,相對查找print(result)result = li.xpath("./a/@href") # 拿到屬性值: @屬性print(result)print(tree.xpath("/html/body/ul/li/a/@href"))print(tree.xpath("/html/body/div[1]/text()"))print(tree.xpath("/html/body/ol/li/a/text()"))

關注公眾號,獲取更多資料

總結

以上是生活随笔為你收集整理的python爬虫lxml xpath测试的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯,歡迎將生活随笔推薦給好友。