python 解析url上的xml_如何从python中的URL读取XML文件?
由于命名空間的原因,找不到標題元素。在
下面是要查找的示例代碼:“文檔”標簽中的標題
來自內部“組件”標簽的標題import xml.etree.ElementTree as ET
import urllib.request
url = 'https://dailymed.nlm.nih.gov/dailymed/services/v2/spls/fe9e8b7d-61ea-409d-84aa-3ebd79a046b5.xml'
response = urllib.request.urlopen(url).read()
tree = ET.fromstring(response)
for docTitle in tree.findall('{urn:hl7-org:v3}title'):
print(docTitle.text)
for compTitle in tree.findall('.//{urn:hl7-org:v3}title'):
print(compTitle.text)
更新
示例:
^{pr2}$
此示例打印ID為829076996的作者名
更新2
您可以使用findall方法輕松處理所有assignedEntity標記。
對于每種方法,您可以有多個產品,因此需要另一個findall方法(參見下面的示例)。在xPathAssignedEntities = ''.join([
".//",
NS, "author/",
NS, "assignedEntity/",
NS, "representedOrganization/",
NS, "assignedEntity/",
NS, "assignedOrganization/",
NS, "assignedEntity"
])
xPathProdCode = ''.join([
NS, "actDefinition/",
NS, "product/",
NS, "manufacturedProduct/",
NS, "manufacturedMaterialKind/",
NS, "code"
])
# GET ALL assignedEntity TAGS
for assignedEntity in tree.findall(xPathAssignedEntities):
#?GET ID AND NAME OF assignedEntity
id = assignedEntity.find(NS + 'assignedOrganization/'+ NS + 'id').get('extension')
name = assignedEntity.find(NS + 'assignedOrganization/' + NS + 'name').text
# FOR EACH assignedEntity WE CAN HAVE MULTIPLE TAGS
for performance in assignedEntity.findall(NS + 'performance'):
actCode = performance.find(NS + 'actDefinition/'+ NS + 'code').get('displayName')
prodCode = performance.find(xPathProdCode).get('code')
print(id, '\t', name, '\t', actCode, '\t', prodCode)
結果是:829084545 Pfizer Pharmaceuticals LLC ANALYSIS 0049-0050
829084545 Pfizer Pharmaceuticals LLC ANALYSIS 0049-4900
829084545 Pfizer Pharmaceuticals LLC ANALYSIS 0049-4910
829084545 Pfizer Pharmaceuticals LLC ANALYSIS 0049-4940
829084545 Pfizer Pharmaceuticals LLC ANALYSIS 0049-4960
829084545 Pfizer Pharmaceuticals LLC API MANUFACTURE 0049-0050
829084545 Pfizer Pharmaceuticals LLC API MANUFACTURE 0049-4900
829084545 Pfizer Pharmaceuticals LLC API MANUFACTURE 0049-4910
829084545 Pfizer Pharmaceuticals LLC API MANUFACTURE 0049-4940
829084545 Pfizer Pharmaceuticals LLC API MANUFACTURE 0049-4960
829084545 Pfizer Pharmaceuticals LLC MANUFACTURE 0049-4900
829084545 Pfizer Pharmaceuticals LLC MANUFACTURE 0049-4910
829084545 Pfizer Pharmaceuticals LLC MANUFACTURE 0049-4960
829084545 Pfizer Pharmaceuticals LLC PACK 0049-4900
829084545 Pfizer Pharmaceuticals LLC PACK 0049-4910
829084545 Pfizer Pharmaceuticals LLC PACK 0049-4960
618054084 Pharmacia and Upjohn Company LLC ANALYSIS 0049-0050
618054084 Pharmacia and Upjohn Company LLC ANALYSIS 0049-4940
829084552 Pfizer Pharmaceuticals LLC PACK 0049-4900
829084552 Pfizer Pharmaceuticals LLC PACK 0049-4910
829084552 Pfizer Pharmaceuticals LLC PACK 0049-4960
總結
以上是生活随笔為你收集整理的python 解析url上的xml_如何从python中的URL读取XML文件?的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: python中os.path.join(
- 下一篇: websocket python爬虫_p