The story of reading XML in Python came in, so I investigated it. Some XML tags are in the format namespace: tag_name, I couldn't find the tag with namespace well, so I've put together the code.
I didn't add the URL of the sample code, so I added it. Sample code
If it is php, it can be read by simplexml_load_string. In the case of php, it is converted to namespace_tag_name, so you can search for nodes without worrying about it.
In python, you can't search unless you store the namespace in an array in advance and pass the namespace array when you execute the find function.
That's why how to read XML with namespace
First, excerpt the contents of the XML to be read. This is the tag structure.
MESH03622.gml
<?xml version="1.0" encoding="UTF-8"?>
<gml:FeatureCollection xmlns:fme="http://www.safe.com/gml/fme" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:gml="http://www.opengis.net/gml" xsi:schemaLocation="http://www.safe.com/gml/fme MESH03622.xsd">
<gml:boundedBy>
<gml:Envelope srsName="EPSG:4612" srsDimension="2">
<gml:lowerCorner>24.4333333329177 122.924999999934</gml:lowerCorner>
<gml:upperCorner>24.4833333336014 123.000000000218</gml:upperCorner>
</gml:Envelope>
</gml:boundedBy>
<gml:featureMember>
<fme:MESH03622 gml:id="id060c8300-20cf-4750-b6a6-6b9730bc8fb3">
<fme:FID>0</fme:FID>
<fme:KEN_ID>47</fme:KEN_ID>
<fme:KEY_CODE>36225734</fme:KEY_CODE>
<fme:MESH1_ID>3622</fme:MESH1_ID>
<fme:MESH2_ID>57</fme:MESH2_ID>
<fme:MESH3_ID>34</fme:MESH3_ID>
<gml:surfaceProperty>
<gml:Surface srsName="EPSG:4612" srsDimension="2">
<gml:patches>
<gml:PolygonPatch>
<gml:exterior>
<gml:LinearRing>
<gml:posList>24.4416666664184 122.925000000467 24.4500000000747 122.924999999934 24.4500000002353 122.937499999765 24.4416666669395 122.937499999636 24.4416666664184 122.925000000467</gml:posList>
</gml:LinearRing>
</gml:exterior>
</gml:PolygonPatch>
</gml:patches>
</gml:Surface>
</gml:surfaceProperty>
</fme:MESH03622>
</gml:featureMember>
Python script to read that XML
ReadXMLSample.py
from pprint import pprint
import os
import xml.etree.ElementTree as ET
#Register each NameSpace specified in XML
# xmlns:fme="http://www.safe.com/gml/fme"this is'fme' : 'http://www.safe.com/gml/fme'Becomes
ns = {'gml': 'http://www.opengis.net/gml', 'fme': 'http://www.safe.com/gml/fme', 'xlink': 'http://www.w3.org/1999/xlink'}
tree = ET.parse(fileName)
root = tree.getroot()
#Get srsName of Attribute
pprint ("envelope = " + root.find('gml:boundedBy/gml:Envelope', ns).attrib['srsName'].strip())
# gml:I want to get the following featureMember
fmeNodes = root.findall('gml:featureMember', ns)
records = []
for itemNode in fmeNodes:
#If you want the value of node below the root, you can get the value by describing the path from root
#Don't forget to specify an array of namespaces when you find!
pprint ("KEN_ID = " + itemNode.find('fme:MESH03622/fme:KEN_ID', ns).text.strip())
#Can be obtained by specifying a relative path based on the current itemNode
pprint ("KEN_ID = " + itemNode.find('.//fme:KEN_ID', ns).text.strip())
#If you want to get the pointlist, describe the path in the same way.
pprint ("posList = " + itemNode.find('fme:MESH03622/gml:surfaceProperty/gml:Surface/gml:patches/gml:PolygonPatch/gml:exterior/gml:LinearRing/gml:posList', ns).text.strip())
#Break for convenience so that data does not flow too much
break
There are several ways to read it, but you can specify it with the relative path or absolute path from the read tag. I wonder if I should think that way.
Recommended Posts