Read Namespace-specified XML in Python

The story of reading XML in Python came in, so I investigated it. Some XML tags are in the format namespace: tag_name, I couldn't find the tag with namespace well, so I've put together the code.

I didn't add the URL of the sample code, so I added it. Sample code

If it is php, it can be read by simplexml_load_string. In the case of php, it is converted to namespace_tag_name, so you can search for nodes without worrying about it.

In python, you can't search unless you store the namespace in an array in advance and pass the namespace array when you execute the find function.

That's why how to read XML with namespace

First, excerpt the contents of the XML to be read. This is the tag structure.

MESH03622.gml


<?xml version="1.0" encoding="UTF-8"?>
<gml:FeatureCollection xmlns:fme="http://www.safe.com/gml/fme" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:gml="http://www.opengis.net/gml" xsi:schemaLocation="http://www.safe.com/gml/fme MESH03622.xsd">
<gml:boundedBy>
<gml:Envelope srsName="EPSG:4612" srsDimension="2">
<gml:lowerCorner>24.4333333329177 122.924999999934</gml:lowerCorner>
<gml:upperCorner>24.4833333336014 123.000000000218</gml:upperCorner>
</gml:Envelope>
</gml:boundedBy>
<gml:featureMember>
<fme:MESH03622 gml:id="id060c8300-20cf-4750-b6a6-6b9730bc8fb3">
<fme:FID>0</fme:FID>
<fme:KEN_ID>47</fme:KEN_ID>
<fme:KEY_CODE>36225734</fme:KEY_CODE>
<fme:MESH1_ID>3622</fme:MESH1_ID>
<fme:MESH2_ID>57</fme:MESH2_ID>
<fme:MESH3_ID>34</fme:MESH3_ID>
<gml:surfaceProperty>
<gml:Surface srsName="EPSG:4612" srsDimension="2">
<gml:patches>
<gml:PolygonPatch>
<gml:exterior>
<gml:LinearRing>
<gml:posList>24.4416666664184 122.925000000467 24.4500000000747 122.924999999934 24.4500000002353 122.937499999765 24.4416666669395 122.937499999636 24.4416666664184 122.925000000467</gml:posList>
</gml:LinearRing>
</gml:exterior>
</gml:PolygonPatch>
</gml:patches>
</gml:Surface>
</gml:surfaceProperty>
</fme:MESH03622>
</gml:featureMember>

Python script to read that XML

ReadXMLSample.py


from pprint import pprint
import os
import xml.etree.ElementTree as ET

  #Register each NameSpace specified in XML
  # xmlns:fme="http://www.safe.com/gml/fme"this is'fme' : 'http://www.safe.com/gml/fme'Becomes
  ns = {'gml': 'http://www.opengis.net/gml', 'fme': 'http://www.safe.com/gml/fme', 'xlink': 'http://www.w3.org/1999/xlink'}
  tree = ET.parse(fileName)
  root = tree.getroot()

  #Get srsName of Attribute
  pprint ("envelope = " + root.find('gml:boundedBy/gml:Envelope', ns).attrib['srsName'].strip())

  # gml:I want to get the following featureMember
  fmeNodes = root.findall('gml:featureMember', ns)

  records = []
  for itemNode in fmeNodes:
    #If you want the value of node below the root, you can get the value by describing the path from root
    #Don't forget to specify an array of namespaces when you find!
    pprint ("KEN_ID = " + itemNode.find('fme:MESH03622/fme:KEN_ID', ns).text.strip())
    #Can be obtained by specifying a relative path based on the current itemNode
    pprint ("KEN_ID = " + itemNode.find('.//fme:KEN_ID', ns).text.strip())
    #If you want to get the pointlist, describe the path in the same way.
    pprint ("posList = " + itemNode.find('fme:MESH03622/gml:surfaceProperty/gml:Surface/gml:patches/gml:PolygonPatch/gml:exterior/gml:LinearRing/gml:posList', ns).text.strip())
    #Break for convenience so that data does not flow too much
    break

There are several ways to read it, but you can specify it with the relative path or absolute path from the read tag. I wonder if I should think that way.

Recommended Posts

Read Namespace-specified XML in Python
Read DXF in python
Parse XML in Python
Read Euler's formula in Python
Read Outlook emails in Python
Read Fortran output in python
Read Protocol Buffers data in Python3
Read PNG chunks in Python (class)
Read files in parallel with Python
Create and read messagepacks in Python
Quadtree in Python --2
Python in optimization
CURL in python
Metaprogramming in Python
Python 3.3 in Anaconda
Geocoding in python
SendKeys in Python
Meta-analysis in Python
Unittest in python
Epoch in Python
Discord in Python
Sudoku in Python
DCI in Python
quicksort in python
nCr in python
Read the file line by line in Python
Read the file line by line in Python
Plink in Python
Constant in python
Read and write JSON files in Python
Lifegame in Python.
FizzBuzz in Python
Sqlite in python
StepAIC in Python
N-gram in python
LINE-Bot [0] in Python
Csv in python
Disassemble in Python
Reflection in Python
Constant in python
nCr in Python.
format in python
Scons in Python3
Puyo Puyo in python
python in virtualenv
PPAP in Python
Get Precipitation Probability from XML in Python
Manipulate namespaced XML in Python (Element Tree)
Quad-tree in Python
Reflection in Python
Chemistry in Python
[Python] Read the specified line in the file
Hashable in python
DirectLiNGAM in Python
Read text in images with python OCR
LiNGAM in Python
Flatten in python
[python] Read data
flatten in python
Read a file containing garbled lines in Python
Read table data in PDF file with Python