Wednesday, November 2, 2011

XML parsing in Python/Jython/ WLST: lxml , libxml2 and minidom

you have the option of using libxml2 http://users.skynet.be/sbi/libxml-python/

but I read better things about http://lxml.de/. To use lxml you must first install http://users.skynet.be/sbi/libxml-python/.


To determine which version of Python you are running in WLST, do

import sys
sys.version_info


for instance with WLS 10.3.5 I get:

(2, 2, 1, 'final', 0)

So the most recent version of libxml2 available for Python 2.2.1 is the 2.6.9:

http://users.skynet.be/sbi/libxml-python/libxml2-python-2.6.9.win32-py2.2.exe

Anyway lxml requires Python 2.4 as a minimum, so you are screwed. In any case WLST uses Jython, not Python, so again you are screwed unless you install Python.


Alternatively, use minidom which is supported in WLST:

from xml.dom.minidom import parse
dom1 = parse('ALSBCustomizationFile.xml'
)

dom1 is a xml.dom.minidom.Document

here the javadoc

then you can do:

envValueAssignments=dom1.getElementsByTagName('cus:envValueAssignments')

and you get a NodeList http://docs.python.org/library/xml.dom.html

if you do:


print envValueAssignments[0].__class__

it says
xml.dom.minidom.Element
whose javadoc is here


you can also do

print envValueAssignments[0].toxml()


Honestly parsing is very painful using a DOM-based approach.
I would rather use Groovy which has got a much more intuitive semantic approach to xml parsing.

No comments: