Wednesday, November 2, 2011

Groovy: parsing XML with namespaces

99% of the examples on the Internet show how to parse XML without namespaces.

Unfortunately in real life 99% of the XML HAS namespaces :o(

Here is an example, the source XML is:

<?xml version="1.0" encoding="UTF-8"?>
<cus:Customizations xmlns:cus="http://www.bea.com/wli/config/customizations" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xt="http://www.bea.com/wli/config/xmltypes">
  <cus:customization xsi:type="cus:EnvValueCustomizationType">
    <cus:description/>
    <cus:envValueAssignments>
      <xt:envValueType>UDDI Auto Publish</xt:envValueType>
      <xt:location xsi:nil="true"/>
      <xt:owner>
        <xt:type>ProxyService</xt:type>
        <xt:path>OSBProject1/ProxyService1</xt:path>
      </xt:owner>
      <xt:value xsi:type="xs:string" xmlns:xs="http://www.w3.org/2001/XMLSchema">false</xt:value>
    </cus:envValueAssignments>
    <cus:envValueAssignments>
      <xt:envValueType>Service URI</xt:envValueType>
      <xt:location xsi:nil="true"/>
      <xt:owner>
        <xt:type>ProxyService</xt:type>
        <xt:path>OSBProject1/ProxyService1</xt:path>
      </xt:owner>
      <xt:value xsi:type="xs:string" xmlns:xs="http://www.w3.org/2001/XMLSchema">/OSBProject1/ProxyServicePippo</xt:value>
    </cus:envValueAssignments>
  </cus:customization>
  <cus:customization xsi:type="cus:FindAndReplaceCustomizationType">
    <cus:description/>
    <cus:query>
      <xt:resourceTypes>ProxyService</xt:resourceTypes>
      <xt:envValueTypes>UDDI Auto Publish</xt:envValueTypes>
      <xt:envValueTypes>Service URI</xt:envValueTypes>
      <xt:refsToSearch xsi:type="xt:ResourceRefType">
        <xt:type>ProxyService</xt:type>
        <xt:path>OSBProject1/ProxyService1</xt:path>
      </xt:refsToSearch>
      <xt:includeOnlyModifiedResources>false</xt:includeOnlyModifiedResources>
      <xt:searchString>Search String</xt:searchString>
      <xt:isCompleteMatch>false</xt:isCompleteMatch>
    </cus:query>
    <cus:replacement>Replacement String</cus:replacement>
  </cus:customization>
  <cus:customization xsi:type="cus:ReferenceCustomizationType">
    <cus:description/>
  </cus:customization>
</cus:Customizations>


The Groovy-XmlParser is:

def customizations = new XmlParser().parse("ALSBCustomizationFile.xml")
def cus = new groovy.xml.Namespace("http://www.bea.com/wli/config/customizations")
def xt = new groovy.xml.Namespace("http://www.bea.com/wli/config/xmltypes")
def xsi = new groovy.xml.Namespace("http://www.w3.org/2001/XMLSchema-instance")

customizations[cus.customization].each {
    if (it.attributes()[xsi.type] == 'cus:EnvValueCustomizationType') {
        println "FOUND!"
    }

    def values = it[cus.envValueAssignments][xt.envValueType]
    for (value in values) {
        print value
    }
}


Result:

FOUND!
{http://www.bea.com/wli/config/xmltypes}envValueType[attributes={}; value=[UDDI Auto Publish]]{http://www.bea.com/wli/config/xmlty
pes}envValueType[attributes={}; value=[Service URI]]


The Groovy-XmlSlurper way is:

def customizations = new XmlSlurper().parse("ALSBCustomizationFile.xml").declareNamespace(xt: 'http://www.bea.com/wli/config/xmltypes',xsi: 'http://www.w3.org/2001/XMLSchema-instance', cus : 'http://www.bea.com/wli/config/customizations')

println customizations

customizations.'cus:customization'.each {
    println "UNO"
 if (it.'@xsi:type' == "cus:EnvValueCustomizationType") {
  println "TROVATO"
 }
 
}


The very annoying difference between XmlParser and XmlSlurper is that in the first you use ns.part and in the other ns:part

2 comments:

Luciano said...

You can also initialize XmlSlurper to ignore Namespaces: http://groovy.codehaus.org/api/groovy/util/XmlSlurper.html#XmlSlurper(boolean, boolean)

new XmlSlurper(false, false)

Same goes for XmlParser:
http://groovy.codehaus.org/api/groovy/util/XmlParser.html#XmlParser(boolean, boolean)

new XmlParser(false, false)

In this way you can reference a node without having to declare the namespaces.

Unknown said...

Luciano: good point. Can you please add a working example of your idea when adding a node with namespace in it is involved? thanks.