I have to parse an XML document that looks like this:

 <?xml version="1.0" encoding="UTF-8" ?> 
 <m:OASISReport xmlns:m="http://oasissta.caiso.com/mrtu-oasis/xsd/OASISReport.xsd" 
                xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                xsi:schemaLocation="http://oasissta.caiso.com/mrtu-oasis/xsd/OASISReport.xsd http://oasissta.caiso.com/mrtu-oasis/xsd/OASISReport.xsd">
  <m:MessagePayload>
   <m:RTO>
    <m:name>CAISO</m:name> 
    <m:REPORT_ITEM>
     <m:REPORT_HEADER>
      <m:SYSTEM>OASIS</m:SYSTEM> 
      <m:TZ>PPT</m:TZ> 
      <m:REPORT>AS_RESULTS</m:REPORT> 
      <m:MKT_TYPE>HASP</m:MKT_TYPE> 
      <m:UOM>MW</m:UOM> 
      <m:INTERVAL>ENDING</m:INTERVAL> 
      <m:SEC_PER_INTERVAL>3600</m:SEC_PER_INTERVAL> 
     </m:REPORT_HEADER>
     <m:REPORT_DATA>
      <m:DATA_ITEM>NS_PROC_MW</m:DATA_ITEM> 
      <m:RESOURCE_NAME>AS_SP26_EXP</m:RESOURCE_NAME> 
      <m:OPR_DATE>2010-11-17</m:OPR_DATE> 
      <m:INTERVAL_NUM>1</m:INTERVAL_NUM> 
      <m:VALUE>0</m:VALUE> 
     </m:REPORT_DATA>

The problem is that the namespace "http://oasissta.caiso.com/mrtu-oasis/xsd/OASISReport.xsd" can sometimes be different. I want to ignore it completely and just get my data from tag MessagePayload downstream.

The code I am using so far is:

String[] namespaces = new String[1];
  String[] namespaceAliases = new String[1];

  namespaceAliases[0] = "ns0";
  namespaces[0] = "http://oasissta.caiso.com/mrtu-oasis/xsd/OASISReport.xsd";

  File inputFile = new File(inputFileName);

  Map namespaceURIs = new HashMap();

  // This query will return all of the ASR records.
  String xPathExpression = "/ns0:OASISReport
                             /ns0:MessagePayload
                              /ns0:RTO
                               /ns0:REPORT_ITEM
                                /ns0:REPORT_DATA";
  xPathExpression += "|/ns0:OASISReport
                        /ns0:MessagePayload
                         /ns0:RTO
                          /ns0:REPORT_ITEM
                           /ns0:REPORT_HEADER";

  // Load up the raw XML file. The parameters ignore whitespace and other
  // nonsense,
  // reduces DOM tree size.
  SAXReader reader = new SAXReader();
  reader.setStripWhitespaceText(true);
  reader.setMergeAdjacentText(true);
  Document inputDocument = reader.read(inputFile);

  // Relate the aliases with the namespaces
  if (namespaceAliases != null && namespaces != null)
  {
   for (int i = 0; i < namespaceAliases.length; i++)
   {
    namespaceURIs.put(namespaceAliases[i], namespaces[i]);
   }
  }

  // Cache the expression using the supplied namespaces.
  XPath xPath = DocumentHelper.createXPath(xPathExpression);
  xPath.setNamespaceURIs(namespaceURIs);

  List asResultsNodes = xPath.selectNodes(inputDocument.getRootElement());

It works fine if the namespace never changes but that is obviously not the case. What do I need to do to make it ignore the namespace? Or if I know the set of all possible namespace values, how can I pass them all to the XPath instance?

share|improve this question
2  
@user452103: XPath is XML Names complain, so it will never ignore namespace. You can use expression that selects nodes regarding namespace. If namespace URI is changing so often, then is the wrong URI. Namespace URI suppose to indicate that element belong to specific XML vocabulary. – user357812 Dec 9 '10 at 19:49
    
@user452103: Keep this formatting, it's more clear. – user357812 Dec 9 '10 at 19:54
1  
@Alejandro: thanks for the formatting, it does look better now. What expression can I use to select nodes regardless of namespace? – lukegf Dec 9 '10 at 20:15
    
Good question, +1. See my answer for a single XPath 1.0 expression that selects exactly the wanted nodes. :) – Dimitre Novatchev Dec 9 '10 at 20:32
1  
up vote 30 down vote accepted

Use:

/*/*/*/*/*
        [local-name()='REPORT_DATA' 
       or 
         local-name()='REPORT_HEADER'
        ]
share|improve this answer
    
do you mean to use that as the value of xPathExpression in the code above? – lukegf Dec 9 '10 at 20:42
    
@user452103: Yes, exactly. This is the XPath expression to use. – Dimitre Novatchev Dec 9 '10 at 21:59
    
so, just to clarify, should it be like this now: String xPathExpression = "/*/*/*/*/*[local-name()='REPORT_DATA' or local-name()='REPORT_HEADER']"; – lukegf Dec 10 '10 at 14:55
1  
@user452103:Yes, Why don't you just try it? This expression selects the two wanted nodes in the provided XML document. – Dimitre Novatchev Dec 10 '10 at 15:33
1  
@ClaraOnager, This selects any element on the 4th level below the top, whose local-name() is either 'REPORT_DATA' or 'REPORT_HEADER' – Dimitre Novatchev Jul 3 '13 at 14:24

This is FAQ (but I'm lazy to search duplicates today)

In XPath 1.0

//*[local-name()='name']

Selects any element with "name" as local-name.

In XPath 2.0 you can use:

//*:name
share|improve this answer
1  
faq on the on hand, hell of an overengineering on the other, or so one would say – n611x007 Sep 21 '15 at 11:59

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.