XPath Replacement Task

This section relates to the XpathReplaceTask as used in in the orchestration plugin.

Definitions

XPath, the XML Path Language, is a query language for selecting nodes from an XML document. In addition, XPath may be used to compute values (e.g., strings, numbers, or Boolean values) from the content of an XML document. XPath was defined by the World Wide Web Consortium (W3C).

XPATH Tutorial

Usage scenario

You should use XPATH when a standard search/replace algorithm is unsuitable for your requirements. For example, I want to replace a value in XML where the tag occurs quite often, but I only want to replace one of the values not all of them (as would happen in global search/replace. See the example below.

Implementation

  1. Add a XpathReplaceTask to your orchestration
  2. Specify the directoryToSearch
  3. In the Entries section at the bottom, add one XPATH line for each search string.

Example XPATH entries in the task

Example 1

Here is an example of XPATH entries. It is in the format of a standard XPATH expression:

Type    Key                                                                                                                                     Value

XPATH   //section[normalize-space(text())='ERROR_LOGGER']/tag[normalize-space(text())='ENABLED']/value/text()           20

You may have one or more entry lines, each representing an xpath expression.

Consider the following XML file. The above expression will replace the value "1" in the ERROR_LOGGER section in the ENABLED tag with the value "20". All other values will be left unchanged.

<?xml version="1.0"?>
<root>
<section>STATE_LOGGER                          
                <tag>
                        APPENDER_TYPE
                        <value>2</value>                
                </tag>
                <tag>
                        APPENDER_ID
                        <value>SL_2</value>            
                </tag>

                <tag>
                        ENABLED
                        <value>1</value>                
                </tag>
        </section>
        <section>ERROR_LOGGER          
                <tag>
                        APPENDER_TYPE
                        <value>2</value>                
                </tag>
                <tag>
                        APPENDER_ID
                        <value>ER_2</value>            
                </tag>
                <tag>
                        ENABLED
                        <value>1</value>        <!-- This will become <value>20</value> -->    
                </tag>
        </section>
        <section>ER_2
                <tag>
                        LAYOUT_PATTERN            
                        <value>%d{yyyyMMdd.HHMMss.SSS} %c %D %N %o %A %x %r %U %I %i %R %p %V %s %E %a %Y %b %v %C %P %n</value>        
                </tag>
                <tag>
                        FILE_NAME                                      
                        <value>logs/mvprojectLogs/IBErrorLog.log</value>                      
                </tag>
                <tag>
                        FILE_ROLLOVER_SIZE                                      
                        <value>10MB</value>                  
                </tag>
                <tag>
                        FILE_ROLLOVER_COUNT                                    
                        <value>5</value>                      
                </tag>
        </section>      
</root>

Example 2:

In the following example we search for specific author nodes under the book node anywhere in the hierarchy. On finding the given author node, we replace the title text (at the same level) with the given text. This is an example of a replacement that would be much less elegant using regular expressions alone.

The entries look like:

Type    Key                                                             Value

XPATH   //book[author='Larry Niven']/title/text()               Beeblewigs and other British Pond Life
XPATH   //book[author='Terry Venables']/title/text()            My Life in Football
XPATH   //book[author='Neal Stephenson']/title/text()           Wayne Rooney Style Guide

The xml file before/after:

Example 3

Here we replace the FuseResetTime of 30 with 30000

<?xml version="1.0" encoding="UTF-8"?><MetaData>
        <Item type="FuseResetTime" value="30"/><!-- Fuse Reset time in minutes -->
        <Item type="maxResTime" value="100000"/><!-- Maximum response time in ms(milli seconds) -->
        <Item type="FailureInterval" value="10"/><!-- Maximum failure count  -->
        <Item type="FailureTimeSpan" value="5"/><!-- Span for Failure in seconds -->
</MetaData>

The XPATH entries:

Note that the node element attributes are prefixed with "@".

Type    Key                                                                             Value

XPATH   (/MetaData/Item[@type='FuseResetTime']/@value)[1]               30000

Example 4

Xml file with namespace.

An xml file with a namesapce, such as a web.xml will not work with xpath unless you either specify the namespace, or use namespace agnostic xpath expressions. So for example this xml file:

<?xml version="1.0" encoding="UTF-8"?>

<weblogic-web-app xmlns="http://www.bea.com/ns/weblogic/90">
  <jsp-descriptor>
    <page-check-seconds>1</page-check-seconds>
    <verbose>true</verbose>
  </jsp-descriptor>
</weblogic-web-app>

With this expression will not find the value (1) to replace:

//jsp-descriptor/page-check-seconds/text()

Using namespace agnostic expression it will work:

//*[local-name() = 'jsp-descriptor']/*[local-name() = 'page-check-seconds']/text()

So the xpath entry will have (to replace the page-check-seconds with 2):

Type    Key                                                                                                             Value

XPATH   //*[local-name() = 'jsp-descriptor']/*[local-name() = 'page-check-seconds']/text()              2

DOCTYPE declaration and SAXParseException

The DOCTYPE declaration is handled as follows:

If the DOCTYPE refers to a local or remote dtd that cannot be resolved, a number of exceptions may be generated. These include FileNotFoundException, UnknownHostException, SocketTimeoutException. The SocketTimeoutException can also cause the replacement activity to stall waiting for the socket timeout.

These situations are handled as below:

  • DTDs are only used/amended if the XML file in question has an XPATH match. Otherwise the file is left untouched.
  • Remote DTDs are ignored.
  • Local (resolvable) DTDs are used in entity expansion.
  • Local DTDs that cannot be resolved are ignored and left in the resulting output file. If there is an entity relying on the DTD then a SAXParseException is thrown. This results in the XML file being skipped, with no XPATH replacement attempted.

XML Parsing

Any XML Parse errors that cannot be resolved as above will result in the XML file being skipped for XPATH substitution.

Performance

If performing a large number of xpath replacements and/or performing replacements in very large XML files, this can cause the process to consume CPU/Memory and take excessive amounts of time. The time is taken when the file is evaluated and the XPATH expressions are applied to the file. If there are a lot of expressions and the file is large, this can take a long time as the file must be evaluated once for each expression.

The time can be improved by increasing the heap size of the agent or SSH process and making sure there is enough free memory and CPU on the target server.

For example, an xml file mvrdtest.xml:

2014-08-06 17:55:17,407 [Thread-242] DEBUG com.midvision.rapiddeploy.utilities.xpath.XmlUpdater - Processing file: mvrdtest.xml
2014-08-06 17:55:17,407 [Thread-242] DEBUG com.midvision.rapiddeploy.utilities.xpath.XmlFileContext - XmlFileContext
2014-08-06 17:55:17,589 [Thread-242] DEBUG com.midvision.rapiddeploy.utilities.xpath.XmlFileContext - got builder ...
2014-08-06 17:55:17,589 [Thread-242] DEBUG com.midvision.rapiddeploy.utilities.xpath.XmlFileContext - Parser
2014-08-06 17:55:17,974 [Thread-242] DEBUG com.midvision.rapiddeploy.utilities.xpath.XmlFileContext - Parser complete
2014-08-06 17:55:17,974 [Thread-242] DEBUG com.midvision.rapiddeploy.utilities.xpath.XmlUpdater - Eval Expressions
2014-08-06 17:55:34,742 [Thread-242] DEBUG com.midvision.rapiddeploy.utilities.xpath.XmlUpdater - Eval Expressions - complete

We can see the time taken is in the expression evaluation and application. This file is an xml file of more than 500,000 lines and is 9.2Mb in size and we are running over 40 xpath expressions against it .