JavaScript interface for Saxon XSLT processor and Mozilla Rhino

Monday, September 12, 2011 Posted by Ruslan Matveev
In one of my previous posts, I've explained how to set up Apache Ant and Mozilla Rhino in order to create platform independent JavaScript environment. I've also mentioned that you can extend JavaScript by accessing Java packages and classes. Last time I showed you how to make use of java.io.File class in order to retrieve list of the files in the specific directory, this very simple example might have made you think about what else can you do with it? So today I'm going to focus a little bit more on this essential part of Mozilla Rhino JavaScript engine, by showing you how to perform XSLT transformation using Saxon XSLT processor.

Prerequisites

Make initial setup by following this article or download it from the repository (you'll find all necessary information at the end of this blog post). This is what you'll need to add to your build.xml:

<!-- path to the Saxon processor -->
<pathelement location="lib/java/saxon9.jar" />
<!-- path to the Saxon DOM classes -->
<pathelement location="lib/java/saxon9-dom.jar" />

so your build script will look similar to this one (make sure that you put all jar files in the correct locations):

<?xml version="1.0" encoding="UTF-8"?>
<project name="build" default="build" basedir=".">

    <!-- define classpath -->
    <path id="classpath">
        <!-- path to Mozilla Rhino JavaScript engine -->
        <pathelement location="lib/java/js.jar" />
        <!-- path to the Saxon processor -->
        <pathelement location="lib/java/saxon9.jar" />
        <!-- path to the Saxon DOM classes -->
        <pathelement location="lib/java/saxon9-dom.jar" />
    </path>

    <!-- main target -->
    <target name="build" description="build">
        <exec executable="java">
            <arg value="-classpath" />
            <arg value="${toString:classpath}" />
            <arg line="org.mozilla.javascript.tools.shell.Main" />
            <arg line="build.js" />
        </exec>
    </target>

</project>

Running XSLT transformation

Put XSLTProcessor.class (or XSLTProcessor.js in case if you want to run it from source) into lib/java (or lib/js) folder and make sure that you have following snippet somewhere in your build.js:

// include XSLTProcessor
load('lib/java/XSLTProcessor.class');

Then create an XSLTProcessor instance:

// create instance of XSLTProcessor
var xsltProcessor = new XSLTProcessor(xsltStyleSheet);

XLSTProcessor constructor takes single argument - xsltStyleSheet, that can be defined in two different ways, first one is to use E4X extension:

ECMAScript for XML (E4X) is a programming language extension that adds native XML support to ECMAScript (which includes ActionScript, DMDScript, JavaScript, and JScript). The goal is to provide an alternative to DOM interfaces that uses a simpler syntax for accessing XML documents. It also offers a new way of making XML visible.

Following snippet will give you an idea of using E4X for XSLT stylesheet definition:

// create instance of XSLTProcessor
var xsltProcessor = new XSLTProcessor(
    // use E4X XML document as XSLT stylesheet
    <xsl:stylesheet version="2.0"
        xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
        <xsl:template match="/">
            <xsl:message>hello world</xsl:message>
        </xsl:template>
    </xsl:stylesheet>
);

Second possibility is to define your XSLT stylesheet in external file and then use it like this:

// create instance of XSLTProcessor
var xsltProcessor = new XSLTProcessor('stylesheet.xsl');

Now we can perform XSLT transformation by calling transform() on xsltProcessor instance:

// transform XML document defined as E4X object
var result = xsltProcessor.transform(
    <note>
        <to>Tove</to>
        <from>Jani</from>
        <heading>Reminder</heading>
        <body>Don't forget me this weekend!</body>
    </note>
);

In the above example, we used XML document defined as E4X object, but you can also run the transformation on XML document stored in your file system:

// transform XML document stored in document.xml
var result = xsltProcessor.transform('document.xml');

As an addition, you can let XSLTProcessor to construct XML document out of JSON object or any other valid JavaScript value:

// transform XML document constructed out of JSON object
var result = xsltProcessor.transform({
    'note': {
        'to': 'Tove',
        'from': 'Jani',
        'heading': 'Reminder',
        'body': 'Don\'t forget me this weekend!'
    }
});

In the above code snippet, JSON object will be converted into following XML document:

<root>
    <note>
        <to>Tove</to>
        <from>Jani</from>
        <heading>Reminder</heading>
        <body>Don't forget me this weekend!</body>
    </note>
</root>

it also works similar with other JavaScript types as a fallback:

// transform XML document constructed out of simple value
xsltProcessor.transform(1234); // <root>1234</root>
xsltProcessor.transform(true); // <root>true</root>

Specifying output properties

You can control various output properties, from within XSLT stylesheet itself (by setting attributes on xsl:output element), or externally, in your JavaScript code, using setOutputProperty method. In the below example xsl:output element is used to set output result type to indented xml:

var xsltProcessor = new XSLTProcessor(
    <xsl:stylesheet version="2.0"
        xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
        <xsl:output method="xml" indent="yes" />
        <xsl:template match="/">
            <root>
                hello world
            </root>
        </xsl:template>
    </xsl:stylesheet>
);

this would normally produce following result:

<?xml version="1.0" encoding="UTF-8"?>
<root>hello world</root>

now by calling setOutputProperty, and setting output method to text, we can change the way XSLT processor produces the result and instead of outputting XML document it will produce "hello world":

xsltProcessor.setOutputProperty('method', 'text');

Output properties can also be used all the way around, means that you can set it in the stylesheet and then read it from the JavaScript code. Note that you can work with the standard output properties or define your own. In case if you're planning to define your own output property, you'll have to put it in the separate namespace, following example will give you an idea of how:

var xsltProcessor = new XSLTProcessor(
    <xsl:stylesheet version="2.0"
        xmlns:ac="http://www.angrycoding.com/"
        xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
        <!-- define custom output property -->
        <xsl:output ac:myprop="do-something-else" />
        <xsl:template match="/">
            <root>
                hello world
            </root>
        </xsl:template>
    </xsl:stylesheet>
);

In the stylesheet above, we have defined our own output property called "myprop" in "http://www.angrycoding.com/" namespace, in order to read it from JavaScript you'll have to wrap namespace URI into curly brackets and then put property name after (note that you don't need to use namespace URI in case if you're reading one of the standard output properties):

// will produce "do-something-else"
print(xsltProcessor.getOutputProperty(
    '{http://www.angrycoding.com/}myprop'
));

Passing parameters

XSLTProcessor gives you the opportunity of passing parameters to the stylesheet at run time. A parameter is a value that is passed to the stylesheet and then can be used inside the stylesheet. Following code snippet shows how to define a parameter called "foo" in XSLT stylesheet:

var xsltProcessor = new XSLTProcessor(
    <xsl:stylesheet version="2.0"
        xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
        <!-- define external parameter -->
        <xsl:param name="foo" />
        <xsl:template match="/">
            <root>
                <!-- output parameter value -->
                <xsl:copy-of select="$foo" />
            </root>
        </xsl:template>
    </xsl:stylesheet>
);

use setParameter() to pass the value into XSLT stylesheet:

// set "foo" parameter value to "bar"
xsltProcessor.setParameter('foo', 'bar');
// will produce <root>bar</root>
print(xsltProcessor.transform());

you can pass XML document as parameter value (similar to transform() method):

// pass E4X XML document object as parameter value
xsltProcessor.setParameter('foo', (
    <hello>world</hello>
));
// will produce <root><hello>world</hello></root>
print(xsltProcessor.transform());

// pass JSON object (will be converted to XML) as parameter value
xsltProcessor.setParameter('foo', {
    'foo': 'bar'
});
// will produce <root><root><foo>bar</foo></root></root>
print(xsltProcessor.transform());

Resolving external resources

There are several cases when XSLT processor might need to resolve external resource's URI:

  • - when using functions: document, doc-available, unparsed-text, unparsed-text-available
  • - when using instructions: xsl:import or xsl:include
  • - when custom extension needs to resolve external resource URI
In all of this cases XSLT processor makes a call to URI resolver function before taking any further steps. Standard URI resolver will resolve URI relatively to the working directory (directory of the java process). It is possible to overwrite default URI resolver with your own, using setUriResolver() function:

var xsltProcessor = new XSLTProcessor(
    <xsl:stylesheet version="2.0"
        xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
        <xsl:output indent="yes" />
        <xsl:template match="/">
            <notes>
                <!-- copy contents of note.xml -->
                <xsl:copy-of select="document('note.xml')" />
            </notes>
        </xsl:template>
    </xsl:stylesheet>
);

// set custom URI resolver
xsltProcessor.setUriResolver(function(href, base) {
    print('about to resolve', href, 'relative to', base);
    // return resolved URI
    return ('http://www.w3schools.com/xml/' + href);
});

// print out transformation result
print(xsltProcessor.transform());

Additional information

You can check out the project from this repository, this includes source code, examples and the build script (compiles XSLTProcessor.js into Java class that runs much faster then the source).

You might as well be interested in looking into a different approach that allows you to run JavaScript from within the XSLT stylesheet, called XSR. You can get more information about it in this article (russian), examples and this presentation.

Post a Comment