Saturday, January 24, 2009

In Defense of XSLT

I recently had a conversation with a Java programmer about why he doesn't like XSLT. The following describes his objections and how I handled them as an XSLT salesman.

XSLT is hard to read and debug

This is a matter of using the right tool for the job. While an Eclipse plug-in like EclipseXSLT is better than a simple text editor, using a full blown XML IDE like OxygenXML can greatly increase productivity.

OxygenXML can help you navigate both the input XML document (XML input document view) and the XSLT transform (XSLT template view). The contextual XPath and XSLT content assistants are very helpful given the large number of XSLT 2.0 elements and XPath 2.0 functions now available. The XSLT refactoring feature allows you to turn a selection into a named template or an included XSLT fragment. Finally, the XSLT debugging perspective allows execution with breakpoints as well as an XSLT call stack and XPath watch views and many other goodies.

In addition, the XSLT 2.0 specification itself defines constructs to facilitate the debugging process. You can use the select attribute (an XPath expression) of the <xsl:message> instruction or the content of <xsl:message> as a sequence constructor to output useful information to standard output or even to a log file. The <xsl:comment> instruction and the XPath 2.0 trace() function can be helpful in debugging as well.

XSLT does not support the reusability and maintainability of code

In addition to XSLT named templates inherited from XSLT 1.0, you can now create your own custom functions in XSLT 2.0. While the content models of both <xsl:template> and <xsl:function> are the same, <xsl:function> is preferred for computing new values and selecting nodes (because functions are called from XPath expressions), while <xsl:template> is preferred for constructing new nodes. The "as" attribute which can be specified on <xsl:template>, <xsl:function>, and their <xsl:param> children allows you to constrain the type of returned value and input parameters.

For reusability and maintainability, the <xsl:include>, <xsl:import>, and <xsl:apply-imports> elements inherited from XSLT 1.0 are still available. When used properly, <xsl:import> provides capabilities that are similar to inheritance in object-oriented languages like Java. <xsl:apply-imports> and the XSLT 2.0 <xsl:next-match> instructions are reminiscent of a call to super() in Java.

There is no way to integrate XSLT with my Java libraries

With Saxon, you can represent a Java class as the namespace URI of a function and you can call Java methods and constructors directly from your XSLT transform. This allows you to reuse existing pieces of application logic build in Java without rewriting them in XSLT.

There is no type checking and no way to verify that the result of a transformation is valid against a schema

Again, this argument is no longer valid with XSLT 2.0. You can now validate both the input and output by using a schema-aware XSLT processor. I strongly recommend the schema-aware version of Saxon. This allows you to root out errors and correct bugs early. In addition to the built-in XML Schema types such as xs:decimal and xs:dateTime, you can define custom types in an XSD. You can then write a template that matches all elements of a certain type. XSD type hierarchies and substitution groups are fully supported as well.

After schema validation, a Post Schema Validation Infoset (PSVI) is generated and each node is assigned a typed value (which can be obtained using the data() function) and a type annotation (which is the schema type used to validate the node). To ensure that a string has a given type annotation, constructor functions are available for built-in and custom types such as in xs:date("2009-01-24").

Saxon supports “optimistic static type-checking”. The following is an excerpt from the Saxon FAQ:

Saxon does not do static type-checking in the sense that the term is used in the W3C language specifications (this refers to pessimistic type checking, in which any construct that might fail at run-time is rejected at compile time). This is an optional feature of the W3C specifications. Saxon does however perform optimistic static analysis of queries and stylesheets, in which an error is reported only for constructions that must always fail at run-time. The information derived from this static analysis is also used to optimize the run-time code.

Unlike Java, XSLT 2.0 lacks a try/catch feature. However you can use the handy "castable as" and "instance of" operators to detect programming errors as well as data errors in the input document.

With XSLT, it is "run and pray": there is no unit and functional testing framework like in Java

Automated unit and functional testing are essential in agile software development. Type checking (or schema-aware XSLT) can help you reduce the number of unit tests needed to fully test your XSLT transform but may not be enough to detect all errors such as those related to business rules violations in the output.

Unit testing is performed by testing individual XSLT functions and templates. You can try Jeni Tennison's XSpec framework which is inspired by the Ruby RSpec framework itself based on a Behavior Driven Development (BDD) approach.

Functional testing consists in testing whole outputs of the XSLT transform. One way of doing this is to run a Schematron schema which contains XPath 2.0-based assertions against the output of the XSLT transform. Since the Schematron schema validation process is itself based on XSLT, you can chain all these transformations together with a build tool like Ant which produces an HTML report with friendly diagnostic messages. With XML Schema 1.1, you will be able to build these assertions directly into your schema, although you will have less control other the diagnostic messages that are produced by the XSD validator.

Conclusion

In a world marked up in XML, XSLT 2.0 is a very powerful language available to developers. However, to take full advantage of the language it is important to use the right tools and take the time to explore the full capabilities of the language particularly its code reuse and type checking features. Finally, to bring XSLT into the realm of agile software development, unit and functional testing should become an integrated part of the XSLT development process.