My XSLT Toolbox – Recursive XSLT templates

Recursion is one of the core concepts in programming. It’s valuable not only as a technique for writing programs, but as a general concept for solving problems. XSLT provides many useful elements such as for-each (and apply-templates), but occasionally you will run into a problem which must be solved with recursion. Let’s take a look at a real-world (no Fibonacci!!) example, where we have to operate on a simple string of numbers separated by commas. We’ll take a step-by-step approach to writing a recursive template.

Let’s say we have the following source document, short and sweet. We want to take each number, and wrap it with an element.

<?xml version="1.0" encoding="UTF-8"?> <comma>1,2,3,4,5,6,7,88,99,100</comma>

The easy way to do this is to use the EXSLT str:tokenize function, which takes a string and some delimiters and splits the string based on those delimiters. All we do is add the xmlns:str and extension-element-prefixes attributes to our xsl:stylesheet declaration, and then call the str:tokenize function.

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0" xmlns:str="http://exslt.org/strings" extension-element-prefixes="str"> <xsl:template match="/> <xsl:for-each select="str:tokenize( comma, ',')"> <xsl:copy-of select="."/> </xsl:for-each> </xsl:template> </xsl:stylesheet>

The result is, (new-lines added for readability):

<?xml version="1.0"?> <token>1</token> <token>2</token> <token>3</token> <token>4</token> <token>5</token> <token>6</token> <token>7</token> <token>88</token> <token>99</token> <token>100</token>

Excellent. But let’s say that we don’t have access to the EXSLT functions, and we have to write a template to perform the same thing.

So now we think up a recursive algorithm. Let’s look at a simplified list with three numbers, such as “1,2,3”. First, we print the “1”, the value before the first comma, and then we discard the first comma. At that point, our list will be “2,3” and we repeat, printing the new first value, and discarding the new first comma. Finally, the list becomes only “3”. There is no comma, so we simply print out the rest of the list, “3”. So we will be recursing over the string printing the first number, and then popping off the first number and first comma. This technique will work with a three number list, or a million-number list (though your processor will probably run out of memory before that).

XPath’s “substring-before”, “substring-after”, and “contains” functions are all of the tools that we’ll need to implement our algorithm. “substring-before” lets us obtain the number before the first comma. “substring-after” lets us discard the first number and first comma, and “contains” allows us figure out the last, comma-less case.

Our function starts in the same manner as all recursive functions, dealing with the last case, and then all of the cases before it. The last case will be the comma-less case from our algorithm. So here’s our template skeleton.

Advantages of push-style XSLT over pull-style

2008-11-25 3 min read Programming Xslt Eddie

Working with more than a few new-hires over the last few weeks, I’ve noticed that new XSLT developers often write pull-style XSLTs by default. However, this tends to defy XSLT’s functional heritage, and is not as useful as the opposite form, push-style XSLTs.

Pull-style XSLTs reach into the source document and pull out the data they need to transform. The pull-style is similar to template systems like those found in Rails or Django, or inserting PHP commands between HTML elements. For example, given the trivial input:

<?xml version="1.0" encoding="UTF-8"?>
<books>
    <book>
        <title>The Scheme Programming Language</title>
        <author>R. Kent Dybvig</author>
    </book>
    <book>
        <title>Essentials of Programming Languages</title>
        <author>Daniel P. Friedman</author>
    </book>
    <book>
        <title>An Introduction to Information Theory</title>
        <author>John R. Pierce</author>
    </book>
</books>

an XSLT novice will produce a stylesheet like the following (note lines 11 and 12 which reach into the source and grab the data):

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xsl:template match="/">
        <html>
            <head>
                <title>books</title>
            </head>
            <body>
                <dl>
                    <xsl:for-each select="books/book">
                        <dt><xsl:value-of select="title"/></dt>
                        <dd><xsl:value-of select="author"/></dd>
                    </xsl:for-each>
                </dl>
            </body>
        </html>
    </xsl:template>
</xsl:stylesheet>

which transforms into:

<html>
   <head>
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      <title>books</title>
   </head>
   <body>
      <dl>
         <dt>The Scheme Programming Language</dt>
         <dd>R. Kent Dybvig</dd>
         <dt>Essentials of Programming Languages</dt>
         <dd>Daniel P. Friedman</dd>
         <dt>An Introduction to Information Theory</dt>
         <dd>John R. Pierce</dd>
      </dl>
   </body>
</html>

The real power of XSLT, however, is defining templates for the elements found within the source document. These are push-style XSLTs. They have two main advantages. First, push-style gracefully handles complex source structures, including recursively nested elements. It would be near impossible to handle the following source document using pull-style,

<pre lang="xml">
<div><div><div>a</div></div></div>

if you didn’t know how deep the recursive divs would go. A push-style solution, though, is incredibly simple.

<pre lang="xml">
<template match="div">
     * <apply-templates></apply-templates> *
</template>

Will transform the previous source into the following.

* * * a * * *

In addition to handling complex source structures, push-style allows code reuse. This is of course an ideal of any programming language. Push-style XSLTs have a greater ability to be reused, because the individual templates can be reused. When you only have one template, it is quite difficult to make it general without resorting to numerous choose-when statements. Here is an example of code reuse, where we extend a previously written template with the xsl:apply-imports rule.

Given the input,

<images>
    <image>
        <url>http://www.filmjunkie.com/drinks/blixa/blixa.jpg</url>
        <alt>Blixa!</alt>
    </image>
</images>

and the XSLTs,

    <xsl:import href="imageformat.xsl"/>
 
    <xsl:template match="image">
        <div class="wrapper">
            <xsl:apply-imports/>
        </div>
    </xsl:template>

and the rule in “imageformat.xsl” (the template being extended in this case),