My XSLT Toolbox – 5 Favorite XSLT Books

I love reading programming books, especially to learn a new programming language. Learning XSLT, I read a large number of books, as there are quite a few available. The quality of the XSLT books struck me as particularly all over the place, some were quite good while others weren’t even worth the time to skim. So I’m throwing together a simple list of my current collection of XSLT references, which happened to be my favorites of the bunch. These books are all geared towards specific audiences… beginners, advanced, etc, so I included their audiences.

XSLT – Mastering XML Transformations, Doug Tidwell
This is my favorite XSLT book. Mr. Tidwell did a great job of combining an introduction to the language, a tutorial on how to write XSLTs, and a reference all into one book. On top of that, I found it to be written in the clearest, most conversational style I’ve found in many a programming book. I find this book covers 90% of my day-to-day needs, and when I forget how something works, this book usually answers my questions. (Plus, hey, you can get the 1.0 version for about $3 used.)
XSLT: Programmer’s Reference, Michael Kay
If Mr. Tidwell’s book covers 90%, this book covers all 100%, and then some. Mr. Kay (who wrote the Saxon processor, if you weren’t aware) presents what amounts to an annotated specification in book form. One of my co-workers calls this book the XSLT dictionary, and I can’t argue with that. This book is probably best for advanced programmers.
XSLT and XPATH on the Edge, Jeni Tennison
Once you’ve got the basics of the language down, you’ve got to use it to write real-world code. I found this book helps to smooth down the rough edges of working with the language. This book requires a mid-level familiarity with the language.
XSLT Cookbook, Second Edition, Salvatore Mangano
I reach for this book whenever I’ve got to do something weird. I use it to find the solution to some odd edge case, or for my “can I do this with XSLT” questions. The book covers everything from faking regular expressions, to set operations on different node-sets, to functional programming with XSLT. I don’t use it often, but it’s like gold when I do. This book is mostly for advanced users.
XPath and XPointer, John E. Simpson
The content in this book is totally covered in each of the other books, and it isn’t really XSLT, because it only covers XPath. But this book is my simple reference to 90% of the XPath questions I have. It is a nice little book that I could live without, but it certainly makes my life easier having it around. I think new users will likely get the most out of this book. (Another book that can be had for about $3 used.)

(For disclosure, I did make the links amazon referrals. I feel kinda weird, but figured why not. I don’t expect any results, but if I got some, it’d go straight to buying a new book.)

My XSLT Toolbox – Recursive XSLT templates

2008-12-28 4 min read Programming Xslt Eddie

Recursion is one of the core concepts in programming. It’s valuable not only as a technique for writing programs, but as a general concept for solving problems. XSLT provides many useful elements such as for-each (and apply-templates), but occasionally you will run into a problem which must be solved with recursion. Let’s take a look at a real-world (no Fibonacci!!) example, where we have to operate on a simple string of numbers separated by commas. We’ll take a step-by-step approach to writing a recursive template.

Let’s say we have the following source document, short and sweet. We want to take each number, and wrap it with an element.

<?xml version="1.0" encoding="UTF-8"?>
<comma>1,2,3,4,5,6,7,88,99,100</comma>

The easy way to do this is to use the EXSLT str:tokenize function, which takes a string and some delimiters and splits the string based on those delimiters. All we do is add the xmlns:str and extension-element-prefixes attributes to our xsl:stylesheet declaration, and then call the str:tokenize function.

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
version="1.0" xmlns:str="http://exslt.org/strings" 
extension-element-prefixes="str">
 
    <xsl:template match="/>
        <xsl:for-each select="str:tokenize( comma, ',')">
            <xsl:copy-of select="."/>
        </xsl:for-each>
    </xsl:template>
 
</xsl:stylesheet>

The result is, (new-lines added for readability):

<?xml version="1.0"?>
<token>1</token>
<token>2</token>
<token>3</token>
<token>4</token>
<token>5</token>
<token>6</token>
<token>7</token>
<token>88</token>
<token>99</token>
<token>100</token>

Excellent. But let’s say that we don’t have access to the EXSLT functions, and we have to write a template to perform the same thing.

So now we think up a recursive algorithm. Let’s look at a simplified list with three numbers, such as “1,2,3”. First, we print the “1”, the value before the first comma, and then we discard the first comma. At that point, our list will be “2,3” and we repeat, printing the new first value, and discarding the new first comma. Finally, the list becomes only “3”. There is no comma, so we simply print out the rest of the list, “3”. So we will be recursing over the string printing the first number, and then popping off the first number and first comma. This technique will work with a three number list, or a million-number list (though your processor will probably run out of memory before that).

XPath’s “substring-before”, “substring-after”, and “contains” functions are all of the tools that we’ll need to implement our algorithm. “substring-before” lets us obtain the number before the first comma. “substring-after” lets us discard the first number and first comma, and “contains” allows us figure out the last, comma-less case.

Our function starts in the same manner as all recursive functions, dealing with the last case, and then all of the cases before it. The last case will be the comma-less case from our algorithm. So here’s our template skeleton.

My XSLT Toolbox – copy and copy-of

2008-12-27 4 min read Programming Xslt Eddie

Using XSLT to copy elements is extremely common when you’re transforming a source document of a certain type (XML, HTML, etc.) to the same type. Often, you need an exact copy of an element verbatim, but other times you need to selectively choose certain elements to copy and others to discard. XSLT makes this process quite elegant using it’s xsl:copy-of and xsl:copy elements. The following is a setp-by-step tutorial on how these elements are used.

When you need an exact copy of an element and it’s children, you use the xsl:copy-of element, which makes an exact copy of the selected element and it’s children. Given the following XML data, which represents a (trivial) inventory of a store, let’s say you want an exact copy of any items with the name “XSLT”.

<pre lang="xml">
<?xml version="1.0" encoding="UTF-8"??><inventory><item id="1"><name>The Little Schemer</name><type>book</type><author>Friedman</author><author>Felleisen</author><list-price>29.95</list-price><sell-price>26.99</sell-price><cost>17.92</cost></item><item id="2"><name>XSLT</name><type>book</type><author>Tidwell</author><list-price>49.95</list-price><sell-price>34.99</sell-price><cost>22.92</cost></item><item id="3"><name>Romeo and Juliet</name><type>compact disc</type><conductor>Rostropovich</conductor><list-price>18.98</list-price><sell-price>13.99</sell-price><cost>9.92</cost></item></inventory>

You simply apply the following XSLT stylesheet to your source document:

<pre lang="xml">
<?xml version="1.0" encoding="UTF-8"??><stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"><template match="/"><copy-of select="inventory/item[name = 'XSLT']"></copy-of></template></stylesheet>

Which gives you exactly what you were looking for, the “item” with the name “XSLT”.

<pre lang="xml">
<?xml version="1.0" encoding="utf-8"??><item id="2"><name>XSLT</name><type>book</type><author>Tidwell</author><list-price>49.95</list-price><sell-price>34.99</sell-price><cost>22.92</cost></item>

That was easy, so now let’s say you want to do a little more with your inventory document. Your boss wants a copy of it to look at the numbers and do some accounting. She doesn’t care about the authors or conductors, so she’d like that information left out. Also, she would like an additional piece of information for each item, the amount of profit off each item sold, the difference between the sell-price and the cost.

Because we are adding a piece of information and getting rid of elements that don’t affect the accounting, we can’t use a xsl:copy-of, because that would output an exact copy of the item element, it’s attribute nodes, and it’s child nodes. This exact copy is called a deep copy, because it not only copies the element, but all of it’s children as well. The solution is to use xsl:copy which performs a shallow copy, which means it only copies the current node, and ignores all children or attribute nodes.

Since xsl:copy only copies one element at a time, you need to explicitly specify that you want to continue copying attribute nodes and child nodes. xsl:apply-templates gives us the leverage to write a template that accomplishes that. The following template starts by matching attribute and children nodes, then copies the node, and recursively applies itself to any attribute or child nodes found in the source tree.

Affecting your situation

2008-12-18 1 min read Cello Classical Music Eddie

I don’t typically link to other blogs/articles, nor do I mention classical music particularly often, but I found this article and blog entry so interesting and thought-provoking that they deserve a re-post.

First, a moving blog entry from David Finlayson, trombonist in the New York Philharmonic, and second, the New York Times article describing the background, as well as referencing the blog post.

While I’ve never had a specifically parallel experience, I can relate to the concepts of “fakes” in a particular industry. I find Mr. Finlayson’s reaction (that all musicians must take responsibility and blame for the situation) to be both bold, yet… well, correct. It takes a strong person to identify a stormy situation clearly and react in an appropriate fashion. I only hope that I would react the same way given the circumstanses.

Advantages of push-style XSLT over pull-style

2008-11-25 3 min read Programming Xslt Eddie

Working with more than a few new-hires over the last few weeks, I’ve noticed that new XSLT developers often write pull-style XSLTs by default. However, this tends to defy XSLT’s functional heritage, and is not as useful as the opposite form, push-style XSLTs.

Pull-style XSLTs reach into the source document and pull out the data they need to transform. The pull-style is similar to template systems like those found in Rails or Django, or inserting PHP commands between HTML elements. For example, given the trivial input:

<?xml version="1.0" encoding="UTF-8"?>
<books>
    <book>
        <title>The Scheme Programming Language</title>
        <author>R. Kent Dybvig</author>
    </book>
    <book>
        <title>Essentials of Programming Languages</title>
        <author>Daniel P. Friedman</author>
    </book>
    <book>
        <title>An Introduction to Information Theory</title>
        <author>John R. Pierce</author>
    </book>
</books>

an XSLT novice will produce a stylesheet like the following (note lines 11 and 12 which reach into the source and grab the data):

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xsl:template match="/">
        <html>
            <head>
                <title>books</title>
            </head>
            <body>
                <dl>
                    <xsl:for-each select="books/book">
                        <dt><xsl:value-of select="title"/></dt>
                        <dd><xsl:value-of select="author"/></dd>
                    </xsl:for-each>
                </dl>
            </body>
        </html>
    </xsl:template>
</xsl:stylesheet>

which transforms into:

<html>
   <head>
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      <title>books</title>
   </head>
   <body>
      <dl>
         <dt>The Scheme Programming Language</dt>
         <dd>R. Kent Dybvig</dd>
         <dt>Essentials of Programming Languages</dt>
         <dd>Daniel P. Friedman</dd>
         <dt>An Introduction to Information Theory</dt>
         <dd>John R. Pierce</dd>
      </dl>
   </body>
</html>

The real power of XSLT, however, is defining templates for the elements found within the source document. These are push-style XSLTs. They have two main advantages. First, push-style gracefully handles complex source structures, including recursively nested elements. It would be near impossible to handle the following source document using pull-style,

<pre lang="xml">
<div><div><div>a</div></div></div>

if you didn’t know how deep the recursive divs would go. A push-style solution, though, is incredibly simple.

<pre lang="xml">
<template match="div">
     * <apply-templates></apply-templates> *
</template>

Will transform the previous source into the following.

* * * a * * *

In addition to handling complex source structures, push-style allows code reuse. This is of course an ideal of any programming language. Push-style XSLTs have a greater ability to be reused, because the individual templates can be reused. When you only have one template, it is quite difficult to make it general without resorting to numerous choose-when statements. Here is an example of code reuse, where we extend a previously written template with the xsl:apply-imports rule.

Given the input,

<images>
    <image>
        <url>http://www.filmjunkie.com/drinks/blixa/blixa.jpg</url>
        <alt>Blixa!</alt>
    </image>
</images>

and the XSLTs,

    <xsl:import href="imageformat.xsl"/>
 
    <xsl:template match="image">
        <div class="wrapper">
            <xsl:apply-imports/>
        </div>
    </xsl:template>

and the rule in “imageformat.xsl” (the template being extended in this case),

Older posts Newer posts