22.2. Quick Tour

22.2.1. Hello World

XSLT is a transformation language and as such not meant to print "Hello World". However, we can show the required scaffolding by defining a transformation which turns any XML input into our friendly message.

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="text"/>
  <xsl:template match="/">Hello World</xsl:template>
</xsl:stylesheet>

We put this transformation (or "stylesheet") into a file called hello.xsl. To run the program, that is, to apply this transformation, we enter any XML document into a file called hello.xml, for example,

<?xml version="1.0"?>
<hello/>

and run the XSLT processor with the stylesheet and the XML document as the two arguments:

> xsltproc hello.xslt hello.xml
Hello World

We have seen many "Hello World" programs by now, but this is definitely different. First, an XSLT stylesheet is again an XML document. Similar to Lisp, this means that the same tools can be used for the data and the programs manipulating the data. We can, for example, use XSLT to generate or modify other XSLT stylesheets. On the downside, this also means that we have to deal with a syntax that was not invented for programs. A good XML editor such as James Clark's nxml mode for emacs (which I'm using to write this book) or one of the many graphical XML tools is indispensible.

After the XML preamble <?xml version="1.0"?>, the root element tells that we are defining an XSLT stylesheet and introduces the associated namespace, which is almost always (unless you generate another stylesheet) called xsl).

The body of consists of a number of global definitions followed by a list of rules. In our example, there is exactly one of each. The output statement, <xsl:output method="text"/>, determines how the output is formatted. The default output method is xml, since XSLT main pupose is to transform one XML document into another. The xml method ensures that the output is well-formed XML including proper nesting of elements and the escaping of special characters. Since we want to generate plain text, we use the text method. The text method is often used in XSLT based code generators, for example, to generate Java classes from XML schemas.

The template element defines the transformation rule. As usual, a rule consists of two parts: when to apply the rule and what to do in case the rule is applied. In XSLT, a rule is applied to all the XML nodes matching the XPath expression in the match attribute of the template. We will dive into XPath in the next paragraph. Our example uses the simplest possible XPath expression / matching the root element of an XML document.

Now, how do we get anything to the output stream? When the XSLT processor encounters a non-XSL element or text node in a template, the element or text is copied to the output document. In other words, the one and only rule of our "Hello World" stylesheet, is applied to the root node of the XML document which causes the "Hello World" message to be copied to the output stream.

22.2.2. XPath

Much of the power of XSLT derives from the ability to extract information from XML documents in a very concise way using XPath expressions. As test data for the following examples we take a small bibliography (as used in the docbook source of this book) and store it in biblio.xml.

<?xml version="1.0" ?>
<bibliography id="biblio.xslt">
  <title>References</title>
  <biblioentry id="Eckstein01">
    <authorgroup>
      <author>
	<firstname>Robert</firstname>
	<surname>Eckstein</surname>
      </author>
      <author>
	<firstname>Michel</firstname>
	<surname>Casabianca</surname>
      </author>
    </authorgroup>
    <isbn>0596001339</isbn>
    <publisher>
      <publishername>O'Reilly</publishername>
    </publisher>
    <pubdate>2001</pubdate>
    <title>XML Pocket Reference</title>
  </biblioentry>

  <biblioentry id="Kay03">
    <author>
      <firstname>Michael</firstname>
      <surname>Kay</surname>
    </author>
    <isbn>0764543814</isbn>
    <publisher>
      <publishername>John Wiley</publishername>
    </publisher>
    <pubdate>2003</pubdate>
    <title>XSLT</title>
  </biblioentry>
</bibliography>

To start with, we apply an empty stylesheet to this data.

<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"/>

To our surprise, the output is not empty, but contains all the text (that is, the contents of the text nodes) of the document.

<?xml version="1.0"?>
  References
	Robert
	Eckstein
	Michel
	Casabianca
    0596001339
      O'Reilly
    2001
    XML Pocket Reference
      Michael
      Kay
    0764543814
      John Wiley
    2003
    XSLT

Whenever the XSLT processor encounters a node for which no template is defined in the stylesheet, it applies a default template. For an element node, the processor applies all templates to children of the element. The default template for text nodes copies the contents to the output. Together, this explains why we see all the text of the XML input document when applying the empty stylesheet.

Like paths in a file system, XPath expressions denote nodes in an XML document. In the simplest case, an XPath expression looks exactly like a (UNIX) directory path. The main difference is that a path may refer to multiple nodes, since an XML element may contain multiple children with the same name. Here is a stylesheet which extracts the titles from our bibliography.

22.2.3. Conditions and Loops

22.2.4. Functions

template elements are also used to define functions