...making Linux just a little more fun! |
By Daniel Guerrero |
The eXtensible Stylesheet Language Transformations (XSLT) is used mostly to transform the XML data to HTML data, but with XSLT we could transform from XML (or anything which uses the xml namespaces, like RDF) to whatever thing we need, from xml to plain text.
The w3 defines that XSL (eXtensible Stylesheet Language) consists of three parts: XSLT, XPath (a expression language used by XSLT to access or refer to parts of an XML document), and the third part is XSL Formatting Objects, an XML vocabulary for specifying formatting semantics
First of all, we need to specify that our XML document will be an XSL stylesheet, and import the XML NameSpace:
<xsl:stylesheet version="1.0" xmlns:xsl="https://www.w3.org/1999/XSL/Transform"> ... </xsl:stylesheet>
After that, the principal element which we will use, will be the xsl:template match
, which is called when
the name of a xml node matchs with the value of the xsl:template match
:
<xsl:stylesheet version="1.0" xmlns:xsl="https://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <!-- '/' is taken from XPath and will match with the root element --> <!-- do something with the attributes of the node --> </xsl:template> </xsl:stylesheet>
Inside of the xsl:template match
, we could get an attribute of the node with the element:
xsl:value-of select
, and the name of the attribute, lets first make an xml of example with
some information:
<!-- hello.xml --> <hello> <text>Hello World!</text> </hello>
And this is the xslt which will extract the text
of the root element (hello
):
<!-- hello.xsl --> <xsl:stylesheet version="1.0" xmlns:xsl="https://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <html> <head> <title>Extracting <xsl:value-of select="//text"/> </title> <!-- in this case '//text' is: 'hello/text' but because I'm a lazy person... I will short it with XPath --> </head> <body> <p> The <b>text</b> of the root element is: <b><xsl:value-of select="//text"/></b> </p> </body> </html> </xsl:template> </xsl:stylesheet>
The HTML output is:
<!-- hello.html --> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <title>Extracting Hello World! </title> </head> <body> <p> The <b>text</b> of the root element is: <b>Hello World!</b> </p> </body> </html>
@att
will match with the attribute att
. For example:
<!-- hello_style.xml --> <hello> <text color="red">Hello World!</text> </hello>
And the XSLT:
<!-- hello_style.xsl --> <xsl:stylesheet version="1.0" xmlns:xsl="https://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <html> <head> <title>Extracting <xsl:value-of select="//text"/> </title> </head> <body> <p> The <b>text</b> of the root element is: <b><xsl:value-of select="//text"/></b> and his <b>color</b> attribute is: <xsl:value-of select="//text/@color"/> </p> </body> </html> </xsl:template> </xsl:stylesheet>
The HTML output will be:
<html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <title>Extracting Hello World! </title> </head> <body> <p> The <b>text</b> of the root element is: <b>Hello World!</b> and his <b>color</b> attribute is: red </p> </body> </html>
If you are thinking in use this information to, in this case, put in red color the text Hello World!,
yes it's possible, in two forms, making variables and using they in the attributes of the font, for
example, or using the xsl:attribute
element.
Variables could be used to contain constants or the value of an element.
Assigning constants are simple:
<!-- variables.xsl --> <xsl:stylesheet version="1.0" xmlns:xsl="https://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <!-- definition of the variable --> <xsl:variable name="path">https://somedomain/tmp/xslt</xsl:variable> <html> <head> <title>Examples of Variables</title> </head> <body> <p> <a href="{$path}/photo.jpg">Photo of my latest travel</a> </p> </body> </html> </xsl:template> </xsl:stylesheet>
The html output:
<html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <title>Examples of Variables</title> </head> <body> <p><a href="https://somedomain/xslt/photo.jpg">Photo of my latest travel</a></p> </body> </html>
You can also get the value of the variable selecting it from the values or attributes of the nodes:
<!-- variables_select.xsl --> <xsl:stylesheet version="1.0" xmlns:xsl="https://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <html> <head> <title>Examples of Variables</title> </head> <body> <xsl:apply-templates select="//photo"/> </body> </html> </xsl:template> <xsl:template match="photo"> <!-- definition of the variables --> <xsl:variable name="path">https://somedomain/tmp/xslt</xsl:variable> <xsl:variable name="photo" select="file"/> <p> <a href="{$path}/{$photo}"><xsl:value-of select="description"/></a> </p> </xsl:template> </xsl:stylesheet>
And the xml source (I don't put images of myself, because I don't want to scare you :-) )
<!-- variables_select.xml --> <album> <photo> <file>mountains.jpg</file> <description>me at the mountains</description> </photo> <photo> <file>congress.jpg</file> <description>me at the congress</description> </photo> <photo> <file>school.jpg</file> <description>me at the school</description> </photo> </album>
And the html output:
<html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <title>Examples of Variables</title> </head> <body> <p><a href="https://somedomain/tmp/xslt/mountains.jpg">me at the mountains</a></p> <p><a href="https://somedomain/tmp/xslt/congress.jpg">me at the congress</a></p> <p><a href="https://somedomain/tmp/xslt/school.jpg">me at the school</a></p> </body> </html>
If you note, you will see that the photo
element-match is called three times because of
the xsl:apply-templates
, every time xslt finds an element that match it,
is called the xsl:template match
that matches it.
Ok, so you are impatient to try to make the text in red of the hello_style.xml
?, try to do this with
variables, if you can't do it, open this page misc/danguer/hello_style_variables.xsl
XSLT could sort the processing of xml tags with <xsl:sort select="sort_by_this_attibute">
, this
element must be placed into xsl:apply-templates
element, you could sort by an xml element or attribute,
in ascending or descending order, you could also specify the order of the case (if the lower case
is before than a upper case, or vice versa).
I will use the example of the album, and I will add only the sort element:
<xsl:apply-templates select="//photo"> <xsl:sort select="file" order="descending"> </xsl:apply-templates>
This will alter only the order of photos is put in the html, in fact, xslt will order
first all the elements photo
of our xml, and it will send to the
template-match
element in that order, that's why the xsl:sort
element must go inside the xsl:apply-templates
.
The xsl's and html's files are in the examples, you can get it with these links:
There will some cases when you need to put some text if some xml element (or attribute) appears,
or other if doesn't appears, the xsl:if
element will do this for you, I will show you
what can do, let's image you have a page with documents (this example is taken from my 'tests' at
TLDP-ES project) and from these documents, you know if the sources were converted to PDF, PS or
HTML format, this information is in you xml, so you can test if the PDF file was generated, and
put a link to it:
<xsl:if test="format/@pdf = 'yes'"> <a href="{$doc_path}/{$doc_subpath}/{$doc_subpath}.pdf">PDF</a> </xsl:if>
If the pdf attibute of the document is yes, like this example:
<document> <title>Bellatrix Library and Semantic Web</title> <author>Daniel Guerrero</author> <module>bellatrix</module> <format pdf="yes" ps="yes" html="yes"/> </document>
Then it will put a link to the document in the PDF format, if the attribute is 'no' or whatever value the xml's DTD allow you, then no link will put, if you want to check all the xsl and xml documents they are in:
If you check the xml document of the below example, you will see, in the first document we have
three authors separated by a comma, obviously a better way to separate the authors will put it
in separated <author>
tagas:
<document> <title>Donantonio: bibliographic system for automatic distribuited publication. Specifications of Software Requeriments</title> <author>Ismael Olea</author> <author>Juan Jose Amor</author> <author>David Escorial</author> <module>donantonio</module> <format pdf="yes" ps="no" html="yes"/> </document>
And you could think to make an xsl:apply-templates
and a
xsl:template match
to put every name in a separate row, for example, this could
be done, but if you also could utilice the xsl:for-each
statement.
<xsl:for-each select="author"> <tr> <td> Author: <xsl:apply-templates /> </td> </tr> </xsl:for-each>
In this case, the processor will go through all the authors that the document had, and
if you are wondering what template I made to process the authors, I will say there is no
template, the processor will take the apply-templates
element like a 'print'
the text of the element selected by the for-each
element.
The last xslt element I will show you is the choose element, this works like the popoular
switch
of popular languages like C.
First you must declare a xsl:choose
element, and after, put all the options
in xsl:when
elements, if element couldn't satisfy any when, then you could
put an xsl:otherwise
element:
<xsl:variable name="even" select="position() mod 2"/> <xsl:choose> <xsl:when test="$even = 1"> <![CDATA[<table width="100%" bgcolor="#cccccc">]]> </xsl:when> <xsl:when test="$even = 0"> <![CDATA[<table width="100%" bgcolor="#99b0bf">]]> </xsl:when> <xsl:otherwise> <![CDATA[<table width="100%" bgcolor="#ffffff">]]> </xsl:otherwise> </xsl:choose>
The position()
returns the number of element processed, in the case of the
documents, the number will increment as many documents you had, in this case, we only want
to know which document is even or odd, so we can put a table of a color for the even
numbers and other for the odd numbers; I put the xsl:otherwise
only to illustrate
its use, but actually I think it will never be a table with blank background in our library.
If you ask me why I put a CDATA
section?, I will answer you, because if I don't
put it, then the processor will ask for his termation tag (</table>
) but
its termination is bottom, so, the termination tag will need also the CDATA
section.
Once again, I have to short the code, if you want to see all the code, you must see these documents:
Saxon is a XSLT Processor written in Java, I'm using the version 6.5.2, the following instructions will be for this version, in others versions you have to check the properly information for running Saxon.
After you have downloaded the saxon zip, you must unzip it:
[danguer@perseo xslt]$ unzip saxon6_5_2.zip
After this, you must include the saxon.jar file in you class path, you can pass the path of the jar to java with the -cp path
option.
I will put saxon.jar under the dir xslt, you must write to Java the Class you will use; in the case of my saxon version (6.5.2) the Class is:
com.icl.saxon.StyleSheet
and also pass as argument the document in xml and the XSLT StyleSheet that you will use. For example:
[danguer@perseo xslt]$ java -cp saxon.jar com.icl.saxon.StyleSheet document.xml tranformation.xsl
This will send the output of the transformation to the standard output, you can send to a file with:
[danguer@perseo xslt]$ java -cp saxon.jar com.icl.saxon.StyleSheet document.xml tranformation.xsl > file_processed.html
For example, we will transform our first example of XSLT with saxon:
[danguer@perseo xslt]$ java -cp saxon.jar com.icl.saxon.StyleSheet cards.xml cards.xsl > cards.html
And as I said, the result of the processing with xslt is:
[danguer@perseo xslt]$ java -cp saxon.jar com.icl.saxon.StyleSheet hello.xml hello.xsl > hello.html
xsltproc comes with all the major distributions, the sintaxis it's like the saxon's one:
[danguer@perseo xslt]$ xsltproc hello.xsl hello.xml > hello.html
I know there are others xslt processors, like sablotron, but I haven't used, so, I can't suggest you ;-).
I'm trying to finish my Bachelor Degree at BUAP in Puebla, Mexico. I'm involved with TLPD-ES
project, and they make I learn all about this technologies, now I'm learning
about Semantic Web.