Accessibility
Home / Developer Center / XML
Icon or Spacer DevNet

In our previous issue in the Developer Center, we talked about User Defined Functions (UDF) . Now let's put it to use with XML. Building on UDF functionality, Granularity Information Architecture, Inc., has created the Granularity XML UDF Library to help programmers with the most basic XML tasks. In this article, G. Hussain Chinoy, Chief Architect at the company, explains how to perform basic XML functions, the concept behind the XML UDF Library, and some basic examples.

This article was reprinted with the permission of Granularity Information Architects, xml.granularity.com, a consulting firm and website dedicated to using and learning all about XML with Macromedia server products, JRun and ColdFusion.

Introducing the Granularity XML UDF Library
Version 5 of Macromedia's ColdFusion application server brought a greater level of stability and introduced some very useful functionality. One of the most useful new features is the ability to extend the power of ColdFusion by creating User Defined Functions (UDF). Utilizing the existing CFSCRIPT syntax, ColdFusion programmers can now reuse blocks of CFSCRIPT code as functions, adding their own functions to the base set CFSCRIPT functions and commands. Granularity Information Architecture, Inc. has created a library of functions to assist ColdFusion programmers in dealing with XML, called the Granularity XML UDF Library, which is a complement to Granularity's giaX™ Suite of custom-tags for XML manipulation. This article goes over the basic XML functions programmers require, the concept behind the XML UDF Library, and some basic examples.

Major XML tasks covered
ColdFusion 5.0's scripting language currently lacks direct tags and functions to deal with XML documents that are not WDDX documents. WDDX documents can be created and parsed, but the built-in <cfwddx /> tag cannot read generic XML documents. The most common tasks that face programmers when dealing with XML documents not in WDDX format are:
1
Converting the XML document into a logical structure that can be utilized within ColdFusion by the ColdFusion programmer (also known as "parsing the XML document"),
2
Transforming an XML document into another document (HTML, text, PDF, other XML, etc.) via and XSLT stylesheet, and
3
Selecting a portion or portions of the XML document via an XPath expression.

With these three tasks in mind, the Granularity XML UDF Library was born.

The Granularity XML UDF Library contains these functions:

transform() Enables the transformation of an XML document with an XSLT stylesheet
xpath() Evaluates an XPath expression
parse() Parses and XML document via DOM (internal function)
getDocumentElement() Helper for DOM parsing (internal function)
getNodeType() Helper for DOM parsing (internal function)
getNodeName() Helper for DOM parsing (internal function)
getAttributes() Helper for DOM parsing (internal function)
getLength() Helper for DOM parsing (internal function)
getChildNodes() Helper for DOM parsing (internal function)
getChild() Helper for DOM parsing (internal function)
saxparse() Parses and XML document via SAX (internal function)
giaX() Complement function; parses XML into WDDX
XercesVersion() Displays the current Apache Xerces version
XalanVersion() Displays the current Apache Xalan version

 

Using the Granularity XML UDF Library
To use a UDF Library in ColdFusion 5.0, the programmer simply includes the library within the CFML template they're using, as follows:

        <cfinclude template="_xml.cfm" />

 

Now the additional extension functions are available within the rest of the CFML template.

It's important that the ColdFusion Administrator have the necessary connections to an existing JVM and the required XML and XSLT classes in place. The Granularity XML UDF Library requires and utilizes Apache Software Foundation's Xerces and Xalan XML and XSLT Java parsers to accomplish the seamless XML integration functionality and extend the Macromedia CFSCRIPT language. Because of this, the XML UDF Library will run on any ColdFusion 5 installation - Windows, Linux, Solaris, and HP-UX.

We'll focus on the common uses for XML - transformation, selecting, and parsing - and hopefully, in future articles, expand on advanced uses for XML in ColdFusion.

transform(): Transforming XML with XSLT
Presented with an XML document containing data from a third-party, a ColdFusion programmer may want to display the data in a friendlier HTML format. A typical way to do this would be to write an XSLT stylesheet that transforms the XML document into HTML. All the ColdFusion programmer requires is a XSLT processor engine to accomplish this task.

Let's consider these two documents, first an XML document:

[ LISTING 1 ]

Next, an XSLT document that defines a transformation:

[ LISTING 2 ]

With the GIA XML UDF Library, the ColdFusion programmer can use the transform() function to apply the XSLT stylesheet to the XML document and output HTML, as easily as one function call:


Syntax:

transform( URI to XML document, URI to XSLT document );

Returns a string of the transformed result.

Example:

<cfoutput>#transform( "http://127.0.0.1/kitten.xml", "http://127.0.0.1/cats-html.xsl" )#</cfoutput>

Will display:

Name Gender Disposition
Abigail female smallest in the world
Geordi male  
Jenghis male nicest in the world
Newt female  

xpath(): Selecting a portion of the XML document via XPath
Sometimes, only a subset of the existing XML document is necessary. To accomplish this an XPath expression may be used. The GIA XML UDF Library contains the xpath() function.

Syntax:

  xpath( URI to XML document, XPath Expression );

Returns a string of the XML nodes found.

Example:

  <cfoutput>#xpath( "http://127.0.0.1/kitten.xml", "//name" )#</cfoutput>

Will display:

  <name>Jenghis</name><name>Newt</name> <name>Geordi</name><name>Abigail</name>

The result of the xpath() function can be further analyzed via existing ColdFusion string or list functions or tags, or reprocessed via lower level XML parsing functions.


Low-level parsing of an XML document
More often than not, a ColdFusion programmer is presented with an XML document that they need to access, and have neither the understanding nor time to learn the XSLT language or the XPath syntax. Typically, a programmer will require custom access to the XML document's data that can't be readily provided via the facilities in XSLT or XPath.

In order to directly access the data within an XML document and preserve the structure of the document, some in-depth knowledge about XML and the two most common APIs used to traverse XML documents will be necessary. This sort of study is unavoidable in the current version of the XML UDF Library and is arguably a more involved learning process than either XPath or XSLT.

The first API used to access XML documents in a granular fashion is the Document Object Model, or DOM. Once an XML DOM parser reads an XML document, a DOM structure is created. The second common API is called the Simple API for XML, or SAX. SAX is an event-based API, which means that as the XML document is parsed via a SAX XML parser, the programmer must catch and process each portion (elements, attributes, characters and spaces) of the document.

The Granularity XML UDF Library contains two internal functions to enable DOM and SAX parsing. They're called "internal functions" not only because transform() and xpath() (above) rely on them, but also due to the fact that they're not the most high-level functions, and utilizing them will require a deeper understanding of XML API's.

parse(): DOM Parsing
The Document Object Model considers hierarchies of elements in a parent-child-sibling type of relationship. Each element is called a node, which may or may not have "children" nodes. In particular the "document" node has a single child, the "root" node. The root node may have children nodes. A DOM programmer typically iterates through the nodes of the XML document and can pull out particulars of each node, such as the element name, text contents of the element, the attribute names and their values, and the list of the node's children.

There are many portions of an XML document that the DOM can access. We will concentrate on the most commonly used portions: document, element, attribute, and text.

Since the DOM gives access to XML data and structure in an anonymous parent-child relationship, parsing an XML document also requires knowledge of the document or the DTD (Document Type Definition) to locate and/or extract the particular information needed.

Yes, parsing an XML document via the DOM can be a little complex, but it's the building block to doing some really interesting things with XML, especially higher-level, ease-of-use parsing of documents. We provide an example of DOM parsing and leave the rest to your imagination!

Syntax:

parse( URL to XML Document);
    Returns a Java Document object

getDocumentElement( Java Document object );
    Returns a Java Node object

getNodeType( Java Node object );
    Returns a string containing the Node Type; the type defined by the W3C DOM Level 2 Java Bindings


ELEMENT_NODE
ATTRIBUTE_NODE
TEXT_NODE
CDATA_SECTION_NODE
ENTITY_REFERENCE_NODE
ENTITY_NODE
PROCESSING_INSTRUCTION_NODE
COMMENT_NODE
DOCUMENT_NODE
DOCUMENT_TYPE_NODE
DOCUMENT_FRAGMENT_NODE
NOTATION_NODE

getNodeName( Java Node object );
    Returns a string containing the Node's name.

getAttributes( Java Node object );
    Returns a Java NamedNodeMap, or attribute list object

getChildNodes( Java Node object );
    Returns a Java NodeList object, or a child list

getLength( listObject );
    Returns an integer of the number of children or attributes in a list object

getChild( Java Node object, index );
    Returns a Java Node object at the specified index

Example
DOM helper functions contained in the XML UDF Library are shown in bold.

<cfscript>

// Utilize Granularity XML UDF Library to parse the document (DOM)
document = parse("http://127.0.0.1/test/kitten.xml");

/**
* Specific DOM Tree Traversal
*/
// Document (document)
WriteOutput( "(" & getNodeType(document) & ") " & getNodeName(document) & "<br />");

// Root Element (rootElement)
rootElement = getDocumentElement(document);
rootElementAttrs = getAttributes(rootElement);
children = getChildNodes(rootElement);

WriteOutput("(" & getNodeType(rootElement) & ") <code>" & getNodeName(rootElement) & "</code> ");
WriteOutput("(attrs: " & getLength(rootElementAttrs) & ", children: " & getLength(children) & ") <Br />");

// Children of Root Element (thisChild, grandchildren)
WriteOutput("<ol>");
for(i = 0; i LT getLength(children) ; i = i + 1) {
thisChild = getChild(children,i);
thisChildAttrs = getAttributes(thisChild);
grandchildren = getChildNodes(thisChild);

if ( getNodeType(thisChild) is "ELEMENT_NODE" ) {
WriteOutput("<li>");
WriteOutput("(" & getNodeType(thisChild) & ") <code>" & getNodeName(thisChild) & "</code> ");
WriteOutput("(attrs: " & getLength(thisChildAttrs) & ", children: " & getLength(grandchildren) & ") <Br />");
WriteOutput("</li>");
}
}
WriteOutput("</ol>");
</cfscript>

Will output:

(DOCUMENT_NODE) #document
(ELEMENT_NODE) cats (attrs: 0, children: 4)
1. (ELEMENT_NODE) cat (attrs: 2, children: 2)
2. (ELEMENT_NODE) cat (attrs: 2, children: 1)
3. (ELEMENT_NODE) cat (attrs: 2, children: 1)
4. (ELEMENT_NODE) cat (attrs: 2, children: 2)


Wrapping up
Using XML in ColdFusion becomes a very straightforward job with ColdFusion 5's new user defined function ability and the Granularity XML UDF Library which contains functions needed for the most basic of XML usage: parse(), transform(), and xpath().

The Granularity XML UDF Library contains other functions such as saxparse() for SAX parsing (the alternative to DOM parsing), giaX() to hook into Granularity's tag-based XML parser, as well as helper functions to make using XML and ColdFusion easier.

We hope to put in more powerful and easy to use functions in the next release of the XML UDF Library, so we welcome readers' feedback as to how they use XML and what they'd like to see.

Code listings

Listing 1: kitten.xml

<?xml version="1.0"?>
<cats>
<cat name="Jenghis" gender="male">
<name>Jenghis</name>
<disposition>nicest in the world</disposition>
</cat>
<cat name="Newt" gender="female">
<name>Newt</name>
</cat>
<cat name="Geordi" gender="male">
<name>Geordi</name>
</cat>
<cat name="Abigail" gender="female">
<name>Abigail</name>
<disposition>smallest in the world</disposition>
</cat>
</cats>


Listing 2: cats-html.xsl

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">

<xsl:output method="html" indent="yes"/>

<xsl:template match="/">
<xsl:call-template name="HTMLFormat"/>
</xsl:template>

<xsl:template name="HTMLFormat">
<html>
<head>
<title>Cats!</title>
</head>
<body>

<table border="1" cellspacing="0" cellpadding="1">
<tr bgcolor="#F0F0F0">
<td>Name</td>
<td>Gender</td>
<td>Disposition</td>
</tr>

<xsl:for-each select="/cats/cat">
<xsl:sort order="ascending" select="@name" />
<tr>
<td><xsl:value-of select="@name" /></td>
<td><xsl:value-of select="@gender" /></td>
<td><xsl:value-of select="./disposition" /></td>
</tr>
</xsl:for-each>

</table>

</body>
</html>
</xsl:template>

</xsl:stylesheet>


Granularity Information Architecture, Inc.
1006 Robertson Street
Building 1A Suite 104
Fort Collins, CO 80524
Phone 970.224.1329
Fax 801.697.4180
xmltechnologies@granularity.net
http://xml.granularity.com/