Welcome!

Kurt Cagle

Subscribe to Kurt Cagle: eMailAlertsEmail Alerts
Get Kurt Cagle via: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Related Topics: XML Magazine

XML: Article

XML Query and the Next Generation Web

XML Query and the Next Generation Web

Anyone who has ever done a search query on the Internet is familiar with the phenomenon in which a single query pulls up more than a million possible search matches. This has to do with the fact that information is ultimately not linear, but rather is linked and interrelated in ways that can't be quantified easily through text searches.

When I was in college at the University of Illinois (way too many years ago to want to dwell on here), I spent an inordinate amount of time in the library - sometimes researching things for class, more often just researching things for my own interest. The library there was something of a marvel, with row upon row of cards in their card catalog, designed in such a way that you could - if you knew the author or title of a book - determine where it was in the million-plus book stacks.

Of course, this works only when the books have titles that correspond to the information you're looking for, and even then only if the first word of the title has sufficiently relevant information. Thus (accounting for the dropping of obvious keywords) A Survey of Theory on Musical Trends and Survey Techniques for Municipalities would be in the same general proximity, even though they have remarkably little in common.

While the stacks are still there as part of the atmosphere of the library, they're now years out of date. In their stead, computers with browser interfaces communicate with a large centralized database capable of storing relevant abstracts and keywords, information that was in the cards in the first place but couldn't be indexed easily. However, what has happened is that one method of indexing that probably missed too much information has been replaced by another that generates too much information.

The Web is inexorably moving to an XML basis, though perhaps not in ways that anyone would have predicted in 1996 when XML was born. Back then, the vision seemed to be that content would exist in XML documents that could be referenced through a series of pointers that included a way of navigating to specific elements. Xpointer still exists, but it's been adopted only marginally, in great part because a significant amount of the XML that now exists on the Web comes not from static documents but rather from large databases or autogenerated server applications.

XPath, on the other hand, has become an integral part of XSLT (though not, with the exception of the Microsoft parser, DOM). The reason has to do with the need to be able to retrieve sets of nodes from a given XML document. XSLT in fact is a very simple language by itself - it provides a logical structure to tie together matches, defines variables and parameter methods, and handles the generation of output. However, without the ability that XPath brings - either to generate or to manipulate sets of nodes - XSLT would be virtually worthless.

Unfortunately, however, XPath itself has some serious limitations. It is fundamentally document oriented. A path in general extends from a single root node, can become fairly complex to write, and by itself can't retain state information - it's dependent on XSLT to do that. Also, in the current implementation, you can't use XSLT to generate functions that can be referenced from within XPath, which means that XSLT implementations of complex functions can become extremely verbose. Finally, XPath can get fairly cryptic, especially when an XPath expression has to reference an item indirectly through an IDREF or similar construct.

Introducing XML Query
The database world faced a similar problem in the mid-1980s. Relational databases had been slowly but steadily replacing flat databases for some time, but every database company had its own set of commands and terminology to access the database. This in turn meant that you had to educate your programmers every time you changed databases, and it also meant that the only people who could access your database directly were the programmers.

After some wary circling on the parts of the various combatants...er, participants, the SQL standard finally emerged as a common format for performing database operations, regardless of the vendor. While considerable variation exists among the various implementations of SQL, these variations are still relatively minor compared to having to learn a completely different programming language for each database you support. The initial version, released as an ANSI standard in 1987, covered most of the basic operations for both manipulating data and creating the corresponding data structures. SQL-89 included several fixes that made the model more robust after two years of experimentation.

The requirements for XML Query are roughly analogous to those that resulted in the later SQL-89 specification, with a typical XML twist. One of the primary issues that many people who work with XPath have with the language is that it can often be incredibly terse and cryptic, requiring an intimate association with the data that, while powerful, is often confusing. For instance, suppose you have two XML documents, one giving authors' names and the ISBN numbers for each book they've published, the second consisting of book titles with their associated ISBN numbers. You want to produce a list showing the titles of a given set of books by a specific author. With XPath, to get the set of such books can require a fairly cryptic expression:

<xsl:variable name="bookTitles" select=
"document('books.xml')//book[isbn =
(document('authors.xml')//author[.='Kurt Cagle']/isbn)]/title"/>
While such an expression is reasonably compact, it is, to put it mildly, more than a little hard to follow (especially for people who've spent a lot of time working with SQL-like expressions). XML Query is intended to make such a query more readable, and to produce simple node-set, text, and data results on the side. The foregoing expression, for instance, may be rendered as follows in XML Query:

FOR $book IN document
("books.xml")//book,
$author IN document("authors.xml")//author
WHERE
$author="Kurt Cagle" AND
$book/isdn = $author/isdn
RETURN
$book/title
This still uses XPath (which is considered a part of XML Query), but the syntax is considerably more legible, if somewhat more verbose. Indeed, you can see echoes of SQL here:

SELECT title
FROM books.book
WHERE authors.author.isbn=
books.book.isbn AND authors.author=
'Kurt Cagle'
The result of the query operation returns a set of XML nodes, though, unlike XPath, the possible set of items returned can be considerably more complex - not only scalar types such as dates, numbers, monetary amounts, or strings, but also more formal complex types that are definable within XML Schema. It is this ability to return more formal types that makes XML Query attractive to database developers, especially given that both XPath and XSLT are essentially type-agnostic languages.

XML Query actually consists of a number of different languages that interrelate with one another. The first is actually an extended version of XPath that can be used to reference any specific node or set of nodes in a document tree. In addition to the base functions of XPath, however, XML Query also uses two new operators, the RANGE operator and the dereference operator.

RANGE is primarily a shortcut. For instance, the expression:

document("books.xml")//book[RANGE 2 to 5]/title
will retrieve the title of the second through fifth book in the node-set selected by the XPath expression, and is basically the same as:

document("books.xml")//book[position() >= 2 and position() <= 5]/title
The dereference operator ("->"), on the other hand, is a powerful new construct that makes it possible to use a reference to retrieve a specific node in a manner similar to that used by the id() function or the set of id and idref attributes. For instance, suppose you had a catalog that had the structure given in Listing 1. You could then retrieve a pointer from a given author to a list of all possible book titles:

//author[.='Kurt Cagle']//isbn/@refid ->
book/ title
In essence, such a pointer works in the same way that the id() function does in XPath expressions, but is designed to be a little more legible, especially when more than one dereference is involved. Note that it does, however, require a schema declaring the @refid and @isbn attributes as being of type IDREF and ID, respectively.

Pretty FLWRs
If XML Query were simply an extension of XPath, it would seem to make more sense to make an XPath 2.0 iteration. The primary reason for developing XPath is still legibility, which it does by creating a structure that has a fair amount of resemblance to SQL. Specifically, Query utilizes four new key structures: FOR, LET, WHERE, and RETURN, which together are known as the FLWR (pronounced flower) language.

With FLWR you can define variables that contain node-sets, scalar or complex data types that let you do a certain amount of intermediate processing. Additionally, you can perform conditional tests, use the core function set of XPath along with a number of additions, and even define your own functions. Finally, FLWR structures are defined to produce output, and in that respect they perform in a manner comparable to simple XSLT templates.

FLWR gives you two operators for defining variables. The LET operator lets you define either a node-set or a scalar variable, something analogous to an XSLT <xsl:variable> call. For instance, to retrieve all books in the file books.xml, you may have an expression that looks like:

LET $books:= document
("books.xml")//book
Note that such an assignment means that the variable $books now contains a node-set and doesn't perform any explicit iteration. You can, however, iterate through each book in a set of books using the FOR operator:

FOR $book IN document("books.xml")//book
In this case the function iterates over each node in the XPath expression and assigns it to the variable (here $book), which can then be manipulated to produce output:

FOR $book IN document
("books.xml")//book
RETURN $book/title
LET and FOR can be combined:

LET $books:= document
("books.xml")//book
FOR $book IN $books
RETURN $book/title

Here, books contains a node-set of book nodes that are then iterated via the FOR $book IN $books statement. Note that this is basically analogous to the <xsl:for-each> statement in XSLT.

The WHERE statement in turn performs much the same function as a predicate expression in XPATH or a WHERE statement in SQL. It effectively provides a way to qualify the set, pruning out those nodes that don't satisfy the given condition. The WHERE statement is optional, under the assumption that if it isn't included, all nodes defined in the preceding FOR or LET statements are then passed to the RETURN operator.

The RETURN portion of an XML Query operator handles the actual generation of the resultant code. Typically, an XML Query will itself be contained within the body of an XML document, depending on the parsing processor, and the RETURN body in turn performs a function analogous to an XSL template. The following code, for example, will generate a list of books that contain XML somewhere in the title of the book:

<catalog>
FOR $book IN document
("books.xml")//book
WHERE contains($book,'XML')
RETURN $book
</catalog>
It's worth mentioning that you can have additional FLWR expressions within the RETURN portion of a given FLRW expression. For example, the code in Listing 2 would order books in descending order, by title, for each unique publisher.

This example, taken from the XML Query working draft, illustrates the use of a number of functions - distinct(), which eliminates duplicate nodes, and the SORTBY operator, which sorts elements based on a specific element or attribute (here, price for $book and name for $publisher). Note that such a query is still a well-formed XML expression.

You can also use conditional expressions in your output, with IF/THEN/ELSE (see Listing 3).

Data Types and Functions
Two of the charges frequently leveled at XPath are its inability to define internal functional notation and its extremely limited type support (in essence, node-sets, nodes, strings, and numbers). This is in fact one of the primary motivators for an XML Query language, though it's likely that XPath 2.0 will also support both types and user functions.

XML Query will lean heavily on the XML Schema specifications (especially Schema[2], Data types) for working with data types, though that support is currently one of the least well defined areas in the specification. Because XML Query performs sorting and comparison operations, it's up to the particular schema to define the ordering mechanism for a given type - whether one expression is "less than" or "greater than" another expression for the specified type.

In addition to making it easier to work with types such as dates, this mechanism will also make it possible to create casts that are analogous to casting from one type to another in a strongly typed language such as C++. At a minimum, this will make it easier to compare an integer and a floating point number, but it will really come into its own when it becomes possible for a given set of XML to inherit a complex data type and then cast back to an abstract type. For instance, both an American and an English address block can be cast back to a generic address for purposes of generating queries.

XML Query will also make it possible to create functions that can be invoked from within XQuery/XPath expressions. As an example, the depth() function could be defined to ascertain the maximum depth (the number of nodes from the root to the farthest leaf) for a given node (see Listing 4).

Notice that this particular function is recursive; it calls itself with the children of the current element until no node is found that has any more children. Such a definition is highly procedural in its form, complete with the return data type given as an integer. There are provisions in the language to handle the output to user-defined types as well, via schemas defined within the XQuery call.

XML Query vs XSLT?
XML Query is a language for performing queries against XML data sources and generating result node-sets in XML or related languages. XSLT is a language for processing XML data sources and generating result node-sets in XML or related languages. In other words, there's a lot of overlap between the two specifications, a point that has caused no little consternation among the users of XSLT.

In both cases there is a requirement for a processor that actually performs the filtering or transformation based on the XSLT or XQuery code. An interesting exercise would be to use something like a regular expression engine to convert XQuery into augmented XSLT code, but it's likely that XQuery will quickly be adopted by most of the major SQL vendors.

The two operate within similar niches. The primary differences between the two stem from the way programming is done with them. XML Query is a very procedural solution to XML manipulation, although, like XSLT, it operates on sets of nodes rather than on single instances of data. XSLT, on the other hand, is very much oriented around pattern-matching algorithms, making for programs that are comparatively terse yet very powerful.

Perhaps the best way to explain the difference between the two would be to see XML Query as the C++ solution to XML programming, oriented toward solving problems in a very linear fashion, while XSLT is much more akin to working with regular expressions in a language like Perl. One regular expression, especially applied recursively, can typically perform as much work as a thousand-line program written in C++. However, this power comes at the cost of terseness, difficulty in writing, and, occasionally, performance. The same thing holds true of XSLT, which is capable of incredibly sophisticated behavior but often at the cost of being unintelligible to the average user.

I suspect that the relationship of XML Query to XSLT will in fact develop much the same basic synergy as Visual Basic has with C++. For simple transformations (those that typically involve fairly linear mapping) XML Query will probably end up being the preferred mechanism, especially when working against a database. XSLT, however, will likely become the power tool application of the Internet, handling routing, sophisticated transformations of XML sources, and the creation of complex applications. Either way, the upshot will likely be even more development moving into the realm of XML from both the COM and Java sides.

More Stories By Kurt Cagle

Kurt Cagle is a developer and author, with nearly 20 books to his name and several dozen articles. He writes about Web technologies, open source, Java, and .NET programming issues. He has also worked with Microsoft and others to develop white papers on these technologies. He is the owner of Cagle Communications and a co-author of Real-World AJAX: Secrets of the Masters (SYS-CON books, 2006).

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.