Search

URL-rewriting and dispatching in XQuery

I've always wanted to create websites with those cool REST-style URIs but have been stuck with HTML-form style query strings which also expose the script name.  High time to do something about this weakness.  I have an associated problem with dispatching, calling the appropriate XQuery function depending on the incoming parameters which gets messy.

Background

eXist provides a powerful framework for URL re-writing which also provides pipelining. Rewriting is configured in the eXist configuration files and  by controllers which can reside in the database. I have used this approach in the past but of itself it doesn't address the problem of matching resource paths to functions in the code and I have found it a bit tricky difficult to configure. At XML Prague this year, Adam Retter showed an alternative approach using the XQuery 3.0 feature of annotations.

This blog post describes the approach I'm experimenting with, just for HTTP GET operations at present. No doubt other developers have their own styles which it would be interesting to compare.

Example Site

Here is a toy example site developed by Maia (age 6) to improve her maths.  (ignore the funny domain - it was just one I had hanging around)

http://ourstreet.info/math

The application is intended to provide multiple sets of exercises with randomly-generated variants and some attempt at diagnosing mistakes. It is tailored to Maia's family and her favourite colour!

Apache

I thought I'd look at a solution using Apache which I'm already using for virtual hosting  (Apache2 on Ubuntu) so the initial rewriting will be done with mod-rewrite.  Since the site uses a number of different paths, it would be tedious and inflexible to rewrite all the possible URLs this way, so I decided to pass the whole path as a parameter to a main script, appending the orginal query string.

The Apache virtual host file created for the domain contains the Proxy directives and rewrite rules:



   ......


   ProxyPass / http://localhost:8080/exist/rest/db/apps/Maths/
      ...
    RewriteMap escape int:escape
      RewriteRule ^/$ /math/  [R]
    RewriteRule ^/math$ /math/ [R]
    RewriteRule ^/math/(.*) /xquery/home.xq?_path=${escape:$1} [QSA,P] 
 

The application will reside in the collection /db/apps/Maths with a main script called home.xq in the xquery subdirectory. The first two rules normalise short paths  to /math/ . The third gets the path after /math/ and passes it, after escaping,  as the parameter _path.  The query string on the original URL is appended (as directed by the QSA parameter)

So a url like

http://ourstreet.info/math/set/1/exercise/2

will be rewritten as

http://localhost... Math/xquery/home.xq?_path=set/1/exercise/2

and

http://ourstreet.info/math/set/1/exercise/2/variant/7,4/response?answer=7

will be rewritten as

http://localhost... ..Math/xquery/home.xq?_path=set/1/exercise/2/response/variant/2,10/response?answer=7

XQuery database structure 

These paths are mapped into a sub-collection Maths of the apps collection in the eXist database.

/db/apps/Maths/

                     xquery -- xquery scripts including home.xq

                     lib       --xquery modules

                     data    -- data files such as the exercises

                     system  -- configuration files etc

                     css, jscript etc.

/db/apps/Maths/math is a virtual collection on the same level as the other Maths collections.  Common XQuery libraries are placed in /db/lib/ 

The context object

A common pattern I use in XQuery applications is to gather all the parameters and any other environment variables of interest into a context object (sorry, node) to pass into functions.  This may seem a bit heavy but its very flexible. Query parameters are converted to child elements of the context. Now we also need to parse the _path string. To simplify parsing, I've assumed that types and type values alternate in the path. So



set/1  => <set>1</set>


set    =>  <set/>


set/1/exercise/2/variant/12,4/response  =>


    <set>1</set>
    <exercise>2</exercise>
    <variant>12,4</variant>
    <response/>


Dispatching

The appropriate function to generate an HTMLpage can be selected in the basis of the 'signature' of the function derived from the path by replacing the value parts of the path with "*" to create a signature:

set/1/exercise/2  => set/*/exercise/*

An XQuery dispatch function calls the appropriate function:



declare function tm:dispatch($signature,$context) {
if ($signature eq '') then tm:home() 
 else if ($signature eq 'set') then tm:home() 
 else if ($signature eq 'doc/*') then tm:doc($context) 
 else if ($signature eq 'set/*') then tm:set($context) 
 else if ($signature eq 'set/*/exercise') then tm:set($context) 
 else if ($signature eq 'set/*/exercise/*') then tm:exercise($context) 
 else if ($signature eq 'set/*/exercise/*/variant/*/response') then tm:answer($context) 
 else if ($signature eq 'set/*/worksheet-form') then tm:worksheet-form($context) 
 else if ($signature eq 'set/*/worksheet') then tm:worksheet($context)
 else ()
};



The dispatcher can contain multiple signatures for the same function to support alternative paths to the same endpoint.

Thus the function tm:exercise($context) could be associated with  the signature exercise/*/set/* as well as the signature set/*/exercise/* 

[Aside] I originally used an XML table of signatures and functions, and either used selection followed by util:eval() or generated the text of the function above from the table. On reflection, the cost of the machinery of these approaches doesn't seem to be worth-while. The switch statement of XQuery 3.0 will improve the dispatch code.

Absolute or Relative URIs

The straighforward approach is to use absolute URIs when generating links to related resources in the application.  For example a breadcrumb-style menu:



let $set := $tm:sets[@id=$context/set]
 return 
 <div class="menu">
   Sets
   {$set/title/string()}
   Another?
 </div>


Absolute URIs are also used in the HTML to link to CSS and JavaScript:

    <link rel="stylesheet" type="text/css" href="/css/screen.css" media="screen" ></link>

Alternatively, we could use relative URLs but I found these require much more care to get right

However neither approach meets the needs to be able to use alternative versions of the script for testing on the same server. So I append a _root parameter to the rewritten URL and then prefix all URIs with /{$context/_root}.  The modified  mod-rewrite rules become:



RewriteRule ^/math/(.*) /xquery/home.xq?_path=math&_resource=${escape:$1} [QSA,P]
RewriteRule ^/test/(.*) /xquery/home-2.xq?_path=test&_resource=${escape:$1} [QSA,P] 


The full configuration file is here

HTML forms and REST-style URIs

HTML forms produce query strings and not resource paths. Forms are needed to gather inputs , for example the child's answer to a question. The action part of the form is the absolute resource path whilst the form creates the additional query string




<form action="/{context/_root}/set/{$model/set}/exercise/{$model/exercise}/variant/{$model/variant}/response" 
      method="get">  
    ...
  <input type="text" name="answer" id="answer" size="4"/>


Thus the path to a specific answer to a question will look like

/math/set/1/exercise/3/vars/10,3/response?answer=5

Code

The code is browsable and the full application is available as a zip file. 

A brief explanation of the application itself

Each exercise in a set is parameterised by expressions at points in the exercise definition. When a question is selected, the var expressions defining variables are executed, typically to generate random values. These values are used to complete the question and the sequence of values define a variant of  the exercise. When a response is returned, the value elements are computed using the variant's values and an appropriate response created. The transformation of the exercise XML to HTML  is performed by a recursive function which is guided by the current context which includes the sequence of variables which define a variant of an exercise. var and value elements are evaluated with util:eval()

Here is an example exercise:



 <exercise id="2" use-words="true">
         <title>The runaway chickens</title>
         <question>There are <var>tm:random(5,13)</var>  chickens in the yard and <var>tm:random(1,5)</var> chickens  ran away. How many chickens are now in the yard?</question>
         <answer>
             <correct>Yes would you believe it, there are now  <value>$var[1] - $var[2]</value> chickens left in the yard!  </correct>
             <wrong>Actually there are really only  <value>$var[1] - $var[2]</value> chickens left in the  yard.</wrong>
             <alternative>
                 <value>$var[1] + $var[2]</value>
                 <diagnostic>You have added the numbers rather than subtracted them</diagnostic>
             </alternative>
         </answer>
         <hint>This is a subtraction problem. You have to subtract  the number of chickens who ran away from the number in the  yard.</hint>
     </exercise>


So what did Maia think of it? 

Maia liked the generated worksheets which she could complete and take to school. However the interactive version didn't compare very well with her current obsession -  Moshi Monsters. So my next improvement is to add an intensely annoying repetative and addictive soundtrack. 

 

First, you've done it again, Chris. A tour de force of XQuery application development and documentation. You've created a nice app that serves a good purpose, and you've explained beautifully from motivation to execution. Your "browsable" code link above is incredible: a dynamically generated visualization of an XQuery module, linked to clear documentation about each function. You've raised the bar for everyone who writes about XQuery.

Second, I agree that eXist's mechanism for URL rewriting is complex, so it's great that you've found a mechanism that's straightforward and better suited to your hosting environment. I think many more people know Apache and mod_rewrite than who know about eXist's URL rewriting, so your documentation will be very helpful for others. I too am excited about Adam Retter's XQuery 3.0 annotation-based system for RESTful URLs - very promising.

Of the many nice features of eXist's URL rewriting (e.g., being internal to eXist, it "knows the database", so routing decisions can be based on database contents and XQuery logic), the biggest for people like me the fact that it's "pure eXist" and doesn't have any external dependencies. So for teaching purposes, and for people who want to remain neutral about the container they put eXist into, the built in URL rewriting route is great to have. As a side note, I think some of the complexity of eXist's URL rewriting is being lifted with the release of eXist 2.0: no longer is it necessary to edit the controller-config.xml file just to point URLs at the database (since the "/apps" URL maps to the database).

Again, great post. Thanks, as always!