Joe Wicentowski posted a nice example of the power of XQuery to do text processing. The problem concerns converting an indented list to a tree.
I find that eXist's util:parse() can be helpful with this kind of transformation. I used this approach in a function to convert from json to XML: first generate the XML as a string and then parse the string to XML.
As Joe does in his approach, the indented list is first converted to a sequence of lines, each of which has a level attribute ( $lines ) e.g.
<line level="0">The President left at 8:48 am</line> <line level="1">Administration recommendations on Capitol Hill</line> <line level="1">Improvements</line> <line level="1">Richardson’s trip to New York</line> <line level="1">Health programs</line> <line level="2">Goals</line>
A recursive function generates nested lists and items as a string using one item look-ahead:
declare function local:nest($lines) { let $this := $lines [1] let $next := $lines [2] return if (empty($next)) then concat("<item>",$this,"</item>", string-pad("</list></item>",$this/@level)) else if ($this/@level = $next/@level) (:in the same list :) then concat("<item>",$this,"</item>", local:nest (subsequence($lines,2))) else if ($this/@level < $next/@level) (: going down :) then concat("<item>",$this, string-pad("<list>",$next/@level - $this/@level), local:nest(subsequence($lines,2)) ) else (: ($this/@level > $next/@level) so going up :) concat("<item>",$this,"</item>", string-pad("</list></item>",$this/@level - $next/@level), local:nest(subsequence($lines,2)) ) };
Finally the string is parsed to XML to create the tree:
util:parse(local:nest($lines))
So Joe's text converted looks like this
The function (deprecated in eXist's XPath namespace) string-pad($s, $n) creates a string of $n $s's. You need to be a bit careful to ensure that the string is well-formed XML (I forgot that level changes may not be just +/- 1) so its a bit tricky to build the string correctly but at least this kind of processing is very fast in eXist.
@Mark: Glad to hear these are helpful, Mark.
@joe I dont think string-pad ($s,$n) is more than
string-join(for $i in 1 to $n return $s,"")