These are algorithms and implementations of algorithms for extracting the parts of a node between milestones,
for example between pagebreaks (tei:pb) elements in TEI.
The common example is from the Punch
case study developed by Joe Wicentowski.
The timings generated in this framework seem to flatter the XQuery scripts. A more direct comparison
which times only the fragment code shows that the Java code is about 5 times faster than my algorithm which in turn is about 4 times faster than David Sewell's algorithm.
For details see TEI Wiki
. The version used here is the later version which preserves namespaces.
The util:get-fragment-between() function is based on David Sewell's algorithm. Implemented in Java. This function only works on documents in the database and fails on in-memory documents (in 1.4.2) whereas both XQuery functions do not.
Ignorant of David Sewell's work, I wrote this algorithm for the University of Richmond.
This version adds attributes (in a different namespace) to indicate where nodes have been trimmed.