Search

XQuery and TEI

Through my friend Dan McCreary I was lucky to get involved in a project at the University of Richmond in Virginia.  The Proceedings of the Virginia State Convention of 1861,at the time of the secession of the state from the Union, have been encoded in TEI and a web site to browse and search the contents is being developed. eXist is being used as the database and retrieval queries are written in XQuery.

This was my first encounter with the world of TEI - Text Encoding Initiative, despite the fact that Wolfgand Meier's impetus for developing eXist was for processing TEI.  Some of the TEI sites are really wonderful.  Van Gogh's letters is a prime example of the possibilities for presenting a corpus of documents and of the editorial, cross-referencing and technical effort over a 15 year period. The raw TEI is not however available .The NZ Electronic Text Centre is a very large collection of historical documents encoded in TEI and transformable to a variety of formats, including PHP and ePub. The TEI is available and was a good example of TEI markup. Metadata is comprehensive and semantics are added for perople with external rferences and dates.  The presence of dates supports timelines and an example of an XQuery script is described in the Wikibook.  The time-line for Beaglehole's book on the Discovery of New Zealand is an example.

 

TEI loose structure - local corpus-centric markup

Lucene search

KWIC context extraction  - bound to context of exist:match elements   but code is xquery so can be used to do own context when it is not the result of a match

Getting pages (Fragment between milestones )

 

TEI transformation to XHTML - pure XQuery or XSLT -  customisation, re-structuring

Tuning and Query profiling in the eXist admin interface

Node references using exist ids

Blueprint CSS

Eric Meyers Pure CSS popup

 

problems

Adding markup to XML e.g.  highlighting text in a document

Highlighting proximity matches

A TEI cookbook for eXist