Search

Pipelines

Looks like its going to be the year of 'Pipes' in my teaching. I came across Yahoo pipes earlier this year and decided to restructure my second-level module, Data Schemas and Applications, around the 'web as database', before diving into local database technologies like Relational and XML databases. Yahoo Pipes provides an approachable starting point and has the added benefit of being a technology which none of my students, from a diverse collections of programmes, have yet encountered. Also, those slinky pipes and typed ports are -so- seductive.

However, the visual editor soon becomes awkward to use and thus leads naturally into using a scripting language instead. I started with XSLT expecting to move quickly into XQuery, but I've been surprised to find how much can be done, especially with XSLT2.0. For a transformation engine, I've set up a service (using Saxon8 via XQuery on eXist-db). This has allowed us to implement most of the yahoo pipes we'd written and also searches over an XML file with a single script containing a form and the search results. More..

Although many of the steps in a Yahoo pipeline can be handled within a single XSLT script, some of the processing I want to demonstrate involves processing HTML pages which are not XHTML, so I needed a tidy service too, and to be able to pipeline them together.

So... I need a pipeline language, a way of visualizing the pipeline and an engine to execute the pipeline. Naturally I started to write my own, based mainly on XPL which Eric Bruchez introduced me to at XML Prague. A tentative first step using an XQuery script is described in an XQuery WikiBook article.

Of course this is fine as play but I need to join the real world of pipeline languages. I suppose the main contenders are :

NetKernel looks theoretically and practically very interesting and what's more, 1060research are a locally-based spin-off from HP labs next door.