Hacking an API to do Readability Analysis

Somehow I found time to knock up an API to the readability Analysis provided by Juicy Studio

This page posts the url entered in a form to a server script and returns an HTML page. The aim is to turn this into a service which can be called with a URL and which returns just the analysis as XML or an HTML fragment.

Only fair to use it analysis this blog:

The service is a short XQuery script running on eXist.  The key function uses the httpclient module:

declare function assess:readability($url) {
 let $form :=
 element httpclient:fields {
 element  httpclient:field {>
 attribute name {"url"},
 attribute value {escape-html-uri($url)},>
 attribute type {"string"}
let $response := httpclient:post-form(xs:anyURI(""),$form,false(), ())
 if ($response/@statusCode eq "200")
 then  $response/httpclient:body//table[@summary="Table to display the readability results"]
 else ()

With a bit of code to produce the interface,we have an API.

I took the same function and built it into a script which iterated over all the pages the students had produced and generated a full results table.  Now to see if these measures really tell us anything about the quality of the pages.

Of course I'm sailing a bit close to the wind with this use of a website without, as yet, the authors agreement.