Search

Visualization of visits and the joys of persistance in XQuery

I posted an earlier blog item to Twitter via a bitly URL to see if I could gather better data on visits. The experiment wasn't entirely successful because I had few visitors to that blog item and fewer still via the bitly URI. If you want to see how few, the visits are shown on a Simile Timeline (scroll back a bit) and a GoogleMap . If you click on this bitly link, you will appear on the Timeline and more slowly thanks to caching on the map.

One thing I really like about working in XQuery is that data persistance is transparent. Just update the data structure in place, with no need to convert to and from a storage format.  This feature is nicely illustrated by the ip geo-coding code. 

As each request is logged, it now calls a function to get the ip address data.  The function first checks the cache but if it misses, it calls the geo-coding service, http://ipinfodb.com, constructs an address record and adds it to the cache.



declare variable $log:ipcache := doc("/db/cache/ipcache.xml")/cache;

declare function log:geocode-ip($ip as xs:string)  as element(address) {
let $address := $log:ipcache/address[@ip = $ip]
return 
    if ($address)
    then $address
    else 
    if (empty ($ip) or $ip eq "")
    then ()
    else
         let $response := doc(concat("http://ipinfodb.com/ip_query.php?ip=",$ip))/Response
         let $address := 
               element address {
                     attribute ip {$ip},
                     attribute latitude {$response/Latitude},
                     attribute longitude {$response/Longitude},
                     attribute country {$response/CountryName},
                     if ($response/City) then attribute city {$response/City} else ()
               }
           let $update := update insert $address into $log:ipcache
           return $address
};


I added this code after the first set of requests had been received, but since the same function is used in the visualisation scripts, the missing ip addresses were geo-coded the first time the scripts were run. The script to generate the Simile Timeline xml becomes:



import module namespace log = "http://www.cems.uwe.ac.uk/xmlwiki/log" at "../lib/log.xqm";

let $logfile := request:get-parameter("logfile",())
let $login := xmldb:login(..)
let $log := doc (concat(..,logfile) )/log
return
<data date-time-format = "iso8601">
   {for $logrecord in $log/logrecord
   let $address := log:geocode-ip($logrecord/@referer)
   return      
        <event start="{$logrecord/@dateTime}"  title="{($address/@city,$address/@country,$logrecord/@referer)[1]}">
          {string($logrecord)}
         </event>
    }
 </data>

The approach is probably OK for low levels of traffic, particularly if there are repeat visitors.  I've realised that a better use of this logger is to add it to XQuery scripts whose usage I want to track. Blog visit tracking is probably best left to Google Analyics but I dont like the latency which this imposes on page loads.