The Twitter photo wall needs server-side caching for several reasons:
In version 2 of the photo wall, an authorised user can create, modify and delete photo walls. A wall may be moderated or unmoderated. If moderated, a moderation screen shows all unmoderated photos, the oldest at the top. The moderator can accept or reject a photo by clicking the appropriate button The moderation element of the photo in the cache will be updated and the photo removed from the moderation screen. The public page of a moderated wall will show only accepted photos. Each photo is timestamped so that screens can be updated with those photos acquired since the last refresh.
This achitecture is more efficient since the twitter search is done only once so each public page only has to access the cached data. However there are now three lags in the system: the interval between searches of the twitter stream, the pause whilst a human operator moderates the photos, and the interval between refreshes of the wall.
You can view a few walls which have been created but are most likely currently stopped. If anyone wants a a login to use the prototype, drop me a line.
The data for each wall is the query description and a sequence of photo descriptors. This structure is held in an XML file in the database. It is updated in situ using the eXistdb XQuery update extensions when new photos are acquired, when photos are moderated and when the query parameters are updated
When a wall is created, the XML file is created containing the initial query. The twitter stream is searched for the first time using the query parameters and matching photos added to the XML file, ignoring duplicates. In addition a task is scheduled to re-run the search task at the defined refresh rate.
The moderation page uses AJAX both to repetitatively fetch the set of all unmoderated photos and as each moderation decision is made, to update the moderation status in the photo in the database. The public page also uses AJAX to fetch newly acquired or moderated photos.
Short urls like Twitters own t.co need to be converted to their unabbreviated form to detect which photo service is being used, if any. The usual approach using Curl is to request the function not to follow Locations in the header, but the httpclient module in eXist does not have this option. This means that the page has to be fetched and then analysed to see whether it is an image service and if so which one - messy.