The Wallace Line.

Blog

Index

RDF and Missing Persons

I've been reviewing a number of sites for missing persons in preparation for an student assignment.

There is an existing XML schema which has been used in the GoogleApp-based Person Finder called PFIF (People Finder Interchange Format) which is described in the developer notes. This schema has developed post Katrina and has recently been modified to version 1.2. The developer forum is very active and an interesting read, particularly about difficulties of over-lapping and competing applications and data sharing with the International Committee of the Red Cross which also provides a service.

There has been much detailed work done to produce the XML schema but there are a few areas where I think it could be improved:

Person properties and person roles are confused as indicated by a variety of elements names for the same property and differences in level of detail:

first_name,last_name (of missing person)
author_name
source_name

Person identifiers are almost URIs but not quite: salesforce.com/a0030000001TRYR.
Uncertainty in values are represented by non-atomic values - age 30-40
Relationships are partly by containment (person/note) and partly via reference: person-record-id and linked-person-record-id

After briefly reviewing this spec and other missing sites, particularly the International Committee for Missing and Exploited Children, the core types seem to be Person, Place and Report, all with many optional attributes. There are a variety of relationships between persons (abductor, parent, sibling, friend, sameAs), between Person and Place (born, missingFrom,seenAt, foundAt) and Person and Report.

Person is complicated by the fact that descriptive data is is uncertain, requiring at least that for example, a height be described as a range (min, value, max), and dated, since descriptions need to be updated as time moves on. I'm not sure how much reuse of existing vocabs such as foaf would be possible.

This all points to the value of an open RDF schema as a more flexible data model. Surprisingly RDF has not been mentioned in the mailing list . I'm wondering if this would be a worth project for a #crisisBristol group or even for the rather quiet semanticweb-southwest group.