What do you do when you find a restriction that has more than one owl:onProperty value attached? Indeed, what do you do with a restriction that has multiple owl:someValuesFrom attached? Until further notice, I'm going with the highly underrated throw it out approach.
Apologies to whoever lovingly crafted this data.
Wednesday, October 15, 2008
Tuesday, October 7, 2008
The God Entity
Here's a seemingly innocuous hex string:
08445a31a78661b5c746feff39a9db6e4e2cc5cf
One may wonder why Google returns about 16,600 results for this highly-entropic hex-string (Oct. 2008).
Upon further investigation, one may again wonder why the results are all FOAF...
And with a few more clicks, one may wonder why the value is so popular for foaf:mbox_sha1sum...
And then one might wonder why one might care...
The aforementioned hash is that of the empty 'mailto:' string, presumably produced by FOAF exporters from empty email input forms. Unfortunately, foaf:mbox_sha1sum is inverse functional, meaning that it should be a unique identifier for an entity: in this case a person. Now, from a reasoner's perspective, only one person can have that particular value for the property: therefore if you find two they must be the same person! Now, we have a problem. All of the descriptions for these people get merged into one super-description for this super-person. A reasoner will now see one person, with tens of thousands of names, interests, emails, etc.
Of course, there are other such values which contribute:
da39a3ee5e6b4b0d3255bfef95601890afd80709
...not to talk about other inverse functional properties such as foaf:weblog which is oft used for defining shared weblogs (anyone who shares one is the same person).
To clarify, perhaps, this is not a criticism of FOAF but perhaps moreso an observation that people will not stick to the semantics hidden away in an RDFS/OWL description. They will see a label for a property or class, project their needs onto it and use it, although it doesn't fit the bill.
The problem becomes a serious issue where the identity of what is described is at stake. More specifically, problems with identity -- relating to assignment of URIs, lack or mis-use of same-as, inverse-functional, functional or cardinality of 1 properties -- are one of the largest stubling blocks at the moment for building a "web of entities".
In human language, a word's definition follows it's usage to a certain extent. The question is, should FOAF change the definition of their words to match how people use them? Should they loosen definitions to say that foaf:weblog can apply to communal weblogs?
Finally, where would this post be without one of the finest examples of the chaos in RDF web data.
EDIT (15/10/08): Indeed, I am new to blogging (and indeed reluctantly at that), and I missed an opportunity for flagrant self-promotion! For more on the issue of identity on the Web and smushing through inverse-functional properties, see this paper from 2007:
Aidan Hogan, Andreas Harth, Stefan Decker. "Performing Object Consolidation on the Semantic Web Data Graph". Proceedings of I3: Identity, Identifiers, Identification. Workshop at 16th International World Wide Web Conference (WWW2007), Banff, Alberta, Canada, 2007.
08445a31a78661b5c746feff39a9db6e4e2cc5cf
One may wonder why Google returns about 16,600 results for this highly-entropic hex-string (Oct. 2008).
Upon further investigation, one may again wonder why the results are all FOAF...
And with a few more clicks, one may wonder why the value is so popular for foaf:mbox_sha1sum...
And then one might wonder why one might care...
The aforementioned hash is that of the empty 'mailto:' string, presumably produced by FOAF exporters from empty email input forms. Unfortunately, foaf:mbox_sha1sum is inverse functional, meaning that it should be a unique identifier for an entity: in this case a person. Now, from a reasoner's perspective, only one person can have that particular value for the property: therefore if you find two they must be the same person! Now, we have a problem. All of the descriptions for these people get merged into one super-description for this super-person. A reasoner will now see one person, with tens of thousands of names, interests, emails, etc.
Of course, there are other such values which contribute:
da39a3ee5e6b4b0d3255bfef95601890afd80709
...not to talk about other inverse functional properties such as foaf:weblog which is oft used for defining shared weblogs (anyone who shares one is the same person).
To clarify, perhaps, this is not a criticism of FOAF but perhaps moreso an observation that people will not stick to the semantics hidden away in an RDFS/OWL description. They will see a label for a property or class, project their needs onto it and use it, although it doesn't fit the bill.
The problem becomes a serious issue where the identity of what is described is at stake. More specifically, problems with identity -- relating to assignment of URIs, lack or mis-use of same-as, inverse-functional, functional or cardinality of 1 properties -- are one of the largest stubling blocks at the moment for building a "web of entities".
In human language, a word's definition follows it's usage to a certain extent. The question is, should FOAF change the definition of their words to match how people use them? Should they loosen definitions to say that foaf:weblog can apply to communal weblogs?
Finally, where would this post be without one of the finest examples of the chaos in RDF web data.
EDIT (15/10/08): Indeed, I am new to blogging (and indeed reluctantly at that), and I missed an opportunity for flagrant self-promotion! For more on the issue of identity on the Web and smushing through inverse-functional properties, see this paper from 2007:
Aidan Hogan, Andreas Harth, Stefan Decker. "Performing Object Consolidation on the Semantic Web Data Graph". Proceedings of I3: Identity, Identifiers, Identification. Workshop at 16th International World Wide Web Conference (WWW2007), Banff, Alberta, Canada, 2007.
Subscribe to:
Posts (Atom)