Tuesday, October 7, 2008

The God Entity

Here's a seemingly innocuous hex string:
08445a31a78661b5c746feff39a9db6e4e2cc5cf

One may wonder why Google returns about 16,600 results for this highly-entropic hex-string (Oct. 2008).

Upon further investigation, one may again wonder why the results are all FOAF...

And with a few more clicks, one may wonder why the value is so popular for foaf:mbox_sha1sum...

And then one might wonder why one might care...

The aforementioned hash is that of the empty 'mailto:' string, presumably produced by FOAF exporters from empty email input forms. Unfortunately, foaf:mbox_sha1sum is inverse functional, meaning that it should be a unique identifier for an entity: in this case a person. Now, from a reasoner's perspective, only one person can have that particular value for the property: therefore if you find two they must be the same person! Now, we have a problem. All of the descriptions for these people get merged into one super-description for this super-person. A reasoner will now see one person, with tens of thousands of names, interests, emails, etc.

Of course, there are other such values which contribute:
da39a3ee5e6b4b0d3255bfef95601890afd80709

...not to talk about other inverse functional properties such as foaf:weblog which is oft used for defining shared weblogs (anyone who shares one is the same person).

To clarify, perhaps, this is not a criticism of FOAF but perhaps moreso an observation that people will not stick to the semantics hidden away in an RDFS/OWL description. They will see a label for a property or class, project their needs onto it and use it, although it doesn't fit the bill.

The problem becomes a serious issue where the identity of what is described is at stake. More specifically, problems with identity -- relating to assignment of URIs, lack or mis-use of same-as, inverse-functional, functional or cardinality of 1 properties -- are one of the largest stubling blocks at the moment for building a "web of entities".

In human language, a word's definition follows it's usage to a certain extent. The question is, should FOAF change the definition of their words to match how people use them? Should they loosen definitions to say that foaf:weblog can apply to communal weblogs?

Finally, where would this post be without one of the finest examples of the chaos in RDF web data.

EDIT (15/10/08): Indeed, I am new to blogging (and indeed reluctantly at that), and I missed an opportunity for flagrant self-promotion! For more on the issue of identity on the Web and smushing through inverse-functional properties, see this paper from 2007:

Aidan Hogan, Andreas Harth, Stefan Decker. "Performing Object Consolidation on the Semantic Web Data Graph". Proceedings of I3: Identity, Identifiers, Identification. Workshop at 16th International World Wide Web Conference (WWW2007), Banff, Alberta, Canada, 2007.

4 comments:

George Bondurant said...

I have seen and read out all pages as well as all articles and I think your post is exceptionally fascinating and generally, I continue searching for like this sort of sites where I learn or get new idea. so I need to suggest you one thought you will review the task custom essay writing service and furthermore review the rule of remark. I have to thank you for your minute because of this inconceivable read! A good blog always comes-up with new and energizing data and keeping in mind that understanding I have feel that this blog is truly have all those quality that qualify a blog to be a good one.

Colin Cowdrey said...

In any case, indeed, more utilization of them is likewise bad for wellbeing. A decent consideration must be taken for the capacity of tea and espresso so uncommon sorts of sacks came in picture. food plastic packaging manufacturers

Mark Albert - Roku Customer Support said...

How to Activate TNT Drama Channel on Roku?

- Select your device
- Complete the device setup
- Move to the channel store
- Add TNT drama channel
- Find the added channel
- Use the channel account for sign in
- Collect the channel activation code
- Visit the portal, tntdrama.com/activate
- Enter the code
- Complete the channel activation.

We offer agent-assisted support. If you still require assistance tntdrama com activate, please feel free to dial the toll-free number @ +1-805-980-1700.

Monnika Jacob said...

Yes you are right, but I was surprised until I discussed this with the experts of Buy Essay Online. It was very difficult for me to understand this algorithm, but the experts explained it to me well, and now you have explained it in a detail.