trenchant.org

by adam mathes
archive · subscribe

Implicit Faceted Classification

[This is one of those nerdy essays that will bore most people to tears. Sorry. I should probably put them somewhere else. I’ll try to get back to writing about more important things like cartoons and video games and my inability to interact with the real world soon.]

One of the interesting things about the uncontrolled free tagging that occurs in folksonomies is that they seem to be implicitly faceted.

The term facet is often used in the information architecture context in a loose fashion I’m probably guilty of that in my folksonomies paper as well. Rather than attempt to figure out what S. R. Ranganathan and other Library Science pioneers meant – I have my hands full with that in my classification systems seminar and it’s being taught by someone who studied under Ranganathan – for simplicity’s sake I’m considering a facet to be a property of something.

Although tags are often used to describe the subject of documents, this is not their only use. I mentioned this briefly in the folksonomies paper, and mentioned, for example, that proper place names are some of the most popular tags on Flickr. But I didn’t really delve into this. I’m not going to go into this with any academic rigor now either, but I think it’s an important concept to explore.

One way to describe tagging is that it is simply “labeling.” In the most simplistic view of tagging, you are associating a keyword with a document, nothing more nothing less. That is what is going on in the most literal sense, but both the people creating tags and the people looking at documents tagged have an implicit understanding of those tags as one half of a property-value pair (to borrow a term from the RDF semantic web nerds – have I mentioned I’m taking a course in formal ontology as well?)

For example, as I write this, one of the front entries on Del.icio.us is:

Relay control
Concise, pragmatic and complete tutorial showing how to activate a relay through the parallel port
to electronics parallel.port relays tutorials by josealvarez

While the first three tags can be seen as subjects of the document linked, the last tag, “tutorials” clearly is not. It is a “genre” or “form” or “type.” Arguing about which one is right isn’t the point – the important thing is we can agree it’s not adequately described by a “subject” relationship.

Note that it’s not necessary for anyone to point out that “tutorials” is unlikely to be the subject of a document – it has a different relationship to the document, or represents a different facet, depending on your perspective. While it may seem obvious and trivial, I think it’s important and interesting to note that this relationship is implicit and easily understood by humans. (Not necessarily computers, but with the right algorithms and analysis… who knows.)

Of course, like anything that is implicit rather than explicit, there will be ambiguity.

As always, it’s important in any good tagging example to involve Matt Haughey. Matt is known online by his nickname “mathowie.”

One use of the mathowie tag on del.icio.us is documents written by Matt. But there are also documents where Matt is the subject - an article in Discover that talks about Matt’s now infamous success paying his mortgage by redirecting his constant babbling about Tivo from annoyed friends and family to the whole web.

And then there’s this:

Crump-lah! Crippy Duck
this is that laptop bag that mathowie linked to and i am in the market
to bag laptop mathowie wishlist by kfan

The facet analysis here is more complex, but “recommended by” would be how I would describe the relationship between “mathowie” and the Crippy Duck.

I have a hunch that there may be some sort of cognitive jump between the amibguity and implicit nature of things now and having to make these relationships explicit that the flexible nature of current tools leveraging folksonomies avoid. I’m certainly not advocating that these things need to be made explicit. On the other hand, most librarians would probably not be content with a system that could not distinguish between works by William Shakespeare and those about William Shakespeare. But you can’t do that with web search engines now, and I’m not sure how many people are complaining about that.

Many of the brightest minds in the classification world have been saying for years that faceted classification is the future – traditional hierarchical structures could not and would not scale as the world of knowledge increased and changed. While faceted classification has found some traction in the information architecture community and increasing use on commercial sites, it’s really interesting to see what is developing organically on Del.icio.us and Flickr from that perspective.

· · ·

If you enjoyed this post, please join my mailing list