Folksonomy, Tag Collisions, and Tag Spam

by Joe Loong on March 31, 2009


This past weekend was the Government 2.0 Camp unconference; I’ll be talking more about it this week (with some recaps and notes), but it reminded me of a few tag-related issues I’ve been mulling over.

During the event, the hashtag #gov20camp became one of the top tags on Twitter, which is probably not surprising when you get a few hundred social media and Twitter users together to liveblog, tweet, and otherwise overshare about a common event.

It made for a very active TwitterFall display.

I’m not an expert on taxonomy and folksonomy, so consider this a layman’s charmingly simplistic thoughts on user tags and social tags — specifically, the things people do to keep tags useful and relevant in the real world.

Because so many people were using the “#gov20camp” tag, session leaders were asked to come up with a tag specific to their session, to provide some categorization so that people could find info on specific sessions.

This is similar to something Daniel Terdiman, over at CNET’s Webware blog, wondered a few weeks back — if tag saturation was diminishing the value of Twitter at the SXSW conference (probably not, as he followed up).

Part of the problem of tag overuse reminds me of the old Yogi Berra quote: “No one goes there any more; it’s too crowded.”

Another problem with popular tags is the potential for tag spam. There are a few different types of tag spam, but here I’m talking about intentionally glomming on to a popular tag (like #sxsw or #gov20camp) for your unrelated content, simply because you know a lot of people are looking for it. It’s a way to game the system, like we saw in the recent Skittles Twitter experiment.

And of course, since there’s no master tag licensing authority, we see other kinds of tag conflicts: Tag collisions, where people either intentionally or accidentally use the same tags, for equally valid yet unrelated content. Or improper tag recycling, where you have people reusing a tag, when they should actually create a new, unique one so you can distinguish between versions (e.g. a generic “Wrestlemania” tag, instead of an event-specific “Wrestlemania92″ tag).  Or tag sprawl, where, absent a plan for how you’re going to use tags, you end up with a meandering mess… which is fine, unless you’re using those tags as a category system, as we see in many blogs.

It seems to me (why do I feel like Andy Rooney all of a sudden?) that any folksonomic scheme — where users have the ability to label content however they want, with no master categorization plan — once it gets big enough, invariably develops towards something that looks and functions like a traditional, hierarchical, categorized, taxonomy. And if it doesn’t, then structures around it develop to help perform those functions, or else it ceases to be useful.

I’m betting there’s a named postulate in the field of information theory or library science or something that I’m unknowingly paraphrasing.

Anyway, the real-world challenge to folksonomies is that people tend not to think like librarians or information architects, so we look to technology to help fill in the blanks: Groups. Context. More metadata. Human editors. Better filtering. Smarter machines. Semantic tags. Or returning to purposeful tagging, like the way you can tag people in Facebook photos, where it links back to a specific person’s profile.

Basically, the world is a complicated place, and any tagging schemes will eventually grow to reflect that complexity, or die.

Find us on Facebook and follow us on Twitter for more posts like this!

Brought to you by Network Solutions®, a Web.com® service.

Related Posts