...giving birth to learning...
  Monday, August 22, 2005

  Link love: lists, clouds and action points

I was thinking of commenting on the unfolding discussion on link love since BlogHer, but couldn't find time to write it up properly (which for me required going through the fast-growing number of posts). Don't think I'll do it properly now, but given our work was referenced a couple of times I feel responsible enough to do it...

I'm in the Feedster top 500 (as some friends nicely point out). So what?

  • I don't have people knocking on my door asking me to speak at conferences or wanting to place ads in my weblog - being in the list doesn't mean that you are in the inner circle (I suspect that A-list is not something defined by whatever top-X list anyway).
  • I do not see any personal value of being in this list or using it to find others. The only thing it brings is egosmiling - ha, I'm in the list - me having some fan registering the fact. If I disappear from it tomorrow I'd smile again and go on.

These are my personal indicators that lists of popular blogs do not work.

A few things could work. Smart combinations of blog metrics, or better visualizations of conversation clouds because I guess we are more interested in finding the cloudmakers and connecting with them...

Lilia Efimova (Blog posts 2004)visualization of the political blog networkI guess there is already some understanding in the community of what is needed. Probably something like those visualizations.

Available for you and me. For our own weblogs or topics we are interested, not only for those researchers choose to study. Trusted and clickable.

From what to how

I'm not sure that the problem is in the lack of algorithms. At least those that come from research are published. I think it's pretty much about the teasing data.

It's not enough to come up with a great formula. You have to test it - to see what comes out, to try it on different data sets, to implement it as a tool, to make tools open for a public, to make sure all these scales...

But it starts with the data. And the data is not public.

I can not speak for others, but I can talk about problems we have with the data needed for our research (which addresses some of the "link love" aspects). What we need to develop algorithms and tools are pretty simple: blog content in "full-text RSS quality" via APIs...

We tried many of the current blog indexing tools: no luck (those that are pretty close to what we need, BlogPulse, Technorati and Bloglines are either consider the data they collect commercial or do not have APIs to access it).  As a results Anjo is working on weblog spider instead of community discovery algorithm.

I know other researchers working on weblog spidering instead of working on algorithms to process and visualise weblog data. I wonder how many other people out there who would play with the data if it would be accessible without any threshold. I believe there are many.

I was very sad to hear last week that upflux didn't gain much support from players in the blog indexing market. I wonder if open access to weblog data is a "nice to have, but never real" dream. And I wonder if Mary's effort will turn it into reality...

Btw, are there any Technorati tags for this conversation?

