Updated: 1/19/2006; 9:53:37 PM.

Mathemagenic


...giving birth to learning...
If you search for mathemagenic that has nothing to do with weblogs try this
    


Earlier | Home | Later

  Tuesday, December 20, 2005


  Topics and terms (categorisations and text analysis) for weblog conversations

Anjo, in What is a topic?

The most mysterious term that I encountered a lot recently is topic. I have no idea how to define it and, neither seem the weblog research proposals that suggest finding the topic of a post is something worth doing. Being on holiday currently, and given it was raining and snowing outside, I tried to apply the notion of "topic finding'' to weblog conversations (see also: here, and here).

Anjo goes on, providing an example of "unique" terms extracted from three weblog conversations (more details in the post). Although those provide a good picture of what conversations are about, they do not really answer the question of what is a topic of each of them.

Which makes me thinking of my own experiences around the issue...

One of the things we planned to do this year, but didn't get to do, was looking at personal categorisations. To be more specific the idea was to compare categories (~tags, ~topics) that a blogger assigns to her posts and the results of the text analysis of those posts to see if there is any correlation between the language used and conceptual categories. [I still think it's an experiment worth doing, but not sure I personally can devote serious time to it. Anyone interested?]

Thinking of my own weblog I can imagine that for some topics (I call them topics ;) that I use for my own weblog the correlation should be present (e.g. posts related to events are likely to be labelled with it and mention it in the text).

However there are others, those where I assign topic to organise my ideas on ill-structured themes (=I feel that those posts belong together, but I don't know why yet, or I don't have a good label for it). The examples of the second type are posts on life, knowledge mapping or transparency.

Which brings me to the reason I started to write this post. I think that topics are conceptual categories used to characterise a group of connected pieces (conversations with others, conversations with self, or something in between) and to give it a nametag. The common name makes sense - it makes it easier to remember those pieces belong together, to retrive, to communicate about.

The problem is that conceptual categories are subjective. They depend on a person, group or even groupthink (as with pressure to use certain tags to appear at right places in Technorati and not because they make more sense than others). So I suspect that once we define a topic of a conversation there will be someone who would say that it's about something else (referring to Anjo's examples - it could be "not about Skype, but about presence").

That's said I still think that defining a topic of a conversation makes sense. Personally, I'd prefer to have a Sigmund picture (~frequent terms and relations between them) for a conversation, as some kind of ontological fingerprint of what the conversation is about. Or there is a number of ways to select one of the terms from the "unique term list" for a conversation:

  • by further selecting "least unique" from the subset (i.e. terms used by highest number of participants of the conversation)
  • by selecting terms that match categories some of participants assign to posts
  • by selecting terms that match predefined ontology/folksonomy/keyword list
  • by selecting terms most of the participants are likely to agree (don't ask me how to do that :)
  • by selecting terms most closely resembling those of an external "customer" for the analysis or those that non-participant is likely to understand

Or we just have to find a way of matching personal caterogisations. Given there the tools are going this shouldn't be that far...


Earlier | Home | Later


© Copyright 2002-2006 Lilia Efimova.

This weblog is my learning diary. Sometimes I write about things related to my work, but the views expressed here are personal and do not necessarily reflect the views of my employer.

 
December 2005
Sun Mon Tue Wed Thu Fri Sat
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31
Nov   Jan






Edublog award 2004 as Best Research Based Blog. Click for more details...


Click to see the XML version of this web page. Click here to send an email to the editor of this weblog. Please, make sure that I recognise your name or you have a nice autorisation message - I tend to decline calls from people I don't know ;)

Locations of visitors to this page Technorati Profile