...giving birth to learning...
  Friday, January 28, 2005

  Ontological fingeprinting: documents or people

Anjo gives a bit of insight into our internal discussions on uses of ontologies:

Andy Boyd came up with a wonderful new term: "ontological fingerprinting" and to illustrate how imaginative he is: zero hits on Google! Suppose one has an ontology (lexicon, thesaurus) and some software that can determine whether the terms in the ontology are present in a document. Applying the software, one gets a "fingerprint" of the concepts in the ontology for a given document. Comparing fingerprints for different documents, such is the assumption, provides a better metric of the similarity between these documents than comparing plain words. Ideas like this simply have to be tested in practice. Fortunately, Andy is making available a lot of real data to try it.

I like the term, but find it a bit misleading: usually documents do not have fingers :)

I'd associate the term with people - you may think of "ontological fingerprint" of a person, which could be something like conceptualisations produced by Sigmund based on analysis of weblog posts written by someone, set of personal categories someone uses to classify a document or mapping one's documents to a shared ontology. Then you can look for others with similar "fingerprints" (this was one of uses I imagined for Sigmund, but didn't have such a nice term to talk about it :). 

May be we should rather talk about "ontological abstract" in case of documents...

  Researcher vs. blogger: researcher influence

Inna Kouper on disadvantages of participant observation as a research method (in relation to reading Milroy, 1987):

The researcher may be unable to fit the data in a wider context without additional broader studies. Participant observations can be very demanding in time, energy and emotional involvement. There might be a lot of "unanalyzable" data because the researcher has to record everything and then sort it out. Personal characteristics play essential role and can skew the sample (e.g. males attracted to a female researcher). There is a chance of data distortion from researcher's side (who unconsiously may influence communication) and from the studied community side referred in sociolinguistics to as "observer's paradox." The paradox was formulated by Labov in late 60s - early 70s works as follows:

"... researchers want to find out how people talk when they are not being systematically observed; yet we can only obtain these data by systematic observation..."

It is usually ignored in blogs studies because we're studying "publicly available" behavior. But if people know there is a blog researcher in their community and they're being observed do they change their behavior?

I started to articulate my concerns regarding this issue in Hard choices: researcher vs. blogger?, but I guess I can make a bit more fine-grained distinctions of my influence:

  1. bloggers I study may change their behaviour as a result of knowing that they are being observed (I guess this is what Inna refers to)
  2. by participating in the community I influence behaviours of others

Before I get into the details, I'd like to specify that what I say applies to "my community", broadly defined as "KM/learning/Internet research bloggers". I do not have an objective way to describe it yet, but this is one of my goals in our work with Stephanie on defining weblog community boundaries.

Now to the points...

Bloggers I study may change their behaviour as a result of knowing that they are being observed

Sure there is a risk of that, but:

Blogging is a bit exhibitionistic anyway - you write in public and you are likely to know that you are "being observed" by your readers. I guess knowing that your family member/friend/colleague/potential employer may read your weblog would influence what is being said and how more than knowing that someone may study it for research purposes. Of course, it depends on a blogger awareness of public nature of blogging, which may not be the case in some groups, but definitely not an issue in "my community".

Another reason I don't think I have influence of this kind is a longitidinal nature of my study. You may be aware of a researcher around you for a week or month, but then life takes its own course (as Dina puts it, you can't be consistently fake). It may have the same effect as videocamera that people stop noticing after some time of being videotaped (btw, anyone knows scientific evidence of it?)

Now to the second part - by participating in the community I influence behaviours of others.

The simple answer would be that it's part of my design since what I do is pretty close to action research. And I'm still working on the complex answer since what I do is closer to research through active participation in a sense Torill uses in her dissertation (as a side remark - this is a nice example how sharing good Italian food and your research problems with another blogger influences your own research :)

The "complex answer" is still hard to articulate and I don't have all the ingridients, but as an indication - questions I'm trying to tackle as part of it:

  • My community: What is "my community"? Is it one or several? Characteristics? Boundaries? Who are the members?
  • Me and others: What is my role in the community? Is it different from others' roles? How?

This post also appears on channel weblog research

