Updated: 6/25/2007; 10:18:31 PM.


on personal productivity in knowledge-intensive environments, weblog research, knowledge management, PhD, serendipity and lack of work-life balance...
If you search for mathemagenic that has nothing to do with weblogs try this

Earlier | Home | Later

  Tuesday, June 12, 2007

  Tools to find similarity between two texts (weblog and papers)

I'm playing with an idea of comparing (parts of) my weblog with some of my published papers (and with the dissertation as a whole when I'm done). So far I'm interested in two things:

  • how much of the text is reused
  • how conceptually close two texts (weblog and a paper) are

Thought of a couple of ways to do so:

  • One way would be to use all kinds of weblog analysis tool from Anjo. One of the difficulties there would be to figure out how to find similarities between weblog text, which is relatively self-contained microcontent pieces, and linear "build upon previousely said" academic papers.
  • Another option would be to use some plagiarism detection tools. Only wonder if you can configure those to compare target paper with a specific weblog, rather than with "everything published".

Any ideas?

  Blogger thought group and attributing ideas

Browsing my archives and realising that I'd better quote those comments to Context and attribution (12 Feb 2004!) in a blogpost, which is easier to find later.

By Alex Halavais (#):

This is, arguably, easy enough with words, but much harder when it comes to ideas. I came up with some thoughts that, I will assert, are my own. Someone noted that these followed closely some things you had written about in your blog. I am a regular reader of your blog, and I think it is likely that these entries--at the very least--prompted my thinking in a particular direction. This tendency to remember the ideas but forget their source--the "sleeper effect"--has been shown in communication research several times over the last 50 years.

You actually know about this, because someone else made the connection and hyperlinked it. But otherwise, I would have been abscounding with your ideas without due credit. As interersted as I am in encouraging hyperlinking as attribution, there has to be a limit.

I wonder whether a standing set of citations (your "Regular reads/dialogues") constitutes a kind of "thought group"--an indication that your ideas are at least in some part attibutable to the people you communicate with every day?

By Piers Young (#):

Crikey - all sounds like we're beginning to enter the murky world of Intellectual Proprty Rights. Have a few brief comments: 1) that this trail is happening at all is a good thing. It underlines the fact that there is value (however intangible) in blogging. 2) I don't think the "thought group" idea's is quite enough. Most, or at least many blogs have a "thought group" anyway: a blogroll. Most, or at least many bloggers have diverse interests: they may be into KM and skiing, KM and whiskey or KM and needlecraft or - you get the picture. One of the great things about links is that it allows me to get an idea which blogs most interest me. Without specific citations, I - as let's say a needlecraft afficionado - would have to wade through a whole load of stuff on marketing, whiskey and skiing. Links, along with a whole load of other good things, help you filter. 3) That said, I agree there has to be a limit. In many cases it just isn't practical to search all the citations and make all the links. But surely you do as much as you've got time for? And with the joys of trackback, bookmarklets etc, you almost by definition have time for one.

Alternating between typing, reading, browsing my weblog and walking around (usually means writing flow :)

Earlier | Home | Later

© Copyright 2002-2007 Lilia Efimova.

This weblog is my learning diary. Sometimes I write about things related to my work, but the views expressed here are personal and do not necessarily reflect the views of my employer.

June 2007
Sun Mon Tue Wed Thu Fri Sat
          1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
May   Jul

Edublog award 2004 as Best Research Based Blog. Click for more details...

Click to see the XML version of this web page. Click here to send an email to the editor of this weblog. Please, make sure that I recognise your name or you have a nice autorisation message - I tend to decline calls from people I don't know ;)

Locations of visitors to this page Technorati Profile