June 12th 2007 06:00 pm

Tools to find similarity between two texts (weblog and papers)

I’m playing with an idea of comparing (parts of) my weblog with some of my published papers (and with the dissertation as a whole when I’m done). So far I’m interested in two things:

  • how much of the text is reused
  • how conceptually close two texts (weblog and a paper) are

Thought of a couple of ways to do so:

  • One way would be to use all kinds of weblog analysis tool from Anjo. One of the difficulties there would be to figure out how to find similarities between weblog text, which is relatively self-contained microcontent pieces, and linear “build upon previousely said” academic papers.
  • Another option would be to use some plagiarism detection tools. Only wonder if you can configure those to compare target paper with a specific weblog, rather than with “everything published”.

Any ideas?

Archived version of this entry is available at http://blog.mathemagenic.com/2007/06/12.html#a1909; comments are here.

Tags: , , ,

Related posts

No Comments yet »

Trackback URI | Comments RSS

Leave a Reply

« Blogger thought group and attributing ideas | Flow »

  • Welcome!

    Like my house right now this blog is loved, but neglected space: finishing my dissertation and being a happy mom doesn't leave much energy for anything else. I'm almost there, starting to look forward to "after the PhD" life, like moving to an unknown country...
  • Archives

  • Categories