[Corpora-List] Frequency list of transformations

Stefan Th. Gries STGries at sitkom.sdu.dk
Fri Jan 21 10:01:15 UTC 2005


Dear Marijke

This is not quite what you're lokking for, but maybe it's still useful
in some connection. In R, the approximate pattern matching function
agrep has a function computing the Levenshtein string edit distance,
which you could use to at least determine the number of "the total
number of insertions, deletions and substitutions required to transform
one string into another"; in the help file it also says that this
function is "a simple interface to the apse library developed by Jarkko
Hietaniemi (also used in the Perl String::Approx module)".
    If you are also interested in a possibility to compute the
similarity of two strings to each other, let me know and I'll send you
an R program I have written. It takes as input a list of strings and
outputs the following pairwise similarity measures: Dice, a weighted
version of Dice, XDice, a weighted version of XDice, absolute and
relative longest common subsequence, mean and minimum longest common
subsequence.
Best,
STG

Stefan Th. Gries
----------------------------------------
IFKI, Southern Denmark University
http://people.freenet.de/Stefan_Th_Gries
----------------------------------------



--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.300 / Virus Database: 265.7.1 - Release Date: 19.01.2005



More information about the Corpora mailing list