[Corpora-List] Frequency list of transformations
Viktor Pekar
v.pekar at wlv.ac.uk
Fri Jan 21 09:39:06 UTC 2005
Hi Marijke,
Here is a Perl module that can tell which letters need to be
removed/inserted/substituted in one word to get the other:
http://cs.haifa.ac.il/~shlomo/talks/edit_distance/slides/Brew.pm.html
Viktor
----- Original Message -----
From: "Marijke Koster" <marijke at polderland.nl>
To: <CORPORA at UIB.NO>
Sent: Friday, January 21, 2005 8:44 AM
Subject: [Corpora-List] Frequency list of transformations
Dear corpora list members,
Does anyone have a suggestion for a simple method / a script to extract
a frequency list of transformations from a list of spelling errors and
corrections?
For example here's this tab separated list:
wrong correct
----- -------
occurence occurrence
occosion occasion
commputer computer
live life
heavie heavy
geat great
save safe
After applying the method it should result in something like this
1 rr -> r
1 a -> o
1 m -> mm
2 f -> v
1 y -> ie
1 r -> ()
Thanks in advance,
Marijke Koster
______________________________________
Marijke Koster, linguistic engineer
Polderland Language & Speech Technology BV
The Netherlands
http://www.polderland.nl
Phone: +31.24.352 28 66
Fax: +31.24.352 28 60
More information about the Corpora
mailing list