[Corpora-List] fast string replacement
Piao, Songlin
s.piao at lancaster.ac.uk
Fri Mar 11 16:44:05 UTC 2005
Hi Jörg,
I put a freely downloadable Java tool on my webpage, which has a function for the same purpose, :
http://www.lancs.ac.uk/staff/piaosl/research/download/download.htm <http://www.lancs.ac.uk/staff/piaosl/research/download/download.htm>
You can use it for your purpose as follows:
1) Replace commas with tabs in the rules (the program use tabs as separator),
2) List your rules, with each rule in a separate line as shown below:
books books/v:3:pres;n:plur
nice nice/adj
3) go to menu "Tools" --> "Convert Codes", and click on it to get a file chooser.
4) Choose one or multiple files that you want to convert.
Then the program will convert all the matching items with corresponding substitutes in the files.
For it is Java program, it should be running in Linux.
I tried with your sample rules and senetnce with it, and I got exactly the same result as you hoped.
Scott Piao
________________________________
From: owner-corpora at lists.uib.no on behalf of js at cis.uni-muenchen.de
Sent: Fri 11/03/2005 14:43
To: CORPORA at hd.uib.no
Subject: [Corpora-List] fast string replacement
Hello,
I am looking for a program that
- takes as input a string (!) rewriting dictionary and and a corpus
- applies all rewriting rules to all lines of the corpus
- is fast, stable and free
- works under Linux
Example:
Some rewriting rules:
book3, books/v:3:pres;n:plur
nice, nice/adj
A "corpus" before transduction:
John reads nice books.
The same corpus after transduction:
John reads nice/adj books/v:3:pres;n:plur
Does anyone know such a program?
Jörg Schuster
More information about the Corpora
mailing list