[Corpora-List] fast string replacement

Piao, Songlin s.piao at lancaster.ac.uk
Fri Mar 11 16:44:05 UTC 2005


Hi Jörg,
 
I put a freely downloadable Java tool on my webpage, which has a function for the same purpose, :
http://www.lancs.ac.uk/staff/piaosl/research/download/download.htm <http://www.lancs.ac.uk/staff/piaosl/research/download/download.htm> 
 
You can use it for your purpose as follows:
 
1) Replace commas with tabs in the rules (the program use tabs as separator), 
2) List your rules, with each rule in a separate line as shown below:
  books   books/v:3:pres;n:plur
  nice      nice/adj
 
3) go to menu "Tools" --> "Convert Codes", and click on it to get a file chooser.
4) Choose one or multiple files that you want to convert.
 
Then the program will convert all the matching items with corresponding substitutes in the files.
 
For it is Java program, it should be running in Linux.
 
I tried with your sample rules and senetnce with it, and I got exactly the same result as you hoped.
 
Scott Piao

 

________________________________

From: owner-corpora at lists.uib.no on behalf of js at cis.uni-muenchen.de
Sent: Fri 11/03/2005 14:43
To: CORPORA at hd.uib.no
Subject: [Corpora-List] fast string replacement



Hello,

I am looking for a program that

- takes as input a string (!) rewriting dictionary and and a corpus
- applies all rewriting rules to all lines of the corpus
- is fast, stable and free
- works under Linux

Example:

Some rewriting rules:

 book3, books/v:3:pres;n:plur
 nice, nice/adj

A "corpus" before transduction:

 John reads nice books.

The same corpus after transduction:

 John reads nice/adj books/v:3:pres;n:plur

Does anyone know such a program?

Jörg Schuster



More information about the Corpora mailing list