Corpora: Iggy Roca's query about retrieving from a word list

Geoffrey Sampson geoffs at cogs.susx.ac.uk
Wed Oct 10 14:33:51 UTC 2001


My immediate response to Iggy's query is that for the first few things
he asks about, if his list has one word per line (or can be reformatted to
have one word per line), searching for "all words that end in a vowel", or
"which have a vowel 3rd letter from end", etc are very easily done with
grep, so that it would be bootless to look at more out-of-the-way software.
Clearly grep alone will not say where the syllable boundaries come, but if
this is decidable mechanically at all, presumably it would be easy to write
a routine to insert syllable-boundary syllables, and then use grep as before.


G.R. Sampson, Professor of Natural Language Computing

School of Cognitive & Computing Sciences
University of Sussex
Falmer, Brighton BN1 9QH, GB

e-mail geoffs at cogs.susx.ac.uk
tel. +44 1273 678525
fax  +44 1273 671320
web http://www.grsampson.net



More information about the Corpora mailing list