Secondary entries (was Re: [Lexicog] Query on how to deal with coined words)
Mike Maxwell
maxwell at LDC.UPENN.EDU
Tue Apr 10 14:33:42 UTC 2007
Vincent `Bentong` S. Isles wrote:
> I would like to know that "complex procedure". I had spent half of
> yesterday and the whole of today trying to understand the Consistent
> Changes program, and I do very well think of CC when you wrote
> "complex" :)
It might be useful to say what you're hoping to use cc for. Cebuano
morphology is very complex, with reduplication and infixing, and if
you're trying to go from stems or inflected words to roots, I don't
think I would recommend cc. What you really need is a morphological parser.
It happens that I wrote a morphological parser for Cebuano several years
ago. I won't claim that it does everything right--I basically wrote it
in a couple hours, and tweaked it a little after that--it would probably
work better than one could do with cc. I did it while I was working for
the Linguistic Data Consortium (LDC), and I'll have to ask them whether
it's sharable. It uses the Xerox finite state tools, for which you have
to pay $40 (the CD comes in a book from U of Chicago press). If that's
of interest, let me know and I'll see what the LDC says.
On the other hand, if you're trying to fix spelling errors, a parser
won't help you fix them (it might help you find them). A program like
cc might be usable to fix some classes of errors, provided you have some
notion of what common errors are (substituting a 'c' for a 'k', for
example).
Also, there are other programs that do more or less the same thing that
cc does. These come largely from the Unix/ Linux world, but are
available on DOS and in the Windows command prompt. These are programs
like sed and awk (and more recently, Unicode-compatible versions of such
programs, often written in Perl). Depending on where you are (at a
university, for example), you may be able to find people who can help
you with these programs more easily than with cc.
--
Mike Maxwell
maxwell at ldc.upenn.edu
Yahoo! Groups Links
<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/lexicographylist/
<*> Your email settings:
Individual Email | Traditional
<*> To change settings online go to:
http://groups.yahoo.com/group/lexicographylist/join
(Yahoo! ID required)
<*> To change settings via email:
mailto:lexicographylist-digest at yahoogroups.com
mailto:lexicographylist-fullfeatured at yahoogroups.com
<*> To unsubscribe from this group, send an email to:
lexicographylist-unsubscribe at yahoogroups.com
<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
More information about the Lexicography
mailing list