Secondary entries (was Re: [Lexicog] Query on how to deal with coined words)

Mike Maxwell maxwell at LDC.UPENN.EDU
Tue Apr 10 14:33:42 UTC 2007


Vincent `Bentong` S. Isles wrote:
> I would like to know that "complex procedure". I had spent half of
> yesterday and the whole of today trying to understand the Consistent
> Changes program, and I do very well think of CC when you wrote
> "complex" :)

It might be useful to say what you're hoping to use cc for.  Cebuano 
morphology is very complex, with reduplication and infixing, and if 
you're trying to go from stems or inflected words to roots, I don't 
think I would recommend cc.  What you really need is a morphological parser.

It happens that I wrote a morphological parser for Cebuano several years 
ago.  I won't claim that it does everything right--I basically wrote it 
in a couple hours, and tweaked it a little after that--it would probably 
work better than one could do with cc.  I did it while I was working for 
the Linguistic Data Consortium (LDC), and I'll have to ask them whether 
it's sharable.  It uses the Xerox finite state tools, for which you have 
to pay $40 (the CD comes in a book from U of Chicago press).  If that's 
of interest, let me know and I'll see what the LDC says.

On the other hand, if you're trying to fix spelling errors, a parser 
won't help you fix them (it might help you find them).  A program like 
cc might be usable to fix some classes of errors, provided you have some 
notion of what common errors are (substituting a 'c' for a 'k', for 
example).

Also, there are other programs that do more or less the same thing that 
cc does.  These come largely from the Unix/ Linux world, but are 
available on DOS and in the Windows command prompt.  These are programs 
like sed and awk (and more recently, Unicode-compatible versions of such 
programs, often written in Perl).  Depending on where you are (at a 
university, for example), you may be able to find people who can help 
you with these programs more easily than with cc.
-- 
	Mike Maxwell
	maxwell at ldc.upenn.edu


 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/lexicographylist/

<*> Your email settings:
    Individual Email | Traditional

<*> To change settings online go to:
    http://groups.yahoo.com/group/lexicographylist/join
    (Yahoo! ID required)

<*> To change settings via email:
    mailto:lexicographylist-digest at yahoogroups.com 
    mailto:lexicographylist-fullfeatured at yahoogroups.com

<*> To unsubscribe from this group, send an email to:
    lexicographylist-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/
 



More information about the Lexicography mailing list