[Corpora-List] celex plus

j_kurjian at hotmail.com j_kurjian at hotmail.com
Mon Jul 3 00:47:27 UTC 2006


Hi all,
I was wondering if anyone had a revised celex list, in particular a revised 
list of the celex words split by morpheme.  I was planning to use celex as a 
gold standard to test my morphological analyzer.  However, when I extracted 
the celex words split by morpheme, I found there were many cases that seem 
inappropriate for my purpose, e.g.
wrongheadedness --> wrongheaded-ness
vs. what I'd like: wrong+head+ed+ness
wistful --> wistful
vs. wist+ful
whitening --> whitening
vs. white+n+ing or whit+en+ing

Thanks!
Jerry



More information about the Corpora mailing list