Corpora: Quest: compound words

Veronique Gendner gendner at
Tue Feb 29 15:37:52 UTC 2000


I am looking for references to work about definitions of compound
words and their identification in context.

I am working on French data and am particularly interested in
grammatical words (like "en face de":P; "en fait":ADV, ...) but also
common nouns.

My purposes are to:

 - determine what definition of *word* is pertinent for different
applications (extraction of classes according to syntactic
information, language modeling for speech recognition, ...)

 - define guidelines for corpus anotation (i.e. tests that can be
applied to determine whether a sequence is a compound word or not)

Thank you in advance for any useful pointers

Veronique GENDNER
TALaNa / Lattice, Paris VII
Limsi, Orsay

More information about the Corpora mailing list