[Corpora-List] Uses of N-grams?
Kilian Evang
maschinenraum at texttheater.net
Thu Jul 18 08:35:26 UTC 2013
Dear Cedric,
n-grams are also widely used in computational linguistics to process
natural language automatically, based on statistical information.
A simple example is a language model: given the beginning of a sentence,
which word is likely to appear next? Counting n-grams in a large corpus
can help predict this. To some extent, this can be used for fluency
ranking, i.e. automatically assessing how "natural" a text sounds that
is produced e.g. by a language learner or by a machine translation system.
Another example is part-of-speech tagging: the word "cap" can be either
a verb or a noun, but the context should disambiguate it. The n-grams as
part of which the word appears may provide such context.
Best,
Kilian
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list