[Corpora-List] Reducing n-gram output

Dahlmann Irina aexid at nottingham.ac.uk
Mon Oct 27 20:07:59 UTC 2008


Dear all,

I was wondering whether anybody is aware of ideas and/or automated
processes to reduce n-gram output by solving the common problem that
shorter n-grams can be fragments of larger structures (e.g. the 5-gram
'at the end of the' as part of the 6-gram 'at the end of the day')

I am only aware of Paul Rayson's work on c-grams (collapsed-grams).

Many thanks,

Irina Dahlmann
 
PhD student
School of English Studies
University of Nottingham
aexid at nottingham.ac.uk

This message has been checked for viruses but the contents of an attachment
may still contain software viruses, which could damage your computer system:
you are advised to perform your own checks. Email communications with the
University of Nottingham may be monitored as permitted by UK legislation.


_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list