[Corpora-List] Reducing n-gram output
svetlana sheremetyeva
linklana at yahoo.com
Tue Oct 28 10:15:26 UTC 2008
Hi, Irina
I have just made a tool for keyword extraction (LanA-Key) which includes collapsing n-grams. It outputs up to 4-grams, but it can be updated to any "n"
The tool can be downloaded for a 3 day free trial from
http://lanaconsult.com
Regards,
Svetlana Sheremetyeva
--- On Mon, 10/27/08, Dahlmann Irina <aexid at nottingham.ac.uk> wrote:
From: Dahlmann Irina <aexid at nottingham.ac.uk>
Subject: [Corpora-List] Reducing n-gram output
To: CORPORA at uib.no
Date: Monday, October 27, 2008, 1:07 PM
Dear all,
I was wondering whether anybody is aware of ideas and/or automated
processes to reduce n-gram output by solving the common problem that
shorter n-grams can be fragments of larger structures (e.g. the 5-gram
'at the end of the' as part of the 6-gram 'at the end of the
day')
I am only aware of Paul Rayson's work on c-grams (collapsed-grams).
Many thanks,
Irina Dahlmann
PhD student
School of English Studies
University of Nottingham
aexid at nottingham.ac.uk
This message has been checked for viruses but the contents of an attachment
may still contain software viruses, which could damage your computer system:
you are advised to perform your own checks. Email communications with the
University of Nottingham may be monitored as permitted by UK legislation.
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20081028/0019e844/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list