[Corpora-List] complete list of closed-class words in English

Eric Atwell csc6ea at leeds.ac.uk
Tue Nov 22 23:13:26 UTC 2011


It depends whcih grammatical categories/tags you consider "closed
class". For example, "adverb" is presumably an open class, but 
some subclasses of adverb are only a few words and hence "closed".
The AMALGAM website lists the PoS-tag sets used in 8 English 
corpus tagging schemes (Brown, ICE, LLC, LOB, Parts, PoW, SEC, Penn)
http://www.comp.leeds.ac.uk/ccalas/tagsets/tagmenu.html
and lists words with each tag - so you can choose the tags you 
want as "closed" and select the words which have these tags.

But I disagree with your assumption that Wikipedia is not
"authentic" - Wikipedia has sophisticated mechanisms for 
fostering and monitoring supervised collaboration, producing 
a resource which is arguably more authoritative and unbiased 
than a single-authored source; e.g. see IBM research paper:

Viegas F., Wattenberg, M., Kriss, J., and van Ham, F. 2007.
Talk Before You Type: Coordination in Wikipedia. In
Proceedings of HICSS.
http://wiki.nus.sg/download/attachments/57742900/Proceedings+of+HICSS+2007+Viegas.pdf?version=1&modificationDate=1263642365168


Eric Atwell, Leeds University


On Tue, 22 Nov 2011, Siddhartha Jonnalagadda wrote:

> Does anyone have such a reference readily available? I want a source that is more authentic than Wikipedia.
> 
> Sincerely,
> Siddhartha Jonnalagadda, Ph.D.
> sjonnalagadda.wordpress.com
> 
> 
> 
>

-- 
Eric Atwell, Senior Lecturer, Language Processing research group,
  I-AIBS Institute for Artificial Intelligence and Biological Systems
  School of Computing, Faculty of Engineering, UNIVERSITY OF LEEDS
  Leeds LS2 9JT, England.        TEL: 0113-3435430  FAX: 0113-3435468
  WWW: http://www.comp.leeds.ac.uk/eric
       http://www.comp.leeds.ac.uk/nlp

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list