[Corpora-List] Distribution of tokens by POS in BNC or COCA

Mark Davies Mark_Davies at byu.edu
Sun Mar 23 15:07:50 UTC 2014


Paul Rayson mentioned:

http://ucrel.lancs.ac.uk/bncfreq/flists.html

I agree with Paul that this is probably the easiest access to word lists by general part of speech (nouns, verbs, etc). But as Khurshid mentioned in the OP, sometimes it's useful to search by a more narrow PoS (nn2, vvg, jjr, etc) and I don't believe that the BNC lists at the above URL would provide that.

This would be possible, however, via the BYU interface, e.g.:

(for COCA):

[vvg*] http://corpus.byu.edu/coca/?c=coca&q=29371680
[nn2*] http://corpus.byu.edu/coca/?c=coca&q=29371693
[jjr*] http://corpus.byu.edu/coca/?c=coca&q=29371711

You can also show the frequency by genre, e.g.:

[jjr*] http://corpus.byu.edu/coca/?c=coca&q=29371965

The same could be done for BYU-BNC as well, and I would imagine that similar functionality would be available for the BNC via BNCweb and SketchEngine as well.

MD

============================================
Mark Davies
Professor of Linguistics / Brigham Young University
http://davies-linguistics.byu.edu/

** Corpus design and use // Linguistic databases **
** Historical linguistics // Language variation **
** English, Spanish, and Portuguese **
============================================

________________________________________
From: corpora-bounces at uib.no [corpora-bounces at uib.no] on behalf of Rayson, Paul [p.rayson at lancaster.ac.uk]
Sent: Sunday, March 23, 2014 4:21 AM
To: Khurshid Ahmad; corpora at uib.no
Subject: Re: [Corpora-List] Distribution of tokens by POS in BNC or COCA

Dear Khurshid,

For the BNC this information is available in the chapter 5 frequency lists, see:

http://ucrel.lancs.ac.uk/bncfreq/
http://ucrel.lancs.ac.uk/bncfreq/flists.html

Regards,
Paul.

Dr. Paul Rayson
Director of UCREL and Senior Lecturer in Computer Science
Faculty of Science and Technology Director of International Teaching Partnerships
School of Computing and Communications, InfoLab21, Lancaster University, Lancaster, LA1 4WA, UK.
Web: http://www.comp.lancs.ac.uk/~paul/
Tel: +44 1524 510357 Fax: +44 1524 510492


-----Original Message-----
From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf Of Khurshid Ahmad
Sent: 22 March 2014 18:15
To: corpora at uib.no
Subject: [Corpora-List] Distribution of tokens by POS in BNC or COCA

Dear All
Is there a table which lists the contents of BNC or COCA by POS - NN,
NNP, JJ, VB and their variations?
Apologies for using the bandwidth for such a simple query.

--
Best wishes

Khurshid Ahmad.
Professor of Computer Science
School of Computer Science and Statistics
Trinity College
Dublin 2
IRELAND

Phone: 00353 1 896 8429 (Labs: 00 353 1 8968435)
Fax 353 1 677 2204
Webpage: www.cs.tcd.ie/khurshid.ahmad

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list