[Corpora-List] Merging info from the BNC and WordNet

Mark Davies Mark_Davies at byu.edu
Tue Nov 4 13:41:30 UTC 2003


Is anyone aware of projects that have created some type of database that merges the semantic information from WordNet with the frequency and distributional information from the BNC?  
 
For example, a user could query the database to look for all lemma occurring with a particular frequency in certain registers of English or in certain collocations (info from the BNC), but which are also related to a particular hyponym or are a member meronym of a given word (info from WordNet).
 
I was considering working on such a project --  since I already have both the BNC and WordNet in relational database form (SQL Server) --  but I didn't want to proceed much further if I'd just be re-inventing the wheel. (BTW, the output would not contain actual sentences and paragraphs from the BNC [licensing issues], but would probably just be tables containing info on lemma, frequency, distribution, and semantic relationships).
 
I'd be happy to summarize the responses, if there is sufficient interest.  Thanks in advance.
 
Mark Davies
 
=================================================
Mark Davies
Assoc. Prof., Linguistics
Brigham Young University
(phone) 801-422-9168 / (fax) 801-422-0906
http://davies-linguistics.byu.edu

** Corpus design and use // Web-database scripting **
** Historical linguistics // Functional-typological grammar **
** Spanish and Portuguese historical and dialectal syntax **
================================================= 


More information about the Corpora mailing list