[Corpora-List] Common connectors
Xiao, Zhonghua
z.xiao at lancaster.ac.uk
Sat Apr 23 14:54:22 UTC 2005
Hi Wallace,
I think there is no established statistical norm for what should be considered as "common". Maybe we can take account of the two factors underlying Mike Scott's idea of "key keyword": frequency and dispersion. If an item is frequent and it also occurs in a large number of genres and/or texts in your corpus, it can be considered as "common". The cut-off points for frequency and coverage, of course, depend upon how many connectors you want to include in your study.
Best,
Richard
________________________________
From: owner-corpora at lists.uib.no on behalf of Wallace Chen
Sent: Fri 22/04/2005 23:09
To: CORPORA at UIB.NO
Subject: [Corpora-List] Common connectors
Dear Corpora colleagues,
I am currently doing a research on Chinese connectors, which have around 270 types and broadly include conjunctions and sentence adverbs. These are derived from a five-million-word corpus of contemporary Chinese. My question is how to determine which ones are "common"? Are there statistical criteria (e.g. cut-off point) to determine "common connectors" from such a list? Do I look at their frequencies or rankings? I appreciate anyone who can help me answer the questions or direct me to relevant resources. Thanks in advance for all your help!
Wallace Chen
More information about the Corpora
mailing list