[Lexicog] corpus + cognitive linguistics. WAS: Interesting lexical discoveries
Peter Kirk
peterkirk at QAYA.ORG
Tue Feb 3 17:33:13 UTC 2004
On 03/02/2004 06:56, Patrick Hanks wrote:
> Murray:
>
> > Would another hypothesis be that you are working with a skewed corpus?
> > Journalism has its own peculiar style and genre, where the use of
> abstracts
> > in general (and, especially, of abstracts as subjects of verbs) would be
> > much more common than in everyday speech or in many other written
> genres.
>
> Good point. BNC is _supposedly_ "balanced and representative" (and
> certainly
> it contains a wide variety of text types, not just journalism.) However,
> within BNC,
> the "[[Abstract Entity]] total {QUANT [[Numerical Value]]}" sense has a
> skewed distribution in favor of journalism.
>
> I expect -- but have not checked -- that large spoken corpora may
> favor the
> insurance sense of total/V.
>
This is surely the point. There are different registers of language,
including journalistic, official/administrative and colloquial. The
average person can understand several registers but is most used to
producing one register, the colloquial. And when asked to produce a
sample sentence, they tend to prefer one from the register they are most
used to producing. The colloquial register is where the motor vehicle
usage of "total" comes from, and so most people produce sample sentences
from it. The accountancy usage comes from the official/administrative
register, and a number of people are used to producing that, and produce
sample sentences from it perhaps because they feel they ought to avoid
colloquialisms. But rather few people are used to producing the
journalistic register (and even those people probably produce more in
the colloquial register - journalists don't talk like newspapers to
their families and friends!), and so few people produce sample sentences
from it.
Try a corpus of soap opera scripts. I guess you will get a very
different distribution there. Although even such scripts are not
representative, at least in avoiding certain words and usages which are
very common, but not acceptable to all, in genuine spoken language.
--
Peter Kirk
peter at qaya.org (personal)
peterkirk at qaya.org (work)
http://www.qaya.org/
Yahoo! Groups Links
To visit your group on the web, go to:
http://groups.yahoo.com/group/lexicographylist/
To unsubscribe from this group, send an email to:
lexicographylist-unsubscribe at yahoogroups.com
Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
More information about the Lexicography
mailing list