[Lexicog] stereotypical beliefs and lexicography
Patrick Hanks
hanks at BBAW.DE
Tue Feb 22 17:28:37 UTC 2005
Thapelo Otlogetswe said:
>What are good corpora for lexicography?
Fred Jelinek (or was it Bob Mercer? -- one of those guys at IBM)
used to say, "More data is better data."
I think we're still at the stage where we need more data,
even in English, for which big corpora, both "balanced" and
"unbalanced", exist. For example, we still need many more
texts for historical corpora.
> Hanks takes a position that is common amongst corpora-dependent
> lexicographers - if it's very rare or doesn't exist in broad-based
> corpora like the BNC one would be inclined not to include it as
> a dictionary entry
Actually I take a sort of Popperian variant of this position, viz.:
> if it's very commonly used (in a corpus, or in conversation, or ...
> anywhere), one may be reluctantly forced to include it
> as a dictionary entry. Otherwise, I'd prefer to leave it out.
Patrick
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lexicography/attachments/20050222/b4d9e22c/attachment.htm>
More information about the Lexicography
mailing list