Corpora: Question about a Brown Corpus tag

Andrew Harley aharley at cambridge.org
Thu Aug 17 16:20:24 UTC 2000


>Some tag
>definitions in Brown were clearly decided by what TAGGIT found computable;
>I *guess* linguistic inconsistencies in tagging some words may be down to
>drawing boundaries on grounds of computational tractability rather than
>purely linguistic reasons (or, to be more fair, when two or more
>conflicting linguistic criteria were available (eg form v function),
>computational tractability was a deciding factor)

This explains how so many taggers can claim 95% or higher success rates!

I also know taggers that tagged IN as "preposition or conjunction" on the
same grounds.

Different users have different needs. For lexicographers, the difference
between these forms for WDT and IN are important, as are the less commonly
distinguished VB-transitive and VB-intransitive tags.

Andrew Harley
Systems Development Manager
English Language Teaching & Dictionaries
Cambridge University Press
http://www.cambridge.org/elt

Direct line: (01223)325880

Cambridge Dictionaries Online (4 million searches since August 1999):
http://dictionary.cambridge.org

Cambridge International Dictionary of English on CD-ROM:
http://www.cambridge.org/elt/cide



More information about the Corpora mailing list