[Corpora-List] Google searches as linguistic evidence
maxwell at ldc.upenn.edu
maxwell at ldc.upenn.edu
Thu Dec 7 15:05:33 UTC 2006
Quoting Ramesh Krishnamurthy <r.krishnamurthy at aston.ac.uk>:
> I don't know of many websites who use professional proof-readers...
I'm sure most readers of this list have already seen this, but just in case:
Christoph Ringlstetter, Klaus U. Schulz and Stoyan Mihov: Orthographic
Errors in Web Pages - Towards Cleaner Web Corpora . Computational
Linguistics, September 2006, Vol. 32(3), pp. 295-340.
One useful output is a classification of websites into ones that have
more or fewer misspellings.
Mike Maxwell
CASL/ U MD
----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.
More information about the Corpora
mailing list