16.968, Disc: New: Re: 16.961: Grammar Checker

LINGUIST List linguist at linguistlist.org
Thu Mar 31 04:36:29 UTC 2005


LINGUIST List: Vol-16-968. Wed Mar 30 2005. ISSN: 1068 - 4875.

Subject: 16.968, Disc: New: Re: 16.961: Grammar Checker

Moderators: Anthony Aristar, Wayne State U <aristar at linguistlist.org>
            Helen Aristar-Dry, Eastern Michigan U <hdry at linguistlist.org>

Reviews (reviews at linguistlist.org)
        Sheila Collberg, U of Arizona
        Terry Langendoen, U of Arizona

Homepage: http://linguistlist.org/

The LINGUIST List is funded by Eastern Michigan University, Wayne
State University, and donations from subscribers and publishers.

Editor for this issue: Naomi Fox <fox at linguistlist.org>
================================================================

To post to LINGUIST, use our convenient web form at
http://linguistlist.org/LL/posttolinguist.html.


===========================Directory==============================

1)
Date: 30-Mar-2005
From: Mike Maxwell < maxwell at ldc.upenn.edu >
Subject: A Word to the unwise -- program's grammar checker

	
-------------------------Message 1 ----------------------------------
Date: Wed, 30 Mar 2005 23:26:45
From: Mike Maxwell < maxwell at ldc.upenn.edu >
Subject: A Word to the unwise -- program's grammar checker


Fund Drive 2005 is now on! Visit http://linguistlist.org/donate.html to donate now!

In LL 16.961(http://linguistlist.org/issues/16/16-961.html), John Lawler
discusses a recent Seattle Post-Intelligencer article concerning the
reportedly bad performance of Microsoft's grammar checker.  (I'd suggest
that before replying on this topic, you read the original article at
http://seattlepi.nwsource.com/business/217802_grammar28.asp).

As most of us know (but the business prof may who started the crusade may
not know), grammar checking is a tradeoff between recall (allowing all
grammatical sentences through, i.e. not flagging them) vs. precision
(flagging all ungrammatical sentences).  And of course in the case of
grammar judgments, there are issues of inter-annotator agreement which make
it impossible to even agree on what is grammatical or not.

Having said this, it occurs to me that it would be great fun--and might
even advance the state of the art--to have a web site where sentences both
grammatical and un- could be posted, and the output of grammar checkers
displayed in an interlinear format.  [BTW, take that last sentence and
parse it...]  I suppose the way it would work is that you could download
the sentences, pass them through your favorite checker or parser, and send
the results back in some agreed-on format (perhaps XML) to the owners of
the site, who could post the results.

Hopefully no one would be tempted to cheat by adjusting their parser's
results.  If need be, there could be safeguards against that.

I think such a site should use individual sentences, not whole paragraphs
or texts, because there is probably no grammar checker or parser around
that could flag errors at the paragraph level.  Of course, it would be nice
to be proved wrong!

There should also be an annotation line giving a human-produced  indication
of errors, possibly with levels of acceptability and/or inter-annotator
agreement indicated.  This would serve as the standard against which the
machine-produced results would be judged.

As for where one would get the sentences to be tested, I'm sure any teacher
could provide lots of bad examples.  And good sentences could be lifted
from lots of places.


Linguistic Field(s): Computational Linguistics
                     General Linguistics





-----------------------------------------------------------
LINGUIST List: Vol-16-968	

	



More information about the LINGUIST mailing list