9.276, Disc: NLP and Syntax

Tue Feb 24 13:02:26 UTC 1998

LINGUIST List:  Vol-9-276. Tue Feb 24 1998. ISSN: 1068-4875.

Subject: 9.276, Disc: NLP and Syntax

Moderators: Anthony Rodrigues Aristar: Texas A&M U. <aristar at linguistlist.org>
            Helen Dry: Eastern Michigan U. <hdry at linguistlist.org>

Review Editor:     Andrew Carnie <carnie at linguistlist.org>

Editors:  	    Brett Churchill <brett at linguistlist.org>
		    Martin Jacobsen <marty at linguistlist.org>
		    Elaine Halleck <elaine at linguistlist.org>
                    Anita Huang <anita at linguistlist.org>
                    Ljuba Veselinova <ljuba at linguistlist.org>
		    Julie Wilson <julie at linguistlist.org>

Software development: John H. Remmers <remmers at emunix.emich.edu>
                      Zhiping Zheng <zzheng at online.emich.edu>

Home Page:  http://linguistlist.org/

Editor for this issue: Martin Jacobsen <marty at linguistlist.org>

=================================Directory=================================

1)
Date:  Sun, 22 Feb 1998 15:59:56 +0900
From:  John Phillips <john at po.yb.cc.yamaguchi-u.ac.jp>
Subject:  Re: 9.255, Disc: NLP and syntax

2)
Date:  Sun, 22 Feb 1998 09:56:12 -1000
From:  Anne Sing <annes at htdc.org>
Subject:  Re: 9.255, Disc: NLP and syntax

3)
Date:  Mon, 23 Feb 1998 12:51:34 +0000 (GMT)
From:  jock at ccl.umist.ac.uk
Subject:  re: NLP and the best theory of syntax

-------------------------------- Message 1 -------------------------------

Date:  Sun, 22 Feb 1998 15:59:56 +0900
From:  John Phillips <john at po.yb.cc.yamaguchi-u.ac.jp>
Subject:  Re: 9.255, Disc: NLP and syntax

> ... the best theory of syntax must necessarily be the one that
> demonstrates itself to be most completely implemented in a
> programming language.  ... the best independent and objective
> measure of a theory of syntax' overall effectiveness is its ability
> to generate, in a computer program, standard grammatical structures
> ...  ... 2) that any theory that can not be fully implemented in a
> programming language as described in the standards outlined above,
> is flawed in some way; and 3) that the best independent and
> objective measure of a theories scope, efficiency, and effectiveness
> is the degree to which it can be implemented in a programming
> language.  (Of course, the basis for judgement will be the Penn
> Treebank II guidelines and the standards described above).

I'd like to makes some points about Phil Bralich's recent message,
summarised above in his own words.

1. He seems to take it that the goal of NLP, and syntactic theory in
general, is to produce syntactic analyses, and that the only valid
type of analysis is the same as, or equivalent to, that of the Penn
Treebank. This is itself a "theory of syntax", and a controversial
one.  For a human being using language, the only function of syntax is
to fit word meanings together to make sentence meanings: syntax has no
value on its own, it is just a key to the semantics. This is true for
most NLP applications too.  Grammatical formalisms which do not
analyse sentences in terms of trees or labelled bracketings (e.g.
some types of categorial, dependency, and systemic grammar) are
capable of producing accurate semantic interpretations and are
computationally implementable and efficient.
    In this sense, Phil Bralich's characterisation of the ideal parser
as one which produces Penn Treebank syntactic structures quickly and
accurately covers only part of the field, and not necessarily the part
which will turn out to describe language most successfully in the long
run.

2. Suitability for implementation in a programming language is not the
only measure of the success of a grammatical formalism. Language is
(presumably) designed to work with human brains. We don't know how
brains work, but they are obviously not the same as serial computers.
The most natural description of the grammar of a language, then, is
not necessarily the one that works best on a serial computer.  Even
implementability itself is not always relevant: e.g. a syntax which
has a 30-word sentence producing 30 million analyses is
unimplementable with current techniques; but in a system which
analysed incrementally, one or a few words at a time, excluded 99.99%
of analyses on semantic grounds before their structure was built, and
represented the remaining ambiguities as vagueness in a single
analysis (all of which the brain perhaps does), it would be fine.

3. One of the contributors to this list may well have the ideal
grammatical formalism half worked out on their desk, but lack the time
and programming skill (or the money to employ people) to compile the
50,000 word dictionary and full-scale grammar needed to pass Bralich's
test. As has often been said about machine translation, any half-way
decent syntactic formalism will serve as the base for a good system,
given the time and money for development. Bralich's argument here
seems to me only partly relevant to linguistics. He's talking largely
about programming skill and availability of resources, not about the
adequacy of theories of linguistic description.

4. I'm sure many others too will have taken exception to the the
following: > computational linguistics departments who do not mention
these tools or > use tools of this calibre are remiss in their duty I
requested a copy of the Bracket Doctor to use in my classes, but it
seems a Unix version does not exist - and like most University CL
departments, we use Unix machines here.

John Phillips
Dept. of Linguistics
Yamaguchi University

-------------------------------- Message 2 -------------------------------

Date:  Sun, 22 Feb 1998 09:56:12 -1000
From:  Anne Sing <annes at htdc.org>
Subject:  Re: 9.255, Disc: NLP and syntax

At 03:59 PM 2/22/98 +0900, John Phillips wrote:

>I'd like to makes some points about Phil Bralich's recent message,
>summarised above in his own words.  1. He seems to take it that the
>goal of NLP, and syntactic theory in general, is to produce syntactic
>analyses, and that the only valid type of analysis is the same as, or
>equivalent to, that of the Penn Treebank. This is itself a "theory of
>syntax", and a controversial one.

You have the point rather backward here.  I am saying that if a theory
of syntax (whatever other things might be done in NLP) is to consider
itself a mature theory of syntax it should be able to produce programs
that meet those minimum standards I have outlined (appended at the end
of this message).  Look at those standards very closely.  I am sure
you will agree that most people have believed that current theories
can already program those very basic requirements, but in reality they
cannot.  After reviewing those standards, ask yourself if it is
reasonable to expect theories of syntax to handle this minimal level
of programming after 35 years of work (and millions of dollars in
grants, salaries and other expenses in academia and in industry).
Then ask yourself, if there is ONLY ONE theory that can meet those
standards, what does that mean for other theories of syntax.  (I chose
to say "35 years" based on the number of years of the existance of the
Association of Computer Linguistics.)

The Penn Treebank II guidelines are widely established as the standard
for this area of linguistics.  Researchers in both academia and
industry always ask me about our ability to handle the Penn Treebank
styles.  I assume they ask one another for the same proof of ability
in this area.  It is for that reason that we created programs that
could generate those trees and labeled brackets. That is so that we
could use a standard developed in this field to demonstrate our
abilities.  By the way, there is no one else doing this at this time
except Satoshi Sekine at NYU whose parser produces only the simplist
of sentences.  The fact that the major theories who are supported by
hundreds of researchers and sizable grants cannot do this is a cause
for serious concern on the part of the entire field of linguistics.
Institutions like MIT and Standford are lacking in reseachers, funds
or graduate students.  Why can they not generate this standard which
this field itself has established?

The controversial nature of the Penn Treebank styles should not be a
problem.  Certainly the theory that underlies our parser is very
different, but the fact that we have thorouhgly worked out our theory
and our algorithms means that it is a simple manner to translate our
output into the Penn Treebank sytle.  Any other theory who can meet
the standards that are outlined in my post (appended below) should be
able to do the same.  That is they should be able to translate their
output into the Penn Treebank styles with relative ease.

Again the fact that these standards and the Penn Treebank guidelines
are not being generated by Stanford, MIT, Microsoft, IBM and so on is
a cause for serious concern not only among linguists but also among
those who fund such projects and stockholders of those companies.  And
even more so for there being one theory that has succeeded in this
area where they haven't.  They cannot argue that it is impossible and
they cannot argue that they have already done it if they cannot show
it.

>For a human being using language, the only function of syntax is to
>fit word meanings together to make sentence meanings: syntax has no
>value on its own, it is just a key to the semantics. This is true for
>most NLP applications too.  Grammatical formalisms which do not
>analyse sentences in terms of trees or labelled bracketings (e.g.
>some types of categorial, dependency, and systemic grammar) are
>capable of producing accurate semantic interpretations and are
>computationally implementable and efficient.

This simply has not been shown.  As a matter of fact, since no NLP
device (based on semantics, syntax of anything else) before now has
been able to produce the Penn Treebank trees or meet those very
minimal standards that I have proposed, we can state confidently that
the opposite is true-- that the above clearly are NOT computationally
implementable and efficient.

>    In this sense, Phil Bralich's characterisation of the ideal
>parser as one which produces Penn Treebank syntactic structures
>quickly and accurately covers only part of the field, and not
>necessarily the part which will turn out to describe language most
>successfully in the long run.

Let's not forget that any characterization of a string or sentence or
set of strings and sentences is going to have to be able to account
for some very regular, very predictable facts of structure.  As a
matter of fact no description of semantics, pragmatics, or discourse
is going to be properly grounded until such time as the basic facts of
structure have been properly understood.  This is a reality that all
of linguistics has to live with.  We can not talk about quantum
physics without first having a thorough understanding of atoms,
electrons, protons, and so on.  In a similar manner the linguistics
field is not ready to talk about these other areas until the basic
facts are understood.  I am not saying that they cannot talk about or
research these areas I am only saying that until basic structure is
understood sufficiently, those other studies will never be more than
speculation--a lot like the work in alchemy could not evolve into
modern chemistry until the basic building blocks of nature were
understood.  They had every right to hypothesize and speculate but it
was not until the basics were understood that we saw the advances that
were possible from chemistry.  This will be true for linguistics as
well.  As soon as the basic building blocks and their basic
relationships are understood, then and only then will we see major
advances in other areas of linguistics.

>2. Suitability for implementation in a programming language is not
>the only measure of the success of a grammatical formalism. Language
>is (presumably) designed to work with human brains. We don't know how
>brains work, but they are obviously not the same as serial computers.
>The most natural description of the grammar of a language, then, is
>not necessarily the one that works best on a serial computer.  Even
>implementability itself is not always relevant: e.g. a syntax which
>has a 30-word sentence producing 30 million analyses is
>unimplementable with current techniques; but in a system which
>analysed incrementally, one or a few words at a time, excluded 99.99%
>of analyses on semantic grounds before their structure was built, and
>represented the remaining ambiguities as vagueness in a single
>analysis (all of which the brain perhaps does), it would be fine.

Yes, but take a look at those standards and the theoretical mechanisms
of current theories of syntax and see if you can find any principled
reason why they cannot be brought together to form a proving ground of
a theories scope and efficiency.  Surely you aren't going to argue
that there must be better theories of math because math programs can
make calculators?  You aren't also going to argue that there is a
better theory of mathematics out there that takes more of the brain
into account and therefor proof of its validity is the fact that it
cannot be programmed to make calculators?

>3. One of the contributors to this list may well have the ideal
>grammatical formalism half worked out on their desk, but lack the
>time and programming skill (or the money to employ people) to compile
>the 50,000 word dictionary and full-scale grammar needed to pass
>Bralich's test. As has often been said about machine translation, any
>half-way decent syntactic formalism will serve as the base for a good
>system, given the time and money for development. Bralich's argument
>here seems to me only partly relevant to linguistics. He's talking
>largely about programming skill and availability of resources, not
>about the adequacy of theories of linguistic description.

The dictionary can be obtained from the Linguistic Data Consortium for
about $2,500.  And there should be plenty of grad students and
programmers around who would see this as a good resume building
project.  After 35 years this should be common knowledge and there
should have been plenty of opportunity to have tried it.  If there are
new and untried theories out there, we at Ergo would be interested in
looking at them with an eye toward joint development efforts.

More importantly though is the fact that MIT, Standford, Microsoft,
and IBM are not saddled with these financial and personnel problems
but they also have not produced any devices that can meet the
standards I have proposed.  This in spite of the fact that most poeple
(even you it seems) have been led to believe these standards are
easily met and have already been met.

>4. I'm sure many others too will have taken exception to the the
>following:
>> computational linguistics departments who do not mention these
>>tools or use tools of this calibre are remiss in their duty
>I requested a copy of the Bracket Doctor to use in my classes, but
>it seems a Unix version does not exist - and like most University CL
>departments, we use Unix machines here.

I think any CL department should be able to afford at least one
Windows 95 machine.  Maybe you can find one at the University library.
The executable fits on one disk and is installed from a standard set
up program, so just copy it to a disk and take it to the nearest WIN95
machine if you want to try it.  We decided to use Windows 95 because
it was the easiest and most convenient manner to get these tools to
students, researchers, programmers, marketers, and so on in both
academia and industry.  We cannot limit ourselves to academia alone in
the current intellectual climate.

The original discussion can be found at Linguist 9.255

Philip A. Bralich, President
Ergo Linguistic Technologies
2800 Woodlawn Drive, Suite 175
Honolulu, HI 96822
tel:(808)539-3920
fax:(880)539-3924

-------------------------------- Message 3 -------------------------------

Date:  Mon, 23 Feb 1998 12:51:34 +0000 (GMT)
From:  jock at ccl.umist.ac.uk
Subject:  re: NLP and the best theory of syntax

Although mindful of the risk of exposing myself to Saint Anthony's
Fire, I wish to respond to the criticism of the EAGLES initiative made
by Dr Bralich. I do so in my function as co-Chief Editor of EAGLES.

There is a simple reason why EAGLES does not mention the criteria
espoused by Dr Bralich: EAGLES has not so far concerned itself with
proposing standards in the area of parsers.

That is the short answer, interested readers please read on.

It is unfair to criticise us for not doing something we had not
included in our (public) programme of work. One may criticise the
initial selection of topics, however. The topics retained were those
where there was wide agreement that some kind of useful consensus
could be obtained in the near term. The set of topics we actually
worked on were furthermore constrained by factors such as availability
of voluntary labour.

We have worked on the following topics of immediate relevance to the
current debate:

* morphosyntactic annotation of text corpora
* syntactic annotation of text corpora
* morphosyntactic description of lexical items
* syntactic subcategorisation of lexical items (verbs)
* comparative survey of implemented computational linguistic
	formalisms
* linguistic adequacy of CL formalisms
* development of an evaluation framework for NLP products

At present, we are working on, among other topics, semantic
subcategorisation of lexical items, pragmatic annotation of corpora
(text and spoken language) and developing proposals that complement
ISO 9126 from the point of view of NLP products and quality in use.

As Dr Bralich has found it difficult to find appropriate discussions
in the EAGLES literature and as there are presumably others who have
experienced such difficulty, let me point out a few areas of
relevance, which also indicate the limits of our work.

The introduction to our document on syntactic annotation of corpora
states:

"The scope of this report is syntactic annotation of corpora. At first
glance, a study of such annotation practices is difficult to
distinguish from a study of parsers, parsing, grammars, the
representation of parses, and the formalisms adopted for such
representations. Clearly, the syntactic annotation of corpora has a
close interrelation with parsing (indeed, a major function of a
syntactically annotated corpus is to provide a test-bed or a
training-bed for wide-coverage parsers). This cannot be ignored in the
report: but what we are ultimately interested in is the parsing
schemes in use to date (i.e. the set of symbols used in the annotation
scheme and guidelines for their application), although how the corpus
is parsed (the parsing system) is relevant, albeit indirectly, to our
task."

As we are also working in a multilingual environment, where different
languages have different linguistic representational traditions and
needs, we find that there are issues in practical application of
guidelines for syntactic representation:

"Since the approach to syntactic annotation is to a large extent
influenced by the language to be annotated, our guidelines do not give
any preference either to a phrase structure annotation or to a
dependency annotation. The phrase structure annotation, however, is in
certain ways the more demanding of the two, which is why this report
covers phrase structure in more detail. This should not be construed,
however, as expressing a preference for phrase structure annotation.
We will propose notations for both approaches."

In their work, the corpus group took into consideration various
projects including UPenn Treebank, ULancaster Treebank and the SUSANNE
corpus, all of which heavily influenced the shape and content of our
proposals.

This same concern with linguistic representation is met with
throughout the EAGLES reports. Here is an extract from the document on
syntactic subcategorisation, for example:

"The most important concern for EAGLES is linguistic substance.
Consequently, the group is building on the results of the ET-7
feasibility study (Heid & McNaught, 1991) which recommended the
following methodology: to break up the complex descriptive devices
into `minimal observable facts' in order to arrive at the most
fine-grained, common set of features underlying different theoretical
frameworks or systems. EAGLES results are therefore based on a careful
and detailed analysis of different linguistic theories and frameworks,
but aiming at reaching a consensus at the level of these `minimal
observable facts'.

Connected with this basic objective is the approach chosen towards its
achievement, an approach which can be defined as looking for an edited
union (a term due to Gazdar) of the features proposed in the various
major theories and systems. This approach tries to capture all the
relevant distinctions made by the major lexical theories/systems,
without taking a theoretical stand, thereby giving to features labels
which are as neutral as possible.

In an attempt to be as theory-compatible as possible, there are a few
points where choices were left open, especially for those aspects of
grammatical description which tend to be more theory-bound (e.g.
grammatical relations and control). There are practical drawbacks to
this decision -- especially with regard to the implementation of the
proposed standard -- but, at least in this first phase, more
importance was given to avoiding committment to specific theories of
lexical description. We recognise that there is a tension between the
decision to be flexible and open to more than one choice and the real
and effective useability of the proposed standards. Without abandoning
the principle of flexibility and openness, we provide an indication of
usage by exemplifying the implementation of critical choices.

In general, the EAGLES results are achieved in a dynamic way, with a
cyclical process of revision after one or more phases of testing and
feedback, possibly in large projects. The difference between the
European approach and other approaches to standards should be pointed
out here, to be taken as a description of a general tendency. While
in, say, the USA, a sort of de facto standard is somehow made
available to the community through the provision of publicly available
data, in Europe we try to arrive at consensually agreed
standards. This implies a considerable effort in trying to involve the
relevant experts in the different areas of concern, either in the
phase of producing the standards, or at least in the successive phases
of testing the proposals and providing feedback.  This approach also
involves a large amount of overheads in terms of activities and work
necessary to arrive at a consensus as well as a slower process of
arriving at the aimed-for results."

I also draw attention to preparatory work carried out on computational
linguistic formalisms. In particular, the group charged with this work
organised two workshops that brought together numerous practioners
from industry and academia. One concentrated on intensive comparison
of implemented formalisms: a common framework for comparison was
agreed on and systems were put through their paces as part of the
information gathering process.  The other focussed on the linguistic
adequacy of implemented formalisms. I stress that this work was
preparatory and was not continued in phase II of EAGLES, for reasons
of little interest in this context. However, the results did reveal a
great degree of convergence among formalisms and gave indications how
grammars associated with some formalism could be rendered reusable by
another.

The work on evaluation did not specifically include parsers: it was
oriented at developing a general framework for evaluation of NLP
products and focussed initially on adequacy evaluation. There has been
increasing cooperation between EAGLES and those responsible for ISO
9126, as EAGLES has been instrumental in providing guidelines that
neatly complement the ISO work.  The recent book on NLP evaluation by
Sparck Jones & Galliers recognised the contribution of EAGLES to
evaluation.

Lastly, I note that numerous projects and initiatives in Europe have
chosen to adopt EAGLES guidelines especially for the representation
and annotation of text corpora and dictionaries.  ELRA also works with
EAGLES guidelines. The initiative thus receives constant feedback from
such widespread take-up of its results. (EAGLES also has an important
activity in the development of guidelines for spoken language
resources and processing and the speech community has responded warmly
to our efforts in this area.).

Those who wish to find out more about how EAGLES guidelines are being
used, debated, developed, applied, etc., are welcome to attend (or to
acquire the proceedings of) the First International Conference on
Language Resources and Evaluation (Granada, May 1998) where many
papers and workshops will refer to EAGLES results and in a
constructively critical way into the bargain.  (Conference URL:
http://www.icp.inpg.fr/ELRA/conflre.html)

The EAGLES initiative will produce a new round of publications in the
4th quarter of 1998, which will be available from
http://www.ilc.pi.cnr.it/EAGLES/home.html

I trust this explanation has served to put EAGLES work and results in
context. One may naturally disagree with our approach; no-one is
imposing standards on anyone in EAGLES. However, we are encouraged
that large numbers of people from industry and academia have become
involved in this initiative and have given freely of their time to
develop recommendations and guidelines that, by most accounts, meet
with widespread approval and adoption. We must be doing something
right...

JMcN
-

John McNaught		     jock at ccl.umist.ac.uk
(Co-Chief Editor, EAGLES)
Centre for Computational
 Linguistics 		
Department of Language Engineering
UMIST			
PO Box 88
Sackville Street
Manchester, UK         	     tel: +44.161.200.3098 (direct)
M60 1QD                      fax: +44.161.200.3099

---------------------------------------------------------------------------
LINGUIST List: Vol-9-276