[HPSG-L] relation between Formal Grammar of Human Language and CFG

Thomas Graf hpsg-l at thomasgraf.net
Fri Aug 5 18:29:10 UTC 2022


Formal grammarian/computational linguist here who can't resist chiming 
in. Advance warning: I tried to keep this short, but it's a long read...

>While CCG and TAG people will tell you that this is way too 
>powerful, HPSGians usually have a different view. The formalism 
>should not be constraining but the theory formulated within the 
>formalism has to be as restrictive as possible. Carl Pollard argued 
>for this in a paper.

The reason CCG, TAG, MGs, and others worry about the "too powerful" part 
is actually two reasons in one. My impression is that the HPSG community 
tends to focus on the first reason when it is actually the second reason 
that motivates formal grammarians' search for restricted formalisms.


Reason 1
========

Linguistics is in the business of characterizing what separates
the patterns we find in natural languages from arbitrary mathematical 
patterns, e.g. dependencies that involve the ability to distinguish 
prime numbers from numbers that aren't prime (this is within the class 
of recursively enumerable string languages).

For this goal, it really does not matter whether one puts the 
restrictions in the formalism or the theory. For example, no language 
seems to have cowardly islands, i.e. cases where an island loses its 
island status unless there are at least n other islands in the sentence. 
"HPSG the formalism" could define such islands, but they would still 
look very odd within "HPSG the theory".

Quite simply, what we want is explanations, and it's not really that 
important whether those come from the formalism or the theory stated 
within the formalism. HPSG is justified in doing this purely through 
the theory, but CCG, TAG, MGs and other communities are equally 
justified in doing this (largely, but rarely exclusively) through the 
formalism.

Reason 2
========

We don't just want grammatical descriptions, we also want parsing and 
learning algorithms with safeties and guaranteed computational bounds.  
And you get those safeties and bounds by exploiting computational 
limitations of the formalism. If, for instance, your syntactic 
derivations have a finite-state backbone, that opens up a whole 
toolbox of tricks you can rely on for parsing and learning.

Now in principle this should work just as well no matter whether we 
start with a restricted formalism or an formalism restricted by a 
theory, but in practice it simply doesn't. The restrictions of 
linguistic theories are not readily exploited in parsing and learning 
algorithms, and they are difficult to translate into computational 
properties that could be exploited this way.

In particular, it is not obviously true that we can feasibly 
reaxiomatize HPSG into a more restricted formalism once we have 
absolute certainty about the restrictions of the theory. The fact that 
such a reaxiomatization exists does not mean that we can find it, or 
that we can prove that this reaxiomatization is correct and covers all 
of HPSG as a theory. To wit, it's unclear whether the TAG 
implementation of HPSG in Kasper et al (1995) actually accommodates 
"HPSG the theory".

If you care deeply about these issues, starting with a restricted, 
well-understood formalism that gets expanded as needed is the safer 
route.


A few other points
==================

Despite what I just said above, I think the difference between the 
HPSG approach and the formal grammar approach isn't actually all that 
big.

The modus operandi of HPSG is to take an unrestricted description 
language and then state within that the linguistic restrictions that 
we care about. When restrictions turn out to be too strong, we loosen 
them, when it turns out there's empirical slack, we tighten them.

Well, that's exactly what formal grammar does. Our unrestricted 
description language is math itself, and with that language we state 
the linguistic restrictions we care about --- these restrictions are 
what's called a formalism. Just like HPSG theory isn't set in stone, 
formalisms aren't set in stone: we can restrict TAG, expand it, design 
a completely new formalism, and so on.

To me, the central advantage of the formal grammar approach is that 
you can have two types of restrictions: fine-grained linguistic 
restrictions that explain the empirical lay of the land, and the broad 
computational restrictions that can be relied on in parsing and 
learning. These broad restrictions also provide new hooks for 
artificial language learning experiments; they close the gap to 
neuroscience where we do not have the means yet to test our 
fine-grained linguistic restrictions; they provide the foundation for 
comparisons to animal "language", and much more.  Whenever we only 
need to be roughly in the right ballpark rather than exactly right, 
computational restrictions are an invaluable asset.

In a nutshell: Computational restrictions don't limit linguistic 
research, they foster it.

Cheers,
   Thomas


On 2022/08/05  17:11, Stefan Müller wrote:
>Dear Roussanka,
>
>The general insight from the eightees was that human language is not 
>context free. Some people call it mildly context-sensitive. Since GPSG 
>was constructed to be context free, people were looking for something 
>more powerful and abandoned it in favoure of GPSG. This taken together 
>with a more lexical view resulted in HPSG, which has exactly the right 
>generative capacity: It has Turing power. =:-)
>
>While CCG and TAG people will tell you that this is way too powerful, 
>HPSGians usually have a different view. The formalism should not be 
>constraining but the theory formulated within the formalism has to be 
>as restrictive as possible. Carl Pollard argued for this in a paper.
>
>The whole discussion and references can be found in the HPSG handbook 
>and in my Grammar Theory textbook. I have a (brief) chapter on 
>generative power (Chapter 17):
>
>https://langsci-press.org/catalog/book/259
>
>https://langsci-press.org/catalog/book/287
>
>Geoff Pullum has a good paper on the history of the discussion. All 
>referenced from with in the book.
>
>Best
>
>    Stefan
>
>
>Am 05.08.22 um 14:24 schrieb Roussanka Loukanova:
>>Dear All,
>>
>>What is the verdict on the relations between Formal Grammar of Human
>>Language and Context-Free Grammar (CFG) of Chomsky hierarchy on formal
>>grammars and languages?
>>
>>I would appreciate very much, opinions, points to research and, especially,
>>bibliographical references.
>>
>>Best Regards,
>>Roussanka
>>_______________________________________________
>>HPSG-L mailing list
>>HPSG-L at listserv.linguistlist.org
>>https://listserv.linguistlist.org/cgi-bin/mailman/listinfo/hpsg-l
>_______________________________________________
>HPSG-L mailing list
>HPSG-L at listserv.linguistlist.org
>https://listserv.linguistlist.org/cgi-bin/mailman/listinfo/hpsg-l

-- 
Thomas Graf
Stony Brook University
Department of Linguistics
mail at thomasgraf.net
http://thomasgraf.net


More information about the HPSG-L mailing list