9.1241, Disc: POS is well-formed or not well-formed?

LINGUIST Network linguist at linguistlist.org
Tue Sep 8 17:31:09 UTC 1998


LINGUIST List:  Vol-9-1241. Tue Sep 8 1998. ISSN: 1068-4875.

Subject: 9.1241, Disc: POS is well-formed or not well-formed?

Moderators: Anthony Rodrigues Aristar: Wayne State U. <aristar at linguistlist.org>
            Helen Dry: Eastern Michigan U. <hdry at linguistlist.org>

Review Editor: Andrew Carnie: U. of Arizona <carnie at linguistlist.org>

Associate Editors:  Ljuba Veselinova <ljuba at linguistlist.org>
                    Martin Jacobsen <marty at linguistlist.org>
                    Brett Churchill <brett at linguistlist.org>

Assistant Editors:  Elaine Halleck <elaine at linguistlist.org>
		    Julie Wilson <julie at linguistlist.org>

Software development: John H. Remmers <remmers at emunix.emich.edu>
                      Zhiping Zheng <zzheng at online.emich.edu>

Home Page:  http://linguistlist.org/


Editor for this issue: Brett Churchill <brett at linguistlist.org>

=================================Directory=================================

1)
Date:  Tue, 8 Sep 1998 09:23:34 +0300 (EET DST)
From:  Deborah D K Ruuskanen <druuskan at cc.helsinki.fi>
Subject:  Re: 9.1235, Disc: POS is well-formed or not well-formed?

2)
Date:  Tue, 08 Sep 1998 02:16:20 +0300
From:  "Christopher A. Brewster" <brewster at upatras.gr>
Subject:  Re: 9.1235, Disc: POS is well-formed or not well-formed?

-------------------------------- Message 1 -------------------------------

Date:  Tue, 8 Sep 1998 09:23:34 +0300 (EET DST)
From:  Deborah D K Ruuskanen <druuskan at cc.helsinki.fi>
Subject:  Re: 9.1235, Disc: POS is well-formed or not well-formed?

The comparison of linguistic segmentation with molecular chemistry is
not valid IMO, precisely because the rules for segmentation differ for
different languages. Consider:

	brot   -  her
	
	broth  -  er

The first example follows the rules for segmentation used for Finnish,
the second the rules for English - at least for written text.  There is
never going to be any agreement on these rules like the agreement for
defining atoms in chemistry, because the symbol/sound systems are so
different. Finnish, BTW, does not leave 'space' between the segments
(words?) so that the post-positions are glued on to the main segment
(head), which is the case in all agglutinative languages. So all we have
left is the statistical distribution based well-formed theory which is
in the textbooks. Unless you all can come up with something better,
which is what I assume this discussion is all about.
Cheers, DKR
-
Deborah D. Kela Ruuskanen
Leankuja 1, FIN-01420 Vantaa
druuskan at cc.helsinki.fi



-------------------------------- Message 2 -------------------------------

Date:  Tue, 08 Sep 1998 02:16:20 +0300
From:  "Christopher A. Brewster" <brewster at upatras.gr>
Subject:  Re: 9.1235, Disc: POS is well-formed or not well-formed?

There aspects of this discussion which are reminiscent of the post-Bloomfield period
in linguistics where supposedly one could only go strictly bottom up and no
'cheating' was allowed. Everything had to be determined purely on distributional
criteria. In fact, of course, everyone used their common sense indeciding what a
word was, or a morpheme etc.

While there may be theoretically a large number of possible POS systems for a given
language, there are such factors as theoretical elegance, common sense and seeing
what works. We all know that language never has fit our neat theoretical containers
very well, but that is why we search for better theoretical accounts.

I would like in addition to ask whether anyone has applied language modelling
methods such as described by Brown et. al 1992 'Class-based n-gram models of natural
language' or McMahon & Smith 1996 'Improving Statistical Language Models Performance
with Automatically  Generated Word Hierarchies' to a language like Chinese. The
approaches described in these papers result in very significant POS type
classification structures for languages like English while the criteria are quite
reasonable.

Christopher Brewster
University of Patras
brewster at upatras.gr

---------------------------------------------------------------------------
LINGUIST List: Vol-9-1241



More information about the LINGUIST mailing list