6.625, Qs: Lithuanian phonology, Greek VOT values, Parser results

The Linguist List linguist at tam2000.tamu.edu
Sun Apr 30 04:59:08 UTC 1995


----------------------------------------------------------------------
LINGUIST List:  Vol-6-625. Sat 29 Apr 1995. ISSN: 1068-4875. Lines: 186
 
Subject: 6.625, Qs: Lithuanian phonology, Greek VOT values, Parser results
 
Moderators: Anthony Rodrigues Aristar: Texas A&M U. <aristar at tam2000.tamu.edu>
            Helen Dry: Eastern Michigan U. <hdry at emunix.emich.edu>
 
Asst. Editors: Ron Reck <rreck at emunix.emich.edu>
               Ann Dizdar <dizdar at tam2000.tamu.edu>
               Ljuba Veselinova <lveselin at emunix.emich.edu>
               Annemarie Valdez <avaldez at emunix.emich.edu>
 
                           REMINDER
[We'd like to remind readers that the responses to queries are usually
best posted to the individual asking the question. That individual is
then  strongly encouraged to post a summary to the list.   This policy was
instituted to help control the huge volume of mail on LINGUIST; so we
would appreciate your cooperating with it whenever it seems appropriate.]
 
-------------------------Directory-------------------------------------
 
1)
Date: Wed, 26 Apr 95 22:06:22 EDT
From: MICHEL PLATT (m200754 at er.uqam.ca)
Subject: Need info Lithuanian Phonology
 
2)
Date: Thu, 27 Apr 95 12:58:02 +1000
From: (Liz Francis) (E.Francis at unsw.EDU.AU)
Subject: Greek VOT Values
 
3)
Date:   Wed, 26 Apr 1995 21:27:36 -1000
From: Phil Bralich (bralich at uhunix.uhcc.Hawaii.Edu)
Subject: Parser results
 
-------------------------Messages--------------------------------------
1)
Date: Wed, 26 Apr 95 22:06:22 EDT
From: MICHEL PLATT (m200754 at er.uqam.ca)
Subject: Need info Lithuanian Phonology
 
I would like to know who has recently written articles, etc., on
Lithuanian Phonology. Would like to know what is currently out there
(also looking for e-mail address for Lithuanian Summer School in Vilnius
if anyone has it)
 
Michel (Platukis) Platt
 
--------------------------------------------------------------------------
2)
Date: Thu, 27 Apr 95 12:58:02 +1000
From: (Liz Francis) (E.Francis at unsw.EDU.AU)
Subject: Greek VOT Values
 
Dear Linguist Subscribers,
I am interested in obtaining the following information or references which
deal with the following:
 
VOT values (and any other acoustic charactersitics) of the bilabial and
alveolar-dental stops of Modern Greek.
 
If you have any information about this, can you please write to me at:
e.francis at unsw.edu.au
 
Thank You
 
Elizabeth Francis
 
--------------------------------------------------------------------------
3)
Date:   Wed, 26 Apr 1995 21:27:36 -1000
From: Phil Bralich (bralich at uhunix.uhcc.Hawaii.Edu)
Subject: Parser results
 
I have been involved in creating a new parser for English using a
theory of syntax developed by myself and Professor Derek Bickerton
of the University of Hawaii.  We have had good initial success and
would like to ask for comments and suggestions on what we have
accomplished.  Specifically, we would like to receive comments about
lexical items  and structure types that readers believe would pose us
problems.  At the end of this message is a list of sentences that
indicate the extent of our success so far.  If you recognize structure
types or lexical items that we are missing, we will try and include
them in future stages of the parsers development.
 
The parser is still in its infancy, but here are the specs for a rather
complete version of the parser.  It is based on a series of algorithms
that have been four years in the making, but the programming required
to create this parser has only taken 300 hours using C++ .  There are
approximately 3000 lines of code that take up 150k executable on
disk.  About 100k of RAM is required to run the parser.  30k on disk is
required for a 300 word dictionary.   An average sentence takes under
4 seconds to process on a 486 IBM compatible.  Since this is only a
development version, we expect these numbers to change.  To date, no
optimizations have occurred, and we expect to significantly shrink the
dictionary disk usage and the execution time.
 
The final demonstration version of the parser (due to finish in a few
weeks) will: 1) identify sentences as correct, incorrect but parsable
(e.g. John likes herself), or unparsable (e.g. John up red the), 2)
identify parts of speech as appropiate for context (correctly separating
ambiguous words such as 'can' the verb and 'can' the noun), 3) identify
parts of the sentence such as subject verb, object, (including complex
subjects and objects as in "What John knows is scary" "John's pictures
of himself are good" or "That John likes Mary is shocking."  4) change
active sentences to passive and passive to active, 5) change noun
clauses to questions and questions to noun clauses.  The parser will
also be able to identify appropriate referents for reflexives and
pronouns.  Finally, the parser will be able to respond to statements and
answer questions based on a text that you create from the 200 word
dictionary.
 
In the sentences that follow, we are of course aware that negatives and
coreference are not yet handled.  Those are somewhat separate issues,
and we will be posting our results with those in about a week.  We
would like to receive responses in three areas: 1) the basic sentences
posted in this message, 2) the facts of coreference to follow in the next
message, and 3) problems with negatives and quantifiers which will be
posted still later.
 
BASIC SENTENCES
Currently the parser can assign a '+' to acceptable sentences, a '*' to
parsable but unacceptable sentences and an F to sentences that fail to
parse.  The structure types are these:  statements, topicalized
sentences, yes/no questions, wh word questions, the That-trace effect,
and Control sentences.  Samples (1) - (30) illustrate the results that we
are getting.  The user inputs the string and then the program returns a
judgement about the string.  The sentences we choose are meant to
illustrate the programs ability to handle a full range of syntactic
phenonena.  The current lexicon we are dealing with has about 300
words, but we are limiting vocabulary choice in this posting to focus
on the structures.  Frequently parsers are aimed at a robust abilty to
relate to large corpora.  However, the problems involved in such a
project are largely lexical.  The focus of this project is to develop a
parser that can take individual sentences and provide a very complete
treatment of the syntax: give acceptabilty  judgements, label parts of
speech, label parts of the sentence, identify problems, manipulate
structures (e.g. passive to active or question to noun clause).  We are
unlikely to apply this work to large corpora until it is fully capable of
providing a full syntactic treatment of individual sentences.
 
1.   +John put the car in the garage
2.   +Did John put the car in the garage
3.   +Who put the car in the garage
4.   +What did John put in the garage
5.   *What did John put the car in the garage
6.   *What did who put in the garage
7.   *who what did John put in the garage
8.   *who did what did John put in the garage
9.   *what did John forget who bought
10.  +this car John put in the garage
11.  *this car who put in the garage
12.  *where this car John put
13.  NP +the car that John put in the garage
14.  +John wants someone to work for ('John' is subject of 'to work'
     'someone' is object of 'for'))
15.  +John wants someone to work for him ('someone' is subject of
     'to work')
16.  +Who does John want someone to work for ('who' is object of
     'for,' 'someone' is subject of 'to work')
17.  +who does John want to work for Bob ('who' is subject of 'to
     work')
18.  *who does John want Bob to work for Mary.
19.  +Johns knows that Bob likes Mary
20.  *John knows that likes Mary
21.  +John tried to go
22.  *John tried Bob to go
23.  +Who does John want to go
24.  *who does John try to go
25.  *John can going
26.  *John has going
27.  *John does gone
28.  *John knows the who Bob saw
29.  *John wonders John is a student
30   *John knows who is that man
 
As I said earlier, our results with corefernce for anaphora and pronouns and
our results with negation and quantification will be posted in two future
messages (about one week each).
 
Phil Bralich
bralich at uhccux.uhcc.Hawaii.edu
 
--------------------------------------------------------------------------
LINGUIST List: Vol-6-625.



More information about the LINGUIST mailing list