6.1438, Qs: Text wanted, Data acquisition, AGR, Dictionary

Tue Oct 17 14:05:52 UTC 1995

---------------------------------------------------------------------------
LINGUIST List:  Vol-6-1438. Tue Oct 17 1995. ISSN: 1068-4875. Lines:  247

Subject: 6.1438, Qs: Text wanted, Data acquisition, AGR, Dictionary

Moderators: Anthony Rodrigues Aristar: Texas A&M U. <aristar at tam2000.tamu.edu>
            Helen Dry: Eastern Michigan U. <hdry at emunix.emich.edu>

Associate Editor:  Ljuba Veselinova <lveselin at emunix.emich.edu>
Assistant Editors: Ron Reck <rreck at emunix.emich.edu>
                   Ann Dizdar <dizdar at tam2000.tamu.edu>
                   Annemarie Valdez <avaldez at emunix.emich.edu>

Software development: John H. Remmers <remmers at emunix.emich.edu>

Editor for this issue: dseely at emunix.emich.edu (T. Daniel Seely)
                           REMINDER
[We'd like to remind readers that the responses to queries are usually
best posted to the individual asking the question. That individual is
then  strongly encouraged to post a summary to the list.   This policy was
instituted to help control the huge volume of mail on LINGUIST; so we
would appreciate your cooperating with it whenever it seems appropriate.]

---------------------------------Directory-----------------------------------
1)
Date:  Mon, 16 Oct 1995 15:55:57 EDT
From:  messing at tiger.asel.udel.edu (Lynn Messing)
Subject:  Q: text with representative frequency distribution of letters

2)
Date:  Mon, 16 Oct 1995 16:42:51 EDT
From:  SDFNCR at ritvax.isc.rit.edu ("Dr. Word")
Subject:  Data acquisition hardware and software

3)
Date:  Mon, 16 Oct 1995 20:43:28 CDT
From:  howard at mailhost.tcs.tulane.edu (Harry Howard)
Subject:  AGR+ morphology

4)
Date:  Tue, 17 Oct 1995 02:46:22 BST
From:  lxalvarz at unica.udc.es (Celso Alvarez-Caccamo)
Subject:  Query: A Universal Electronic Dictionary of Language Names

---------------------------------Messages------------------------------------
1)
Date:  Mon, 16 Oct 1995 15:55:57 EDT
From:  messing at tiger.asel.udel.edu (Lynn Messing)
Subject:  Q: text with representative frequency distribution of letters

We are doing research which involves training neural networks
to recognize fingerspelling. For training purposes, we would like
to have a relatively small text which approximates the
frequency distribution of the letters as used in a typical
English text. Does anyone know whether such a piece has been
composed? BTW, we realize that there will be different frequency
distributions depending on the genre; we would be happy for a
text which represents the frequency distributions from any
genre or from a composite of genres.

Any help would be greatly appreciated.

Cheers,
Lynn Messing

___________________________________________

Lynn Messing  (messing at asel.udel.edu)
Applied Science and Engineering Laboratories
The Alfred I. duPont Institute and The University of Delaware

    Alfred I. duPont Institute        voice: (302) 651-6846
    P.O. Box 269                      tdd:   (302) 651-6834
    1600 Rockland Rd.                 fax:   (302) 651-6895
    Wilmington, DE 19899
http://www.asel.udel.edu/~messing/home.html

------------------------------------------------------------------------
2)
Date:  Mon, 16 Oct 1995 16:42:51 EDT
From:  SDFNCR at ritvax.isc.rit.edu ("Dr. Word")
Subject:  Data acquisition hardware and software

Faculty in our department may have an opportunity to get some new equipment to
generate experimental stimuli and run experiments dealing with visual
perception of language.  We have a RISC-based Mac already available.  One
option is to beef it up with video cards and extra memory, but some people in
our department are under the impression that for collecting data (e.g.,
reaction time, randomizing stimuli,....) there are more packages available in
the PC world than in the Mac world.  We would be interested in your expert
feedback on the following questions:

1)  Are there equivalent packages for the Mac?  If so, are they any good?
2)  Since Macs are generally thought to be better for graphics than PCs, would
it be possible to network a Pentium with the abovementioned beefed-up Mac so
that the Pentium did the data acquisition while the Mac did the presentation of
the data?  If so, have you done it yourself, and did it require a great deal of
programming expertise?

I would be happy to summarize any responses for the net.  Please address
replies directly to me, as I have been unable for some reason to receive any
messages from Linguist.  Thank you in advance for your help.

Susan Fischer                             |  Internet: sdfncr at rit.edu
National Technical Institute for the Deaf |  Phone:  (716) 475-6558
Rochester Institute of Technology         |  Fax:    (716) 475-6500
52 Lomb Memorial Drive                    |
Rochester, NY  14623-5604                 |  Microsoft Works is an oxymoron

------------------------------------------------------------------------
3)
Date:  Mon, 16 Oct 1995 20:43:28 CDT
From:  howard at mailhost.tcs.tulane.edu (Harry Howard)
Subject:  AGR+ morphology

Fellow linguists,

        I am looking for a morpheme which agrees with a noun phrase (e.g.
in person, number, and/or gender) and which adds some other - preferibly
non-redundant - information about its 'antecedent'. In a right-branching
language, an illustrative sequence would be something like:

                            Mary ... 3sf+

where '3sf' stands for a third singular feminine morpheme associated with
'Mary' and the plus sign indicates that the morpheme adds some more
information about Mary.
        One obvious candidate for '+' would be case, but I would like to
exclude this possibility on two grounds. The first is that I already know
of AGR+ morphemes that add Case, namely, the Spanish clitic pronouns. When
they double full NPs, they add the information that the doubled phrase is
accusative or dative. The other reason is that, at least in structure-based
theories, if the antecedent occupies its base position, this position
already encodes a given Case so '+' would not count as adding non-redundant
information.
        So, what other features could '+' be? Focus, (in)definiteness?

        I ask this question for two reasons. The first is that the
reanalysis of Burzio (1991) undertaken in Franks & Schwartz (1994) suggests
that AGR+ morphology should not exist. Franks & Schwartz propose the
following simplification of Burzio:

If A binds B, then B agrees with A.      (11)
B agrees with A iff B is non-distinct from A in phi features.     (12)
"... the non-distinctness criterion will have to be understood
directionally: the target of agreement B cannot be more specified than the
source A, so that any phi feature specified in B must be similarily
specified in A." (p.235)

A phi feature is a grammatical feature like person, number, and gender. My
own take on Franks & Schwartz's proposal is the following observation:

A dependent element does not have a superset of its antecedent's phi features.

This seems to be taken for granted in research on morphoysyntactic
dependencies, but are there really no counterexamples?
        A second reason for asking is that unification-based theories would
seem to have no problem in formulating AGR+ rules. Let us assume that the
putative AGR+ morpheme adds a feature for focus, say [FOC:+]. If the matrix
for 'Mary' is as on the left below, it can unify with the matrix for the
putative morpheme on the right to give the matrix underneath:

   [REF:Mary, PER:3, NUM:s, GEN:f...] [PER:3, NUM:s, GEN:f, FOC:+]
                 [REF:Mary, PER:3, NUM:s, GEN:f, FOC:+ ...]

If such constructions are not attested, then ruling them out would be an
interesting challenge, though my knowledge of the unification literature is
too scanty to know whether this issue has been addressed or not.

Please reply to me personally, and I will post a summary to the list.

Harry

Burzio, Luigi. 1991. The morphological basis of anaphora. Journal of
Linguistics. 27:81-105.
Franks, Steven & Linda Schwartz. 1994. Binding and non-distinctness: a
reply to Burzio. Journal of Linguistics. 30:227-243.

***************************************************************************
Harry Howard, Ph.D.                                     voice: 504/862-3417
Dept. of Spanish & Portuguese                             fax: 504/862-8752
302 Newcomb Hall                                    Harry.Howard at tulane.edu
Tulane University                            howard at mailhost.tcs.tulane.edu
New Orleans, LA  70118-5698
USA
                          http://spgr.sppt.tulane.edu/Span+Port/HHHome.html

------------------------------------------------------------------------
4)
Date:  Tue, 17 Oct 1995 02:46:22 BST
From:  lxalvarz at unica.udc.es (Celso Alvarez-Caccamo)
Subject:  Query: A Universal Electronic Dictionary of Language Names

Continuing with the litany of curious queries (*not* for a research paper
.-)), I wonder if a tool such as the following exists anywhere: An
on-line, interactive (preferably in hypertext) multilingual catalogue or
dictionary of language names in all languages.

At times, particularly when reading linguistic literature in a foreign
language, it is hard to recognize a given language name, or to
adapt/translate the name into yet another language.  Given the prominence
of English in the field, speakers of languages other than English are
often confronted with the dilemma whether to simply adopt the English name
or to try to adapt it according to is its likely pronunciation.  This is
both unfair and unnecessary in the age of technology.

The tool I'm proposing would consist of something like a matrix or grid of
x columns by x rows of languages in alphabetical order, including the IPA
representation of the standard pronunciation of the language name.  An
example follows:

		IPA	ENGLISH		FRANC,AIS	PORTUGUE^S
English		'iNglIS	English		anglais		ingle^s
franc,ais	fRo~'sE	French		franc,ais	france^s
portugue^s	purtu'GeS  Portuguese	portugais	portugue^s

Language names (first column) would be represented in their respective
standard scripts when existing (if not, only in IPA). Of course, a 5,000 x
5,000 grid (or so) wouldn't fit entirely in any screen in the world.  The
hypertext tool should allow for navigating the database, selecting a set
of target languages (first row) and/or language names for comparison,
listings, quick searches, etc.  The database would be ever-expanding;
cells would be filled gradually and columns/rows added as information
would become available.

The database would be useful for questions such as:
-What is the native term and pronunciation for "Estonian"?
-How do you write "Wolof" in Korean?
-What language is "abexim"?
-How have languages similar to my own adapted "Swahili"?

So, that's my Babelian proposal.  The Internet is a good place to start
building the tool.  There should be some team of computer-literate
lunatics out there willing to build it.  But non-commercially, please.

Celso Alvarez-Caccamo
Departamento de Linguistica Geral e Teoria da Literatura
Universidade da Corunha, Galiza - Spain
Tel: 34-81-100457, ext. 1758
FAX: 34-81-102459
lxalvarz at udc.es
------------------------------------------------------------------------
LINGUIST List: Vol-6-1438.