Typological studies based on original texts

Matthew Dryer dryer at BUFFALO.EDU
Tue Apr 4 01:26:16 UTC 2006

While I think Grev is right that there are many typological questions that 
require a large body of texts to answer them for a particular language, I 
also think that there are other questions that do not.  In fact, it is my 
experience that one can tell a tremendous amount about a language from just 
a few pages of text.

Probably about a 1000 of the datapoints (dots) on my maps in the WALS atlas 
are based on my examination of texts, rather than on a claim in a 
descriptive grammar.  For high frequency things like the order of subject 
and verb or the order of object and verb, one can get a pretty good idea 
from a few pages of texts (though there may be issues about the 
applicability of the notions 'subject' and 'object' that will not be easily 
answered from texts).  And there are certain things that I have found I 
could determine MORE dependably from texts than from a grammar.  For 
example, there are many languages in which the demonstrative has 
grammaticized as a definite marker, where this becomes clear from texts and 
may not even be mentioned in a grammar.  The extent to which independent 
pronouns are used in subject position is something that I have found if 
often much clearer in texts than from anything said in a grammar, and many 
of the datapoints in my chapter in the WALS atlas that deals with this are 
based on examination of texts.  In general, grammars will often say that 
something is possible (such as a less frequent order, or leaving out 
subject pronouns) but it is only by examining texts that one gets an idea 
of HOW frequent these phenomena are.

In fact, I would say that any descriptive grammar is inadequate without 
some texts.  In some cases, I have found that the evidence from texts shows 
that a claim in the grammar itself is inaccurate.

Matthew Dryer

--On Monday, April 3, 2006 10:26 AM +0100 Greville Corbett 
<g.corbett at SURREY.AC.UK> wrote:

> Nick makes a good point. To back up with some numbers: in the SMG we
> investigated how various typological claims would work out in a one
> million word corpus of a single language. Gathering the data from an
> existing corpus took a person-year of work. (The language was the exotic
> but not totally unknown language Russian.) One of the outputs, which like
> Nick's example you may not want to count, appeared in the Bybee/Hopper
> volume Frequency and the emergence of linguistic structure (2001) - I'll
> send details if you want them. While we found out a lot of what we wanted
> to know, for one of our key questions one million words proved
> insufficient. So there's also the converse of Nick's problem: corpus work
> may require extensive information for each single language. Best wishes
> Greville Corbett
> On 3/4/06 10:08, "Nick Evans" <nrde at UNIMELB.EDU.AU> wrote:
>> Berni, an interesting question, and I agree it's
>> very important, though my preference would be to
>> lower the number of languages: large sample sizes
>> can often get in the way of perceptiveness about
>> what's going on, and the most interesting things
>> happening in texts often require you to have the
>> sort of detailed knowledge of language structure
>> that isn't possible to extend to large numbers of
>> languages.
>> An example which I don't know if you'd count is
>> Nikolaus Himmelmann's 1997 Deiktikon, Artikel,
>> Nominalphrase. Zur Emergenz syntaktischer Struktur
>> which really takes advantage of a text based
>> approach to identify emergent structures, in this
>> case the NP or emergent precursor therefore.
>> Best, Nick Evans
>>> Dear colleagues
>>> Does anybody know of any typological investigation based mainly or in a
>>> substantial part on the material of original texts in a large number of
>>> languages (say, 20 or more). There are by now
>>> many typological studies based on
>>> reference grammars, and even some based on questionnaires, parallel
>>> texts,  and
>>> story stimuli (Pear stories, Frog stories) but it seems to me--I would
>>> be  very
>>> pleased to be wrong--that there are virtually no large-scale studies
>>> based mainly or exclusively on original texts. One
>>> study I am aware of is the following:
>>> Güldemann, Tom. (2001). Quotative constructions in African languages: a
>>> synchronic and diachronic survey. Habilitationsschrift Leipzig.
>>> Unpublished [based on texts in 39 African languages]
>>> In a way it seems to be strange that there are few such studies, because
>>> Greenberg, who was so influential in other
>>> respects, made some pilot studies in
>>> this direction:
>>> Greenberg, Joseph H. (1960). A quantitative approach to the
>>> morphological typology of languages. International Journal of
>>> American Linguistics 26: 178-194.
>>> Greenberg, Joseph H. & O'Sullivan, Chris. (1974). Frequency, marking and
>>> discourse styles with special reference to substantival categories in
>>> the Romance languages. Working Papers on Language Universals 16: 47-72.
>>> Connected to the few typological studies based
>>> on original texts there is a low
>>> prestige associated with careful editions of texts (with translations
>>> and glosses). As a consequence of the intensive
>>> typological work based on reference
>>> grammars, it seems that reference grammars have
>>> acquired a higher status during
>>> the last decades in ever more places (more libraries buy them, more
>>> linguists write and publish grammars, it becomes a possible topic for a
>>> Ph.D. thesis in more and more universities). The same does not hold for
>>> text collections  (most
>>> libraries do not buy them, most universities will not accept an edited
>>> text collection as a Ph.D., many linguists never publish their
>>> collected texts or only a small portion).
>>> Please, send references to me about typological
>>> studies based mainly on original
>>> texts in more than 20 languages and including at least some non-European
>>> languages. If there will be any answers I'll make a survey.
>>> Bernhard Waelchli
>>> Max Planck Institute for Evolutionary Anthropology
>>> Department of Linguistics
>>> Deutscher Platz 6
>>> 04103 Leipzig
>>> Germany
>>> --
>>> ========================================
>>> Bernhard Waelchli
>>> University of Berne
>>> bernhard.waelchli at isw.unibe.ch
>>> ------------------------------------------------------
>>> This mail was sent through IMP at http://mail.unibe.ch
> --
> Greville G. Corbett
> Surrey Morphology Group
> CMC,
> School of Arts, Communication and Humanities
> University of Surrey
> Guildford                             email: g.corbett at surrey.ac.uk
> Surrey, GU2 7XH                       FAX:   +44 1483 686201
> Great Britain                         phone:  +44 1483 682849
> http://www.surrey.ac.uk/LIS/SMG/

More information about the Lingtyp mailing list