Typological databases and the reading public

Sat Apr 21 10:00:35 UTC 2007

Thanks, Frans, for the extensive reaction from the board of LT.
Although the first line (Databases are databases, and journal articles are journal articles)
seems to suggest that you think there is no place for discussing databases as such in
LT and the like, the rest of the contribution gives mainly arguments for
the opposite. What I was suggesting a couple of days ago would be
short notes on web accessible databases, which have been used for completed
(and ongoing) research already presented in printed form. This should be done
according to a short, clear format, which immediately gives an impression of the
contents and the quality of the data in the database.
Just as book reviews (in LT) give us a short cut to new literature in the field, 
could a note on an available database (with scores on criteria crucial for
successful use by third parties) help us expand our typological resources, and 
stand on the shoulders of those who entered some linguistic domain before us.
Work done on the combined databases of the WALS atlas are a clear example
of the latter. 
As pointed out by Dan, I think that Childes indeed is a perfect example of sharing 
and combining data resources to the benefit of the - in this case mainly the 
language acquisition - community. That field was lucky to agree on a fixed format early on in time.
There are several reasons why this is much more complex in the case of typology,
and why this field seems to be lagging behind in that respect by several decades. These are both
of a technical nature (lack of standardization in formats and software) and, more importantly,
the lack of a unifying theoretical framework which defines the categories, and
makes the interpretation of other typologists data more or less straighforward.
Childes databases may be interpreted by any user, inclusing non-linguists, since
the meaning of the data is quite straightforward. More often than not, it concerns one
language, or very few, typically well-known languages, or even the native language of
the researcher herself. In that sense they may be compared directly to corpora
(BNC, CGN etc) as nowadays widely used by all kinds of linguists, theoretical
and applied. On the other hand, typologists typically deal with hundreds of languages, 
some well-known to them but others hardly, for which they often have to rely on descriptions 
of others, and must make their own interpretations and categorizations. 
No existing theoretical framework provides the complete descriptive apparatus to date. 
It is precisely the work of typologists that may help theories to expand their empirical coverage, 
to the extent that some theories can be bothered about language data in the first place.
In that sense typology has a long way to go; the TDB project that was mentioned
by Martin (and myself) is a first example of bringing together a number of
existing databases under the umbrella of one interface and (an attempt to) a
common vocabulary.
So, LT, or others, keep your pages open for database notes, and help the
typological community to share and unite their resources, at the same time
giving some deserved credit to all your Unknown Database Contributors.

Best,

Dik

Dik Bakker 
Dept. of General Linguistics 
Universities of Amsterdam & Lancaster
tel (+44) 1524 64975 & (+31) 20 5253864
http://home.medewerker.uva.nl/d.bakker/

Societas Linguistica Europaea 
Secretary/Treasurer
http://cf.hum.uva.nl/sle/ <http://cf.hum.uva.nl/sle/> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lingtyp/attachments/20070421/23e692e2/attachment.htm>