publishing typological databases

Stuart Robinson stuart at ZAPATA.ORG
Mon Apr 16 21:32:47 UTC 2007

Ashild has a good point. Part of the problem is the culture of descriptive
linguistics, where there is still a fair bit of indifference and even
hostility towards the technological investment required to support
sustainable digital fieldwork data. I'm thinking, for example, of Bob
Dixon's statement on this list when he received the Leonard Bloomfield
"A word addressed to junior colleagues who think that it will                                                                                                             
improve their work to immerse it in the latest electronic technology.                                                                                                     
Don't. Because it won't. I worked on the Jarawara grammar as I did on                                                                                                     
previous grammars of Dyirbal, of Yidi?, of Boumaa Fijian (and of                                                                                                          
English). I used pencil, pen and spiral-bound notebooks, plus a couple of                                                                                                 
good-quality tape recorders. No video camera (to have employed                                                                                                            
one would have compromised my role in the community). No lap-top. No                                                                                                      
shoebox or anything of that nature. And no also grammatical                                                                                                               
elicitation from the lingua franca."                                                                                                                                      
This passed without comment when it was posted roughly a year ago, but if
people are serious about recognizing the value of electronic data, it
shouldn't have.

Stuart Robinson

On Mon, 16 Apr 2007, Ashild Naess wrote:

> Dear Martin,
> the question you raise is just as relevant for descriptive linguistics; 
> properly annotated corpora of descriptive data require an enormous 
> amount of analysis work, but are generally not recognised as research 
> output by those who count such things. Finding ways of having electronic 
> data sets recognised as publications would be a great benefit to the 
> whole field.
> There was some discussion of the question at a recent conference in 
> Sydney on electronic data collection, annotation and archiving. The 
> following paper from the conference proceedings may be of interest:
> Coleman, Ross. 2006. Field, file, data, conference: Towards new modes of 
> scholarly publication. In Linda Barwick and Nicholas Thieberger (eds): 
> Sustainable data from digital fieldwork. Sydney: Sydney University 
> Press. 163-174.
> The paper is available online at 
> Best,
> Ã…shild
> On 13.04.2007 16:21, Martin Haspelmath wrote:
> > Dear typologists,
> > 
> > Last week at an informal meeting of the European Typology Network in 
> > Leipzig, we discussed the issue of publishing typological databases. In 
> > the past, this was a practical problem, because journals and book 
> > publishers were reluctant to print many pages of tabular data. The basic 
> > practical problem has disappeared with modern information technology, 
> > but many problems remain, and it would be good if typologists made a 
> > joint effort to address them.
> > 
> > Traditional paper publication simultaneously fulfills at least four 
> > distinct functions:
> > 
> > (i) giving *recognition* (or even prestige) to a researcher's work, so 
> > that they can list it on their CV as the visible outcome of their work
> > 
> > (ii) *citability*, i.e. allowing users of published work to build on 
> > this work without having to vouch for it personally, without having to 
> > mention all the details, etc.
> > 
> > (iii) *accessibility*, i.e. allowing users in many different places (in 
> > principle, at any institution devoted to research, and beyond) to access 
> > the results of the work
> > 
> > (iv) *standardization*, i.e. things like uniform glossing, 
> > bibliographical references, section organization, or even uniform 
> > terminology (in some particular context, e.g. an edited volume)
> > 
> > All of these functions are important also for typological databases, but 
> > while some progress has been made with regard to (iii) (accessibility), 
> > the other requirements (recognition, citability, and standardization) 
> > still need a lot of thinking and work on our part. You can access some 
> > typological databases such as the Surrey morphology databases 
> > (, the Berlin-Utrecht Reciprocals Survey 
> > (, the Graz Reduplication 
> > database (, but these websites 
> > generally don't say how to cite data from these databases, so they do 
> > not give enough recognition to the authors.
> > 
> > Standardization has been addressed by the Typological Database System 
> > (, and this project additionally aims 
> > for a fifth function, *cross-searchability*, that was not possible with 
> > traditional paper publication at all.
> > 
> > Another problem is how to divide databases into units: Some databases 
> > (such as the database of the World Atlas of Language Structures, which 
> > will become available on the web in 2008) are aggregates of datasets 
> > contributed by many different authors, which should be citable 
> > separately. Also for the databases created by a smaller team, it may be 
> > desirable to specifiy more precisely which author did what. In 
> > traditional paper publications, we had two kinds of units, articles and 
> > books, which could be single-authored or multi-authored (occasionally 
> > with some ranking of the authors). Maybe it would be desirable to allow 
> > more different units, and more different roles (e.g. content provider 
> > vs. database designer?).
> > 
> > Any ideas how typologists should go about solving these problems?
> > 
> > Martin
> > 

| Stuart Robinson                        |
| Email: stuart at zapata dot org        |
| Homepage: |

More information about the Lingtyp mailing list