Typological databases and the reading public

Martin Haspelmath haspelmath at EVA.MPG.DE
Sun Apr 22 09:52:06 UTC 2007

Frans Plank wrote:
> Databases are databases, and journal articles are journal articles.
Yes, this describes the conventional wisdom.

But now that journals are increasingly electronic and there are no space 
limitations, I see no reason why a database could not be a journal 
publication, just like a conventional prose article. I fully understand 
that the editor(s) of "Linguistic Typology" (LT) may not want to change 
LT's profile, but one could start a new journal specifically for databsaes.
> Naturally, those interested in your findings may also want to 
> ascertain that what you publicly claim you have found is valid, in 
> terms of both (i) your own data and (ii) other evidence.  You should 
> therefore be prepared to make your own data publicly available, at 
> least if challenged.  I'm not sure this implies that all one's data 
> (i.e., full databases) ought to be published together with one's 
> findings:  sometimes this may be sensible and viable;  but it may also 
> suffice if one's data can be inspected upon special request (from the 
> reading public, or from reviewers asked to decide on the publication 
> of one's findings).
Yes, in the past this was all that one could reasonably ask for, but 
haven't the external conditions changed? Wouldn't it now be reasonable, 
for instance, for a funding body to require that the researchers it 
funds publish not only the interpretations of the newly gathered data, 
but also the data themselves?
> While the data in databases, in flux or complete, await analysis and 
> theorising it may be useful to share them (and there are initiatives 
> to improve communication in the typological database scene);  having 
> them published in a journal or book would not seem to serve any 
> obvious purpose. 
I find the purpose it would serve obvious: Other researchers could look 
at every individual case, go back to the original source, see the 
crucial examples, etc. Of course, published databases are used somewhat 
differently from published articles, but I see no reason why one would 
object to publishing (rather than just "sharing") databases. To remind 
Lingtyp readers: In addition to informal "sharing", formal "publication" 
implies citability, recognition, permanent accessibility, and ideally 
standardization as well as cross-searchability.

Dan Slobin wrote:
> I think it should be pointed out that the international child language 
> research community has functioned with a open, accessible database for 
> decades.  There's a standardized format for submission, programs for 
> search through transcripts, ethical guidelines, etc.  The typology 
> community is far behind, and could learn from this.  Here's the url:  
> http://childes.psy.cmu.edu/
Yes, CHILDES is a great resource, but it appears to fall short of actual 
*publication*. It's a great way to share anotated text corpora, but 
maybe even this resource could be improved by turning it into a 
full-fledged *corpus journal*, with full peer review of the 
contributions. We read that "The basic principle behind TalkBank is that 
researchers would like to share their data, because they think they are 
important and can interest others." This seems to appeal to researchers' 
conscience, but the idea of full-fledged publication of data is that 
gathering and annotating data is a scientific activity that should be 
given equal recognition to the publication of interpretations of these 
data. The issue mentioned at the end ("Untenured faculty should not 
share data until they have published the basic findings. In reality, we 
have never seen a case in which a person's ability to publish findings 
has been limited by contributing data. In any case, this is only a 
concern for faculty without tenure") could be resolved by moving from 
"sharing data" to "publishing data".


Martin Haspelmath (haspelmath at eva.mpg.de)
Max-Planck-Institut fuer evolutionaere Anthropologie, Deutscher Platz 6	
D-04103 Leipzig      
Tel. (MPI) +49-341-3550 307, (priv.) +49-341-980 1616

More information about the Lingtyp mailing list