Typological databases and the reading public
Frans.Plank at UNI-KONSTANZ.DE
Fri Apr 20 11:24:52 UTC 2007
Typological databases and the reading public, especially that reading LT
(from the Editorial Board of LT)
Databases are databases, and journal articles are journal articles.
You compile a database -- on your own or in collaboration, which may
include sharing data from separate databases -- in order to find out
something new, or also in order to confirm or disconfirm claims or
theories that have been around. If you do find out something that
you consider worth making public, you try to get it published in a
journal (or book) or you publicise it otherwise (on your web site or
in a letter that you send around to your friends and foes, thereby
perhaps reaching a wider audience).
Naturally, those interested in your findings may also want to
ascertain that what you publicly claim you have found is valid, in
terms of both (i) your own data and (ii) other evidence. You should
therefore be prepared to make your own data publicly available, at
least if challenged. I'm not sure this implies that all one's data
(i.e., full databases) ought to be published together with one's
findings: sometimes this may be sensible and viable; but it may
also suffice if one's data can be inspected upon special request
(from the reading public, or from reviewers asked to decide on the
publication of one's findings).
While the data in databases, in flux or complete, await analysis and
theorising it may be useful to share them (and there are initiatives
to improve communication in the typological database scene); having
them published in a journal or book would not seem to serve any
obvious purpose. (Although the line may be hard to draw, the
publication of text collections would seem to serve purposes which
the publication of typological databases doesn't.)
Databases, then, are tools: interesting primarily for what you can
do with them -- furthering knowledge, in our case knowledge about
linguistic diversity and unity.
Since there is an obvious communal interest in having the best tools
possible and in these tools being used in the most expert way
possible, tool design and tool use are questions of considerable
interest for everybody keen on furthering knowledge. They are
important questions which merit informed and prompt discussion in
scholarly journals, whether old-fashioned easy-to-read p or
new-fashioned technology-intensive e, whether specialising in some
limited field of empirical enquiry or generally dedicated to
questions of the methodology and philosophy of science.
Were it not the whole point of this message, it would almost be
needless to add that, for linguistic typology, LT, this field's
dedicated journal, continues to warmly invite scholarly information
about, and scholarly debate on, typological methodology, obviously
including database methodology.
As to the different question, also raised in this current lingtyp
exchange, whether typologists are to be trusted with data, and
generally are good for anything, readers of LT may rest assured, and
writers in LT will confirm, that editors and reviewers for this
journal have always seen to it, to the best of their individual and
collective abilities, that specific data, as well as specific
analyses and specific ideas, are properly credited. (Sure, the line
may sometimes be hard to draw between what is specific and needs to
be attributed and what is or has become common knowledge. But that
is another question.)
I think it should be pointed out that the international child
language research community has functioned with a open, accessible
database for decades. There's a standardized format for submission,
programs for search through transcripts, ethical guidelines, etc.
The typology community is far behind, and could learn from this.
Here's the url: http://childes.psy.cmu.edu/
These are the ground principles:
The basic principle behind TalkBank is that researchers would like to
share their data, because they think they are important and can
interest others. However, apart from this basic consideration there
are several additional reasons to share data and some reasons not to
First, the reasons to share data:
Principles of scientific integrity require that ideas be put to a
test. In order to test your ideas about your data, you need to open
them up to others who will either support or challenge your ideas.
Some types of claims can only be tested against large data sets or
against comparisons of somewhat similar data sets. To make these
analyses, we often need more and more data.
Much of the work in science is conducted using public funds. We have
an obligation to the public to make maximally efficient use of these
data. For example, the NIH has now issued
guidelines on this issue.
But there are also two reasons not to share data:
Data should not be shared if you have not secured informed consent
from your subjects.
Untenured faculty should not share data until they have
published the basic findings. In reality, we have never seen a case
in which a person's ability to publish findings has been limited by
contributing data. In any case, this is only a concern for faculty
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Lingtyp