A query...

Brian MacWhinney macw at cmu.edu
Tue Oct 24 14:19:12 UTC 2006


Folks,
      Dan's points about data charity are very good.  The idea of  
creating a standard open-access, open-source repository for  
linguistic data is one that has motivated much of my own research for  
25 years.  This work began with the establishment of the Child  
Language Data Exchange System (CHILDES) database in 1984 and  
continued with the establishment of TalkBank in 1999.  Catherine Snow  
collaborated on CHILDES and Steven Bird and Mark Liberman  
collaborated on TalkBank.  The web sites are childes.psy.cmu.edu and  
talkbank.org.
      I totally agree with Dan about the Jesuits and the issue of  
data charity.  I believe that it is crucial to make the original data  
available in all linguistic work.  However, I think it is crucial  
that individual projects should not do this on their own without  
regard to universal access and universal coding standards.  For that  
reason, we provide full web-based access and downloadability for all  
TalkBank corpora and a standardized XML-based schema that covers and  
translate between all current coding systems.
     This facility is now the default data sharing and storage  
mechanism for child language, aphasia,
child phonology, classroom discourse, bilingualism, and much of  
conversational analysis.
     Unfortunately, field linguistics is not making use of this  
facility and that is a real pity.  Moreover, it is not clear that any  
parallel archive/sharing facility has arisen for field linguistics.   
Dan refers to activitiy of NSF and agencies in Europe.  But, in  
reality, there is no publicly available system outside of TalkBank  
for this type of sharing and TalkBank is not being used by field  
linguists.  To be honest, a lot of the problem here is my time.  I  
have so much funded support for child language, conversation  
analysis, and aphasia that adding on a project for achiving/sharing  
in field linguistics is not possible, given my current system for  
project organization.  However, all of the TalkBank tools are totally  
open and it would be easy for someone to take the model, get the  
funding, and apply the system to data from field linguistics.  I  
really wish someone would do this.  Dan, are you interested?

--Brian MacWhinney

On Oct 24, 2006, at 9:01 AM, Daniel L. Everett wrote:

> Thanks to Alex for pointing this out.
>
> There are some obvious dissimilarities between linguistics and  
> fields that use lots of glass and metal. First, linguistic grants  
> tend to be smaller, especially in the US (compared to the UK and  
> EU). Second, linguists don't usually have labs with lots of  
> postdocs. Third, as Mark Line says, data is often not as important  
> to many linguists. (Take it third-hand or fourth-hand and just use  
> it as an illustration at the appropriate times of your main  
> theoretical point).
>
> On the other hand, there are similarities. Some researchers do get  
> large grants in linguistics, with large teams (e.g. Peter  
> Ladefoged's grants in many of his years at UCLA). And many of the  
> more important research projects, grammars & documentation  
> projects, produce data that will be cited for years, perhaps  
> centuries to come. In the case of the Jesuits in the 16th and 17th  
> centuries all our evidence suggests that the integrity of their  
> data-collection and presentation is first-rate, an example that has  
> produced useful data for research on American Indian languages at  
> least for centuries. On the other hand, I think that there is a  
> strong possibility that in some more modern grammars a 'principle  
> of charity' might have guided what data to present, where 'charity'  
> refers to how the author would like the data to look for other  
> points they want to make. Perhaps not falsification, but omission  
> of problematic results. And failure to follow-up with experiments.
>
> Solutions to this kind of thing include peer-review (I believe that  
> it fails a lot, but it is still vital), making data available, and  
> replication of results. In today's fieldwork, for example, I would  
> like to see every fieldworker (with appropriate permissions from  
> native speakers, governments, etc.) make their data available on- 
> line, field notes, sound files, etc. To do this, future grants  
> would need to have funds for digitization of data and storage of  
> data, following guidelines that are now becoming standard in the  
> field.
>
> Funding agencies in Europe are beginning to require this kind of  
> documentation. I think that the NSF should too, certainly in field  
> research projects.
>
> Dan
>
> On Oct 23, 2006, at 9:05 PM, Alexander Gross2 wrote:
>
>> It may well be that I am a bit overwraught about this, but I am  
>> curious to learn if anyone here besides myself detected any  
>> similarity, however remote, between an article in yesterday's New  
>> York Times Sunday Magazine and the field of Linguistics.
>>
>> In case you missed it, you'll find the article at:
>>
>> http://www.nytimes.com/2006/10/22/magazine/22sciencefraud.html? 
>> pagewanted=all
>>
>> with best wishes and apologies in advance,
>>
>> alex
>
> **********************
> Daniel L. Everett, Professor of Linguistics & Anthropology and Chair,
> Department of Languages, Literatures, and Cultures
> Campus Box 4300
> Illinois State University
> Normal, Illinois 61790-4300
> OFFICE: 309-438-3604
> FAX: 309-438-8038
> Dept: http://www.llc.ilstu.edu/default.asp
> Recursion: http://www.llc.ilstu.edu/rechul/
> Personal: http://www.llc.ilstu.edu/dlevere/
>
> and
>
> Honorary Professor of Linguistics
> University of Manchester
> Manchester, UK
>
>
>



More information about the Funknet mailing list