[Tibeto-burman-linguistics] new data on Kusunda

Thu Sep 26 21:13:49 UTC 2019

Just a note from San Diego-
Here, we are moving to open data and working on the ethics of trust,
confidentiality, etc involved in this. Here is the latest statement by
the Research
Ethics Program, UC-San Diego about data sharing below.
The request for Kusunda data is a reasonable one. David Watter's and his
colleagues' data is available through Tribhuvan University's Linguistics
Dept, at least it was 5 years' ago when I had access. Given the time and
money to make the data available online, maybe Mark Donahue and colleagues
could offer their plans or ideas for future data availability. Using open
data repositories is a relatively new thing and we will all be thinking and
planning the details.
Best wishes for fall teaching & research,
Jana Fortier

Sharing of Data

Federal agencies, particularly the NIH (2003) and NSF (2010), have made funding
contingent on plans to share research data and products, particularly after
publication. An open data policy reflects positively on those who share and
benefits science by increasing the likelihood for new insights,
collaboration, and reciprocal sharing. Although sharing of data is
generally in the best interests of science and the individual, it is clear
that such sharing can place an individual scientist at risk:

   - Sharing data before publication could result in loss of credit or
   opportunity
   - Exposure of data to the prejudiced scrutiny of competitors or
   detractors
   - Risk of compromising confidentiality of human subjects
   - Expense of time and resources to meet requests for sharing of data

However, reasonable strategies to minimize potential problems should make
it possible to choose sharing over secrecy. Before publication, it is best
to maintain an open data policy with appropriate caution. After publication,
be prepared to grant reasonable access to the raw data; that is, honor
requests that are in the interest of scientific inquiry and can be
accomplished without inordinate expense or delay. In 2003, the National
Institutes of Health put out a Final NIH Statement on Sharing Research
Data. This document addresses some of the concerns listed above, and makes
clear that data sharing is a crucial and necessary part of responsible
conduct in research.
http://research-ethics.org/topics/data-management/#regulations-and-guidelines

================ END=====================

On Wed, Sep 25, 2019 at 4:23 AM Edward Garrett <heacu.mcintine at gmail.com>
wrote:

> One small correction to Nathan's message -
>
> At ELAR, during the period led by David Nathan some years ago, and when I
> was its software developer, access was defined by the URCS schema:
>
> User - Researcher - Community - Subscriber
>
> In addition, there was a Closed category, so 5 categories in total.
>
> User - open to all - was  widely adopted. Researcher - open only to the
> researcher themself - was virtually unattested. Community - open to the
> language community and the researcher, and Subscriber - open to those who
> explicitly requested permission, were both in common use. Some resources
> were closed, but a small minority I believe.
>
> Although I no longer have statistics from that period, my recollection is
> that many researchers were attracted to the Subscriber category. I took
> that to mean that they didn't want to close their materials entirely, they
> just wanted to know, and have a relationship with, those who wished to
> access their materials. Whether that is appropriate or not is another
> question, but it is certainly different from "closed to the researcher
> alone".
>
> Regards,
> Edward
>
>
>
>
> On Wed, Sep 25, 2019 at 5:50 AM Nathan Hill <nh36 at soas.ac.uk> wrote:
>
>> Dear Kristine,
>>
>> Certainly much valuable research was done at a time when it was
>> impractical to share underlying research data. Those days are now gladly
>> behind us. So why do linguists now still sit on their data like Fafnir on
>> the Nibelungenhort? I suspect mostly it is a phobia that others will work
>> on "their" language (as if that would be a bad thing?!?!).
>>
>> In ELAR there were six levels of access, to take account of all possible
>> ethical situations in terms of relations with the community. In fact only
>> two were ever used, fully open and closed to the researcher alone.  Seeing
>> this happen at SOAS confirmed my intuition that researchers hide behind
>> their communities in order to protect their own personal prerogatives.
>>
>> To take the case of Kusunda, Mark Donohue received $834,827.00 of
>> Australian tax payer money and accumulated a 20 hour corpus of Kusunda
>> (N.B. the grant was not just for Kusunda). Some 10 minutes maximum have
>> been made available. The grant is long over and any hope that the public
>> will receive more than these ten minutes would be naive. In contrast, Uday
>> and Tim received circa €5,000 of mostly private money and made a corpus
>> of the same size available to posterity. The contrast is striking and
>> suggests that the Australian taxpayer (not to mention science and the
>> community itself) is not getting good value for money. Clearly the Kusunda
>> community, if they were happy for the second corpus to be released, are not
>> themselves the ones holding up the release of the first. I do not intend to
>> pick on Donohue personally; I think cases like his are woefully common.
>>
>> The superb work of Lauren Gawne shows that there is another path. She
>> always assiduously cites here data and publishes it in open repositories
>> BEFORE citing it.
>>
>> Nathan
>>
>> --
>> Dr Nathan W. Hill
>> Reader in Tibetan and Historical Linguistics
>> Research Coordinator, East Asian Languages and Cultures
>> UK Director, London Confucius Institute
>> SOAS, University of London
>> Thornhaugh Street, Russell Square, London WC1H 0XG, UK
>> Tel: +44 (0)20 7898 4512
>> Room 456
>> --
>> Profile -- http://www.soas.ac.uk/staff/staff46254.php
>> Tibetan Studies at SOAS -- http://www.soas.ac.uk/cia/tibetanstudies/
>> --
>>
>>
>> On Tue, Sep 24, 2019 at 8:54 PM Kristine Hildebrandt <khildeb at siue.edu>
>> wrote:
>>
>>> Mark Post is correct: My clarification was based on the “published their
>>> data” language used by Nathan in the original email. It struck me as
>>> odd.  Some of the references noted in the email exchanges are not even in
>>> .pdf references in the report.
>>>
>>>
>>> There are couple of other things in Nathan’s more recent response (and I
>>> say this from a stance of great respect for his work) that I wanted to
>>> comment on more broadly, although I am aware that this list-serv is not
>>> really the appropriate place to get bogged down in this, and it is not
>>> really a place for personal quibbles, so I’ll make my comments and then
>>> drop it:
>>>
>>>
>>> 1. To quote Nathan: “The information in a publication cannot be verified
>>> unless the data is released.”
>>>
>>>
>>> If this is understood as categorically true, there is a risk of
>>> nullifying some of the most important published works in our field. Even if
>>> one has access to sound and video files and field notes through a data set
>>> publication or release, there are still potentially matters of contention
>>> and debate in a description or analysis. Even field notebooks (the standard
>>> piece of documentation equipment in the event that recorders are not
>>> available) are ultimately a derived representation of actual language use.
>>> Not everything in linguistic analysis is subject to instrumental
>>> verification or can easily be “check(ed) (for) accuracy.” Publications that
>>> do not have a companion full data-set (that is publicly available) can
>>> still be extremely valuable (for example, many of the exhaustive reference
>>> grammars written and published before the advent of online, public
>>> repositories, or even based on impressionistic researcher observations when
>>> recording equipment was hard to come by, and what ultimately makes it into
>>> a grammar is still probably just a sliver of what the researcher acquired
>>> by whatever means in their fieldwork). That is (hopefully) what a strong
>>> methodology section in the publication, and also the peer review process
>>> help to ensure.
>>>
>>>
>>> 2. “As for Donohue, why he is so reluctant to publish his data is a
>>> mystery to me, despite having me asked him myself this question….”
>>>
>>>
>>> I don’t think this is appropriate or fair—as I noted above, there are
>>> (and have been) many valuable descriptive and analytical contributions in
>>> our field in which the entire data set is not publicly stored in an archive
>>> or repository. If we want to call out one individual for a reluctance or
>>> inability (or whatever) to put everything into some online repository, then
>>> we should also call out any other field researcher who has gathered data
>>> recently who has also not fully released all materials (I guess I would be
>>> one of those people, too!)…. I’m not comfortable with that. If scholars
>>> decide, for example, that they don’t want to trust Mark’s or David’s
>>> published descriptions and analyses because there is no companion data-set
>>> available, then that is their own call, and we can also be grateful for the
>>> presence of Tim and Uday’s materials.
>>>
>>>
>>> 3. “When the public funds research, it is a mystery to me why
>>> researchers do not want the public to check the accuracy of their work and
>>> build upon it…”
>>>
>>>
>>> Actually, I’m the President of the Endangered Language Fund, which
>>> partly supported Uday and Tim’s work on Kusunda. We are not a “public fund”
>>> (funded by taxes) in the same way that a U.S. federal agency like the NSF
>>> or NEH are, or that AHRC or ESRC are in the United Kingdom. ELF grants do
>>> require archiving of the material collected by our funded work, *but it
>>> does not have to be publicly available*. That is a decision that the
>>> community makes in communication and partnership with the researchers. And,
>>> a community and researcher can even decide to keep some/most/all of a
>>> corpus private but still allow publications on aspects of that language,
>>> using selected examples or pieces.
>>>
>>>
>>> ELF offers its grant awardees three levels of access to materials:
>>> Public, ELF-internal for administrative and reporting purposes only, and a
>>> 5 year embargo in which internal will transition to public (NSF offers
>>> similar options). I cannot speak for what the donors on their GoFundMe
>>> campagin did or did not want. In the case of Tim and Uday, the data found
>>> in the Zenodo repository all look to be fully accessible to whomever wants
>>> to use them, and for whatever purposes. And, although I myself have never
>>> made use of it for my own materials, Zenodo looks to be a trustworthy and
>>> sustainably managed repository, which satisfies the needs of ELF. That’s
>>> all fine, and we understand this to be a reflection of an agreement between
>>> them and the (participating) Kusunda community (or the remaining
>>> representatives, since it is moribund). But not all communities want to
>>> have collected materials available to the public in the same ways, and as
>>> researchers, we should also respect that. Furthermore, not all people
>>> who are interested in documenting their/a language have access to the same
>>> types of resources or knowledge about places to put data. Things are
>>> starting to change, and I think that the research community can play a
>>> leading role in being the change that we want to see, but it has to be done
>>> in a way that respects the integrity and desires of the speech community
>>> and is open and encouraging to scholars and those relationships.
>>>
>>>
>>> In the end, I’m grateful that Nathan shared Tim and Uday’s report, which
>>> has all of these links to valuable materials. Also, I am not ignorant of
>>> the bigger concerns here: Kusunda is almost vanished. Despite Tim and
>>> Uday’s valuable data and Mark and co-authors and David’s valuable
>>> publications, there is still much we don’t know about this important
>>> isolate spoken in the midst of so many Indo-European and Tibeto-Burman
>>> languages. And, having benefitted from, and having served the NSF in
>>> various ways (and now serving for ELF), I am a strong advocate for
>>> researchers making use of archival storage and standard metadata encoding
>>> schemes and for ethically sharing whatever can be shared. And in fact,
>>> University of California’s e-Scholarship publishing platform (on which *Himalayan
>>> Linguistics* is hosted) is also in the planning stages of their own
>>> data-set repository feature, which will have a certain level of peer
>>> review, and I think it will be a great addition to introduce to the journal
>>> in the near future. I am happy to see these approaches and standards being
>>> used more and more for the languages of the region. But it is not a
>>> one-size-fits-all situation, in my opinion.
>>>
>>>
>>> Thank you, all,
>>>
>>> On Mon, Sep 23, 2019 at 10:58 PM Mark W. Post <markwpost at gmail.com>
>>> wrote:
>>>
>>>> No, what you said is that "neither of them *published* their data".
>>>> Hence the misunderstanding.
>>>>
>>>> ------ Original Message ------
>>>> From: "Nathan Hill" <nh36 at soas.ac.uk>
>>>> To: "Kristine Hildebrandt" <khildeb at siue.edu>
>>>> Cc: "tibeto-burman-linguistics" <
>>>> tibeto-burman-linguistics at listserv.linguistlist.org>
>>>> Sent: 24/09/2019 12:08:24 AM
>>>> Subject: Re: [Tibeto-burman-linguistics] new data on Kusunda
>>>>
>>>> Dear All,
>>>>
>>>> Since my statement has required two "clarifications," let me clarify
>>>> that I said that neither Watters or Donohue had released their DATA. The
>>>> information in a publication cannot be verified unless the DATA is
>>>> released. Also, it is very hard for new research to be done on a language
>>>> if there is not DATA available. Kusunda will be dead in a few years, and if
>>>> no DATA is available on it, then the works of Watters and Donohue will be
>>>> the last things ever written about the language.
>>>>
>>>> Watters is unfortunately no longer with us, so his DATA is probably
>>>> lost forever. As for Donohue, why he is so reluctant to publish his DATA is
>>>> a mystery to me, despite having me asked him myself this question in Sydney
>>>> this summer. Particularly when the public funds research, it is a mystery
>>>> to me why researchers do not want the public to check the accuracy of their
>>>> work and build upon it.
>>>>
>>>> In my original posting I did not make a mistake. Watters and Donohue
>>>> hae not published their DATA and Uday and Tim have.
>>>>
>>>> thank you,
>>>> Nathan
>>>>
>>>>
>>>> --
>>>> Dr Nathan W. Hill
>>>> Reader in Tibetan and Historical Linguistics
>>>> Research Coordinator, East Asian Languages and Cultures
>>>> UK Director, London Confucius Institute
>>>> SOAS, University of London
>>>> Thornhaugh Street, Russell Square, London WC1H 0XG, UK
>>>> Tel: +44 (0)20 7898 4512
>>>> Room 456
>>>> --
>>>> Profile -- http://www.soas.ac.uk/staff/staff46254.php
>>>> Tibetan Studies at SOAS -- http://www.soas.ac.uk/cia/tibetanstudies/
>>>> --
>>>>
>>>>
>>>> On Mon, Sep 23, 2019 at 3:32 PM Kristine Hildebrandt <khildeb at siue.edu>
>>>> wrote:
>>>>
>>>>> H again all,
>>>>>
>>>>> I wanted to also just clarify that Mark Donohue's work on Kusunda is
>>>>> also out there in many publications (co-authored with Bohj Raj Gautam and
>>>>> Madhav Prasad Pokhrel), and so Uday Raj and Tim Bodt's publicly released
>>>>> data set can be seen as complementing both David and Mark's published
>>>>> descriptions/analyses:
>>>>>
>>>>> Donohue, Mark, and Bhoj Raj Gautam. 2013. Evidence and stance in
>>>>> Kusunda. Nepalese Linguistics 28: 38-47.
>>>>> http://www.digitalhimalaya.com/collections/journals/nepling/
>>>>> Donohue, Mark, Bhoj Raj Gautam and Madhav Prasad Pokharel. 2014.
>>>>> Negation and nominalization in Kusunda. *Language* 90 (3):
>>>>> 737-745. DOI 10.1353/lan.2014.0054
>>>>> <https://doi.org/10.1353/lan.2014.0054>
>>>>> Gautam, Bhoj Raj, and Mark Donohue. 2014. Deixis in Kusunda. Nepalese
>>>>> Linguistics 29: 152- 157.
>>>>> http://www.digitalhimalaya.com/collections/journals/nepling/
>>>>> Donohue, Mark, and Bhoj Raj Gautam. 2016. Quantification in Kusunda.
>>>>> In Denis Paperno and Ed. Keenan, eds., Quantification in Natural Language.
>>>>> https://www.springer.com/gp/book/9789400726802
>>>>>
>>>>> On Wed, Sep 18, 2019 at 6:56 AM Nathan Hill <nh36 at soas.ac.uk> wrote:
>>>>>
>>>>>> Dear Colleague,
>>>>>>
>>>>>> I just wanted to let people know about the new data on Kusunda that
>>>>>> Uday Raj and Tim Bodt have publicly filed on Zenodo. Kusunda is a language
>>>>>> isolate with only two speakers. In the last decades some work was done on
>>>>>> it by the late David E. Watters, and my Mark Donohue, but neither of them
>>>>>> published their research data.
>>>>>>
>>>>>> Uday and Tim have now put more than 20 hours of material online open
>>>>>> access, as described in the attached .pdf file.
>>>>>>
>>>>>> very best,
>>>>>> Nathan
>>>>>> _______________________________________________
>>>>>> Tibeto-burman-linguistics mailing list
>>>>>> Tibeto-burman-linguistics at listserv.linguistlist.org
>>>>>>
>>>>>> http://listserv.linguistlist.org/mailman/listinfo/tibeto-burman-linguistics
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Orche
>>>>> ('Thanks' in Manange)
>>>>>
>>>>> *Kristine A. Hildebrandt*
>>>>> *Professor, Department of English Language & Literature
>>>>> <http://www.siue.edu/artsandsciences/english/>*
>>>>> *President, Endangered Language Fund
>>>>> <http://www.endangeredlanguagefund.org/>*
>>>>>
>>>>> *Secretary, Association for Linguistic Typology
>>>>> <http://www.linguistic-typology.org/>*
>>>>> *Editor, Himalayan Linguistics
>>>>> <http://escholarship.org/uc/himalayanlinguistics>*
>>>>>
>>>>> *Southern Illinois University Edwardsville*
>>>>>
>>>>>
>>>>> *Box 1431Edwardsville, IL 62026 U.S.A.618-650-3991 (department
>>>>> voicemail)*
>>>>>
>>>>>
>>>>> *khildeb at siue.edu <khildeb at siue.edu>http://www.siue.edu/~khildeb
>>>>> <http://www.siue.edu/~khildeb>*
>>>>>
>>>> _______________________________________________
>>>> Tibeto-burman-linguistics mailing list
>>>> Tibeto-burman-linguistics at listserv.linguistlist.org
>>>>
>>>> http://listserv.linguistlist.org/mailman/listinfo/tibeto-burman-linguistics
>>>>
>>>
>>>
>>> --
>>> Orche
>>> ('Thanks' in Manange)
>>>
>>> *Kristine A. Hildebrandt*
>>> *Professor, Department of English Language & Literature
>>> <http://www.siue.edu/artsandsciences/english/>*
>>> *President, Endangered Language Fund
>>> <http://www.endangeredlanguagefund.org/>*
>>>
>>> *Secretary, Association for Linguistic Typology
>>> <http://www.linguistic-typology.org/>*
>>> *Editor, Himalayan Linguistics
>>> <http://escholarship.org/uc/himalayanlinguistics>*
>>>
>>> *Southern Illinois University Edwardsville*
>>>
>>>
>>> *Box 1431Edwardsville, IL 62026 U.S.A.618-650-3991 (department
>>> voicemail)*
>>>
>>>
>>> *khildeb at siue.edu <khildeb at siue.edu>http://www.siue.edu/~khildeb
>>> <http://www.siue.edu/~khildeb>*
>>>
>> _______________________________________________
>> Tibeto-burman-linguistics mailing list
>> Tibeto-burman-linguistics at listserv.linguistlist.org
>>
>> http://listserv.linguistlist.org/mailman/listinfo/tibeto-burman-linguistics
>>
> _______________________________________________
> Tibeto-burman-linguistics mailing list
> Tibeto-burman-linguistics at listserv.linguistlist.org
> http://listserv.linguistlist.org/mailman/listinfo/tibeto-burman-linguistics
>

-- 
New and forthcoming!
A Comparative Dictionary of Raute and Rawat: Tibeto-Burman Languages of the
Himalayas. Harvard Univ. Press, Harvard Oriental Series
<http://www.hup.harvard.edu/catalog.php?isbn=9780674984349>

Tou: Under a Sacred Sky
<https://www.academia.edu/35302918/Tou_Under_a_Sacred_Sky> -  about Raute
and Rawats' sky deities

Lecturer, Dept of Anthropology
Social Sciences Bldg, Rm 210
University of California San Diego
La Jolla, CA 92093-0532
*http://southasia.ucsd.edu/faculty/ <http://southasia.ucsd.edu/faculty/>*
academia.edu/JanaFortier <https://ucsd.academia.edu/JanaFortier>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/tibeto-burman-linguistics/attachments/20190926/87385d99/attachment.htm>