[Lingtyp] Workshop on cross-linguistic databases at ALT: call for abstracts

Hedvig Skirgård hedvig.skirgard at gmail.com
Tue Mar 7 22:58:11 UTC 2017


Dear all,

At the meeting of the Association for Linguistic Typology this year in
Canberra, there will also be several workshops. We (Martin Haspelmath,
Hannah Haynie, Robert Forkel and myself) are organising one on Design
principles and comparisons of typological databases. If you are interested,
do get in touch and submit an abstract.

Interested people should submit abstracts for both the general session and
the workshops in the same form: http://www.dynamicsoflanguage.
edu.au/alt-conference-2017/call-for-abstracts/

The deadline for papers, both in the general and workshops, is the 31st of
March.

Please remember that for those with funding problems, there are a limited
number of scholarships
<http://www.dynamicsoflanguage.edu.au/alt-conference-2017/scholarships/scholarship-application-form/>
for
researchers are available, applications also due 31 March 2017.

Below follows a longer description of our workshop.


*Design principles and comparisons of typological databases*
What are the shared challenges and opportunities facing databases of
language diversity? What kinds of databases are out there, and what can
they be used for? These are questions we would like to address in this
workshop, bringing together researchers working with compiling this kind of
data, and users of it.

There are quite a few existing databases of grammatical features of
languages, and several more are under construction. They differ in their
design and in the kinds of research questions they aim to answer. Some are
created to investigate the particular history of a certain region or family
(e.g. van Gijn 2014), others a particular set of traits in a global set of
languages (Stassen 1997), and so on. Despite these differences, there is
often the possibility of sharing data or design between different
typological surveys.

We would like to take this opportunity during the ALT to bring together
scholars who are working on designing typological databases and end users
of such databases and discuss comparisons and possible opportunities for
co-ordination. We’re interested in the design decisions that go into the
construction of a database and what consequences that has for what it can
be used for, and if it can be linked to other similar databases.

Within MPI-SHH's Glottobank project (http://glottobank.org/), there have
been discussions of how different typological databases relate to each
other and what their different aims and uses are. We would like to engage
the broader typology community in these discussions and hear viewpoints
from other database designers and end-users.

We are also interested in discussing design principles in relation to
end-users of the data. There are many different kinds of end-users of this
data, and the methods with which they approach the material carries with it
certain assumptions and prerequisites. For phylogenetic studies, for
example, it is best if the features are logically independent of each other
and associated with a confidence value. What does the data that is
available today look like, and what should future surveys look like?

This is not only a question of adjusting to certain end-users preferences,
but also a matter of clearly communicating what the data looks like, how it
was designed and why. This will make it clear which research questions the
data is suited for, and which questions it should not be applied to.

For example, WALS (Dryer & Haspelmath 2013) was constructed using already
existing data from a number of well-known typologists. There was also a
core sample of languages (100 and 200) that all/most of the chapters
covered, but there were still significant gaps in the database coverage of
features per language. This renders certain kinds of analysis impossible.
In WALS, there was most likely greater consistency per feature as opposed
to per language since that was how labour was divided. This can be
contrasted with APiCS (Michaelis et al 2013), where the languages each was
represented with experts who corresponded with the APiCS editorial team to
answer a typological questionnaire. In the case of APiCS, we expect greater
consistency over each language instead of over each feature. APiCS also
allows for languages to be represented with several values for one feature,
whereas WALS only allows for one. These design choices has consequences for
the nature of the data and are interesting to discuss in relation to
databases under construction, end users and comparison.

We would like to take this opportunity to invite researchers who are
working on constructing typological databases of structural/grammatical
features to discuss the questions below and related ones. We would also
like to invite end-users who are engaging with this kind of data to present
findings and engage in discussions on what the limitations and
possibilities of the databases are.

The workshop aims at discussing these questions, but is also open to other
related questions:

   - What kind of questions do we want to answer with our data, and which
   questions do we need to admit we cannot answer?
   - What does it mean if we are comparing doculects instead of languages?
   - What do linguistic descriptions, globally, enable us to research and
   what does it not?
   - What other feasible sources of information besides descriptions can we
   use?
   - What do we gain and lose by designing our features to be logically
   independent from each other (or conversely by including non-independent
   items in questionnaires)?
   - How do the circumstances of data collection (e.g. coding by feature or
   by language) affect the use and comparability of data from different
   surveys?
   - Can data from regionally oriented questionnaires be coordinated with
   globally oriented surveys to fruitfully build better sets of information on
   the world's languages? How do data design limitations impact this
   enterprise?
   - What elements need to be considered and what information needs to be
   documented when mapping between grammatical/typological datasets? (i.e.
   setting the stage for the grammaticon/getting input from other database
   designers on this concept)
   - How do we implement measures of coder-inter-reliability into more
   databases and into comparison of them?


Best,

*Hedvig Skirgård*


PhD Candidate
The Wellsprings of Linguistic Diversity

ARC Centre of Excellence for the Dynamics of Language

School of Culture, History and Language
College of Asia and the Pacific

Rm 4203, H.C. Coombs Building (#9)
The Australian National University

Acton ACT 2601

Australia

Co-chair of Public Relations

Board of the International Olympiad of Linguistics

www.ioling.org

Blogger at Humans Who Read Grammars
http://humans-who-read-grammars.blogspot.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lingtyp/attachments/20170308/6ea71f39/attachment.htm>


More information about the Lingtyp mailing list