Corpora

Trevor Johnston Trevor.Johnston at ling.mq.edu.au
Tue Sep 18 11:15:24 UTC 2007


Yes, we have two corpora in Australia. One is a corpus from a project by
Adam Schembri and myself to study sociolinguistic variation (modelled on
the approach taken by Ceil Lucas and colleagues for ASL). The second, is
a corpus project as explained by Adam Schembri in his last posting
(other details can be gleaned from the website mentioned by Inge
Zwitserlood:
http://www.let.kun.nl/sign-lang/corpusngt/scientific/index.html). The
second corpus was collected between 2004 and 2006 and will be deposited
with the Endangered Languages Documentation Program, SOAS, University of
London, as part of their endangered languages archive at the end of this
year or very early next. Full details of the project will be available
on the Auslan Signbank site, which is currently being updated and
migrated to a new host, at the time it is deposited. 
 
The corpus will consist of over 100 hours of digital movies, and
associated ELAN annotation files. Annotators have been working on the
corpus already for two years (and will for the next 10!). The archive is
intended to be internet accessible (but there will be an initial period
of restricted access).
 
I’d like to add a word of caution: a corpus is not just a collection of
videos (digital or otherwise). There is a lot more to it than that. If
it is not machine readable in some way (hence ELAN) it is not a corpus
in the sense meant by linguists today and simply making recordings,
without annotations, would not advance empirical signed language
research greatly.

Trevor Johnston

_______________________________________________
SLLING-L mailing list
SLLING-L at majordomo.valenciacc.edu
http://majordomo.valenciacc.edu/mailman/listinfo/slling-l



More information about the Slling-l mailing list