Sign language corpora
Adam C Schembri
a.schembri at ucl.ac.uk
Tue Sep 18 10:27:34 UTC 2007
There are many of us who follow this particular 'gospel'. :-) I have
jokingly referred to it as the 'gospel according to Trevor and Ceil'.
Ceil Lucas and her colleagues were perhaps the first to
systematically collect a naturalistic corpus of sign language data
balanced for age/gender/region/ethnicity etc (Lucas, Bayley & Valli,
2001), and Trevor Johnston and colleagues were the first - to my
knowledge - to actually begin to build a 'corpus' in the contemporary
sense of the term (i.e., a machine-readable, annotated collection of
language recordings), filming 3 hour data collection sessions from
100 native and near-native signers in 5 regions across Australia. The
NGT project has taken this further by building in web-accessibility
of their corpus into their project, but I believe the Australian team
do hope to make the Auslan corpus more widely available at a later
stage. Certainly, the new project that Inge refers to here in the UK
plans to do something similar to the NGT project ( for those of you
who don't know, my colleagues and I have just been awarded a major
£1.2 million grant from the Economic and Social Research Council for
the 'British Sign Language Corpus Project'. For more information,
visit DCAL's news page: http://www.dcal.ucl.ac.uk/news/news.html ).
Adam
Adam C Schembri, PhD
Senior Research Fellow
Deafness, Cognition and Language (DCAL) Research Centre
University College London
49 Gordon Square
London WC1H0PD
United Kingdom
Tel: +44 20 7679 8680
http://www.dcal.ucl.ac.uk/team/adam_schembri.html
On 18 Sep 2007, at 08:04, I.E.P. Zwitserlood wrote:
> Talking about gospels: so it is ours!
> In the Netherlands, at the Radboud University Nijmegen, a corpus is
> currently being compiled for NGT (Sign Language of the
> Netherlands). My collegues and I aim at recording 75 hours of
> elicited and (semi-)spontaneous data, collected from 100 native
> signers. All video data, as well as a translation and (for a small
> subset of the data) an annotation, will be made available on
> internet. (Similar projects have been/will be undertoken in
> Australia, the UK and Ireland, although the data are not so easily
> available). If anyone is interested in making a corpus for his/her
> sign language, we'll be happy to inform/support you with our
> experiences. For more information, see our website:
> http://www.let.kun.nl/sign-lang/corpusngt/scientific/index.html
>
> Best,
> Inge Zwitserlood
>
> ----- Original Message -----
> From: Dan Parvaz <dparvaz at gmail.com>
> Date: Monday, September 17, 2007 6:30 pm
> Subject: Re: [SLLING-L] An avator doing bfi
>
> Sorry, but I can't stop going on about corpora -- it's the gospel I
> preach :-)
>
> Perhaps the best way to kick-start this is to round up all the
> usual suspects, and get a governmental agency (US or EU, it doesn't
> much matter to me) to coordinate recording and transcribing 50
> hours of data for everyone to use (I know, it isn't enough by
> spoken-language standards, but it's so much more than we've ever
> had). Then we have a fighting chance of pushing the state of the
> art in all these areas...
>
> -Dan.
>
> Sorry, but I can't stop going on about corpora -- it's the gospel I
> preach :-)
>
> Perhaps the best way to kick-start this is to round up all the
> usual suspects, and get a governmental agency (US or EU, it doesn't
> much matter to me) to coordinate recording and transcribing 50
> hours of data for everyone to use (I know, it isn't enough by
> spoken-language standards, but it's so much more than we've ever
> had). Then we have a fighting chance of pushing the state of the
> art in all these areas...
>
> -Dan.
>
> On 9/17/07, Sara Morrissey <sara.morrissey2 at mail.dcu.ie> wrote:
> Oh dear. Don't talk to me about corpora! I'm working in the arena
> of Data-Driven Machine Translation and working with people who have
> millions of sentences for their spoken language translation in
> comparison to my 600 for sign language work!! Finding parallel data
> within a closed domain is a difficult task. Nevertheless progress
> is being made and results are promising :)
>
> Thanks for your input :o)
> Sara
>
>
> On 17/09/2007, Dan Parvaz <dparvaz at gmail.com > wrote:
> I'm sure the one thing standing between the Tunisian Deaf Community
> and achieving their potential is the lack of a signing avatar :-)
> Still, it is potentially cool research with good dividends,
> particularly if it means the development of a real Tunisian SL
> dictionary (as opposed to the previous effort, which was a glossary
> meant to contribute to the perennial Pan-Arab SL movement),
> grammar, etc.
>
> A major chunk of the problem here rests with the lack of
> substantial corpora of any kind, let alone parallel corpora.
>
> -Dan.
>
>
>
>
> On 9/17/07, Sara Morrissey <sara.morrissey2 at mail.dcu.ie > wrote:
> All work in this area is a long way from being a translation
> service, I can assure you of that following 3 years PhD research on
> the topic of Machine Translation of Sign Languages. Sadly most of
> the work that I've come across in this area is similar to the work
> described in the BBC article in that it is just a small project. I
> have seen very little consistant work in this area with most of it
> being satellite projects related to other work so it never gets
> very far. Also, sadly, many groups that work in this area have
> little to no knowledge of the languages they are dealing with and
> often little contact with Deaf communities or colleagues and are
> more interested in the computing aspects. I am aware of the
> forthcoming FP7 project which does seem to intend spending a few
> years of research in this area: http://www.ideal-ist.net/Countries/
> TN/PS-TN-1590 Well, I hope so at least, I've applied for a postdoc
> position with them!!
>
> I'd be interested in hearing anyone's opinion on both this project
> and any other sign language machine translation projects they've
> come across. I intend to continue working in this area so all input
> is valuable :o)
>
> Namaste,
> Sara
>
> ************************************
> Sara Morrissey,
> PhD Researcher,
> National Centre for Language Technology,
> School of Computing,
> Dublin City University,
> Dublin 9,
> Ireland.
> ***********************************
>
>
>
> On 15/09/2007, Dan Parvaz <dparvaz at gmail.com > wrote:
> Sigh. Everytime some student on their Amazing Journey Of Self-
> Discovery<tm> "reinvents" a piece of deaf-related technology
> (datagloves for reading fingerspelling, signing avatars, etc.),
> some ignorant journalist is ready to hail it as a breakthrough.
>
> This was put together in a few months by a student intern. As far
> as I can tell (those knowing BSL please look at the video and
> correct me if I'm wrong), this is yet another relatively
> straightforward marriage of speech recognition and 3D animation.
> There's no indication that space, classifiers, etc. which would be
> part of a natural SL are being used here. As it stands, it's less
> useful than commercially available speech-to-text systems
> (DragonDictate, Via Voice, etc.)
>
> Don't surplus your interpreters just yet :-)
>
> Cheers,
>
> -Dan
>
>
> On 9/15/07, GerardM < gerard.meijssen at gmail.com > wrote:
> Hoi,
> I read this article on the BBC website about a translation service
> created by IBM that uses an avatar to translate into British Sign
> language (bfi). Such technology could in principle also produce
> SignWriting
> Thanks,
> Gerard
>
> http://news.bbc.co.uk/2/hi/technology/6993326.stm
>
> _______________________________________________
> SLLING-L mailing list
> SLLING-L at majordomo.valenciacc.edu
> http://majordomo.valenciacc.edu/mailman/listinfo/slling-l
>
>
>
> _______________________________________________
> SLLING-L mailing list
> SLLING-L at majordomo.valenciacc.edu
> http://majordomo.valenciacc.edu/mailman/listinfo/slling-l
>
>
>
>
> --
> Blessed are the flexible, for they shall not be bent out of shape.
> _______________________________________________
> SLLING-L mailing list
> SLLING-L at majordomo.valenciacc.edu
> http://majordomo.valenciacc.edu/mailman/listinfo/slling-l
>
>
>
> _______________________________________________
> SLLING-L mailing list
> SLLING-L at majordomo.valenciacc.edu
> http://majordomo.valenciacc.edu/mailman/listinfo/slling-l
>
>
>
>
> --
> Blessed are the flexible, for they shall not be bent out of shape.
>
> _______________________________________________
> SLLING-L mailing list
> SLLING-L at majordomo.valenciacc.edu
> http://majordomo.valenciacc.edu/mailman/listinfo/slling-l
>
>
> _______________________________________________
> SLLING-L mailing list
> SLLING-L at majordomo.valenciacc.edu
> http://majordomo.valenciacc.edu/mailman/listinfo/slling-l
> _______________________________________________
> SLLING-L mailing list
> SLLING-L at majordomo.valenciacc.edu
> http://majordomo.valenciacc.edu/mailman/listinfo/slling-l
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/slling-l/attachments/20070918/0f8bb7ed/attachment.htm>
-------------- next part --------------
_______________________________________________
SLLING-L mailing list
SLLING-L at majordomo.valenciacc.edu
http://majordomo.valenciacc.edu/mailman/listinfo/slling-l
More information about the Slling-l
mailing list