Sign language corpora

Adam C Schembri a.schembri at ucl.ac.uk
Tue Sep 18 10:27:34 UTC 2007


There are many of us who follow this particular 'gospel'. :-) I have  
jokingly referred to it as the 'gospel according to Trevor and Ceil'.  
Ceil Lucas and her colleagues were perhaps the first to  
systematically collect a naturalistic corpus of sign language data  
balanced for age/gender/region/ethnicity etc (Lucas, Bayley & Valli,  
2001), and Trevor Johnston and colleagues were the first - to my  
knowledge - to actually begin to build a 'corpus' in the contemporary  
sense of the term (i.e., a machine-readable, annotated collection of  
language recordings), filming 3 hour data collection sessions from  
100 native and near-native signers in 5 regions across Australia. The  
NGT project has taken this further by building in web-accessibility  
of their corpus into their project, but I believe the Australian team  
do hope to make the Auslan corpus more widely available at a later  
stage. Certainly, the new project that Inge refers to here in the UK  
plans to do something similar to the NGT project ( for those of you  
who don't know, my colleagues and I have just been awarded a major  
£1.2 million grant from the Economic and Social Research Council for  
the 'British Sign Language Corpus Project'. For more information,  
visit DCAL's news page: http://www.dcal.ucl.ac.uk/news/news.html ).

Adam

Adam C Schembri, PhD
Senior Research Fellow
Deafness, Cognition and Language (DCAL) Research Centre
University College London
49 Gordon Square
London WC1H0PD
United Kingdom
Tel: +44 20 7679 8680
http://www.dcal.ucl.ac.uk/team/adam_schembri.html



On 18 Sep 2007, at 08:04, I.E.P. Zwitserlood wrote:

> Talking about gospels: so it is ours!
> In the Netherlands, at the Radboud University Nijmegen, a corpus is  
> currently being compiled for NGT (Sign Language of the  
> Netherlands). My collegues and I aim at recording 75 hours of  
> elicited and (semi-)spontaneous data, collected from 100 native  
> signers. All video data, as well as a translation and (for a small  
> subset of the data) an annotation, will be made available on  
> internet. (Similar projects have been/will be undertoken in  
> Australia, the UK and Ireland, although the data are not so easily  
> available). If anyone is interested in making a corpus for his/her  
> sign language, we'll be happy to inform/support you with our  
> experiences. For more information, see our website:
> http://www.let.kun.nl/sign-lang/corpusngt/scientific/index.html
>
> Best,
> Inge Zwitserlood
>
> ----- Original Message -----
> From: Dan Parvaz <dparvaz at gmail.com>
> Date: Monday, September 17, 2007 6:30 pm
> Subject: Re: [SLLING-L] An avator doing bfi
>
> Sorry, but I can't stop going on about corpora -- it's the gospel I  
> preach :-)
>
> Perhaps the best way to kick-start this is to round up all the  
> usual suspects, and get a governmental agency (US or EU, it doesn't  
> much matter to me) to coordinate recording and transcribing 50  
> hours of data for everyone to use (I know, it isn't enough  by  
> spoken-language standards, but it's so much more than we've ever  
> had). Then we have a fighting chance of pushing the state of the  
> art in all these areas...
>
> -Dan.
>
> Sorry, but I can't stop going on about corpora -- it's the gospel I  
> preach :-)
>
> Perhaps the best way to kick-start this is to round up all the  
> usual suspects, and get a governmental agency (US or EU, it doesn't  
> much matter to me) to coordinate recording and transcribing 50  
> hours of data for everyone to use (I know, it isn't enough  by  
> spoken-language standards, but it's so much more than we've ever  
> had). Then we have a fighting chance of pushing the state of the  
> art in all these areas...
>
> -Dan.
>
> On 9/17/07, Sara Morrissey <sara.morrissey2 at mail.dcu.ie> wrote:
> Oh dear. Don't talk to me about corpora! I'm working in the arena  
> of Data-Driven Machine Translation and working with people who have  
> millions of sentences for their spoken language translation in  
> comparison to my 600 for sign language work!! Finding parallel data  
> within a closed domain is a difficult task. Nevertheless progress  
> is being made and results are promising :)
>
> Thanks for your input :o)
> Sara
>
>
> On 17/09/2007, Dan Parvaz <dparvaz at gmail.com > wrote:
> I'm sure the one thing standing between the Tunisian Deaf Community  
> and achieving their potential is the lack of a signing avatar :-)   
> Still, it is potentially cool research with good dividends,  
> particularly if it means the development of a real Tunisian SL  
> dictionary (as opposed to the previous effort, which was a glossary  
> meant to contribute to the perennial Pan-Arab SL movement),  
> grammar, etc.
>
> A major chunk of the problem here rests with the lack of  
> substantial corpora of any kind, let alone parallel corpora.
>
> -Dan.
>
>
>
>
> On 9/17/07, Sara Morrissey <sara.morrissey2 at mail.dcu.ie > wrote:
> All work in this area is a long way from being a translation  
> service, I can assure you of that following 3 years PhD research on  
> the topic of Machine Translation of Sign Languages. Sadly most of  
> the work that I've come across in this area is similar to the work  
> described in the BBC article in that it is just a small project. I  
> have seen very little consistant work in this area with most of it  
> being satellite projects related to other work so it never gets  
> very far. Also, sadly, many groups that work in this area have  
> little to no knowledge of the languages they are dealing with and  
> often little contact with Deaf communities or colleagues and are  
> more interested in the computing aspects.  I am aware of the  
> forthcoming FP7 project which does seem to intend spending a few  
> years of research in this area: http://www.ideal-ist.net/Countries/ 
> TN/PS-TN-1590 Well, I hope so at least, I've applied for a postdoc  
> position with them!!
>
> I'd be interested in hearing anyone's opinion on both this project  
> and any other sign language machine translation projects they've  
> come across. I intend to continue working in this area so all input  
> is valuable :o)
>
> Namaste,
> Sara
>
> ************************************
> Sara Morrissey,
> PhD Researcher,
> National Centre for Language Technology,
> School of Computing,
> Dublin City University,
> Dublin 9,
> Ireland.
> ***********************************
>
>
>
> On 15/09/2007, Dan Parvaz <dparvaz at gmail.com > wrote:
> Sigh. Everytime some student on their Amazing Journey Of Self- 
> Discovery<tm> "reinvents" a piece of deaf-related technology  
> (datagloves for reading fingerspelling, signing avatars, etc.),  
> some ignorant journalist is ready to hail it as a breakthrough.
>
> This was put together in a few months by a student intern. As far  
> as I can tell (those knowing BSL please look at the video and  
> correct me if I'm wrong), this is yet another relatively  
> straightforward marriage of speech recognition and 3D animation.  
> There's no indication that space, classifiers, etc. which would be  
> part of a natural SL are being used here. As it stands, it's less  
> useful than commercially available speech-to-text systems  
> (DragonDictate, Via Voice, etc.)
>
> Don't surplus your interpreters just yet :-)
>
> Cheers,
>
> -Dan
>
>
> On 9/15/07, GerardM < gerard.meijssen at gmail.com > wrote:
> Hoi,
> I read this article on the BBC website about a translation service  
> created by IBM that uses an avatar to translate into British Sign  
> language (bfi). Such technology could in principle also produce  
> SignWriting
> Thanks,
>      Gerard
>
> http://news.bbc.co.uk/2/hi/technology/6993326.stm
>
> _______________________________________________
> SLLING-L mailing list
> SLLING-L at majordomo.valenciacc.edu
> http://majordomo.valenciacc.edu/mailman/listinfo/slling-l
>
>
>
> _______________________________________________
> SLLING-L mailing list
> SLLING-L at majordomo.valenciacc.edu
> http://majordomo.valenciacc.edu/mailman/listinfo/slling-l
>
>
>
>
> -- 
> Blessed are the flexible, for they shall not be bent out of shape.
> _______________________________________________
> SLLING-L mailing list
> SLLING-L at majordomo.valenciacc.edu
> http://majordomo.valenciacc.edu/mailman/listinfo/slling-l
>
>
>
> _______________________________________________
> SLLING-L mailing list
> SLLING-L at majordomo.valenciacc.edu
> http://majordomo.valenciacc.edu/mailman/listinfo/slling-l
>
>
>
>
> -- 
> Blessed are the flexible, for they shall not be bent out of shape.
>
> _______________________________________________
> SLLING-L mailing list
> SLLING-L at majordomo.valenciacc.edu
> http://majordomo.valenciacc.edu/mailman/listinfo/slling-l
>
>
> _______________________________________________
> SLLING-L mailing list
> SLLING-L at majordomo.valenciacc.edu
> http://majordomo.valenciacc.edu/mailman/listinfo/slling-l
> _______________________________________________
> SLLING-L mailing list
> SLLING-L at majordomo.valenciacc.edu
> http://majordomo.valenciacc.edu/mailman/listinfo/slling-l

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/slling-l/attachments/20070918/0f8bb7ed/attachment.htm>
-------------- next part --------------
_______________________________________________
SLLING-L mailing list
SLLING-L at majordomo.valenciacc.edu
http://majordomo.valenciacc.edu/mailman/listinfo/slling-l


More information about the Slling-l mailing list