Sign language corpora

Lorraine Leeson leesonl at gmail.com
Tue Sep 18 11:35:23 UTC 2007


Here in Ireland, we have also been working on the development of a
corpus of Irish Sign Language. Working on a shoestring budget, we have
data for 40 signers from across Ireland (24 female, 16 male), aged
18-70 plus years. Data was collected in 2004 and we have had a small
but diligent team of annotaters working using ELAN and the ECHO
project's conventions (and adapting them when we needed to) to create
a tool that is proving invaluable in terms of research, but also
teaching. The corpus includes self-selected narratives(i.e. the Deaf
person decided what personal story they were happy to tell), versions
of the Frog Story and Volterra et al's picture elicitation task. We
would like to secure funding to enlarge our pool of data to include
ISL signers from the north-west and Northern Ireland, a more robust
representation of elderly signers, the inclusion of child signers, and
the inclusion of dialogues.

At present the Signs of Ireland corpus is not availalbe on the web,
but we are exploring ways of sharing our data more widely.

Lorraine



On 18/09/2007, Adam C Schembri <a.schembri at ucl.ac.uk> wrote:
>
> There are many of us who follow this particular 'gospel'. :-) I have
> jokingly referred to it as the 'gospel according to Trevor and Ceil'. Ceil
> Lucas and her colleagues were perhaps the first to systematically collect a
> naturalistic corpus of sign language data balanced for
> age/gender/region/ethnicity etc (Lucas, Bayley & Valli, 2001), and Trevor
> Johnston and colleagues were the first - to my knowledge - to actually begin
> to build a 'corpus' in the contemporary sense of the term (i.e., a
> machine-readable, annotated collection of language recordings), filming 3
> hour data collection sessions from 100 native and near-native signers in 5
> regions across Australia. The NGT project has taken this further by building
> in web-accessibility of their corpus into their project, but I believe the
> Australian team do hope to make the Auslan corpus more widely available at a
> later stage. Certainly, the new project that Inge refers to here in the UK
> plans to do something similar to the NGT project ( for those of you who
> don't know, my colleagues and I have just been awarded a major £1.2 million
> grant from the Economic and Social Research Council for the 'British Sign
> Language Corpus Project'. For more information, visit DCAL's news
> page: http://www.dcal.ucl.ac.uk/news/news.html ).
>
> Adam
>
> Adam C Schembri, PhD
> Senior Research Fellow
> Deafness, Cognition and Language (DCAL) Research Centre
> University College London
> 49 Gordon Square
> London WC1H0PD
> United Kingdom
> Tel: +44 20 7679 8680
> http://www.dcal.ucl.ac.uk/team/adam_schembri.html
>
>
>
>
> On 18 Sep 2007, at 08:04, I.E.P. Zwitserlood wrote:
>
> Talking about gospels: so it is ours!
> In the Netherlands, at the Radboud University Nijmegen, a corpus is
> currently being compiled for NGT (Sign Language of the Netherlands). My
> collegues and I aim at recording 75 hours of elicited and (semi-)spontaneous
> data, collected from 100 native signers. All video data, as well as a
> translation and (for a small subset of the data) an annotation, will be made
> available on internet. (Similar projects have been/will be undertoken in
> Australia, the UK and Ireland, although the data are not so easily
> available). If anyone is interested in making a corpus for his/her sign
> language, we'll be happy to inform/support you with our experiences. For
> more information, see our website:
> http://www.let.kun.nl/sign-lang/corpusngt/scientific/index.html
>
> Best,
> Inge Zwitserlood
>
> ----- Original Message -----
> From: Dan Parvaz <dparvaz at gmail.com>
> Date: Monday, September 17, 2007 6:30 pm
> Subject: Re: [SLLING-L] An avator doing bfi
>
> Sorry, but I can't stop going on about corpora -- it's the gospel I preach
> :-)
>
> Perhaps the best way to kick-start this is to round up all the usual
> suspects, and get a governmental agency (US or EU, it doesn't much matter to
> me) to coordinate recording and transcribing 50 hours of data for everyone
> to use (I know, it isn't enough  by spoken-language standards, but it's so
> much more than we've ever had). Then we have a fighting chance of pushing
> the state of the art in all these areas...
>
> -Dan.
>
>  Sorry, but I can't stop going on about corpora -- it's the gospel I preach
> :-)
>
> Perhaps the best way to kick-start this is to round up all the usual
> suspects, and get a governmental agency (US or EU, it doesn't much matter to
> me) to coordinate recording and transcribing 50 hours of data for everyone
> to use (I know, it isn't enough  by spoken-language standards, but it's so
> much more than we've ever had). Then we have a fighting chance of pushing
> the state of the art in all these areas...
>
> -Dan.
>
> On 9/17/07, Sara Morrissey <sara.morrissey2 at mail.dcu.ie> wrote:
> >
> > Oh dear. Don't talk to me about corpora! I'm working in the arena of
> Data-Driven Machine Translation and working with people who have millions of
> sentences for their spoken language translation in comparison to my 600 for
> sign language work!! Finding parallel data within a closed domain is a
> difficult task. Nevertheless progress is being made and results are
> promising :)
> >
> > Thanks for your input :o)
> > Sara
> >
> >
> >
> > On 17/09/2007, Dan Parvaz <dparvaz at gmail.com > wrote:
> > > I'm sure the one thing standing between the Tunisian Deaf Community and
> achieving their potential is the lack of a signing avatar :-)  Still, it is
> potentially cool research with good dividends, particularly if it means the
> development of a real Tunisian SL dictionary (as opposed to the previous
> effort, which was a glossary meant to contribute to the perennial Pan-Arab
> SL movement), grammar, etc.
> > >
> > > A major chunk of the problem here rests with the lack of substantial
> corpora of any kind, let alone parallel corpora.
> > >
> > > -Dan.
> > >
> > >
> > >
> > >
> > >
> > > On 9/17/07, Sara Morrissey <sara.morrissey2 at mail.dcu.ie > wrote:
> > > >
> > > > All work in this area is a long way from being a translation service,
> I can assure you of that following 3 years PhD research on the topic of
> Machine Translation of Sign Languages. Sadly most of the work that I've come
> across in this area is similar to the work described in the BBC article in
> that it is just a small project. I have seen very little consistant work in
> this area with most of it being satellite projects related to other work so
> it never gets very far. Also, sadly, many groups that work in this area have
> little to no knowledge of the languages they are dealing with and often
> little contact with Deaf communities or colleagues and are more interested
> in the computing aspects.  I am aware of the forthcoming FP7 project which
> does seem to intend spending a few years of research in this area:
> http://www.ideal-ist.net/Countries/TN/PS-TN-1590 Well, I
> hope so at least, I've applied for a postdoc position with them!!
> > > >
> > > > I'd be interested in hearing anyone's opinion on both this project and
> any other sign language machine translation projects they've come across. I
> intend to continue working in this area so all input is valuable :o)
> > > >
> > > > Namaste,
> > > > Sara
> > > >
> > > > ************************************
> > > > Sara Morrissey,
> > > > PhD Researcher,
> > > > National Centre for Language Technology,
> > > > School of Computing,
> > > > Dublin City University,
> > > > Dublin 9,
> > > > Ireland.
> > > > ***********************************
> > > >
> > > >
> > > >
> > > >
> > > > On 15/09/2007, Dan Parvaz <dparvaz at gmail.com > wrote:
> > > > > Sigh. Everytime some student on their Amazing Journey Of
> Self-Discovery<tm> "reinvents" a piece of deaf-related technology
> (datagloves for reading fingerspelling, signing avatars, etc.), some
> ignorant journalist is ready to hail it as a breakthrough.
> > > > >
> > > > > This was put together in a few months by a student intern. As far as
> I can tell (those knowing BSL please look at the video and correct me if I'm
> wrong), this is yet another relatively straightforward marriage of speech
> recognition and 3D animation. There's no indication that space, classifiers,
> etc. which would be part of a natural SL are being used here. As it stands,
> it's less useful than commercially available speech-to-text systems
> (DragonDictate, Via Voice, etc.)
> > > > >
> > > > > Don't surplus your interpreters just yet :-)
> > > > >
> > > > > Cheers,
> > > > >
> > > > > -Dan
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On 9/15/07, GerardM < gerard.meijssen at gmail.com > wrote:
> > > > > >
> > > > > > Hoi,
> > > > > > I read this article on the BBC website about a translation service
> created by IBM that uses an avatar to translate into British Sign language
> (bfi). Such technology could in principle also produce SignWriting
> > > > > > Thanks,
> > > > > >      Gerard
> > > > > >
> > > > > > http://news.bbc.co.uk/2/hi/technology/6993326.stm
> > > > > >
> > > > > > _______________________________________________
> > > > > > SLLING-L mailing list
> > > > > > SLLING-L at majordomo.valenciacc.edu
> > > > > >
> http://majordomo.valenciacc.edu/mailman/listinfo/slling-l
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > _______________________________________________
> > > > > SLLING-L mailing list
> > > > > SLLING-L at majordomo.valenciacc.edu
> > > > >
> http://majordomo.valenciacc.edu/mailman/listinfo/slling-l
> > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Blessed are the flexible, for they shall not be bent out of shape.
> > > > _______________________________________________
> > > > SLLING-L mailing list
> > > > SLLING-L at majordomo.valenciacc.edu
> > > >
> http://majordomo.valenciacc.edu/mailman/listinfo/slling-l
> > > >
> > > >
> > >
> > >
> > > _______________________________________________
> > > SLLING-L mailing list
> > > SLLING-L at majordomo.valenciacc.edu
> > >
> http://majordomo.valenciacc.edu/mailman/listinfo/slling-l
> > >
> > >
> >
> >
> >
> > --
> > Blessed are the flexible, for they shall not be bent out of shape.
> > _______________________________________________
> > SLLING-L mailing list
> > SLLING-L at majordomo.valenciacc.edu
> > http://majordomo.valenciacc.edu/mailman/listinfo/slling-l
> >
> >
>
> _______________________________________________
> SLLING-L mailing list
> SLLING-L at majordomo.valenciacc.edu
> http://majordomo.valenciacc.edu/mailman/listinfo/slling-l
> _______________________________________________
> SLLING-L mailing list
> SLLING-L at majordomo.valenciacc.edu
> http://majordomo.valenciacc.edu/mailman/listinfo/slling-l
>
> _______________________________________________
> SLLING-L mailing list
> SLLING-L at majordomo.valenciacc.edu
> http://majordomo.valenciacc.edu/mailman/listinfo/slling-l
>
>


-- 
Dr. Lorraine Leeson
Director
Centre for Deaf Studies
School of Linguistics, Speech and Communication Sciences
University of Dublin, Trinity College
40 Lower Drumcondra Road
Drumcondra, Dublin 9

Tel: 01 830 11 66
GSM: 087 66 700 28

_______________________________________________
SLLING-L mailing list
SLLING-L at majordomo.valenciacc.edu
http://majordomo.valenciacc.edu/mailman/listinfo/slling-l



More information about the Slling-l mailing list