[Corpora-List] Corpora Digest, Vol 70, Issue 21

Thu Apr 25 08:20:51 UTC 2013

Hi, 
I want to ask you if you know of software or programs (open source) for the detection of the origin of language (Arabic, French or English)

Houda SAADANE

________________________________
 De : "corpora-request at uib.no" <corpora-request at uib.no>
À : corpora at uib.no 
Envoyé le : Jeudi 18 avril 2013 12h00
Objet : Corpora Digest, Vol 70, Issue 21

Today's Topics:

   1. Re:  [Moses-support] help me to find and download    IWSLT08 and
      NIST08 datasets (Philipp Koehn)
   2.  ESSLLI 2014: Call for Course and Workshop Proposals
      (Sophia Katrenko)
   3.  jobs: Linguistic Annotation Manager at Nuance (Nate Blaylock)
   4.  PhD position in learner corpus research at Louvain
      (Sylviane Granger)
   5.  Two 3-year PhD studentships in Machine Translation at the
      University of Sheffield (starting September 2013) (Lucia Specia)

----------------------------------------------------------------------

Message: 1
Date: Wed, 17 Apr 2013 14:47:49 +0100
From: Philipp Koehn <pkoehn at inf.ed.ac.uk>
Subject: Re: [Corpora-List] [Moses-support] help me to find and
    download    IWSLT08 and NIST08 datasets
To: Saeed Farzi <saeedfarzi at gmail.com>
Cc: moses-support <moses-support at mit.edu>, "corpora at uib.no"
    <corpora at uib.no>

Hi,

the NIST evaluation sets are distributed through the LDC.

The IWSLT evaluation sets are available if you register to participate
in their evaluation campaign - I am not sure if they are available
otherwise, but you should get in touch with the organizers.

-phi

On Mon, Apr 15, 2013 at 7:39 AM, Saeed Farzi <saeedfarzi at gmail.com> wrote:
> Dear members,
>
> I gonna run and test our machine translation system on the well-known
> datasets (IWSLT08, NIST08) . Unfortunately, I could not find them on the
> web. I wonder if anybody could help me to find and download.
>
> Tnx in advance
>
> --
>            S.Farzi, Ph.D. Student
>     Natural Language Processing Lab,
>   School of Electrical and Computer Eng.,
>                Tehran University
>              Tel: +9821-6111-9719
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support at mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>

------------------------------

Message: 2
Date: Thu, 18 Apr 2013 01:11:51 +0200 (CEST)
From: "Sophia Katrenko" <sophia_katrenko at gmx.de>
Subject: [Corpora-List] ESSLLI 2014: Call for Course and Workshop
    Proposals
To: corpora at uib.no

A non-text attachment was scrubbed...
Name: not available
Type: text/html
Size: 7228 bytes
Desc: not available
URL: <http://www.uib.no/mailman/public/corpora/attachments/20130418/d7b300ec/attachment.txt>

------------------------------

Message: 3
Date: Wed, 17 Apr 2013 23:25:39 -0400
From: Nate Blaylock <blaylock.nate at gmail.com>
Subject: [Corpora-List] jobs: Linguistic Annotation Manager at Nuance
To: corpora at uib.no

The NLP Research division at Nuance Communications is looking for a manager
for the linguistic annotation group.  Location is flexible, although we
prefer somewhere in North America because of timezone issues.

NLP Research at Nuance is a large group, and we are involved transitioning
NLP/dialog  research to a number of cool projects used by millions of
people, including Samsung S-Voice, Dragon Mobile Assistant, DragonTV, and
others.

Please contact me (Nate Blaylock - nate.blaylock at nuance.com) if interested.

Details:

? Work with research groups to define workable and consistent annotation
specs.

? Define and continually improve the annotation process for high quality
and high volume, including automated ways to train new annotators,
automated quality checks, etc.

? Help design tools for annotation and quality assurance.

? Plan resources to meet customer and research needs.  Hire permanent and
temporary annotators for multiple languages and multiple projects.

*Qualifications:*

   - At least 3 years managing similar annotation.
   - Prior experience in semantic annotation.
   - Capable of working in close collaboration with researchers. Ability to
   propose and share ideas to define best annotation specs.
   - Very organized.
   - Able to make accurate plans and execute them.
   - Very strong work methodology.
   - Excellent communication skills.
   - Light computer and scripting skills to sort, filter and review data in
   order to monitor and review annotations.
   - Good knowledge of Microsoft Office software.

Preferred Skills:

? Knowledge of multiple languages

? Experience in research

Education: Master's degree in linguistics, computational linguistics or
computer science.

We offer a competitive compensation package; including stock options,
employee stock purchase plan, 401(k), full health and welfare benefits and
a casual yet technically challenging work environment.  Nuance is an Equal
Opportunity Employer.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/html
Size: 2699 bytes
Desc: not available
URL: <http://www.uib.no/mailman/public/corpora/attachments/20130417/9533a940/attachment.txt>

------------------------------

Message: 4
Date: Thu, 18 Apr 2013 09:49:52 +0200
From: Sylviane Granger <sylviane.granger at uclouvain.be>
Subject: [Corpora-List] PhD position in learner corpus research at
    Louvain
To: corpora at uib.no

A non-text attachment was scrubbed...
Name: not available
Type: text/html
Size: 2370 bytes
Desc: not available
URL: <http://www.uib.no/mailman/public/corpora/attachments/20130418/e90b8343/attachment.txt>

------------------------------

Message: 5
Date: Thu, 18 Apr 2013 10:48:42 +0100
From: Lucia Specia <lspecia at gmail.com>
Subject: [Corpora-List] Two 3-year PhD studentships in Machine
    Translation at the University of Sheffield (starting September 2013)
To: corpora at uib.no, mt-list at eamt.org, moses-support at mit.edu,
    ml-news at googlegroups.com, ce-pln at grupos.ufrgs.br,
    forum-lp at lists.fct.unl.pt, Lucia Specia <l.specia at sheffield.ac.uk>
Cc: Gustavo Henrique Paetzold <ghp_91 at hotmail.com>

The Department of Computer Science invites applications for two PhD
studentships in Machine Translation connected to the MODIST project. MODIST
(Modelling Discourse in Statistical Translation) is an EPSRC project aimed
at modelling discourse level relationships across sentences in statistical
machine translation. The studentships will be dedicated to one of the
following projects:

***A framework for learning valid transitions across machine translated
sentences based on rich bilingual linguistic information.
*** Decoding algorithms that represent expected discourse relationships as
document-wide constraints to guide the search for the best translation.

The ideal candidates will hold a first degree (MSc level is a plus) in
computer science, mathematics or statistics. Candidates must have excellent
programming skills. Experience in one or more of the following areas is
desirable: machine learning, natural language processing, statistical
machine translation, and discourse processing.

The studentship will cover the candidates' tuition fees, as well as a
stipend of 13,726/year.  Applications are open to all nationalities, but
due to restrictions from the funding agency, non UK/EU candidates will be
required to cover any differences in tuition fees.

Candidates should submit an application (
https://www.shef.ac.uk/postgradapplication/) including a 500 word research
proposal on one of the above mentioned topics, and a 250 word explanation
of why their skills, experience and plans for further research or study
make them a particularly suitable candidate.

Application closing date: Open until filled, but we are looking to
interview candidates early May.

For informal inquiries contact Dr. Lucia Specia: L.Specia at sheffield.ac.uk

---
www.dcs.shef.ac.uk/~lucia/ <http://www.dcs.shef.ac.uk/%7Elucia/>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/html
Size: 2037 bytes
Desc: not available
URL: <http://www.uib.no/mailman/public/corpora/attachments/20130418/02aebcdc/attachment.txt>

----------------------------------------------------------------------
Send Corpora mailing list submissions to
    corpora at uib.no

To subscribe or unsubscribe via the World Wide Web, visit
    http://mailman.uib.no/listinfo/corpora
or, via email, send a message with subject or body 'help' to
    corpora-request at uib.no

You can reach the person managing the list at
    corpora-owner at uib.no

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Corpora digest..."

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora

End of Corpora Digest, Vol 70, Issue 21
***************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20130425/8ae755b1/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora