[Corpora-List] Corpora Digest, Vol 59, Issue 21

John MCKENNY john.mckenny at nottingham.edu.cn
Sat May 19 01:57:45 UTC 2012


Hi Jane
My sign in name for Yahoo is  seanthepenman
Password   Jesusmaryjoseph5

Let me know when you get in
John
________________________________________
From: corpora-bounces at uib.no [corpora-bounces at uib.no] On Behalf Of corpora-request at uib.no [corpora-request at uib.no]
Sent: 18 May 2012 18:00
To: corpora at uib.no
Subject: Corpora Digest, Vol 59, Issue 21

Today's Topics:

   1. Re:  bilingual labeled corpora (Germán Sanchis Trilles)
   2.  Biomedical Text mining Position available (job / PhD
      /Post-doc position announcement) (Martin Krallinger)
   3.  Ph.D. scolarship (Christer.Johansson at lle.uib.no)
   4.  Fully funded PhD studentships at the University of       Brighton
      (R.M.Salkie at brighton.ac.uk)
   5.  Language Resources for Public Security Applications      Workshop
      - reminder (Language and Technology Conference)
   6.  NAACL-HLT 2012 Last Call for Participation (Smaranda Muresan)
   7. Re:  A doubt concerning posting in Corpora Service.
      (Juan Antonio Sabariego)


----------------------------------------------------------------------

Message: 1
Date: Thu, 17 May 2012 12:27:19 +0200 (CEST)
From: Germán Sanchis Trilles <gsanchis at dsic.upv.es>
Subject: Re: [Corpora-List] bilingual labeled corpora
To: Ralf Steinberger <ralf.steinberger at jrc.ec.europa.eu>
Cc: CORPORA at uib.no

Dear Ralf,

thank you very much for the information. I am looking into the corpora you
pointed out, and I think they will actually be very useful for my research
:)

Best regards,

Germán Sanchis-Trilles





On Wed, 16 May 2012, Ralf Steinberger wrote:

> Dear Germán,
>
> We just released (today - the email is on its way!) a multi-label classification tool which has been trained for 22 languages and which comes with manually annotated topic descriptors, drawn from the EuroVoc thesaurus. The multi-label annotation is at document level. There are between twenty and forty thousand documents per language.
>
> You can find it at http://langtech.jrc.ec.europa.eu/Eurovoc.html .
>
> Maybe this corpus is useful for you.
>
> Should you be seeking for individual aligned sentences, then may be the DGT-Translation Memory DGT-TM is what you are looking for. While the sentences in DGT-TM are not individually annotated, they are accompanied by a document identifier so that - with a bit of effort - you can retrieve the EuroVoc descriptors for these documents. DGT-TM exists in the same 22 languages and is downloadable from http://langtech.jrc.ec.europa.eu/DGT-TM.html .
>
> Greetings,
>
> Ralf
>
>
> Ralf Steinberger (Ralf.Steinberger at jrc.ec.europa.eu)
> European Commission ? Joint Research Centre (JRC)
> IPSC ? GlobeSec ? OPTIMA
> URL ? Applications: http://emm.newsbrief.eu/overview.html
> URL ? The science behind them: http://langtech.jrc.ec.europa.eu
> T.P. 267, Via E. Fermi 2749
> 21027 Ispra (VA), Italy
> Tel: +39 0332 78-6271
> Fax: +39 0332 78-5154
> Secretary: +39 0332 78-5648 or 9478
>
>
> -----Original Message-----
> From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf Of Germán Sanchis Trilles
> Sent: 16 May 2012 12:56
> To: CORPORA at uib.no
> Subject: [Corpora-List] bilingual labeled corpora
>
> Dear list,
>
> for performing some SMT experiments I would require some kind of bilingual corpora, presenting different kind of annotations, such as topic or dialog act labels (or other kinds of labels). Does anyone know about such corpora?
>
> Thanks in advance,
>
> best regards,
>
> Germán Sanchis-Trilles
>
>

------------------------------

Message: 2
Date: Thu, 17 May 2012 18:25:20 +0200
From: "Martin Krallinger" <mkrallinger at cnio.es>
Subject: [Corpora-List] Biomedical Text mining Position available (job
        / PhD   /Post-doc position announcement)
To: <corpora at uib.no>

Title: Biomedical Text mining Position available (job / PhD / Post-doc
position announcement)

Several types of contracts could be offered in our research group,
including Post-doctoral, PhD or post-graduate positions.
Salaries will depend on the type of position, expertise and academic
formation. Working language is English.
Lab URL:
http://www.cnio.es/ing/grupos/plantillas/presentacion.asp?grupo=50004294
Publication record (Alfonso Valencia):
http://scholar.google.es/citations?user=4iB725QAAAAJ&hl=en


General description:
The candidate will work in a multidisciplinary team dealing with the
development and application of biomedical
text mining and natural language processing techniques. The overall aim
of this work is to develop and apply
text mining and natural language processing technologies to biomedical
literature, covering aspects related to
automatic text classification using machine learning methods, the
detection of entities of biological interest from
text and the extraction and ranking of biological relations from the
biomedical literature. A special focus will be
given to certain topics including: cancer-relevant gene detection and
relationship extraction.

Requirements:
(1) Applicants should have a solid formation in computational
linguistics, Natural Language Processing, text mining or
related areas.
(2) Ability to develop algorithms and software for needed by natural
language processing/text mining systems
(3) Programming skills are required, in at least one of the following
languages (Python, Perl, Java, C/C++, Ruby).
(4) Good English communication skills.
(5) Interest in the Biomedical field.

The expertise in the following points are a plus when applying for the
position:
- Formation on topics related to statistics and machine learning.
- Development of online web applications.
- Previous experience with biomedical texts.
- Ability to work in an interdisciplinary team.
- A good scientific publication record would be an advantage.
- Familiarity with NLP tasks such as named entity recognition,
information extraction and information retrieval.

Background on our research group:
The position is available in the group of Dr. Alfonso Valencia, Director
of the Structural Biology and Biocomputing Programme
at the Spanish National Cancer Research Centre. The research group
contributed significantly to the biomedical text mining
research over the past years, from initial work related to the analysis
of protein families, microarray data and protein
interactions to the development of online applications such as the iHOP
server, PLAN2L or the BioCreative Metaserver.
The group collaborates with experimental biomedicine labs to integrate
NLP and text mining data with the results of
bioinformatics data. It has been co-organizing community evaluation
efforts in the BioNLP area, i.e. the BioCreative challenges.

Contact info: Requests for additional information or formal applications
(including application letters, extensive CV and PhD/MA thesis
description) can be sent to Martin Krallinger:mkrallinger at cnio.es


--------------------------------------------------
Martin Krallinger
Structural Computational Biology Group
Structural Biology and BioComputing Programme
Spanish National Cancer Research Centre (CNIO)
--------------------------------------------------



**NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información protegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega comunicarlo al remitente y borrar el mensaje recibido.
**CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged information for the sole use of the designated recipient named above. Distribution, reproduction or any other use of this transmission by any party other than the intended recipient is prohibited. If you are not the intended recipient please contact the sender and delete all copies.




------------------------------

Message: 3
Date: Mon, 14 May 2012 14:21:14 +0200
From: Christer.Johansson at lle.uib.no
Subject: [Corpora-List] Ph.D. scolarship
To: corpora at uib.no

PhD Research Fellowships in the Department of linguistic, literary,
and aesthetic studies

A PhD fellowship is available at the Department of linguistic,
literary, and aesthetic studies from 1 September 2012.

The fellowship is within the research project CATO (Contextual Aspects
of Text Organization) which is funded by the Norwegian Research
Council. CATO is about writing flow, and how writing flow can be
supported in future technical aids for weak writers. The applicant's
work will be within the specifications of this project. The research
is conducted in cooperation with professor Christer Johansson (UiB)
and associate professor Per Henning Uppstad (ReadingCenter at UiS).


http://www.jobbnorge.no/job.aspx?jobid=83366





------------------------------

Message: 4
Date: Tue, 15 May 2012 15:39:40 +0000
From: R.M.Salkie at brighton.ac.uk
Subject: [Corpora-List] Fully funded PhD studentships at the
        University of   Brighton
To: "CORPORA at UIB.NO" <CORPORA at UIB.NO>

[Note deadline: 8 June 2012]

The University of Brighton's Doctoral College invites applications from around the world for one of up to 40 new PhD studentships available for entry during the 2012/2013 academic year.

Two of the studentships are in linguistics - see below.

Full details: http://www.brighton.ac.uk/researchstudy/2012studentships/
Advertisement in Times Higher: http://www.timeshighereducation.co.uk/jobs_jobdetails.asp?ac=93409

Each studentship is worth £55,650 and will support full-time study over a three-year period, including £14,300 per year towards living expenses.

Linguistics topics:

Modality in English and the semantics/pragmatics interface
http://www.brighton.ac.uk/researchstudy/2012studentships/arts-and-humanities/modality-in-english-and-the-semanticspragmatics-interface

The future of the languages of Europe (contrastive linguistics, English and German)
http://www.brighton.ac.uk/researchstudy/2012studentships/arts-and-humanities/the-future-of-the-languages-of-europe-contrastive-linguistics-english-and-german

These urls need to be on a single line. You can also find these topics easily via the Full details page above.

For informal discussion about the linguistics studentships, email Professor Raphael Salkie (r.m.salkie at bton.ac.uk).

For general enquiries about the scheme, or to be put in touch with a supervisor, please contact the Brighton Doctoral College by email at: doctoralcollegedean at brighton.ac.uk



Professor Raphael Salkie,
School of Humanities,
University of Brighton
Falmer, Brighton, BN1 9PH
England.

Fax: (+44) 01273 641873
Email: r.m.salkie at brighton.ac.uk <mailto:r.m.salkie at brighton.ac.uk>

Home page: http://arts.brighton.ac.uk/staff/raf-salkie





___________________________________________________________
This email has been scanned by MessageLabs' Email Security
System on behalf of the University of Brighton.
For more information see http://www.brighton.ac.uk/is/spam/
___________________________________________________________
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/html
Size: 6412 bytes
Desc: not available
URL: <http://www.uib.no/mailman/public/corpora/attachments/20120515/74760089/attachment.txt>

------------------------------

Message: 5
Date: Thu, 17 May 2012 00:47:49 +0200 (CEST)
From: ltc at amu.edu.pl (Language and Technology Conference)
Subject: [Corpora-List] Language Resources for Public Security
        Applications    Workshop - reminder
To: corpora at hd.uib.no

Dear Sir or Madam

This is to invite you to participate in the Language Resources for Public Security Applications Workshop (LRPS 2012) at LREC 2012. The workshop will take place in Istanbul (Turkey), on May 27th, 2012. More information regardig the workshop can be found at http://www.lrps.amu.edu.pl. We also recommend to visit the LREC 2012 site at http://www.lrec-conf.org/lrec2012/.

Best regards,
LRPS Organizers



------------------------------

Message: 6
Date: Thu, 17 May 2012 18:29:06 -0400
From: Smaranda Muresan <smuresan at rci.rutgers.edu>
Subject: [Corpora-List] NAACL-HLT 2012 Last Call for Participation
To: liste_acl <acl at aclweb.org>, liste_isca <publ at isca-speech.org>,
        liste_corpora <corpora at uib.no>, liste_bionlp
        <bionlp at lists.ccs.neu.edu>,     liste_elsnet <elsnet-list at elsnet.org>,
        liste_sigsem <sigsem at aclweb.org>,       irlist at lists.shef.ac.uk

==============================================================================================
NAACL-HLT 2012 LAST CALL FOR PARTICIPATION

North American Chapter of the Association for Computational Linguistics - Human Language Technologies
(NAACL-HTL 2012)
June 3 ? June 8, 2012, Montreal, Canada
http://naaclhlt2012.org

==============================================================================================

****REGISTRATION DEADLINES******

Early Registration Closed
Late Registration Closes May 23 at 11:59pm East Coast Time


Registration:
http://www.naaclhlt2012.org/registration/registration.php
Hotel Room Reservation:
http://www.naaclhlt2012.org/participants/accomodation.php


The 2012 Conference of the North American Chapter of the Association for Computational Linguistics:
Human Language Technologies (NAACL-HLT 2012) will be held June 3 - 8, 2012 at Le Centre Sheraton
Montréal 1201, boul. René-Lévesque ouest, Montréal, (Québec), Canada.

The conference will include three days of technical presentations (June 4-6),
one day of tutorials (June 3), and two days of workshops (June 7-8). In addition, this year the
conference will be co-located with the First Joint Conference on Lexical and Computational Semantics
(June 7-8).

[NAACL-HTL CONFERENCE PROGRAM]
http://www.naaclhlt2012.org/conference/conference.php


[INVITED SPEAKERS]

* Eduard Hovy, Director of the Human Language Technology Group, Information Sciences Institute of
 the University of Southern California
* James W. Pennebaker, Centennial Liberal Arts Professor and Chair of Psychology at the University
 of Texas at Austin


[BEST PAPER AWARDS]

Best Full Paper Award: Vine Pruning for Efficient Multi-Pass Dependency Parsing: Alexander Rush and Slav Petrov
Best Short Paper Award: Trait-Based Hypothesis Selection For Machine Translation: Jacob Devlin and Spyros Matsoukas
IBM Best Student Paper Award: Cross-lingual Word Clusters for Direct Transfer of Linguistic Structure: Oscar Täckström, Ryan McDonald, Jakob Uszkoreit

[TUTORIALS]
June 3  (http://www.naaclhlt2012.org/conference/tutorials.php)

* 100 Things You Always Wanted to Know about Linguistics But Were Afraid to Ask* (Emily M. Bender)
* Structured Sparsity in Natural Language Processing: Models, Algorithms and Applications (André F. T. Martins, Mário A. T. Figueiredo, and Noah A. Smith)
* Arabic Dialect Processing Tutorial (Mona Diab and Nizar Habash)
* Natural Language Processing in Watson         (Alfio M. Gliozzo, Aditya Kalyanpur, James Fan)
* Variational Inference for Structured NLP Models       (David Burkett, Dan Klein)
* Processing modality and negation      (Roser Morante)
* On-Demand Distributional Semantic Distance and Paraphrasing   (Yuval Marton)
* Predicting Structures in NLP: Constrained Conditional Models and Integer Linear Programming NLP (Dan Goldwasser, Vivek Srikumar, Dan Roth)

[WORKSHOPS]
NAACL-HLT 2012 features an expanded workshop program, with 16
workshops over two days (June 7-8). The workshops are:

*Cognitive Modeling and Computational Linguistics
*Future directions and needs in the Spoken Dialog Community
*Induction of Linguistic Structure
*Innovative Use of NLP for Building Educational Applications
*Language in Social Media
*Predicting and improving text readability
*Twelfth Meeting of the ACL-SIGMORPHON Computational Research in Phonetics, Phonology, and Morphology
*BioNLP
*Computational Linguistics for Literature
*Evaluation metrics and system comparison for automatic summarization
*Future of Language Modeling for HL
*Semantic Interpretation in an Actionable Context
*Syntactic Analysis of Non-Canonical Language
*Speech and Language Processing for Assistive Technologies
*Joint Workshop on Automatic Knowledge Base Construction and Web-scale Knowledge Extraction (AKBC-WEKEX)
*Statistical Machine Translation

For more information about the workshops visit: http://www.naaclhlt2012.org/conference/ws.php

[CONFERENCE VENUE]
NAACL-HTL 2012 will be held at Le Centre Sheraton  Montréal 1201, boul. René-Lévesque ouest,
Montréal, (Québec), Canada. The  negotiated Conference discount rate expired at  May 11, 2012.
Due to the Grand Prix immediately following our event, the hotel is almost sold out. However, a few Club Level
rooms are still available and we have negotiated a discounted rate of $280 per night based on availability.
Guests can access the site to book, modify, or cancel a reservation from May 11, 2012 to June 12, 2012.
Simply follow https://www.starwoodmeeting.com/StarGroupsWeb/res?id=1205117094&key=322B


[REGISTRATION]
Early Registration Closed
Late Registration Closes May 23 at 11:59pm East Coast Time
Registration: http://www.naaclhlt2012.org/registration/registration.php

We hope to see you in Montreal!


[ORGANIZING COMMITTEE]

General Conference Chair
Jennifer Chu-Carroll, IBM

Program Co-Chairs
Srinivas Bangalore, AT&T
Eric Fosler-Lussier, The Ohio State University
Ellen Riloff, University of Utah


Local Arrangements Chair:
Priscilla Rasmussen, ACL Business Office, acl-AT-aclweb.org
Advisory committee:
  Sabine Bergler, Concordia University
  Guy Lapalme, Université de Montréal

Workshops Co-Chairs:
Colin Cherry, National Research Council of Canada
Mona Diab, Columbia University

Tutorials Co-Chairs:
Jacob Eisenstein, CMU
Radu Florian, IBM T.J. Watson Research Center

Demos Co-Chairs:
Aria Haghighi, Prismatic
Yaser Al-Onaizan, IBM T.J. Watson Research Center

Student Workshop Co-chairs:
Rivka Levitan, Columbia University
Myle Ott, Cornell University
Faculty Advisors:
Roger Levy, UCSD
Ani Nenkova, University of Pennsylvania

Publications:
Nizar Habash, Columbia University
William Schuler, OSU

Publicity:
Smaranda Muresan, Rutgers University

Exhibits:
Joel Tetreault, Education Testing Services

Webmaster:
Dirk Hovy, USC/ISI

Smaranda Muresan
Assistant Professor
Library and Information Science Department
School of Communication and Information
Rutgers University
4 Hungtington St
New Brunswick, NJ, 08901
smuresan at rci.rutgers.edu




-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/html
Size: 9546 bytes
Desc: not available
URL: <http://www.uib.no/mailman/public/corpora/attachments/20120517/c8ecaee0/attachment.txt>

------------------------------

Message: 7
Date: Fri, 18 May 2012 00:30:22 +0200
From: Juan Antonio Sabariego <j.a.sabariego at gmail.com>
Subject: Re: [Corpora-List] A doubt concerning posting in Corpora
        Service.
To: corpora at uib.no

Dear members of CorporaList,

Some member of the Universitat Pompeu Fabra pragmatics groups are working
on a project about evidential and epistemic markers in five different
languages, namely Spanish, Catalan, English, French and German. We would be
very thankful if you could offer some help about the searching of oral
conversational corpora in those five languages, already transcribed, for
the purpose of our investigation. We have found some of them but they are
quite limited for our investigation or we have to pay for them, we wanted
first to have a look on the free versions. We guess you can draw a clear
panorama of the situation of oral corpora in these three languages.
Thank you in advance.

Best regards,


Juan Antonio Chica Sabariego
Universitat Pompeu Fabra

On Thu, May 17, 2012 at 10:57 PM, Knut Hofland <Knut.Hofland at uni.no> wrote:

> If you want to post to the list, use the address corpora at uib.no
>
> Best regards
> Knut Hofland
> Listadm
>
>
>
> On Tue, 15 May 2012, Juan Antonio Sabariego wrote:
>
>  Some members of the Universitat Pompeu Fabra pragmatics group are working
>> on a project about evidential and epistemic units in five different
>> languages,
>> namely Spanish, Catalan, English, French and German. We would be much
>> pleased if you could offer some help about oral conversational corpora,
>> already
>> transcribed, for the purpose of our investigation. I have found some of
>> them but they are quite limited for our investigation or paying.  Thank you
>> in
>> advance.
>>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/html
Size: 2175 bytes
Desc: not available
URL: <http://www.uib.no/mailman/public/corpora/attachments/20120518/60059db5/attachment.txt>

----------------------------------------------------------------------
Send Corpora mailing list submissions to
        corpora at uib.no

To subscribe or unsubscribe via the World Wide Web, visit
        http://mailman.uib.no/listinfo/corpora
or, via email, send a message with subject or body 'help' to
        corpora-request at uib.no

You can reach the person managing the list at
        corpora-owner at uib.no

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Corpora digest..."


_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


End of Corpora Digest, Vol 59, Issue 21
***************************************

This message and any attachment are intended solely for the addressee and may contain confidential information. 
If you have received this message in error, please send it back to me, and immediately delete it. 
  
Please do not use, copy or disclose the information contained in this message or in any attachment.  

Any views or opinions expressed by the author of this email do not necessarily reflect the views of The University of Nottingham Ningbo, China.


This message has been checked for viruses but the contents of an attachment may still contain software viruses which could damage your computer system:
you are advised to perform your own checks. 

Email communications with The University of Nottingham Ningbo, China may be monitored as permitted by UK and Chinese legislation.

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list