<div dir="ltr"><div class="gmail_extra">Hi,</div><div class="gmail_extra"><br></div><div class="gmail_extra">Moses outputs phrase-alignments by default which are then removed in a subsequent step. For example</div><div class="gmail_extra">
<br></div><div class="gmail_extra">by a government |19-21| authority |22-22| of their state |23-24|<br></div><div class="gmail_extra"><br></div><div class="gmail_extra">means that a phrase made up of source words at indexes 19, 20 and 21 were translated to target phrase "by a government".</div>
<div class="gmail_extra"><br></div><div class="gmail_extra">If you want phrase-internal alignments (word-to-word), you can add one of the following to the decoder command and get alignment information. </div><div class="gmail_extra">
<br></div><div class="gmail_extra"><pre style="margin-top:0px;margin-bottom:0px;border:2px dotted rgb(150,150,150);background-color:rgb(248,248,248);padding:10px;color:rgb(0,0,0);font-size:13px;line-height:20px"><code style="color:rgb(115,0,0)">-alignment-output-file</code>
<code style="color:rgb(115,0,0)">-print-alignment-info-in-n-best</code>
</pre><div><code style="color:rgb(115,0,0)"><br></code></div><div>If target string is known in advance, you can use force decoding. It will give you the best phrasal alignment.<br></div><div><code style="color:rgb(115,0,0)"><br>
</code></div></div><div class="gmail_extra">Cheers,</div><div class="gmail_extra">Nadir</div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Feb 13, 2014 at 11:00 AM, <span dir="ltr"><<a href="mailto:corpora-request@uib.no" target="_blank">corpora-request@uib.no</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Today's Topics:<br>
<br>
1. phrase alignment (Saeed Farzi)<br>
2. First Announcement: The Fifth Swedish Language Technology<br>
Conference (SLTC-14) (Jörg Tiedemann)<br>
3. SEPLN 2013 - 1st Call for Papers (Horacio Saggion)<br>
4. Call for Demos: NLDB'2014, Montpellier - France (Mathieu Roche)<br>
<br>
<br>
----------------------------------------------------------------------<br>
<br>
Message: 1<br>
Date: Wed, 12 Feb 2014 17:55:15 +0330<br>
From: Saeed Farzi <<a href="mailto:saeedfarzi@gmail.com">saeedfarzi@gmail.com</a>><br>
Subject: [Corpora-List] phrase alignment<br>
To: "<a href="mailto:corpora@uib.no">corpora@uib.no</a>" <<a href="mailto:corpora@uib.no">corpora@uib.no</a>>, moses-support<br>
<<a href="mailto:moses-support@mit.edu">moses-support@mit.edu</a>><br>
<br>
Dear all,<br>
<br>
<br>
I have a question about finding the best phrase alignments.<br>
The alignments are used by MOSES during the decoding phrase.<br>
<br>
I have a pair parallel sentences ( a source / a target). I need the best<br>
phrase alignment between the source and target sentences. The best<br>
phrase alignment is a alignment that MOSES is used to translate the source<br>
sentence to the target sentence.<br>
<br>
Let me use an example to explain what i want.<br>
Example:<br>
I have a pair sentence:<br>
Source : I go to the home<br>
Target : man be khaneh miravam (in farsi)<br>
<br>
I need the following alignment:<br>
<br>
The Best alignment : [I-->man] [to the --> beh] [home-->khaneh] [ go --><br>
miravam]<br>
the result includes two sort of information<br>
<br>
1- the best segments<br>
2- the best alignment<br>
<br>
We can use MOSES for extracting the alignments when using training<br>
sentences as input sentence of the MOses's decoder. But there is problem.<br>
The output of the decoder is not exactly same as the target sentence.<br>
<br>
I know that the giza++ is used for word alignments. I need a solution for<br>
phrase alignments.<br>
Tnx<br>
--<br>
S.Farzi, Ph.D. Student<br>
Natural Language Processing Lab,<br>
School of Electrical and Computer Eng.,<br>
Tehran University<br>
Tel: <a href="tel:%2B9821-6111-9719" value="+982161119719">+9821-6111-9719</a><br>
-------------- next part --------------<br>
A non-text attachment was scrubbed...<br>
Name: not available<br>
Type: text/html<br>
Size: 1685 bytes<br>
Desc: not available<br>
URL: <<a href="http://www.uib.no/mailman/public/corpora/attachments/20140212/d8a40cf4/attachment.txt" target="_blank">http://www.uib.no/mailman/public/corpora/attachments/20140212/d8a40cf4/attachment.txt</a>><br>
<br>
------------------------------<br>
<br>
Message: 2<br>
Date: Wed, 12 Feb 2014 14:50:43 +0000<br>
From: Jörg Tiedemann <<a href="mailto:Jorg.Tiedemann@lingfil.uu.se">Jorg.Tiedemann@lingfil.uu.se</a>><br>
Subject: [Corpora-List] First Announcement: The Fifth Swedish Language<br>
Technology Conference (SLTC-14)<br>
To: sltc2014 <<a href="mailto:sltc2014@lingfil.uu.se">sltc2014@lingfil.uu.se</a>><br>
Cc: "<a href="mailto:elsnet-list@elsnet.org">elsnet-list@elsnet.org</a>" <<a href="mailto:elsnet-list@elsnet.org">elsnet-list@elsnet.org</a>>,<br>
"<a href="mailto:acl@aclweb.org">acl@aclweb.org</a>" <<a href="mailto:acl@aclweb.org">acl@aclweb.org</a>>, "<a href="mailto:nordlingnet@uib.no">nordlingnet@uib.no</a>"<br>
<<a href="mailto:nordlingnet@uib.no">nordlingnet@uib.no</a>>, "<a href="mailto:alla@gslt.hum.gu.se">alla@gslt.hum.gu.se</a>" <<a href="mailto:alla@gslt.hum.gu.se">alla@gslt.hum.gu.se</a>>,<br>
"<a href="mailto:nodali@helsinki.fi">nodali@helsinki.fi</a>" <<a href="mailto:nodali@helsinki.fi">nodali@helsinki.fi</a>>, "<a href="mailto:corpora@uib.no">corpora@uib.no</a>"<br>
<<a href="mailto:corpora@uib.no">corpora@uib.no</a>><br>
<br>
The Fifth Swedish Language Technology Conference (SLTC-14)<br>
<a href="http://www2.lingfil.uu.se/SLTC2014/" target="_blank">http://www2.lingfil.uu.se/SLTC2014/</a><br>
<br>
<br>
Uppsala, Sweden<br>
November 13-14, 2014<br>
<br>
The Fifth Swedish Language Technology Conference (SLTC-14) will be held in Uppsala, November 13-14, 2013, organized by the Computational Linguistics Group at the Department of Linguistics and Philology at Uppsala University. Papers and workshops will be invited on all aspects of language technology, including natural language processing, speech technology, and relevant neighboring areas. Call for workshops and papers will be issued in early March.<br>
<br>
<br>
Important dates:<br>
<br>
Workshop Proposal Submission: May 31, 2014<br>
Workshop Notification of Acceptance: June 15, 2014<br>
Abstract Submission: September 1, 2014<br>
Notification of Acceptance: September 22, 2014<br>
Final Abstract Submission: October 13, 2014<br>
Registration (Early Bird): October 13, 2014<br>
Conference and Workshops: November 13-14, 2014<br>
<br>
<br>
URL:<br>
<a href="http://www2.lingfil.uu.se/SLTC2014/" target="_blank">http://www2.lingfil.uu.se/SLTC2014/</a><br>
<br>
<br>
Contact:<br>
Scientific issues: <a href="mailto:sltc2014@lingfil.uu.se">sltc2014@lingfil.uu.se</a><br>
Practical issues: <a href="mailto:sltc2014@akademikonferens.uu.se">sltc2014@akademikonferens.uu.se</a><br>
<br>
<br>
<br>
<br>
------------------------------<br>
<br>
Message: 3<br>
Date: Wed, 12 Feb 2014 16:28:18 +0100<br>
From: Horacio Saggion <<a href="mailto:horacio.saggion@upf.edu">horacio.saggion@upf.edu</a>><br>
Subject: [Corpora-List] SEPLN 2013 - 1st Call for Papers<br>
To: corpora <<a href="mailto:corpora@uib.no">corpora@uib.no</a>><br>
<br>
--------------------------------------------------------------------<br>
<br>
CALL FOR PAPERS:<br>
<br>
30th CONFERENCE OF THE SPANISH SOCIETY<br>
<br>
FOR NATURAL LANGUAGE<br>
PROCESSING (SEPLN 2014)<br>
September 17-19, 2014<br>
Universitat de Girona<br>
<a href="http://www.taln.upf.edu/pages/sepln2014/es/index.html" target="_blank">http://www.taln.upf.edu/pages/sepln2014/es/index.html</a><br>
<br>
--------------------------------------------------------------------<br>
<br>
<br>
INTRODUCTION<br>
-----------------------<br>
<br>
<br>
The 30th edition of the Annual Conference of the Spanish Society for<br>
Natural Language Processing (SEPLN) will take place in Universitat de<br>
Girona, Girona, Spain on 17-19 September 2014. We also expect to organize<br>
associated workshops.<br>
<br>
<br>
The huge amount of information available in digital format and in different<br>
languages calls for systems to enable us to access this vast library in an<br>
increasingly more structured way.<br>
<br>
In this same area, there is a renewed interest in improving information<br>
accessibility and information exploitation in multilingual environments.<br>
Many of the formal foundations for dealing appropriately with these<br>
necessities have been, and are still being established in the area of<br>
Natural Language Processing and its many branches: Information extraction<br>
and retrieval, Questions answering systems, Machine translation, Automatic<br>
analysis of textual content, Text summarization, Text generation, and<br>
Speech recognition and synthesis.<br>
<br>
The aim of the conference is to provide a forum for discussion and<br>
communication where the latest research work and developments in the field<br>
of Natural Language Processing (NLP) can be presented by scientific and<br>
business communities. The conference also aims at exposing new<br>
possibilities of real applications and R&D projects in this field.<br>
<br>
Moreover, as in previous editions, there is the intention of identifying<br>
future guidelines or paths for basic research and foreseen software<br>
applications, in order to compare them against the market needs. Finally,<br>
the conference intends to be an appropriate forum in helping new<br>
professionals to become active members in this field.<br>
<br>
<br>
TOPICS<br>
<br>
-----------<br>
<br>
<br>
Researchers and companies are encouraged to send communications, project<br>
abstracts or demonstrations related to any language technology topic<br>
including but not limited to the following:<br>
<br>
* Linguistic, mathematic and psycholinguistic models of language.<br>
* Machine learning in NLP.<br>
* Computational lexicography and terminology.<br>
* Corpus linguistics.<br>
* Development of linguistic resources and tools.<br>
* Grammars and formalisms for morphological and syntactic analysis.<br>
* Semantics, pragmatics and discourse.<br>
* Lexical ambiguity resolution.<br>
* Monolingual and multilingual text generation.<br>
* Machine translation.<br>
* Speech synthesis and recognition.<br>
* Dialogue systems.<br>
* Audio indexing.<br>
* Monolingual and multilingual information extraction and retrieval.<br>
* Question answering systems.<br>
* Evaluation of NLP systems.<br>
* Automatic textual content analysis.<br>
* Sentiment analysis and opinion mining.<br>
* Plagiarism detection.<br>
* Negation and speculation processing.<br>
* Text mining in blogosphere and social networks.<br>
* Text summarization.<br>
* Image retrieval.<br>
* NLP in biomedical domain.<br>
<br>
* NLP-based generation of teaching resources.<br>
* NLP for languages with limited resources.<br>
* NLP industrial applications.<br>
<br>
<br>
CONTACT<br>
--------------<br>
<br>
All information related to the conference can be found in the web:<br>
<br>
<br>
<a href="http://www.taln.upf.edu/pages/sepln2014/en/index.html" target="_blank">http://www.taln.upf.edu/pages/sepln2014/en/index.html</a><br>
<br>
<br>
STRUCTURE OF THE CONFERENCE<br>
--------------------------------------------------<br>
<br>
<br>
The conference will last three days, and will consist of sessions devoted<br>
to presenting papers, posters, tutorials, ongoing research projects and<br>
prototype or product demonstrations connected with topics addressed in the<br>
conference. Besides, we expect to organize associated workshops.<br>
<br>
<br>
SUBMISSION OF CONTRIBUTIONS<br>
-------------------------------------------------<br>
<br>
<br>
Authors are encouraged to send theoretical or application-oriented<br>
proposals related to NLP. The proposals must include the following sections:<br>
<br>
* The title of the communication.<br>
<br>
* An abstract in English and Spanish (maximum 150 words) and a list<br>
<br>
of keywords.<br>
<br>
* The paper can be written in Spanish or English. Its overall maximum length<br>
will be 8 pages, including references.<br>
* The documents must not include headers or footers.<br>
<br>
* Papers should NOT include the names of the authors.<br>
<br>
The papers proposed will be reviewed at least by three reviewers, and can<br>
be accepted to be presented either as posters or as communications,<br>
depending on the program necessities. However, no distinction will be made<br>
between communications and posters in the printed version of the SEPLN<br>
journal.<br>
<br>
<br>
<br>
<br>
<br>
*** IMPORTANT NOTE ON CAMERA READY ****<br>
<br>
The final version of the paper (camera ready) should be submitted together<br>
with a cover letter explaining how the suggestions of the reviewers were<br>
implemented in the final version.<br>
<br>
**********************************************<br>
<br>
<br>
<br>
Please, send your proposals using the following link::<br>
<br>
<a href="http://www.sepln.org/myreview-sepln53/" target="_blank">http://www.sepln.org/myreview-sepln53/</a><br>
<br>
<br>
The format of the SEPLN journal must be followed:<br>
<br>
<a href="http://www.sepln.org/?page_id=1285&lang=en" target="_blank">http://www.sepln.org/?page_id=1285&lang=en</a><br>
<br>
In addition, all proposals will have to comply with the following<br>
requirements, depending on whether they pare papers, demos or projects.<br>
<br>
<br>
PROJECTS AND DEMOS<br>
----------------------------------<br>
<br>
<br>
As in previous editions, the organizers encourage participants to give oral<br>
presentations of R&D projects and demos of systems or tools related to the<br>
NLP field. For oral presentations on R&D projects to be accepted, the<br>
following information must be included:<br>
<br>
* Project title.<br>
* Name, affiliation, address, e-mail and phone number of the project<br>
director.<br>
* Funding institutions.<br>
* Groups participating in the project.<br>
* Abstract (4 pages maximum, including references).<br>
<br>
<br>
For demonstrations to be accepted, the following information is mandatory:<br>
<br>
* Demo title.<br>
* Name, affiliation, e-mail and phone number of the authors.<br>
* Abstract (4 pages maximum, including references).<br>
* Time estimation for the whole presentation.<br>
<br>
**** SEE NOTE ON CAMERA READY ABOVE ****<br>
<br>
IMPORTANT DATES<br>
----------------------------<br>
<br>
<br>
Deadline for full papers, demos, and projects: 10th April 2014<br>
<br>
Notifications: 26th May 2014<br>
<br>
Camera Ready: 7th June 2014<br>
<br>
<br>
<br>
<br>
--<br>
Dr. Horacio Saggion<br>
TALN / DTIC<br>
Universitat Pompeu Fabra<br>
<a href="http://www.dtic.upf.edu/~hsaggion/" target="_blank">http://www.dtic.upf.edu/~hsaggion/</a><br>
-------------- next part --------------<br>
A non-text attachment was scrubbed...<br>
Name: not available<br>
Type: text/html<br>
Size: 13262 bytes<br>
Desc: not available<br>
URL: <<a href="http://www.uib.no/mailman/public/corpora/attachments/20140212/174eeccd/attachment.txt" target="_blank">http://www.uib.no/mailman/public/corpora/attachments/20140212/174eeccd/attachment.txt</a>><br>
<br>
------------------------------<br>
<br>
Message: 4<br>
Date: Wed, 12 Feb 2014 18:16:32 +0100<br>
From: Mathieu Roche <<a href="mailto:Mathieu.Roche@lirmm.fr">Mathieu.Roche@lirmm.fr</a>><br>
Subject: [Corpora-List] Call for Demos: NLDB'2014, Montpellier -<br>
France<br>
To: <<a href="mailto:corpora@uib.no">corpora@uib.no</a>>, <<a href="mailto:acl@aclweb.org">acl@aclweb.org</a>>, <<a href="mailto:ISWORLD@listserv.heanet.ie">ISWORLD@listserv.heanet.ie</a>>,<br>
<<a href="mailto:IRList@lists.shef.ac.uk">IRList@lists.shef.ac.uk</a>>, <<a href="mailto:bionlp@bionlp.org">bionlp@bionlp.org</a>>, <<a href="mailto:dbworld@cs.wisc.edu">dbworld@cs.wisc.edu</a>>,<br>
<<a href="mailto:ln@cines.fr">ln@cines.fr</a>>, <<a href="mailto:liste-egc@polytech.univ-nantes.fr">liste-egc@polytech.univ-nantes.fr</a>>, <<a href="mailto:bull-i3@irit.fr">bull-i3@irit.fr</a>><br>
<br>
*******************************************<br>
<br>
Call for Demos - NLDB'2014<br>
<br>
18-20 June 2014 - Montpellier, France<br>
<br>
<a href="http://www.nldb.org/" target="_blank">http://www.nldb.org/</a><br>
<br>
*******************************************<br>
<br>
The 19th International Conference on Application of Natural Language to<br>
Information Systems (NLDB?2014) invites submissions of demonstrations of<br>
state-of-the-art research or industrial prototypes related to all<br>
aspects of Natural Language in the Database and Information Systems<br>
field.<br>
Topics of interest include but are not limited to:<br>
- Applications of NLP in Information Systems<br>
- Social Media and Web Data<br>
- Big Data and Natural Language<br>
- Semantic Web and Open Linked Data<br>
- Question Answering (QA)<br>
- Natural language and Ubiquitous Computing<br>
- Natural Language in Conceptual Modeling<br>
- NLP Applications (Opinion Mining, Information Extraction, ?)<br>
<br>
Demo submissions will be handled online via the easychair conference<br>
management system:<br>
<a href="https://www.easychair.org/conferences/?conf=nldb2014demonstratio" target="_blank">https://www.easychair.org/conferences/?conf=nldb2014demonstratio</a><br>
<br>
Demonstration paper submissions should have 4 pages (LNCS format).<br>
Developers should outline the design of their system and provide<br>
details to allow the evaluation of its validity, quality, originality,<br>
and relevance to NLP in Information Systems.<br>
<br>
The accepted papers for demos will be included in the conference<br>
proceedings, to be published by Springer Verlag in the "Lecture Notes in<br>
Computer Science" (LNCS) Series. The demos will be presented in a<br>
special demonstration session. At least one of the demo submitters must<br>
register for the conference, and perform the demo on site.<br>
<br>
Important Dates:<br>
- Demo submission deadline (firm): March 13, 2014<br>
- Notification of acceptance: March 28, 2014<br>
- Camera-ready paper due: April 7, 2014<br>
<br>
<br>
<br>
<br>
----------------------------------------------------------------------<br>
Send Corpora mailing list submissions to<br>
<a href="mailto:corpora@uib.no">corpora@uib.no</a><br>
<br>
To subscribe or unsubscribe via the World Wide Web, visit<br>
<a href="http://mailman.uib.no/listinfo/corpora" target="_blank">http://mailman.uib.no/listinfo/corpora</a><br>
or, via email, send a message with subject or body 'help' to<br>
<a href="mailto:corpora-request@uib.no">corpora-request@uib.no</a><br>
<br>
You can reach the person managing the list at<br>
<a href="mailto:corpora-owner@uib.no">corpora-owner@uib.no</a><br>
<br>
When replying, please edit your Subject line so it is more specific<br>
than "Re: Contents of Corpora digest..."<br>
<br>
<br>
_______________________________________________<br>
Corpora mailing list<br>
<a href="mailto:Corpora@uib.no">Corpora@uib.no</a><br>
<a href="http://mailman.uib.no/listinfo/corpora" target="_blank">http://mailman.uib.no/listinfo/corpora</a><br>
<br>
<br>
End of Corpora Digest, Vol 80, Issue 14<br>
***************************************<br>
</blockquote></div><br></div></div>