[Corpora-List] [CORPORA-List] RE: Evaluating Sentence Aligners

Olivier Kraif olivier.kraif at tele2.fr
Thu Nov 22 09:20:06 UTC 2007


Dear Mike,
according to my own experience, stemming and lemmatizing may improve 
results for lexical level aligning, but not for sentence aligning when 
it is based on surface clues (sentence length, identical chains and 
cognate search, etc.).

Olivier Kraif
-------------
Université Stendhal Grenoble 3
LIDILEM - http://w3.u-grenoble3.fr/lidilem/labo/
Personal Page - http://www.u-grenoble3.fr/kraif
> Olivier wrote:
>   
>> Alinea was evaluated on 'distant' language pairs...
>> These results were obtained without fine tuning of parameters, and show
>> that surface clues (sentence lengths and even identical chains) can be
>> useful, even for non-related languages that don't share the same
>> alphabet. Alinea allows improving these results by adding specific
>> linguistic data (bilingual lexicon, translitteration) or using a large
>> parallel corpus
>>     
>
> Are they results fairly insensitive to morphology?  I.e. does it matter
> whether you stem one or both sides?
>
>    Mike Maxwell
>    CASL/ U MD
>
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>
>   


_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list