[Corpora-List] WORD ALIGNMENT: Does it exist?

Xu Jiajin ustcxujj at gmail.com
Wed Jun 1 14:28:15 UTC 2011


Hi all,



The other day, over an academic discussion, my colleague and I had a brief
debate on WORD ALIGNMENT, while we were talking about bilingual text
aligning practices. When I was reviewing different levels of alignment, such
as sentence alignment and word alignment, I commented that WORD ALIGNMENT IS
A JOKE, as it is never likely to aligning words. My comment was immediately
refuted by another professor. But I have not been convinced by his
counterargument so far.



In my mind, word alignment is not realistic since it’s impossible to find
one-to-one correspondence of parallel texts on the word level. For instance,
it’s most likely that words in a sentence are not translated, either kept
implicit or assimilated into other words, constructions, idioms and so
forth. I reckon it is also the case for parallel texts of cognate languages.



But on a second thought, the alignment of a selection of words, say, lexical
words, or jargons, across texts is not impossible. However, linguistic
alignment, as I see it, has to be exhaustive. In saying so, I actually
consider sentence alignment as the canonical type of text alignment. Each
sentence is aligned to one or more sentences in the target texts, and the
other way round.



I am wondering whether there ARE word alignment implementations in practice.
I would appreciate any pointers to relevant literature or tools, as well as
the clarification of the notion alignment.



Maybe due to my ignorance, word alignment has been a mature technology for
many years. Could anyone tell me what are main uses of word alignment?
Bilingual lexicon? Any other applications?



Thanks in advance.



Cheers,



Jiajin XU

Ph.D., associate professor (discourse studies, corpus linguistics)

National Research Centre for Foreign Language Education

Beijing Foreign Studies University

Beijing 100089

China
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20110601/75c27430/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list