[Corpora-List] "Language Immersion for Chrome", and a Better Idea

Tue May 15 12:03:20 UTC 2012

Dear Ziyuan Yao

I would like to applaud your idea! A Japanese teacher of English, Teruhiko Kadoyama, who was a distance-learning
MA student of mine at Birmingham University, used English subtitles from Hollywood movies very effectively to
help his students to learn a) onomatopoeic words (eg bark, quack, chirp, twitter, etc)  b) a range of verbs of motion
for similar actions (eg walk, stroll, amble, dash, rush, scramble, etc), and reported on his experiments for his
MA Dissertation in 1999.

However, I don't know whether he has conducted any further experiments, or published his
Dissertation or other papers on this topic, or developed any software for the purpose.
But it may be worth you doing a Google Search on his name, to check?  I had a quick look,
and discovered he is now a Professor at Hiroshima International University, and is President
of the Society for Teaching English Through Media...
http://www.stemedia.co.kr/menu_1_1.htm?searchkey=&searchvalue=&page=1&board_seq=79&mode=read

best wishes
Ramesh

Ramesh Krishnamurthy
Visiting Academic Fellow, School of Languages and Social Sciences, Aston University, Birmingham B4 7ET

Date: Tue, 15 May 2012 04:41:24 +0800

From: Ziyuan Yao <yaoziyuan at gmail.com<mailto:yaoziyuan at gmail.com>>

Subject: [Corpora-List] "Language Immersion for Chrome", and a Better

      Idea

To: corpora at uib.no<mailto:corpora at uib.no>

Google's "Language Immersion for Chrome"

Recently a Chrome browser extension called "Language Immersion for Chrome" has been much publicized. Developed by "Use All Five Inc." on behalf of Google, the extension translates certain words and phrases on the Web page you're browsing to a foreign language via Google Translate, for the purpose of helping you learn that foreign language while browsing the Web.

I have been researching this kind of thing for years, and one of my main standpoints is machine translation shouldn't be used in serious language learning as it is error-prone: it takes a learner a great effort to memorize a piece of erroneous knowledge, another great effort to "unlearn" this wrong knowledge and yet another great effort to "relearn" the right knowledge.

But I do understand online machine translation services like Google Translate and Bing Translator are so readily available that directly using them to do the translation can minimize development costs. Upon seeing the this news, I asked myself: "Can we use a kind of freely available, manually prepared data, instead of machine translation, to do this better?" And the answer is YES!

A Bbetter Idea

Imagine if we have a database of manually-translated bilingual sentence pairs (such as those multilingual movie subtitle files on those subtitle websites), e.g.

        (German)  Er ist ein guter Schüler.

        (English) He is a good student.

Now if a German wants to learn English, and he happens to be browsing a German Web page that contains the German word "Schüler" (student), and the computer finds out that this German word also occurs in a bilingual sentence pair like the above. Now, the computer can teach English for this German word, by inserting the above bilingual sentence pair into that Web page, like an embedded advertisement. This way, the German will learn the English word "student", and better yet, learn it in a bilingual sentence pair! This means he will not only learn the word "student" alone, but also its syntax, semantics and pragmatics, all implied by this example sentence. As to phonetics, the computer can use text-to-speech to read aloud the English sentence, or display some kind of pronunciation guide above or alongside the English sentence (see my recent project "Phonetically Intuitive English" for such a pronunciation aid:

https://sites.google.com/site/phoneticallyintuitiveenglish/).

That's the basic idea. But of course we can further refine this idea.

For example, if there are multiple bilingual sentence pairs containing "Schüler", the computer can prefer a pair that contains words that appear near "Schüler" on the Web page (i.e. context words). This would be very useful if the word in question (Schüler) is ambiguous.

Besides bilingual sentence pairs, we may also explore multilingual data from Wiktionary and Wikipedia, although their usage may not be as straightforward as the model discussed above. I leave this as homework for the reader.

I also intend to develop a Chrome extension based on the idea discussed above :-)

Best Regards,

Ziyuan Yao

https://sites.google.com/site/yaoziyuan/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20120515/7c00bae9/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora