[Corpora-List] Using MTurk for markup tasks (was Cost of part

radev at umich.edu radev at umich.edu
Tue Dec 26 21:51:42 UTC 2006


> 
> Alexandre Rafalovitch wrote:
> > An interesting approach would be to use Amazon Mechanical Turk for
> > this kinds of tasks.
> > ...
> > Has anybody else given a thought to this?
> 
> Don't know what languages you're interested in.  I have thought about 
> "wikifying" other sorts of projects (like finding and keeping track of 
> on-line computational resources, or building bilingual text
> collections) 

Have you looked at www.aclweb.org/aclwiki ?

Drago

> for "low density" languages.  I have never actually tried this, but it 
> may be instructive to look at the languages for which there are 
> substantial Wikipedia and Wiktionary resources.  Last time I looked, the 
> usual suspects (the major and some "minor" European languages, plus 
> Japanese) had at least 100k Wikipedia articles, while there was a 
> slightly wider variety of languages with at least 10k Wikipedia articles 
> (including Arabic (= MSA), Persian, Hebrew, Bahasa Indonesian, Korean, 
> Malay, Thai, Turkish and Chinese).  For comparison, the English 
> Wikipedia has 1.5 million articles.
> 
> My guess is that "wikification" (including the Amazon Mechanical Turk 
> under this) will work best for languages where there are a substantial 
> number of speakers with idle time, sufficient income to afford the 
> computer and network connection, and sufficient education for the 
> specific annotation task.
> -- 
> 	Mike Maxwell
> 	maxwell at umiacs.umd.edu
> 
> 
> 


-- 
Dragomir R. Radev                    Associate Professor
SI, CSE, Ling                     U. Michigan, Ann Arbor 
http://www.eecs.umich.edu/~radev         radev at umich.edu              



More information about the Corpora mailing list