[Corpora-List] Query about the (dual) language of web pages
Mike Maxwell
maxwell at umiacs.umd.edu
Tue Oct 9 19:45:56 UTC 2007
P Resnik wrote:
> That's correct, Mike -- unfortunately I didn't have the resources to
> create an ongoing Web bitext mining operation,
To what extent could such an effort run by itself, once set up? I.e. is
it a candidate for some cloud computing effort? Maybe assisted by
volunteers logging in to verify language ID, and that docs are indeed
translations of each other.
> ...There are some folks trying to get Web-scale computing off the
> ground for the language research community (e.g.
> http://wacky.sslmit.unibo.it/doku.php)
I'm pretty sure Marco Baroni of that project is on this mailing
list--Marco, any thoughts?
--
Mike Maxwell
maxwell at umiacs.umd.edu
"Theorists...have merely to lock themselves in a room
with a blackboard and coffee maker to conduct their business."
--Bruce A. Schumm, Deep Down Things
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list