[Corpora-List] Query about the (dual) language of web pages
Michael Maxwell
maxwell at umiacs.umd.edu
Tue Oct 9 16:16:35 UTC 2007
> Everyone is aware that some languages/cultures (e.g. Swedish,
> Finnish) tend to have alternative webpages in English, while others
> (e.g. Arabic) are much less likely to.
> Does anyone have any reliable figures as to the frequency of
> appearance of these parallel-corpora (in English)for different
> (source) languages? I am interested at the moment in :
> Japanese, Chinese, Korean, Spanish, Portuguese, French, German,
> Italian, Arabic
...and I would be interested in similar figures for all languages. (The
parallel text doesn't need to be in English in my case, it might be e.g.
in Spanish or Russian).
Another thought: is there any place that actually tracks these sorts of
pages? I know Phil Resnik was collecting some of this in the past
(http://umiacs.umd.edu/~resnik/strand/), but I don't believe he is
actively doing so now.
Mike Maxwell
CASL/ U MD
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list