[Corpora-List] quantities of publicly available parallel text?

Michael Maxwell maxwell at umiacs.umd.edu
Wed Feb 27 15:15:47 UTC 2008


Alexandre Rafalovitch wrote:
> I have more information available, if somebody takes an interest.

I would be very interested in having some place to go to for information
of this sort, whether general (like my previous msg on this thread) or
specific to a particular language.  The LDC cataloged the information on
"found" resources for a few LoDLs, but the page seems to have been taken
away.  I have some pointers at
http://www.netvouz.com/mcswell/folder/4234597228659620420/Languages, but I
have not attempted to keep it up-to-date, and in most cases I don't know
the languages, so some of the pointers are questionable.  A number of
other people have made similar catalogs, some of which are pointed to from
my page.

OLAC is of course another catalog, but it's really intended to catalog
resources available from formal archiving institutions, I believe.  While
it would be good if everyone put their resources in such places, it isn't
happening, and the archives might be overwhelmed if everyone started doing
this.

One place where general information on resources (both found and created)
could be tracked--maybe the best--is the ACL wiki:
http://aclweb.org/aclwiki/index.php?title=List_of_resources_by_language

   Mike Maxwell
   CASL/ U MD



_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list