[Corpora-List] ACL proceedings paper in the American National Corpus

Alexander Yeh asy at mitre.org
Fri Sep 27 19:36:07 UTC 2002


My apologies for any duplicate message. I am having mailer difficulties.

Nancy Ide wrote:

> The American National Corpus Consortium, with permission from the
> Association for Computational Linguistics, will include in the American
> National Corpus a selection of recent papers written by American authors
> and published in ACL proceedings and anthologies. Any authors who object
> to having their papers included in the American National Corpus should
> contact Nancy Ide (ide at cs.vassar.edu) to have their papers removed.
>
> Note that this applies to papers whose authors are native speakers of
> American English only.
>

Two questions. What is your definition of native speaker? and how are you
going to determine who meets your definition?

This is not as trivial as it may sound. When I was in school, there were a
bunch of people who spent their entire lives in the US (were born in the
US., etc.), but because their parents came from other countries and spoke
English as a second language, one project in speech recognition did not
consider that bunch of people as "native" speakers.

Also, determining who meets what ever standard you have may be tricky: a
probably extreme example: I know of a person who was born in the US, has a
last name typical of country A, but is growing-up in a household where
American English and a language from country B is spoken (the mother comes
from country B).

Thanks
-Alex Yeh


>
> =======================================================
>
> Nancy Ide
>
> Professor and Chair
> Department of Computer Science, Vassar College
> Poughkeepsie, NY 12604-0520 USA
> Tel: +1 845 437-5988 Fax: +1 845 437-7498
> ide at cs.vassar.edu
>
> Chercheur Associe
> Equipe Langue et Dialogue, LORIA/CNRS
> Campus Scientifique - BP 239
> 54506 Vandoeuvre-les-Nancy FRANCE
> Tel: +33 (0)3 83 59 20 47 Fax: +33 (0)3 83 41 30 79
> ide at loria.fr
>
> =======================================================



More information about the Corpora mailing list