[An-lang] AN corpora

Andy Pawley apawley at coombs.anu.edu.au
Mon May 31 02:03:40 UTC 2004


In response to Ross Clark's note, there is at least one electronic
corpus of  Samoan with frequency analysis. This was compiled by
Galumalemana Alfred Hunkin for his 2001 MA thesis: A Corpus of
Contemporary Colloquial Samoan, in the School of Linguistics and
Applied Linguistics, Victoria University of Wellington.  The corpus
consists of about 300,000 words, made up of 300 samples spoken and
written Samoan. Mr Hunkin <Alfred.Hunkin at vuw.ac.nz> teaches Samoan at
Victoria U. Wellington.

Andy Pawley

>Someone asked me whether there are word frequency statistics available for
>Samoan, such as exist for English and other big languages. I think probably
>not, and further it occurred to me that such statistics depend on a corpus
>of the language in question -- nowadays assumed to be computer-searchable.
>Corpus linguistics seems to be pretty trendy in English right now. But I
>wonder whether there are comparable bodies of text for any Austronesian
>languages? At one time the Maori Studies people here had at least the
>beginnings of one, and I believe the Maori Newspapers project aims
>eventually to have a searchable online corpus. Any other news?
>
>Ross Clark
>_______________________________________________
>An-lang mailing list
>An-lang at anu.edu.au
>http://mailman.anu.edu.au/mailman/listinfo/an-lang
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/an-lang/attachments/20040531/fc16a6b4/attachment.htm>
-------------- next part --------------
_______________________________________________
An-lang mailing list
An-lang at anu.edu.au
http://mailman.anu.edu.au/mailman/listinfo/an-lang


More information about the An-lang mailing list