[Corpora-List] Do we still need language corpora?
Angus B. Grieve-Smith
grvsmth at panix.com
Sat Feb 5 19:14:11 UTC 2011
On 2/5/2011 5:35 AM, Serge Sharoff wrote:
> Uni Oslo's noWaC is a case in point, Marco and his colleagues created a two-billion
> ukWac and parsed it syntactically (using Malt parser), I added a
> BNC-like domain and genre annotation layer to it, so the web is at your
> finger tips.
... for existential, not distributional questions. If I understand
your description right, it's not a representative sample of anything, so
any percentages you find are not generalizable beyond the sample, or
perhaps beyond the Web.
--
-Angus B. Grieve-Smith
Saint John's University
grvsmth at panix.com
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list