[Corpora-List] free manually POS-tagged corpus

Thomas Proisl thomas.proisl at fau.de
Thu Dec 6 07:03:58 UTC 2012


Hi Alisa,

you might want to take a look at the manually annotated subcorpora
(MASC) of the American National Corpus
(http://www.americannationalcorpus.org/MASC/Download.html). MASC I is
already available and – to quote from the website – consists of “80K
words of data with validated annotations for token, part of speech,
sentence boundary, noun chunks, verb chunks, named entities, and Penn
Treebank syntax; and full-text FrameNet annotation for seventeen texts.”

Best regards,
Thomas


-- 
FAU Erlangen-Nürnberg
Department Germanistik und Komparatistik
Professur für Korpuslinguistik
Bismarckstr. 6, 91054 Erlangen

Fon: +49 9131 85-25908; Fax: +49 9131 85-29251
http://www.linguistik.uni-erlangen.de/~tsproisl/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20121206/02397f16/attachment-0001.sig>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list