[Corpora-List] Difference in POS tag distribution in different genres
Phil Gooch
philgooch at gmail.com
Tue Dec 18 11:12:39 UTC 2012
I've done a bit of work on analysis of the distribution of pronouns in
clinical narratives (discharge summaries, progress notes and lab reports),
and how this can help with protagonist identification and coreference
resolution. I don't know if this is of interest to you, but I can point you
to a paper and the relevant chapter of my PhD thesis if you'd like to know
more.
Phil
On Mon, Dec 17, 2012 at 2:52 PM, Trevor Jenkins <
trevor.jenkins at suneidesis.com> wrote:
> On 17 Dec 2012, at 03:24, Adam Kilgarriff <adam at lexmasterclass.com> wrote:
>
> > > more proper nouns in news paper text than in fiction
> >
> > certainly true. In general, the more formal/informational a text is,
> the more nominal, with more nouns, adjs/determiners; the more
> informal/interactional, the more verbs and pronouns. Fiction and newspaper
> are noteworthy for past tenses and 3rd-person pronouns.
>
> Interesting but does that hold for all other languages? For example,
> signed languages and specifically British Sign Language. There are several
> signed language corpora now do those support this assertion? And what about
> those written languages, like BSL, that do not have a tense system but use
> time markers instead?
>
> Regards, Trevor.
>
> <>< Re: deemed!
>
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20121218/33a6b055/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list