Corpora: XML programmes and tagging

Gabriella Rundblad G.Rundblad at uea.ac.uk
Fri May 19 09:50:38 UTC 2000


Dear all,

Despite having used language corpora for some years, I've
never put together my own corpus. Until now.

I'm considering putting together a corpus of Middle English
using already electronically available text, but tagging it
to enable searches. I shall be attending the Oxford summer
seminars on digital resources etc. to learn more, but would
like to address some of the issues already now and perhaps
do some tests to see if my idea is plausible at all.


1) As far as I understand, it is today recommended to use
XML for tagging purposes. For this I'll need user-friendly
programme(s), the question is which. I know there are both
free ware, share ware and commercial products out there,
though I've never tried (yet) either of them and don't
know how user-friendly they are. I know HTML and use
Hotmetal Pro for this (great!) and there is obviously an
XML equivalent (XMetal). Could you advice what programme(s)
to use?! Is XMetal good for a never-before-tagger?!

2) The tagging I would like to do (I'm reading up on TEI
etc) is a tagging of phrases and clauses, not parts of
speech. What's been done on this earlier? Any lists of tags
etc?


Grateful for all the advice you can offer.


Gabriella Rundblad


University of East Anglia
School of Language, Linguistics and Translation Studies
Norwich NR4 7TJ
UK



More information about the Corpora mailing list