[Corpora-List] Is the TEI a waste of time?]

Torzec Nicolas ATER LSI Nicolas.Torzec at enssat.fr
Tue Jul 1 14:51:02 UTC 2003


I agree with the Sylvain Loiseau's idea that "It is surprising to see
how little software there is for TEI corpora".
Now that TEI-P4 (and TEI-Lite) proposes a DTD to encode TEI conformant
corpora in XML, it is easier for people having a little background in
Computer Sciences to develop TEI specific tools by existing (generic)
XML libraries.

Do we have to conclude:
- That no one uses the TEI standard nowaday (i.e. no one needs TEI
specific XML tools to create, annotate, manage and exploit TEI
conformant corpora). :-(
- Or that every one has developed its own TEI-XML specific tools and
keep it secret? ;-)

Personally, I am in the second position but the tools that I have
developped are more "quick-and-dirty tools" (that's why I don't
communicate about them) than "high-quality softwares" !

Is the TEI Software Page up to date ? (Cf.
http://www.tei-c.org/Software/index.html)



Nicolas.

--
Nicolas TORZEC

ENSSAT / Université de Rennes 1
6, rue de Kerampont
22300 Lannion

Mel : nicolas.torzec at enssat.fr
Tel : 02.96.46.27.30
Fax : 02.96.37.01.99
Web : http://www.enssat.fr
--

>
> Sylvain Loiseau wrote:
> >
> >
> > I agree with this idea. It is surprising to see how little software there
> > is for TEI corpora. The TEI is a waste of time only if the encoding is
> > under-exploited - which is a problem for the researcher, not for the TEI.
> > As said G. Williams a minimal encoding with hasty-pasted-header and
> > word-processor-regex encoding of <p> takes only a few minute. But in order
> > to exploit easily the encoding there is no public framework or set of tools
> > for treatment of TEI-corpus - such as concordancer based on SAX stream,
> > etc. Something like a set of classes for calling parser, SAX rewriting,
> > etc., allowing just to insert SAX handlers or XSLT stylesheets in the
> > pipeline could be very useful. While XML always gain ground when it
> > normalizes both the standards and the software methodologies, the TEI
> > remain a pure standard.
> >
> > I think the TEI is obviously necessary for the view G. Williams defends - a
> > corpus is not a sac of words - and for interoperability, etc. But I agree
> > that the TEI is perhaps "out to date" for some points: there is nothing for
> > morphosyntaxic or morphologic encoding, texts profiling, etc. The TEI
> > remains perhaps not sufficiently adapted to linguistic corpora. This
> > is quite obvious if we look at the projects listed on tei-c.org : it is
> > mainly philological uses of the TEI.
> >
> > Sylvain Loiseau



More information about the Corpora mailing list