Corpora: Welsh Corpus - DTD?

MIT2USA at aol.com MIT2USA at aol.com
Thu Sep 7 23:05:49 UTC 2000


>Sujet : Corpora: Welsh Corpus - DTD?
>Date : 9/6/00 3:42:49 W. Europe Daylight Time
>From: peter.littlechild at swift.com (Peter Littlechild)
>
>As part of an initiative to create a corpus of written Welsh
>I'm investigating means to create a machine-processable form
>of the Welsh Bible. This text is heavily enriched with
>cross-references, markers indicating dubious etymologies,
>missing words, and other things I haven't fathomed yet. It's
>the perfect justification for using SGML/XML.
>
>Does anyone know of previous work on marked-up versions of
>bibles? Are there special DTDs or some appropriate subset of
>the TEI DTD, for example?


People who have done work on Biblical corpora for language engineering
purposes and who would be best qualified to answer your question are:


Mark Olsen, ARTFL Project, University of Chicago
mark at gide.uchicago.edu
http://estragon.uchicago.edu/Bibles/

Please note that on the ARTFL project site indicated above, there is the
following statement:
"The Humanities Text Initative no longer distributes the the SGML version of
the Luther Bible, on request of the copyright holder."

This implies that someone has already developed a DTD. Hopefully Mark Olsen
can be of assistance.


Philip Resnik
resnik at umiacs.umd.edu
http://www.umiacs.umd.edu/users/resnik/
http://benjamin.umd.edu/parallel/


Mari Olsen
molsen at microsoft.com
http://www.umiacs.umd.edu/~molsen


Mark Davies
The Polyglot Bible project
http://mdavies.for.ilstu.edu/personal/polyglot.htm


Marilyn Mason
marilinc at aol.com
http://hometown.aol.com/mit2haiti/Orthography.htm


Gary Simons
gary_simons at sil.org
http://www.sil.org/computing/noc/Vol13/134academic.htm


Jeff Allen
mit2ceo at aol.com


Also note that several Bible corpora language engineering reference articles
were indicated in the Corpora list messages dated:


Date: Tue, 2 Mar 1999 11:34:37 -0500 (EST)
From: Philip Resnik
To: rykov at iling.msk.su
CC: corpora at hd.uib.no
Subject: Re: Corpora: corpora history revisited

&

Date: Tue, 02 Mar 1999 16:57:35 +0100
To: corpora at hd.uib.no
From: Jeff ALLEN
Subject: Corpora: corpora history revisited - Bible corpus


The Corpora list messages above can be accessed at:
http://helmer.hit.uib.no/corpora/


Best regards,

MIT2 team

********************
Mason Integrated Technologies Ltd (MIT2)
P.O. Box 181015
Boston, MA  02118  USA
(617) 247-8885 (office & answering machine)
(617) 262-8923 (FAX)
MIT2USA at aol.com (e-mail)
Mason Integrated Technologies Ltd Home Page:
   http://hometown.aol.com/mit2usa/Index2.html
MIT2 Products Page:
   http://hometown.aol.com/mit2haiti/Products.htm
Meet the MIT2 Team:
   http://hometown.aol.com/mit2usa/CorpOff.htm
MIT2's Strategic Partners:
   http://hometown.aol.com/mit2usa/Partners.htm



More information about the Corpora mailing list