Tools for on-line etymological dictionaries [was Re: Ancestor-descendant distance]
Jon Patrick
jonpat at staff.cs.usyd.edu.au
Mon Sep 6 02:01:21 UTC 1999
[ moderator changed Subject: header ]
Date: Sun, 22 Aug 1999 20:39:56 -0400 (EDT)
From: Sean Crist <kurisuto at unagi.cis.upenn.edu>
On Fri, 13 Aug 1999 ECOLING at aol.com wrote:
> We now have an international standard computer Code, Unicode,
> which contains most of the characters needed for transliteration
> (Latin-standard-based letters) and for phonetic transcription (IPA).
> It would be useful to try to establish a standard for Comparative
> Data sets, into which all existing computer data sets can be translated,
> so that the massive sets of data can be made available for studies
> such as this.
I agree totally. We're on exactly the same wavelength here.
I've looked into this a little and have tried to educate myself about
SGML, which would be an obvious candidate for marking up the data sets. I
don't know if there are any specific standard sets of SGML tags for
marking up dictionaries; if there are, it would probably make sense to
start with such a tag set, and extend it with whatever additional tags we
need to represent cognations between languages, etc.
If anyone on this list has any experience using SGML for such a purpose,
please write to me, because I'll need to be tackling this problem before
much longer!
The Text Encoding Initiative (TEI) has created a standard set of markup tags
for paper dictionaries. We have used them for converting paper dictionaries to
databases, however I believe they will need to be enhanced for creating
etymological dictionaries or alternatively databases for historical
linguistics. This needs collaborative work between the people interested in
the problem.
Jon
______________________________________________________________
The meaning of your communication is the response you get
More information about the Indo-european
mailing list