kurisuto at unagi.cis.upenn.edu
Mon Aug 23 00:39:56 UTC 1999
On Fri, 13 Aug 1999 ECOLING at aol.com wrote:
> We now have an international standard computer Code, Unicode,
> which contains most of the characters needed for transliteration
> (Latin-standard-based letters) and for phonetic transcription (IPA).
> It would be useful to try to establish a standard for Comparative
> Data sets, into which all existing computer data sets can be translated,
> so that the massive sets of data can be made available for studies
> such as this.
I agree totally. We're on exactly the same wavelength here.
I've looked into this a little and have tried to educate myself about
SGML, which would be an obvious candidate for marking up the data sets. I
don't know if there are any specific standard sets of SGML tags for
marking up dictionaries; if there are, it would probably make sense to
start with such a tag set, and extend it with whatever additional tags we
need to represent cognations between languages, etc.
If anyone on this list has any experience using SGML for such a purpose,
please write to me, because I'll need to be tackling this problem before
\/ __ __ _\_ --Sean Crist (kurisuto at unagi.cis.upenn.edu)
--- | | \ / http://www.ling.upenn.edu/~kurisuto/
_| ,| ,| -----
_| ,| ,| [_]
| | | [_]
More information about the Indo-european