Ancestor-descendant distance

Sean Crist kurisuto at unagi.cis.upenn.edu
Mon Aug 23 00:39:56 UTC 1999


On Fri, 13 Aug 1999 ECOLING at aol.com wrote:

> We now have an international standard computer Code, Unicode,
> which contains most of the characters needed for transliteration
> (Latin-standard-based letters) and for phonetic transcription (IPA).
> It would be useful to try to establish a standard for Comparative
> Data sets, into which all existing computer data sets can be translated,
> so that the massive sets of data can be made available for studies
> such as this.

I agree totally.  We're on exactly the same wavelength here.

I've looked into this a little and have tried to educate myself about
SGML, which would be an obvious candidate for marking up the data sets.  I
don't know if there are any specific standard sets of SGML tags for
marking up dictionaries; if there are, it would probably make sense to
start with such a tag set, and extend it with whatever additional tags we
need to represent cognations between languages, etc.

If anyone on this list has any experience using SGML for such a purpose,
please write to me, because I'll need to be tackling this problem before
much longer!

  \/ __ __    _\_     --Sean Crist  (kurisuto at unagi.cis.upenn.edu)
 ---  |  |    \ /     http://www.ling.upenn.edu/~kurisuto/
  _| ,| ,|   -----
  _| ,| ,|    [_]
   |  |  |    [_]



More information about the Indo-european mailing list