Principled Comparative Method - a new tool

Jon Patrick jonpat at staff.cs.usyd.edu.au
Fri Sep 3 08:41:32 UTC 1999


[ moderator re-formatted ]

    Date:       Wed, 18 Aug 1999 11:56:48 EDT
    From:       X99Lynx at aol.com

    In a message dated 8/12/99 11:43:21 PM, jonpat at staff.cs.usyd.edu.au quoted:

Steve Long said
    The tools needed first should get the history right, so that the apparent
    relationship between langauges are not merely artifacts.

I must respectfully disagree with you on this Steve. The tools don't get
anything right or wrong, they just compute - but if you are referring to the
maxim "garbage in garbage out", I would agree entirely with you. The linguists
need to get the history right.

    jonpat at staff.cs.usyd.edu.au wrote further:

    <<Our data has to be the word set in the parent form (reconstructed words
    or real words) and then one word set for the each daughter language and the
    set of phonological transformation rules between each parent and daughter
    for each word in their chronological sequence.>>

    I'm wondering if there isn't a possible flaw here in using <<the parent
    form (reconstructed words...)>>.

    Reconstructed words have already made assumptions about the relationship
    between the parent and the daughter languages.  In fact they are nothing
    but a presumed relationship between the daughter languages.

To amplify this point -the Chinese data was not a reconstruction -it used 3
documented languages. If you are using reconstructed languages then the only
meaningful use of our tool is that it identifies which of two  reconstructed
relative chronologies is more probable given the patterns in the data.

    <<If we have the cost of the messages for two parent-daughter pairs then
    the shorter cost represents the daughter that is closer to the parent. In
    the case of modern Cantonese and Beijing we got 35,243.58 bits and 36790.93
    bits respectively, indicating Cantonese is closer to the common parent,
    Middle Chinese, than Beijing. >>

    Depending on how much reconstruction of the parent you used, could this not
    be an artifact of the reconstructions?

Yes it is an artefact of the reconstruction hence you need 2 reconstructions
for the numbers to be meaningful. The strength of the method is that it
incorporates not only the phonological changes but the sequence of their
application over  a statistically representative sample - that is also of
course a limitation to its application.

My request to linguists is to prepare and present your reconstructions not as
fragmentary elements of this word changing to that word, but rather as a
system represented by words changing in whatever ways you wish to assert, so
that we can get at the problem wholistically not piecemeal.

    In *PIE, certain aspects are considered the innovations of a particular
    daughter language because they do not appear in the other daughter
    languages, and are therefore factored out of the reconstruction.  If you
    only have two daughter languages - as you did above - how do you identify
    the innovation versus the original form in reconstruction?

If I understand "innovation" correctly it has to represented by  a rule of
insertion from a null position. that's not a problem it's just another rule at
a particular point in the Relative Chronolgy. The algortihm will process it
correctly.

    And if you decide in favor of one or the other in
    reconstruction, it will show up in any further use of that reconstruction.

And so one reconstruction has to pay the penalty of innovation and the other
not. That would favour placing the non-innovative language closer to its
parent than the innovator. That seems sensible all other things being equal,
or am I missing something?

    In effect, you may to some degree be measuring how the relationship between
    the daughters has been perceived in the reconstructions that you use, as
    much as anything else.

Can we measure anything else? Is the reconstruction ever saying anything else
than this is the relationship between mother and daughter.

    I would think that the method you describe would be much more functional if
    it at least triangulated daughter languages.  And avoided using prior
    reconstructions - proving itself on its own, so to speak.

Yes I appreciate the requirement you are placing here but I don't think it is
something that can be done. The question which language is closer to its
parent is well-formed and answerable by our method, irregardless of the number
of languages involved (the persuasive power of the answer may be variable
depending on the quality of the dataset), However the question "how close are
these languages" is not "well formed" (that is not answerable from the data)
in the sense that any attempt to measure the similarities or differences that
ignores their changes from their point of  commonality is not modelling the
processes they have gone through. I work through  a mindset of "well formed
questions" which conditions they way I look at data and attempt to analyse
them. My perspectives don't always suit people, particularly the way it limits
the answerable questions, but I find it useful.
 I think we have demonstrated the usefulness of our concepts and methods on
the Chinese data but others may require more complex tests. We will co-operate
if it is at all humanly possible.
cheers

Jon
______________________________________________________________
The meaning of your communication is the response you get



More information about the Indo-european mailing list