Daniel Ross
Tue Nov 14 00:47:22 UTC 2017

I do not see Peter's approach (or mine for 'family') as disjoint. Instead,
I see it as abstract and variably interpreted in each language. Thus
subjects, or words, or families, or whatever, must be understood like
patterns or behaviors, rather than basic natural kinds.

I would think that, for example, those who like the approach to
categorization of Cognitive Grammar would appreciate the idea that
categories like "word" are emergent in particular grammars but broadly
similar based on our human nature. The concept of "family" differs across
cultures, but most or all humans have some interest in (their version of)
that concept.

Similarly, many languages seem to organize their grammars with some level
between morpheme and phrase. In fact, that does not seem unexpected to me
at all: phrases are loosely connected, and morphemes are not divisible.
Having an emergent level where there is a tight connection seems like a
completely natural development to me for complex systems like languages. So
most or all languages have something like words, but actually DEFINING what
a "word" "is" does not come easily because the ways in which languages "DO
words" is variable.

I'm beginning to wonder if typological categories are best thought of as
actions that languages do, rather than structures or properties that
languages "HAVE".

(I have some more thoughts expanding on that (thinking out loud) but I will
attach those as a text document rather than making this email excessively

And having said that, I will now go back to applying my comparative
concepts to a language sample to see which ones have which features-- a
good starting point at least.


On Mon, Nov 13, 2017 at 1:27 PM, Martin Haspelmath

> I think we can distinguish two broad kinds of situations:
> (1) where a comparative concept is defined by a single criterion or a set
> of simultaneouly necessary and sufficient criteria (e.g. "dative", defined
> by the criterion of flagging the recipient)
> (2) where a comparative concept is defined by multiple disjunctive criteria
> I have been arguing that we should adopt definitions of the former type,
> but Peter Arkadiev is now arguing that the latter should also be accepted.
> Well, maybe he is right, but I think that disjunctive definitions are
> acceptable only if the criteria are independently known to correlate
> tightly.
> For example, let's assume that we know that people with a high income very
> often drive expensive cars and strongly tend to have gardeners. Then we can
> define a sociological category "rich person" disjunctively: Someone who
> either (i) has a high income, or (ii) drives an expensive car, or (iii) has
> a gardener (or several of these simultaneously). Intuitively, this sounds
> reasonable.
> But we could also create disjunctively defined concepts based on criteria
> that don't correlate. For example, we could set up a sociological category
> "meggle person" defined as someone who either (i) owns a Huawei smartphone,
> or (ii) likes Mendelssohn music, or (iii) works as a taxi-driver on
> weekends (or several of these simultaneously). Intuitively, this sounds
> crazy – though the impression that such features correlate might arise
> through some accident (e.g. if there were a movie where two or three meggle
> people play a role).
> So I would say that before we can accept a disjunctive definition of a
> comparative concept, it must be shown that the different criteria correlate
> very significantly. I don't think this has been shown for stereotypical
> subject properties, or for stereotypical word properties – though it may
> well be that it's true (I have no clear intuitions).
> So basically what I'm arguing is that we shouldn't rely on stereotypical
> property clusters, but we should investigate whether the properties do
> indeed cluster. (Recall that typology originated in German Romanticism,
> which was closely linked to nationalism, and eventually to racism – and we
> all know that we should not trust racial stereotypes, though some of the
> supposed correlations may turn out to be correct after they are
> investigated.)
> Best,
> Martin
> On 13.11.17 21:34, Peter Arkadiev wrote:
> Dear Daniel, dear all,
> that was an excellent point, and the analogy to 'family' defined relative
> to particular culture is very lucid. This is precisely how I believe many
> comparative linguistic notions can (or should) be defined -- relative to
> language. Take the notorious notion of subject, which is defined by some as
> "the privileged syntactic argument (by whatever criteria there are in
> particular languages that make one of their arguments privileged)". I may
> be wrong, but this seems to be the definition of subject in Role and
> Reference Grammar. Of course, for those who believe that comparability
> requires identification, this is a bad comparative concept, since in
> principle it does not exclude the possibility that there are two languages
> whose subjects have nothing in common. But still this is a workable concept
> allowing typologists to ask reasonable questions, e.g.:
> 1) Are there languages where subjects in this sense cannot be single out,
> and if yes, for what reasons? As far as I know, there are linguists who
> claim that the answer to this question is "yes", therefore the concept is
> not vacuous.
> 2) What are the grammatical properties that languages with subjects thus
> defined employ to render them privileged as opposed to other arguments?
> Well, much of the grammatical relations typology is just about this.
> 3) Do subjects thus defined cross-linguistically correlate with certain
> admittedly universally applicable comparative concepts such as "agent" or
> "topic" and is there a common "core" to subjects in all languages? Note
> that under the definition proposed, this becomes an empirical question with
> a potentially negative answer, rather than is built into the definition a
> priori.
> I think it is possible to define words, affixes, clitics etc. in such a
> way and get consistent and interesting results.
> Best regards,
> Peter
> --
> Peter Arkadiev, PhD
> Institute of Slavic Studies
> Russian Academy of Sciences
> Leninsky prospekt 32-A 119991 Moscow
> peterarkadiev at yandex.ru
> http://inslav.ru/people/arkadev-petr-mihaylovich-peter-arkadiev
> --
> Martin Haspelmath (haspelmath at shh.mpg.de)
> Max Planck Institute for the Science of Human History
> Kahlaische Strasse 10	
> D-07745 Jena
> &
> Leipzig University
> IPF 141199
> Nikolaistrasse 6-10
> D-04109 Leipzig
Extended thoughts on languages DOING things rather than HAVING things, and how we can compare them:

One argument in favor of this approach is that a single language may "HAVE" multiple, even contradictory properties, such as would be the case for languages that are both "ergative languages" and "accusative languages" with split ergativity. For "words", in some languages we observe that verbs may be polysynthetic while nouns are hardly inflected at all, for example.

Consider that describing synthesis as something that languages DO is relatively easy: "putting morphemes tightly together below the phrasal level". Since the result is a continuum, there are no inherent boundaries. We cannot, for example, answer whether languages "HAVE polysynthesis" without an arbitrary comparative concept. To take a more complicated example, many or most languages express stress on certain syllables (whether lexical, grammatical, pragmatic, etc.), but current research suggests that the actual acoustic properties actually are disjoint to some degree, including pitch (F0), duration, loudness, vowel/phoneme quality, etc. Not all languages use all of those (and they might sometimes vary within a language). But it does not seem to me to be problematic to say that languages DO stress. I'd even be tempted to consider "stress" a "natural kind" although it isn't the sort we can easily define, but rather a type of behavior languages do (marking prominence).

In David's statistical approach, patterns emerge from the data in terms of how some morphemes seem to be bound and others free. Bimodal distributions tell us something about the nature of the pattern (it isn't just a continuum, but there is some sort of attraction to certain regions), and we could think of languages as extremely complex N-dimensional vector arrays in which there are multi-modal distributions like that-- in the case of "words" the peaks (local maxima) would be across many dimensions, even "disjoint" in Martin's terms, and sometimes very hard to identify. But still, those peaks can be meaningful. It is necessary that languages have peaks (or they'd just be noise), but I would say that human nature (via Cognitive Grammar, or even UG if you prefer that) would predict that broadly similar peaks would emerge in different languages, and that is indeed what we observe. "Subjects" as "main arguments" are useful in human communication, so we often have them. (Or "words", as explained above.) Other times, borderline cases may be due to there not being a bimodal (or multi-modal) distribution in particular dimension, or just because we look at the (relatively few) cases in the middle.

I think the ultimate goal of typology then would be to understand the variation in those emergent peaks. We can compare them. We can also look for differences in distributions of peaks. (Do some languages have more than others? Do some seem to be in complementary distribution?) Comparative concepts as labels help us do that, but they do not "define" the peaks, because they are not "things" but rather the results of things that languages DO.

[Problematically, I don't know at this point how the N-dimensions would be structured (are they layered?), nor whether they would be universal (like the theorized list of universal constraints in Optimality Theory). But even if we cannot identify all of the dimensions, we do seem to be able to identify peaks in them, what we often label with comparative concepts, so maybe we could work backwards from there.]

So the questions to ask are:
1. What do languages do?
2. In what ways do they do them?
3. How do those ways vary?

There is no room for a question of the form "Does language X have Y?" If we add comparative concepts as convenient labels for us as typologists, we can behave as if there are such "things" that languages "have", but unless it can be shown that natural kinds exist (as Martin so strongly argues against), the reality is that languages actually do DO things, but our labels are really for our own convenience and simplify reality.

I am, however, still hopeful that there really are "things going on" in languages and that we can do a good job of discovering what they are. Not just with arbitrary comparative concepts, but by showing that those comparative concepts correspond to the things languages DO. As we keep repeating that process we will get better and better models of language.

