Bernard Wälchli has raised a number of important issues. Here are my thoughts about them.

Bernard said a statement like "Lithuanian has ideophones" did not provide certainty about Lithuanian. I think this depends on how we interpret the statement. If somebody says "Europe has palm trees", this is most likely taken as saying that at least some parts of Europe have some number of (some kinds of) palm trees. Similarly, the statement about Lithuanian having ideophones is most naturally interpreted as saying that some varieties or, ultimately, some texts of the language have some amount of (some kinds of)  ideophones. In this interpretation, it is a certainty claim.

Bernard also notes that the statement "Lithuanian has ideophones" is entirely non-falsifiable: it is as true as "Lithuanian has no ideophones". I don't think this is the case under the ordinary interpretations of these statements. If the statement about Lithuanian having ideophones is - as suggested above - a shorthand for saying that some varieties of Lithuanian have ideophones, then the statement is falsified if no variety of the language has them. The converse statement about the language not having ideophones is similarly subject to empirical test: it is false if at least one variety of the language does have ideophones.

Bernard is right in emphasizing the fact that languages are not homogeneous: they consists of dialects, sociolects, idiolects, various stylistic registers and, most basically, different texts. However, I would not accuse typologists of neglecting language-internal variation. Confronted with the enormous diversity of linguistic behavior in the world, linguists are groping in the dark: they are trying to find the right domains and the right grammatical concepts in terms of which some patterns would emerge from this picture.

One of the domains that has been hypothesized is "language" even though it is a very abstract notion with no clear relation to the individual speaker's behavior. As Bernard says, it is not self-evident that language is the right domain. Nonetheless, this concept has been instrumental in the formulation of some generalizations about linguistic diversity. If a typologist works with this notion, this does not imply that language-internal variants are considered less important by him; it is a personal choice whether somebody wishes to test the validity of the notion "language" or is more interested in working with language-internal domains.

As Bernard points out (see also his paper of 2009 referenced in his message), comparing entire languages implies data reduction: many details are glossed over. As long as it is recognized that this is so and it is acknowledged that research within variants of a single language is equally important, there is no problem.

Just as the choice of the right domain of inquiry - language or dialect or text - is an empirical issue, the very identification and the exact formulation of grammatical properties in terms of which to talk about language also amounts to hypothesis formation: it, too, is an empirical matter subject to testing. We do not need to assume to begin with that, of discrete or gradable properties, one or the other kind is the right one. As Bernard points out, some typological generalizations may successfully utilize discrete features while for others, gradable features work best (see again his 2009 paper).

Thus, ideophones may be hypothetically defined by different content and either as discrete or as gradable. The bottom line: let's try anything - any domain and any kind of property - and see what works in terms of facilitating the formulation of generalizations.

Dear Edith, dear Martin,

>Martin is right

No, Martin is wrong. Martin would be right under the premises
(i) that language-internal variation is always negligible and
(ii) that variable properties across languages are always best captured in terms of discrete and simple (binary) features.
However, these premises are not acceptable (even though typology has a strong bias toward investigating properties where these premises somehow arguably do not do much harm; see, e.g., Wälchli 2009), and they are certainly mistaken for ideophones in Lithuanian. I agree with Martin that it is useful to start with clear definitions. Let us assume we have a suitable definition for ideophones. We will then (depending on how exactly we define ideophones probably) find in Lithuanian that certain texts abound with ideophones while there are many others where there is just nothing nada niente (and that that distribution is not at all random, but has interesting extra-linguistic correlates) and probably that different speakers of Lithuanian have different inventories of ideophones. Some maybe none at all or just very few.

Martin rejects the idea of UG that features are a priori given and argues that pre-established categories do not exist. Fine! But why then retaining the idea that typological features should be discrete (even though this may be convenient when using reference grammars as data source)? It is strange that many typologists who have given up the premises of UG still exclusively or almost exclusively conceive of structural properties as discrete features inherent in languages. There are exceptions such as Koptjevskaja-Tamm (2013) recognizing alternatives: "Discrete classifications, or typologies, operate with a restricted number of types (typically 2 - 6, cf. the chapters in the WALS) and are opposed to continuous typologies, which involve quantitative characterizations of phenomena." In many instances, discrete classifications are nothing else but tremendous data reduction that make claims about properties in languages entirely non-falsifiable ( "Lithuanian has ideophones" is as true as "Lithuanian has no ideophones" depending on what Lithuanian data you happen to look at and where your threshold is for recognizing the presence of certain properties as a feature, even if everybody agrees about the comparative concept).

It still puzzles me and will probably never stop puzzling me with which self-evidence many typologists - occasionally the same people who favor terms such as "diversity linguistics" - neglect language-internal variability despite works such as Miller & Weinert (1998) and Kortmann (2004). Cross-linguistic diversity is just one kind of variability in language (the one that typologists happen to be most interested in). Languages are not homogeneous (the idea of homogeneity is probably a heritage from the Romantic roots of typology when languages were considered to be organisms). When investigating a property, the null-hypothesis for typology should be: language-internal variability is as relevant as cross-linguistic variability. Ideally, the typologist should then demonstrate that cross-linguistic variation actually matters more than language-internal variation and that that null-hypothesis can be rejected. It is not self-evident for all structural properties that "language" is the most relevant or only variable, certainly not for ideophones. (And that cross-family variability is as relevant as family-internal variability. The omnipresent idea of stratified sampling considered to be good methodology testifies of this. If the property investigated happens to be diachronically stable, fine! But what is the point of stratified sampling if you happen to come across properties that are maximally unstable diachronically?)

An observation about a single language does not provide certainty about that language.

Miller, Jim & Weinert, Regina. 1998. Spontaneous Spoken Language. Oxford: Clarendon.
Koptjevskaja-Tamm, Maria. 2013. Typology, theories and methods. In Schierholz, Stefan J. & Wiegand, Herbert Ernst (eds.) Wörterbücher zur Sprach- und Kommunikationswissenschaft (WSK) Online: Theories and Methods, ed. by B. Kortmann.
Kortmann, Bernd (ed.). 2004. Dialectology Meets Typology (Dialect Grammar from a Cross-Linguistic Perspective). Berlin, New York: Mouton de Gruyter.
Wälchli, Bernhard. 2009. Data reduction typology and the bimodal distribution bias. Linguistic Typology 13.1: 77-94.

