Further biological/linguistic parallels?

Mon Mar 27 00:39:31 UTC 2006

Looks like it will be you and me for the moment, though others are very welcome to join in anytime.

For me the Darwinian aspect has to do with the local environment that any particular translated protein has to deal with. It must fulfill its functions. Things can go awry in various ways- it may not fit shapewise, the important functional sites on the molecule may be incorrectly structured, positioned, or instantiated in terms of chemical group. It may not appear in the right places, in the right times, or in the right amounts. It may not be amenable to proper tagging for transport, or disposal. Some  changes are fatal to the organism, some only to the molecular usefulness (which in an organism with large pathway redundancy might be a drag on the system, but won't necessarily kill it). And of course there are many changes which are relatively neutral, and the occasional ones that improve the overall function of the system.

Various experimentalists have pointed out that often one can substitute protein (or corresponding genes) from one species for the cognate one from another, and have it perform its primary functions. That is true- as far as it goes. Its like saying that the Datsun got us to the pet shop so it is equivalent to the Cadillac. It was recently announced (just in the last weeks) that there is vastly more juggling, in the overall proteome, of functional linkages between individual proteins, than anyone had suspected, between even relatively closely related species. This has been a rather disconcerting surprise to people who imagined they had it all figured out.

A newly born protein does not find itself in either a physico-chemical or functional vacuum. It is faced with all that has come before. It will act, and be acted upon. If the environment has changed (lets suppose that a mutation in another molecule has rendered a former linkage dead), the protein may no longer be of use, at all, or to the same extent, or the same ways. Making useless parts is costly in terms of energy and material. Any further mutation at this point that stops such waste will be advantageous to the overall chemical economy. If new linkages appear, or old ones re-emerge, proteins may find themselves of use again, and mutations that help them fulfill the new functions will be advantageous. Mind you I am speaking not of Darwinian fitness of whole organisms in the outside world, but of 'parts' within the internal cellular world.

Words/roots also find themselves in a preexisting economy- they stay around as long as they are useful. Sometimes they become marginalized, or expand their use. They may find their parts subsumed to new wholes, or be truncated, or built upon. In languages with a written tradition they can reappear after long disuse. They can be borrowed. And so on. All this is also true of genes, and the proteins they code. Even in languages the overall phonological shapes (both segmentally and prosodically) of words tends to be fitted to the prevailing norms of the language, unless those norms themselves shift, which is inevitable in the long term. We see this as well for genes in chromosomal structure, and in proteins, and fatty acids, polysaccharides, etc. Overall percentage of the four DNA nucleotide bases varies from species to species, sometimes vastly. It seems to have partially to do with environmental adaptation (due to 'melting temperature' of DNA), cell pressure differences, and so on.

The only reason I can think of imposing such uniformity on both types of system is for ease of production and regulation. Similar logic goes into the evolution of the industrial assembly line, hierarchicalization, and such. Systemically different languages and different genomes/proteomes all have the same general objects, but the particulars of instantiation, the fine details, are where one gets the most variability. /p/ in my language may be systemically the same as /p/ in yours, but the exact phonetic value may not be, even if they are phonemically identical in terms of the usual distinctive features. We have 'accents'. Allegro speech in one language may be similar to slow speech in another. Some of these particulars may help communication in different physical (or social) environments.

Linguists have in the past decades been more and more focused on the general properties of languages, seeking out the imagined essence of Language. Extremely variable fine details tend to be ignored or trivialized (but also large scaling effects may also not catch ones attention). Similar things can be seen in the development of physics of the classical, continuous kind. Later the rise of quantum physics on the one hand, and relativistic, on the other, show that systems may have very interesting and important things going on at the upper and lower bounds (not to mention linkages between these extremes that 'go around' the usual level of interest).

Emergence is not the only thing going on with proteins- as I've mentioned the code does not necessarily contain all the information that goes into determining the nature of the end product. The pre-existing proteome, the lipid membranes, sugar polymers, plus all sorts of monomeric, molecular, and ionic factors act back upon the emerging object, helping to direct its immediate evolution, even its correct shape when there are choices. Genes get edited. The proteins themselves are often edited. Or tagged. Annotations. Such things are major reasons that cloning has been so difficult, and health/viability generally less than perfect. I just finished reading last night a review of a new theory of genetic imprinting which claims to explain autism as an overmasculinization of the brain. It seems that the maternal and paternal genes are neither equally nor randomly distributed in it. Paternal genes seem to be grossly relatively overactive in the action-oriented parts of the brain, while maternal ones are overactive relatively in the socially-oriented ones (such as the frontal lobes)- this is the situation in normals. Pushing the system further than this leads to overmasculinization on the one hand (and autism, with enhanced technical capacity) or overfeminization on the other (and oversociability and reduced technical capacity- it almost sounds like a bad joke based on stereotypes!). Anyway, if true the only mechanisms that could accomplish such functional/anatomical split are editing and tagging of the DNA, RNA, or protein 'from above'.

Selection does not have to be mindless- molecule as stranger in a strange land. The system itself can if complex enough make decisions whether to accept a new entry into the fold that depend on its overall state at that time. Sexual selection is an example.

In any case I am NOT hypothesizing any direct comparability between protein folding and linguistic phenomena. The one relates to real physical objects in an objectively real universe (but lets not get started on that...) and the other to 'virtual objects' real only within functioning brains. But what I said about dimensional shift should still hold in some fashion. Not homology but analogy. Nucleic acid polymer storage forms are mostly linearized (spiral often, on nucleosomes- you might want to think 'fractals' here)- but RNA transcripts have kinks, hairpins, local same-strand self-spirals, etc. which add dimensionality relative to the parent DNA. Translated protein has much more dimensional elaboration- and incorporation into a larger protein universe with myriad (though spatially limited) dimensionalities. Seems to be some sort of trade-off between low dimensionality but large continuity (DNA, or in physics the four dimensions of space-time, or in chemistry pure phases, and so on), versus high dimensionality and relative discontinuity, and mixture (as in the proteome, hidden dimensions popularized in String Theory, and chemical dynamic interface/interphase (such as a shaken salad dressing, or the bands of Jupiter's atmosphere and their finer and finer gradations of eddying).

It may also be telling that the low-dimensional forms seem to be the most 'object' or 'patient'-like in that they are capable of being acted upon, but not acting of their own accord. Great for storing information, or for providing fodder to be plugged in elsewhere. They provide the continuity, stability. High-dimensional parts of the system, however, appear to be constantly in flux, always robbing Peter to pay Paul, jockeying for position in the hierarchy they've created. Perhaps it is the very 'spatial' (or other property) truncations that lead to such infighting. This is where one sees the making and breaking of inter-linkages between members, wheeling and dealing, negotiations and renegotiations. But one also sees complexity differences even within DNA- more with actual genes, a bit less for intervening sequences within the intron/exon system, and minimal for the long stretches of regulatory material (zillions of short repeats) between actual genes. It would be interesting to see how this maps to the higher dimensional parts of the system up in the hierarchy- inverse complexity?

My hypothesis (not a claim, since the evidence isn't all in, even for me) is that the communicative signals must contain the information actually being transferred (not terribly controversial, I hope)- but that it is perhaps possible that some of the information is iconically encoded. For ideophones this is very much the case- but then ideophones are NOT terribly intertwined with syntax and hierarchical discourse structure- their focus rather is with the immediate physicomechanical characterization of a material property or action of sorts usually uncontrollable by the experiencer or executor. At this level of coding initial labial stops associate with relaxation of pressure nearly universally in the world's languages (either by complete or partial material failure of the holding wall of the container- leading to either popped or bulbous shapes, etc.), initial apical stops with directed blunt impacts, and so on. Part of the motivation is the exapted function of the articulators in eating, drinking, tasting, breathing, and so on, but also these have evolved from the primate condition to be primarily communicative in function, and so the acoustic features of the articulations have also been systematized (as ears have evolved, also). Paget's work was to some extent crosslinguistic, but not vastly so, but also remember the period in which he was working, the level of knowledge at the time. If you want to see a good thorough bibliography on sound symbolism visit Margaret Magnus' web pages (www.conknet.com/~mmagnus, and various pages within). M.M. has a rather New Age take on the phenomena in question, while I'm a dyed-in-the-wool evolutionist. I've been working on this, but not publishing, for the past 25 years, and have looked very closely now at dozens of languages, less so for hundreds more. 

As ideophones evolve towards lexical status they begin to lose their formulaic iconicity (imagic for the acoustic side, diagrammatic for the articulatory) mapping to 'real world, non-human' static or dynamic properties. Lexical features get picked up. But my question is whether they are done so in random fashion, or is there something more going on? People have noticed before that different lexical classes in some languages (such as English) don't statistically have the same average makeup phonologically- a lot of this is due to the history of these classes diachronically, but this is exactly the same sort of system 'drift' I posit for the genome/proteome. Again- is this random, or are there more things happening out of sight, out of mind? Classical physics emerges from quantum democracy- we 'see' the former, but but have to infer the latter. Brownian motion just too fine-grained for visibility.

We see the rise of morphological marking often helping us differentiate form/function classes (though this can be lost, fossilized- just as one sees in the genome). Are the marks chosen for grammaticalization randomly? No. They tend to be from particular types of semantic fields (time, space, etc.), with more general senses than other forms. Kinda reminds me of the dimensional thing. We see loss of segmentalism and rise of prosody in the realm beyond the lexicon. Is this just an accidental fact- or is there something systemic going on here, going back to the way brains function? What about dimensionality in this area? Phase purity, versus mixture, and featural complexity? Synthesis versus analysis, and remixing, renegotiating links, functions?

Paralleling this is change of focus from the content of the message to the message of the content, as one goes from lexical meaning to grammatical to pragmatic. I've hypothesized that the 'soup' of features present beyond morphological evolution is to some extent similar in some ways to that found in cells at certain points in their cycles-  elaborated structure broken down leaving only the seeds (but always remembering that cells never completely break down- there is always enough infrastructure left to rebuild upon)- perhaps another example of this sort of thing is the insect pupa? The ultimate origin of ideophones has often been dismissively characterized as 'imitative'- but this says nothing about class status, semantic and morphosyntactic behavior, diagrammatic iconicity (following the existing phonological system architecture)- why should any of this be there, and exhibiting universals as well, if it is only 'imitation'? It is now becoming generally understood by historical linguists that ideophones do in fact often feed the lexicon, just as it is well known that the lexicon feeds the grammatical morphology. Does it end here? As grams wear away to 'just the smile' do they disappear? Segmentally yes. Does anything survive, a ghost perhaps within the historical cumulation of the whole system? Prosody?

We also don't have a good theory of the origins of nonideophonic interjections- whose crosslinguistic study lags way behind even the few compartative works on ideophones. I hypothesize that SOME of these interjections form the phonological cores around which many ideophones crystallize segmentally. There seems to be selection and attrition at each change of state. Bridging the gap between grams and interjections, if there is one, will be the hardest part of making this a cycling system.

Much of what I write here is just reiteration of things I've said before, with my usual overdense dose of rambling.
There do seem to be analogies to different types of evolution within cells- just as between cells, between organisms, populations, etc. There doesn't seem to be any reason to assume any break at the cell wall. Our bodies are complex organismal systems which have minds, cells are complex chemical systems- which may have some analogue of a mind. Ask an amoeba.

Bottom-up emergence may explain some properties of systems and their parts, but not all. There are also top-down effects. The really interesting stuff happens in between, at the level of greatest mixing, interaction, complexity, dimensionality. Of course if the highest levels are also of simple dimension, and there is 'back-door' interaction between bottom and top (as there seems to be in many cases in nature), then the snake bites its own tail. Einstein might approve, one would think. No loose ends, no priveleged reference frames, perhaps?

And I haven't even touched today on the evolution OF language! My inbox has more replies- will answer them in turn as I can. Thanks!

Jess Tauber
phonosemantics at earthlink.net