More on language and genome

jess tauber phonosemantics at earthlink.net
Sat May 8 18:05:21 UTC 2010


I'll have to take the time to respond to Brian's concerns properly. Currently I'm in touch with several people involved in the research on viruses I was referring to and am being swamped with papers still to read on the topic- I've been out of the game too long, and am playing catch-up. But just a few quick points.

Viruses genomes code several types of objects- minimally their capsids (if they have any) and any proteins they need to enter and exit a cell, and hijack its metabolism to reproduce themselves. But there is often more, in many cases means of incorporating themselves into the DNA of the host, and other genes dragged along for the ride, some from previous hosts, and some new from scrambling of these genes. Viral genes proper affect processual control when incorporated, but these other genes can add new external capacities, as well as alter existing ones. There are viruses that are larger than some bacteria, and these contain huge numbers of genes that a mere parasite wouldn't require for survival. At the other end of the spectrum are viroids that don't have enough of their own genes to survive in a host alone, and hitch their wagons to other viruses to gain entry into cells, or hijack metabolic processes. So things are a bit more complex on the biological end than simple generalizations can account for, just as I overgeneralize a bit about typology (a not uncommon phenomenon), or ideophones.

Apparently viruses are a hotbed of rampant recombination, to the point that it is useless to talk about deep lineages. Variable RNA editing comes has viral ancestry, and eukaryotic immune systems get their own capabilities from viral recombinational mechanisms as well. If the folks I'm corresponding with are correct, most of the regulatory and innovational machinery is viral in origin.

I agree with your observation that the effects of a given virus on gene expression (in host cells) may be accidental, but then so may be the effects of ideophones and interjections (which I link in my model as part of pragmatic negotiation rather than as part of automatizing, streamlining, and backgrounding grammaticalization). 'Memes' may be too broad a category in this context.

Remember that ideophones are not always known or accepted by listeners- a number of field researchers have written about this. They are easier to find in relaxed communicative situations that perhaps are less friendly to grams- a testable hypothesis if anyone wants to take the time to look.

One of my current correspondents writes: '1998 it was assumed that 8% of the human genome are of viral origin,  2008 it was 45 %. I predict that in further 10 years it becomes increasingly clear that 98% of the human genome are products of viral genome editing. They not only insert and duplicate genetic content arrangements, viruses invent genes and insert them into cellular genomes.'

There is some kind of iconic linkage between the genetic code and protein structure (primary works of course, unless there is later modification after transcription or translation; secondary in proteins not needing chaperonin-driven refolding, etc., tertiary less certainly, and so on, effects dimming the further up the hierarchy the string climbs). But there may also be an effect in the opposite direction, from the proteome downwards, with similar reduction of effectiveness. Many proteins of the same function in different organisms are known to retain largely the same outward shape, with their main reaction centers in the same places, even though the sequence of amino acids can have drifted all over the place. All these sequences converge on the same or similar final product- possibly with the help of chaperonins or other interactions.

When chaperonins do their job, lower hierarchical level structure is disrupted in favor of new connectivities at the higher level, even further than it would be in spontaneous hierarchical folding. This is what I mean by arbitrarization, in that configurations favored by bottom-up processes (parallel to those found in ideophone construction and semantics involving lower-dimensions starting from the linear sequencing) adapt to top-down pressures from the existing population of protein products. I should have mentioned that later evolution may change the initial amino acid sequence so that the protein more easily attains the 'desired' functional configuration. Not every protein needs a chaperonin. It would be interesting to know whether these latter forms have a different statistical spread of aa sequences, peptide folds, etc. from shorter, virally recombined genes of the nonregulatory type. That would be my prediction, in any case. 

Ideophones aren't the only places where there is diagrammatic order- you can find it in serializable verbs in the Papuan Kalam-Kobon family, and we're all familiar with this effect in grammatical paradigms (though not all of them). If I'm right about how class reanalysis and language type shift work together, as the paradigm-like diagrammatic quality of large-scale ideophone systems dies, it passes through the lexicon for a while and eventually settles onto grammatical morphology, where it increases, though I won't call it iconicity. Rather iconicity shifts to symbolicity and then on to indexicality as each class gets the diagrammatical imperative. Then diagrammaticality gets off the morphological tit and affects syntactic structure proper, before the whole process begins again, in the morphosyntactic cycle.

I would like to know whether this cycle exists in living organisms as well- maybe helping to explain the rise and fall of genomic and organismal sizes, lifestyles, etc. There are relationships that have been discovered only in the past couple of years- again I'm trying to play catch-up here.

In eukaryotic organisms there is always regulatory junk- virally derived. In fact people are now claiming that the eukaryotic nucleus itself was originally a giant virus (they DO exist, you know, bigger than bacteria!). The other organelles were either other viruses, bacteria, or archaea.

It has been found that there is a strong correlation between relative amounts of this regulatory junk DNA and life history. Organisms (plants and animals) that have very large amounts relative to protein coding DNA take their time to mature, often retaining infantile character (including the very valuable ability to regenerate whole body parts and organs, including the brain, to heal wounds without scarring, etc.) into adulthood (neoteny), and waiting til circumstances in the environment are most propitious. Their organs are simpler, and contain fewer, larger cells, and fewer types of cells. In other words, everything is continuously negotiated. More junk, more negotiation, more pragmatic orientation.

Animals and plants with the least relative amount of regulatory junk DNA vs. protein coding type have the opposite life history. They have more different interconnected (even overlapping) organ systems, with greater numbers of smaller cells, with more types per organ. Ability to heal is via quick scarring rather than slow regeneration. Regeneration is minimized. Maturation is accelerated (metamorphosis), with earlier stages often reduced. All this is done 'on the clock'- they don't depend on variable environmental cues.

It would be interesting to know whether the latter type has any specializating modifications in the remainder of the DNA that take the place of the excised junk. I'm guessing there are, and that they will resemble, from a systemic perspective, what we see in languages that increase their synthesis and fusion. In the past I had hypothesized that viruses matched this type, bacteria and archaea agglutinating languages, and eukaryotes analytical languages. I know now that this is too simple, since viruses vary radically in size and genome, as do bacteria, archaea, and eukaryotes. There is overlap. But broadly there is still some comparability. In giant viruses most of the genes are not regulatory, and the smaller you go, the higher the proportion of genes are regulatory until with the smallest viroids, all that is left is regulatory. For eukaryotes the opposite seems to be true, in that the organisms with the largest cells have the largest proportion of regulatory DNA, and the smallest of housekeeping, protein-coding genes.

So any developmental hierarchical comparisons would have to be more than single dimensional- but then again multidimensional hierarchies in typology aren't unheard of, either.

More later.

Jess Tauber
phonosemantics at earthlink.net



More information about the Funknet mailing list