Fwd: Articles on Uralic phylogenetics

Sun Nov 10 14:58:06 UTC 2013

Dear All,

For some reason, Outi Vesakoski's answer didn't make it to the list, possibly because of its length. (Normally, rejected messages are automatically bounced to me, but as I get a lot of spam this way every day, I may have deleted it together with all those urgent business proposals. If this ever happens to your message, please feel free to contact me!) So, here it comes.

Best
JL
--
Univ.Prof. Dr. Johanna Laakso
Universität Wien, Institut für Europäische und Vergleichende Sprach- und Literaturwissenschaft (EVSL)
Abteilung Finno-Ugristik
Campus AAKH Spitalgasse 2-4 Hof 7
A-1090 Wien
johanna.laakso at univie.ac.at • http://homepage.univie.ac.at/Johanna.Laakso/
Project ELDIA: http://www.eldia-project.org/ 

Välitetty viesti alkaa:

> Lähettäjä: Outi Vesakoski <outves at utu.fi>
> Aihe: VS: Articles on Uralic phylogenetics
> Päivämäärä: 10. marraskuuta 2013 15.34.59 UTC+1.00
> Vastaanottaja: Johanna Laakso <johanna.laakso at univie.ac.at>
> 
> ****
> 
> Dear readers of Ura-List,
>  
> and thanks for Florian for opening the discussion! Below, I’ve included Florian’s questions/comments, followed by my answers. I will add ** to show more clearly the end of the answer and beginning of comment/question.
>  
>  
> “As the authors of this project apparently want to generate discussion as they insisted on spreading their news on Ura-List, I take the opportunity to comment shortly on the Diachronica paper.—“
> 
> -          That was indeed the aim. The papers are available in Diachronica and Journal of Evolutionary Biology, but I thought that not all have access to both papers. We think it is important to share our findings because studying makes more sense when it also reaches the publicity. And as our research focuses on Uralic languages, what would be a better place to share it than at UraList. This way we can also get these valuable comments from the specialists of the field. 
> 
> ****
> 
> “Let me say in advance, that historical-comparative linguistics is not my main field of interest and this apparently won’t change in the future. However, as I have undergone the typical historical-comparative training of the discipline and spent a decade in a department infamously known for “revolutionary and post-revolutionary approaches to Uralic linguistics” until the retirement of its propagator, I have a hard time understanding the implications…”
> 
> -          Actually I can tell you all the Great Plan, which is not a secret at all: the Diachronica paper established the studies of “language evolution” for Uralic languages. The method and data is comparable to parallel studies on other language families, which means that there is now possibility to compare patterns of “language evolution” against other language families and collaborate internationally. Until now, international researchers have not had access to Uralic data. Further, Uralic literature has been difficult to utilize for the researchers speaking non-Uralic languages. We wanted to give a very short overview of the study of Uralic languages, BUT we encourage writing a thorough, up-dated review of Uralic studies – in English and in a peer-reviewed journal. That information is needed, and I know that it would gain a wide audience and lots of references!
> -           
> -          So the first step was to write this paper that explained the data and the method and – importantly – showed that the results do not differ to a worrying extent from the previous results, which you would expect from lexical data. The second step, after the validation of the method, is using the data to actually study the macroevolutionary patterns of language evolution. (Macroevolution: divergences and extinctions of lineages). This was initiated in Honkola et al. We will continue to compare linguistic macroevolution to the biological process (please feel to co-operate or give suggestions or comments!). However, we will also continue to study the very basic questions asked from this kind of approach: How about network models, how about non-basic vocabulary etc. One of these papers is in peer review right now.  I do think that the implications can be large actually.
> 
> **
> 
> “My first open question is concerned with the nature of “phylogeny” propagated by these papers. Since when is language classification based exclusively on vocabulary and sound changes? Historically and theoretically, we are back in the 18th century again, perhaps with insights in sound changes deriving from the 20th century now reproduced by statistical and biological software…”
> 
> -          Luckily no-one was claiming in the paper that basic vocabulary would be the ultimate truth. In biology, people do phylogenetic research on the morphology of animals, Y-cromosome DNA, mitochondrial DNA, SNP’s…. All these approaches may produce different phylogenies (and often do). Further, if adding more species to the data, the structure of the tree may change. This is one of the advantages of this method – comparing different data sets is generally very straightforward. With different data and results we can try to solve the puzzle of Uralic history. You can ask that how and when the results are similar or dissimilar. At the moment we are also collecting typological data so that we can compare it with lexical data. Surely any other data could be studied as well, and hopefully this will be done! By the way, there is one paper written where the language history was based on structure of the languages. This paper is blamed for not using vocabulary. Maybe the point is, that one task at the time!
> 
> **
> 
> ‘”And then, why Swadesh?”
> 
> -          We have to start somewhere. The Swadesh list is generally one of the most used basic vocabulary list in this kind of research, making it easier to compare to similar work on other languages. The paper also actually goes through TEN different sets of basic vocabulary meanings, TWO of which are Swadesh lists.
> 
> ***
> 
> "Second, it is quite hilarious to read the following introductory statement: „Most Uralic research remains non-quantitative… (Diachronica p. 335). Some pages later however one reads that their data set contains a 100-item data set, a 200-item data set and a 500-word data set. Given that data for historical-comparative work is restricted, why is this new approach with 500 items any better and less „non-quantitative“? From the perspective of lexicography or corpus linguistics, 500 tokens is indeed “non-quantitative”.
> 
> This was good point to notice and we should have chosen different wording. We meant that the data was statistically (quantitatively) analyzed, and the classification was also produced quantitatively, in this case using Bayesian phylogenetics. We had altogether only 226 basic vocabulary meanings (not 500). We did comment on the size (quantity) of the vocabulary lists and ended up saying that the result is more or less the same with Swadesh 100, Leipzig-Jakarta (100 meanings) and Ura100 (our suggested basic vocabulary list for the Uralic langauges) or all these combined (226 items). It has been claimed that basic vocabulary is necessarily restricted to a small number of meanings, as when we start to add items more prone to replacement we aren’t really talking about BASIC vocabulary any more. This is why we tested how the result changes when more or less stable of the data are analysed. These tests are easy with statistical approaches, as you just need to separate an appropriate subset of the data and run the analyses again.
> 
> ***
> 
> “Third, it is quite astonishing to see that output of researchers with a clear “revolutionary connotation” (Künnap & Taagepera 2004; Tambovtsev 2004) are even considered in such a paper. Apparently, the international reviewers have been unaware what happened in the discipline in the late 1990s and the first years of the new millennium and can’t tell solid scholarship from less solid. And by the way, so did the authors of this joint paper and their “linguistic” advisers for whom quite some space is reserved…”
> 
> -          Luckily we remembered to underline that this was not an exhausting review of the literature. As stated above, a full, up-to-date review would be indeed needed! As stated in the text, the linguistics advisers checked through the basic vocabulary lists and are not (similarly to other people mentioned in the acknowledgements) in any way responsible of any of the text. In the future, we would be happy to get co-operation either in form of co-authorship or in reading through the ms before submission. If you would like to read the text but do not want you name to be mentioned anywhere near this unorthodox approach, you can also remain an “anonymous referee”….
>  
> -          BTW, we got financing for a project to put the basic vocabulary lists on the Internet alongside the correlate (cognate) data. The idea will be to provide the data for others as well AND – this could be of interest to the readers – to allow the Uralists and Fenno-Ugrists to comment on them. This will surely improve the data as happened with Indo-European languages. The UraLex –project is indeed aiming at similar outcome as IE-Lex.
> 
> ****
> 
> "Summing up the Diachronica paper, one sees a “scientific” reproduction of a number of “scholarly assembled facts” equaling earlier scholarship which was accused of having been based on a “non-quantitative sample”. After all, it is nice to see that “scholarly work” can indeed compete with a biological software data set analysis and one may be tempted to say that “scholarly work is indeed rather scientific”. Of course, the Diachronica paper is an instance of that kind of „science“ generally appreciated as “hard science” as the paper tests predictions based on a sampled data set and shows different models based on different analysis. But clearly, this paper does not show anything amazingly new; it “scientifically” reproduces data which has been assembled scholarly and comes to solutions which are not too diverging. So, all we got is „quod erum demonstrandum” now supported by software desgined by humans?"
> 
> -          As stated above, we were happy not to see anything completely unexpected. It would have been very difficult to continue if the results would have contrasted completely with earlier results… I hope that I already managed to show that this approach gives new possibilities to approach old and new questions. 
> 
> ***
> 
> “Finally, let me come back to my opening statement – in order to make such research interesting for a community of “scholars” (that’s how we are called by “scientists”), another central component of historical-comparative linguistics needs to be integrated – historical grammar. After all, genetic classification needs both lexicon and grammar. But then, historical grammar is messy, there is more analogy, leveling etc which blurs the nice and clear cut lexicon and sound change picture. I wonder if this can be modeled and combined with the study one eagerly wanted to share with the community.”
> 
> -          As stated above, this is what we are doing now. The good thing in Bayesian phylogenetics is that it actually tells you in different ways whether you should trust the classification or not. If the data is messy, it will be seen in the overall shape, the branch lengths, and in the posterior probabilities. And surely we will provide the ms for publication in international peer-reviewed journals - the future funding depends on number of publications. Also, it would be no science if you would not give you research for the fellow scholars/scientist to read and comment. Science is not about saying that THIS is the one and only truth. It is always said that typological data suggests a different kind of classification than basic vocabulary. I hardly can wait to see the result!
> 
> ***
> 
> “Such a paper might indeed hold some surprises and would produce something new for the 21st century. As long as “phylogeny” is limited to vocabulary and sound change, the picture is incomplete and partial, even if it can be tested “scientifically”. After all, the genetic unity of Uralic (and any other language family) is indeed more than vocabulary and sound change…”
> 
> -          This is indeed what the financer thought when giving us money for collecting the typological list. However, I want to point out that language is also more than just grammar, and I would be hoping to see vocabulary data alongside grammatical data.
>  
> In general, we do not wish to suppress any other research approach in historical linguistics, but actually the opposite: We hope to lift up the research done on Uralic languages with this study field of “language evolution”. This new approach will most likely appeal to a new audience, and with this, the new audience will hopefully also find the exhaustive work that is done in the entire field of Uralic and Fenno-Ugric studies. 
> 
> Yours, Outi Vesakoski
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/ura-list/attachments/20131110/ff0ab38b/attachment.htm>