From hubeyh at mail.montclair.edu Tue Feb 4 01:33:18 2003 From: hubeyh at mail.montclair.edu (H.M. Hubey) Date: Mon, 3 Feb 2003 20:33:18 -0500 Subject: [language] "Comments on Clifton's review of Kessler in issue Message-ID: <><><><><><><><><><><><>--This is the Language List--<><><><><><><><><><><><><> I would like to review Kessler's book for the LinguistList. Please reply. -------- Original Message -------- Subject: Re: Kessler's book Date: Tue, 31 Dec 2002 12:37:02 -0500 (EST) From: Terry Langendoen To: "H.M.Hubey" References: <3E112ADE.10800 at mail.montclair.edu> Dear Ms/Mr Hubey, If you'd like to comment on either the book or the review, I suggest you send your comments as a regular email message to linguist at linguistlist.org. In the subject line include a reference to the original positing, e.g. "Comments on Clifton's review of Kessler in issue 13.491". Terry Terry Langendoen, Linguist List book review editor http://linguistlist.org/issues/indices/Review2002r.html/ Co-Principal Investigator, EMELD Project http://emeld.douglass.arizona.edu/ & http://emeld.org/ Department of Linguistics, University of Arizona PO Box 210028, Tucson AZ 85721-0028, USA http://linguistics.arizona.edu/~langendoen/ On Tue, 31 Dec 2002, H.M.Hubey wrote: > > Hello, > > > I just read Kessler's book and would like to write a > review since this one does nit do it justice. > > > > > > > > LINGUIST List 13.491 > > > Fri Feb 22 2002 > > > Review: Kessler, The Significance of Word Lists > > Editor for this issue: Terence Langendoen > > > ------------------------------------------------------------------------ > What follows is another discussion note contributed to our Book > Discussion Forum. We expect these discussions to be informal and > interactive; and the author of the book discussed is cordially invited > to join in. If you are interested in leading a book discussion, look for > books announced on LINGUIST as "available for discussion." (This means > that the publisher has sent us a review copy.) Then contact Simin Karimi > at simin at linguistlist.org or Terry > Langendoen at terry at linguistlist.org . > Subscribe to Blackwell's LL+ at http://www.linguistlistplus.com/ and > donate 20% of your subscription to LINGUIST! You get 30% off on > Blackwells books, and free shipping and postage! > ------------------------------------------------------------------------ > > > Directory > > 1. John & Debbie Clifton, Review of Kessler: The Significance of Word > Lists > > ------------------------------------------------------------------------ > > > Message 1: Review of Kessler: The Significance of Word Lists > > Date: Fri, 22 Feb 2002 20:59:53 +0400 > From: John & Debbie Clifton > > Subject: Review of Kessler: The Significance of Word Lists > > Kessler, Brett. 2001. The Significance of Word Lists. CSLI Publications, > x+277pp, hardback ISBN 1-57586-299-9, paperback ISBN 1-57586-300-6, > Dissertations in Linguistics. > Announced at http://linguistlist.org/issues/12/12-790.html#1 > > John M Clifton, Summer Institute of Linguistics and University of North > Dakota > > DESCRIPTION OF THE BOOK > The two major issues addressed in this book can be characterized in terms of > two senses of the word 'significance' as used in the title of the book. The > first issue is how significant word lists are to determining language > relatedness. The second issue is what is involved in showing that hypotheses > made on the basis word lists are statistically significant. > > In chapter 1, 'Introduction', Kessler (K) addresses the two major positions > on the first issue. On the one side are those like Greenberg and Ruhlen > (1992) who feel that the analysis of word lists can be used to demonstrate > the links between remotely related languages. On the other side are scores > of more traditional historical linguists who claim that the similarities > used to establish these putative links are due to chance. K proposes a third > option: word lists can be used to establish linguistic relationships, but > only when following a rigid methodology designed to ensure the results will > be statistically significant. > > Chapters 2, 'Statistical Methodology', and 3, 'Significance Testing', are > the heart of the book. In these chapters K discusses statistical methodology > in general, and then details the specific methodology proposed for the > analysis of word lists. K then applies this test to Swadesh 100 word lists > from eight languages: Latin, French, English, German, Albanian, Hawaiian, > Navajo, and Turkish. With a few exceptions, the results of the procedure > indicate that the first five are related, and the others are not. At the > risk of over-simplifying a complex procedure, I will attempt to summarize > the contents of the methodology. Feel free to skip the next paragraph if it > is too obtuse. > > The methodology involves constructing a table of correspondences of > word-initial segments in semantically related words in two languages. This > table can then be analyzed using the chi-square test for significance. From > a statistical point of view, the problem is that the number of occurrences > of specific correspondences is too low for the chi-square test to be > meaningful. To remedy this, K proposes the use of a Monte Carlo technique. > Applying this technique, one of the word lists is randomized, a new table is > constructed, and the chi-square test is applied to the new table. This > procedure is repeated 10,000 times. Now the value of the original table is > compared with the values of these 10,000 tables generated by the Monte Carlo > technique, and a valid level of significance can be attached to the original > value. > > As indicated above, the methodology as proposed does not always correctly > identify which languages are related. There are both false positives in > which a relationship is posited between apparently unrelated languages like > Latin and Navajo, and false negatives in which no relationship is posited > between related languages like Albanian and German. K points out that false > positives are unavoidable in statistics; the goal is to minimize them. False > negatives, on the other hand, should be eliminated. In addition, it would be > nice if the methodology could distinguish between closer relationships like > those between English and German, and more distant relationships like those > between English and Albanian. In chapters 4-10, K discusses various ways in > which the methodology might be improved. > > In chapter 4, 'Tests in Different Environments', K concludes that > predictions are not improved by comparing features other than the > word-initial consonant, for example, the first consonant of the second > syllable, or the first vowel, or some combination of the above. Then in > chapter 5, 'Size of the Word Lists', K shows that increasing the size of the > word lists by using the Swadesh 200 word list instead of the Swadesh 100, > does not improve the predictions. > > Chapter 6, 'Precision and Lumping', deals with the implications of two types > of historical changes. First, phonemes can split or merge so that, for > example, /t/ in language A may correspond to /t/, /tj/, and /tw/ in language > B. Second, semantic shifts occur which result in, for example, the lexical > item for 'skin' in language A being related to the lexical item for 'bark' > in language B. K rejects attempts to incorporate such factors into the > procedures on the basis of practical considerations related to the > methodological requirement that lexical items be chosen without reference to > their similarity to forms in other languages. > > Chapters 7-9 deal with what lexical items may need to be eliminated from the > analysis. In chapter 7, 'Nonarbitrary Vocabulary', K discusses forms in > which the phonetic form may be at least partially determined by sound > symbolism including, but not limited to, onomatopoeia and nursery words. > Then K discusses loan words in chapter 8, 'Historical Connection vs. > Relatedness', and language-internally related forms in chapter 9, > 'Language-Internal Cognates'. Language-internally related forms include such > phenomena one phonetic form for related meanings (for example, 'skin' and > 'bark' or 'egg' and 'seed') and derivationally related forms. K argues that > if the goal of the analysis is determining whether two languages are > genetically related, the nonarbitrary aspects of such forms needs to be > eliminated. > > Then, in chapter 10, 'Recurrence Metrics', K introduces some statistical > methods that might be used in place of the chi-square test. > > In the final chapter, 'Conclusions', K summarizes the actual procedures > proposed in the book, and then offers observations on what such procedures > have to offer the practice of historical linguistics. > > The book concludes with an appendix that includes all eight word lists that > are used to test the methodology presented in the book, references, and an > index. > > CRITICAL EVALUATION > It should be obvious by now that this book may be hard going for readers who > have an aversion to mathematics in general or statistics in particular. At > the same time, I feel K does a good job of presenting the material in a form > that should be accessible to readers who do not have a strong background in > statistics. The book is full of examples illustrating the various points. > And the fact that the same eight word lists are used throughout the book > makes it easier to follow the arguments related to variations in the > procedures. > > I feel K has demonstrated that it is possible to develop procedures that > yield statistically significant results (that is, issue two from above). At > the same time, I do not feel K demonstrates how the procedures will bring > together the two sides regarding the issue of how significant a role word > lists should play in determining language relatedness. The problem is that > most of the discussion regarding this issue deals with languages whose > relationship is very remote, while the methodology presented here only seems > to be applicable to languages related at the level of Indo-European. K never > shows how the methodology could be adapted to test more remote > relationships. > > In addition, I am not sure that K's requirement that the analysis must be > based on a pre-determined procedure, on word lists that are chosen without > reference to any of the other languages to be analyzed, will be acceptable > to those interested in determining remote relationships. > > This is not so say, however, that the methodology is without merit. In some > areas like Papua New Guinea and Africa, relationships have not been firmly > established even at the level of Indo-European. In addition, the chapters on > lexical items that should be eliminated from the analysis (7-9) discuss > issues that are important for anyone involved in the analysis of word lists. > I have seen many analyses (my own included) that fail to take into > consideration internal cognates. > > A major thrust of the book is that 'more is not necessarily better'. K > demonstrates the importance of choosing carefully the words to be analyzed. > It is better to analyze a smaller set of words that have been screened in > terms of origin than to analyze a large number of words that are of > questionable status. In other words, K argues that attempts to bolster an > analysis based on word lists of questionable status by simply adding more > words actually works against the trustworthiness of the analysis. At the > same time, this will make the procedure more difficult to apply in > situations as in Papua New Guinea where it is difficult to gather the > information necessary to compile trustworthy word lists. Technical > dictionaries of the caliber used by K simply do not exist in many of the > languages there. > > K also makes it clear that the procedures proposed in this book are not a > replacement for the more traditional tasks of establishing cognates. > Instead, the procedures are meant to show which languages are good > candidates for such a task. > > In conclusion, while I am not sure how influential the book will be in the > debate over the use of word lists for determining remote relationships, I > feel the book has a lot to offer to those involved in more mundane analysis > of word lists. > > BIBLIOGRAPHY > Greenberg, Joseph H. and Merritt Ruhlen. 1992. Linguistic origins of > Native Americans. Scientific American 267:94-99. > > ABOUT THE REVIEWER > John M Clifton has been involved in sociolinguistic research involving, > among other aspects, language relationships, in Papua New Guinea from 1982 > to 1994. More recently, he has just finished coordinating the work of a team > of researchers working in language use and attitudes among speakers of > less-commonly-spoken languages in Azerbaijan. > > -- M. Hubey -o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o The only difference between humans and machines is that humans can be created by unskilled labor. Arthur C. Clarke /\/\/\/\//\/\/\/\/\/\/ http://www.csam.montclair.edu/~hubey ---<><><><><><><><><><><><>----Language----<><><><><><><><><><><><><> Copyrights/"Fair Use": http://www.templetons.com/brad/copymyths.html The "fair use" exemption to copyright law was created to allow things such as commentary, parody, news reporting, research and education about copyrighted works without the permission of the author. That's important so that copyright law doesn't block your freedom to express your own works -- only the ability to express other people's. Intent, and damage to the commercial value of the work are important considerations. You are currently subscribed to language as: language at listserv.linguistlist.org To unsubscribe send a blank email to leave-language-4283Y at csam-lists.montclair.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From hubeyh at mail.montclair.edu Wed Feb 5 01:01:06 2003 From: hubeyh at mail.montclair.edu (H.M. Hubey) Date: Tue, 4 Feb 2003 20:01:06 -0500 Subject: [language] [Fwd The Neuroscience of Language: On Brain Circuits of Words and Serial Order] Message-ID: <><><><><><><><><><><><>--This is the Language List--<><><><><><><><><><><><><> The Neuroscience of Language: On Brain Circuits of Words and Serial Order Friedemann Pulvermuller Hardcover: 275 pages Publisher: Cambridge University Press; ; (January 2003) ISBN: 0521790263 AMAZON - US http://www.amazon.com/exec/obidos/ASIN/0521790263/darwinanddarwini AMAZON - UK http://www.amazon.co.uk/exec/obidos/ASIN/0521790263/humannaturecom How is language organized in the human brain? The Neuroscience of Language puts forth the first systematic model of language to bridge the gap between linguistics and neuroscience. Neuronal models of word and serial order processing are presented in the form of a computational, connectionist neural network. The linguistic emphasis is on words and elementary syntactic rules. Introductory chapters focus on neuronal structure and function, cognitive brain processes, the basics of classical aphasia research and modern neuroimaging of language, neural network approaches to language, and the basics of syntactic theories. The essence of the work is contained in chapters on neural algorithms and networks, basic syntax, serial order mechanisms, and neuronal grammar. Throughout, excursuses illustrate the functioning of brain models of language, some of which are accessible as animations on the book's accompanying web site. It will appeal to graduate students and researchers in neuroscience, psychology, linguistics, and computational modeling. Download sample chapter Contents Preface; 1. A guide to the book; 2. Neuronal structure and function; 3. >From aphasia research to neuroimaging; 4. Words in the brain; Excursus E1: Explaining neuropsychological double dissociations; 5. Regulation, overlap, and web tails; 6. Neural algorithms and neural networks; 7. Basic syntax; 8. Serial order mechanisms I: Synfire chains; 9. Serial order mechanisms II: Sequence detectors; 10. Neuronal grammar; Excursus E2: Basic bits of neuronal grammar; Excursus E3: A web response to a sentence; 11. Neuronal grammar and algorithms; 12. Refining neuronal grammar; Excursus E4: Multiple reverberation for resolving lexical ambiguity; Excursus E5: Multiple reverberation and multiple center embeddings; 13. Neurophysiology of syntax; 14. Linguistics and the brain. Download sample chapter http://assets.cambridge.org/0521790263/sample/0521790263WS.pdf -- M. Hubey -o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o The only difference between humans and machines is that humans can be created by unskilled labor. Arthur C. Clarke /\/\/\/\//\/\/\/\/\/\/ http://www.csam.montclair.edu/~hubey ---<><><><><><><><><><><><>----Language----<><><><><><><><><><><><><> Copyrights/"Fair Use": http://www.templetons.com/brad/copymyths.html The "fair use" exemption to copyright law was created to allow things such as commentary, parody, news reporting, research and education about copyrighted works without the permission of the author. That's important so that copyright law doesn't block your freedom to express your own works -- only the ability to express other people's. Intent, and damage to the commercial value of the work are important considerations. You are currently subscribed to language as: language at listserv.linguistlist.org To unsubscribe send a blank email to leave-language-4283Y at csam-lists.montclair.edu From hubeyh at mail.montclair.edu Tue Feb 11 16:03:10 2003 From: hubeyh at mail.montclair.edu (H.M. Hubey) Date: Tue, 11 Feb 2003 11:03:10 -0500 Subject: [language] Kessler REview Message-ID: <><><><><><><><><><><><>--This is the Language List--<><><><><><><><><><><><><> BOOK REVIEW by H.M. Hubey, Department of Computer Science, Montclair State University, New Jersey DESCRIPTION OF THE BOOK Kessler, Brett. 2001. The Significance of Word Lists. CSLI Publications, x+277pp, hardback ISBN 1-57586-299-9, paperback ISBN 1-57586-300-6, Dissertations in Linguistics. Announced at http://linguistlist.org/issues/12/12-790.html#1 DESCRIPTION OF THE BOOK The major issues addressed in the book are (i) concept of distance: similarity, (ii) comparative method, (iii) statistical tests, specifically the chi-square test, (iv) data-cleaning so that the chi-square test gives good results. Quick conclusion/summary is in order: (i) the book is excellent, (ii) contrary to expectation it is not about statistics but rather linguistics, (iii) its significance lies in its use of the methods of probability theory in a comprehensive way instead of simple and patched methods used previously. In fact, the book's jacket explains the problem most succintly: The most strident controversies in historical linguistics debate whether claims for historical connections between languages are erroneously based on chance similarities between word lists. But even though it is the province of statistical mathematics to judge whether evidence is significant or due to chance, neither side in these debates uses statistics, leaving readers little room to adjudicate competing claims objectively. This book fills that gap by presenting a new statistical methodology that helps linguists decide whether short word lists have more recurrent sound correspondences than can be expected by chance. The author shows that many of the complicating rules of thumb linguists invoke to obviate chance resemblances, such as multilateral comparison or emphasizing grammar over vocabulary, actually decrease the power of quantitative tests. But while the statistical methodology itself is straightforward, the author also details the extensive linguistic work needed to produce word lists that do not yield nonsensical results. The first problem Kessler tackles is the usual confusion between "resemblance" and "cognate", "proof" vs "statistics", and "distance" vs "similarity". It is not unusual even for "alleged" quantitative linguists to get these concepts backwards, and then use Freudian projection as defense. First, the most important concept. Suppose we attempt to ascertain the abstract property which this compound word represents: hotness:coldness. >From what we know we can see that these measure the same thing but the scales are running in opposite directions. In this case, the property we measure has an unequivocal name, temperature. Suppose we attempt it with, nearness:farness. It is easy to see that the property being measured is distance. However, this word is not abstract enough. We can also measure "time" with the same compound word. Or we may use long_ago:recent, or even distantness:recentness. In perceptual space (not physical space or temporal space) the common word in use is "similarity". Thus distance:similarity measures a concept called "distance". The reason for this seeming contradiction is the fact that natural languages often have words with two meanings. We might correct it via similarity:dissimilarity which then measures "distance". It is this dual usage of distance that often confuses people especially in conjunction with the word "similarity" or "dissimilarity". In other words we have distance1:similarity and this concept we measure via distance2. Having said this, we can say that comparative methods are attempts to measure how far from chance the observed data is, nothing more, nothing less. This much is made crystal clear by Kessler, who obviously understands historical linguistics methodology better than some linguists despite being a psychologist. But surely not being a linguist should not be held against him. It is not unusual for new methods to be brought into a field by outsiders. It happens in physics, engineering, computer science, genetics, biology, economics. Why not linguistics? Once this is clear, then it becomes clearer why binary comparison can be put on the same footing as multi-way comparisons. After all, whatever the data represents, all we want to know is "what is the probability that this data occurred purely due to chance?" We obviously want this number as small as possible if we want to conclude that the data represent an event that is not due to chance. Kessler also makes it clear that if we obtain a very small probability that the data represents a state of events that is not due to chance, all we can conclude is exactly that. How it came about depends on other assumptions. Kessler is very clear on the fact that the Swadesh list is nothing more than a formalization of concepts that historical linguists developed over centuries. A list that would be useful for making tests, a list that avoided technological borrowings, onomotopaic words, and other "unreliable" words, and those lists were created by Swadesh. Any linguist who has anything against the lists is not in disagreement only with Swadesh but also with the basic postulates of historical linguistics. In summary what we want is a test or a number that tells us how far from chance distribution the data are. There is such a test. It is the chi-square test. But there is a catch; the data must be independent. That means that if the word for finger and foot come from the same root, they are not independent, and the chi-square test will give incorrect results since it is based on the independence hypothesis. Kessler gives a small and short example of how it works. In truth he probably should have given a whole chapter, or two on the mathematics of the chi-square test, however he probably decided that it can be found in any statistics books. Instead he concentrates on selection procedures for the words. At the end of the book there is the Swadesh list for the languages which Ringe used for his early attempt at use of statistics. Kessler shows that many of these words are borrowings from other languages. In any case, any test will give incorrect results if the inputs are incorrect. In computer science it is called GIGO, Garbage-In, Garbage-out. There is never any substitute, not yet, for human intelligence. However, there is now at least one way of comparing the closeness of languages to each other using some number. That is about closest concept to "distance" that historical linguistics has ever reached. It would have been better if he had developed it further by normalizing it. For example, let d(x,y) be the 'dissimilarity" between languages x and y. Then let s(x,y) be the 'similarity between languages x and y. Then by normalizing these quantities to the interval [0,1] we can easily see that s(x,y) = 1 - d(x,y) Obviously, d(x,y) should be normalized so that d(x,x)=0. It can be seen already that if z, and w are two "most-distant" languages then d(z,w)=1. That is still something that still needs more work. As already mentioned, most of the book is spent on the actual results of the comparisons amongst the languages. Kessler makes various changes e.g. Swadesh 100 vs Swadesh 200, using only the first phoneme vs using more phonemes, etc. He also discusses thoroughly the problems with using more than a single phoneme, or even a single phoneme. The problem is that we do not know which phoneme should be cognate with which phoneme. In other words what if one language has lost the initial consonants. Then we would be attempting to match a consonant to a vowel which is certain to produce bad results. In the words of datamining, this is called data-cleaning, or pre-processing and it is an important part of analysis. Kessler discusses such problems thoroughly and clearly. The final result is that historical linguistics is on its way to becoming a rigorous science like those that preceded it. Kessler probably could have spent more time (and space) explaining the concept of hypothesis testing, false positives, false negatives, etc. even if only in an appendix. Kessler discusses in other chapters how to go about making use of consonants other than the first one in the comparanda. The main problem is the one faced by researchers in speech recognition and genetics. The phonemes have to be "aligned". That is, it is possible that one of the languages could have lost the initial consonant,or could have gone through a metathesis, etc. . Therefore some algorithms are needed to automatically obtain optimum alignment. These require the existence of phonetic/phonemic distance, but these are much easier than semantic distance (and do already exist in various forms, even if only implicitly). There is much more to the book that should be of interest to historical linguists. In summary, the book is excellent but might require some work for those linguists who have math-anxiety or have any kind of aversion to quantitative techniques. However, beginning statistics courses are now taught at universities at the general-education level, and there is no excuse for anyone not to at least have some grasp of the fundamentals of statistics and probability theory. Time never goes backwards. -- M. Hubey -o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o The only difference between humans and machines is that humans can be created by unskilled labor. Arthur C. Clarke /\/\/\/\//\/\/\/\/\/\/ http://www.csam.montclair.edu/~hubey ---<><><><><><><><><><><><>----Language----<><><><><><><><><><><><><> Copyrights/"Fair Use": http://www.templetons.com/brad/copymyths.html The "fair use" exemption to copyright law was created to allow things such as commentary, parody, news reporting, research and education about copyrighted works without the permission of the author. That's important so that copyright law doesn't block your freedom to express your own works -- only the ability to express other people's. Intent, and damage to the commercial value of the work are important considerations. You are currently subscribed to language as: language at listserv.linguistlist.org To unsubscribe send a blank email to leave-language-4283Y at csam-lists.montclair.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From hubeyh at mail.montclair.edu Sat Feb 15 01:11:44 2003 From: hubeyh at mail.montclair.edu (H.M. Hubey) Date: Fri, 14 Feb 2003 20:11:44 -0500 Subject: [language] The Reconstruction Engine (Computational Linguisics 20.3) Message-ID: <><><><><><><><><><><><>--This is the Language List--<><><><><><><><><><><><><> This is a multi-part message in MIME format. -------------- next part -------------- http://www.linguistics.berkeley.edu/~jblowe/REWWW/RE.html -- M. Hubey -o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o The only difference between humans and machines is that humans can be created by unskilled labor. Arthur C. Clarke /\/\/\/\//\/\/\/\/\/\/ http://www.csam.montclair.edu/~hubey -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ---<><><><><><><><><><><><>----Language----<><><><><><><><><><><><><> Copyrights/"Fair Use": http://www.templetons.com/brad/copymyths.html The "fair use" exemption to copyright law was created to allow things such as commentary, parody, news reporting, research and education about copyrighted works without the permission of the author. That's important so that copyright law doesn't block your freedom to express your own works -- only the ability to express other people's. Intent, and damage to the commercial value of the work are important considerations. You are currently subscribed to language as: language at listserv.linguistlist.org To unsubscribe send a blank email to leave-language-4283Y at csam-lists.montclair.edu From hubeyh at mail.montclair.edu Wed Feb 19 15:09:07 2003 From: hubeyh at mail.montclair.edu (H.M. Hubey) Date: Wed, 19 Feb 2003 10:09:07 -0500 Subject: [language] [Fwd: [evol-psych] Infants may offer clues to language development] Message-ID: <><><><><><><><><><><><>--This is the Language List--<><><><><><><><><><><><><> -------- Original Message -------- Subject: [evol-psych] Infants may offer clues to language development Date: Wed, 19 Feb 2003 10:16:57 +0000 From: Ian Pitchford Reply-To: Ian Pitchford Organization: http://human-nature.com To: evolutionary-psychology at yahoogroups.com [ Back to EurekAlert! ] Public release date: 17-Feb-2003 [ Print This Article | Close This Window ] Contact: Jenny Saffran jsaffran at facstaff.wisc.edu 608-262-9942 University of Wisconsin-Madison Infants may offer clues to language development ------------------------------------------------------------------------ ------------------------------------------------------------------------ Caption: A TV monitor displays a video camera image of David Niergarth and his 9-mount-old son Harper in a sound-proof room during their volunteer participation in an infant auditory test study led by psychology professor Jenny Saffran at the Waisman Center Infant Learning Lab. Photo by: Jeff Miller ------------------------------------------------------------------------ Full size image available through contact DENVER - You may not know it, but you took a course in linguistics as a baby. By listening to the talk around them, infants pick up sound patterns that help them understand the speech they hear, according to new research from the University of Wisconsin-Madison. But this research also shows that some patterns are easier to identify, suggesting that the development of human language may have been shaped by what infants could learn. These results were presented here today, Monday, Feb. 17, at the annual meeting of the American Association for the Advancement of Science. In a series of forthcoming papers, psychologist Jenny Saffran, who directs the Infant Learning Laboratory at UW-Madison, suggests how infants quickly acquire language, specifically their ability to find word boundaries - where words begin and end - from a steady stream of speech. "We've known for a long time that babies acquire language rapidly," she says, "but what we haven't known is how they do it." In all her studies, Saffran introduces her infant listeners to an artificial, or nonsense, language. Examples of words include "giku," "tuka" and "bugo." By using these made-up words, which the tiny listeners have never heard before, Saffran can isolate particular elements found in natural languages such as English. For just a couple of minutes, the infants hear dozens of two-syllable words strung together in a stream of monotone speech, unbroken by any pauses (for example, gikutukabugo...). The words are presented in a particular order that reveals a sound pattern. If babies recognize the pattern, says Saffran, they will use it to quickly identify word boundaries in what they hear next. To test this, Saffran introduces her listeners to a new string of nonsense words in which only some of them fit the pattern heard earlier. Saffran records how long the infants listen to the parts that conform to the pattern and the parts that don't. A significant difference in times, she explains, means the infants did pick up the pattern. As her recent studies show, infants do learn sound patterns, which then help them learn words and, ultimately, grammar. Their ability to do this, however, depends on age. By exposing infants who are 6-and-a-half and 9 months old to a string of made-up words in a certain order, Saffran learned that the two age groups use different strategies to determine where words end and begin. While the younger listeners identified word boundaries by relying on the likelihood that certain sounds occur together, the older listeners paid attention to what speech sounds were emphasized, or stressed. Because 90 percent of two-syllable words in English follow the same stress pattern, says Saffran, infants can use the pattern to determine the word boundaries. "At different points in development, babies orient towards some cues and not others," says Saffran. Why? "More linguistic experience." Before infants can recognize that stressed and unstressed syllables are reliable indicators of word boundaries, explains Saffran, they must first know a few words - lessons they learn earlier by learning which sounds are likely to occur together. Findings from this study will be published in an upcoming issue of the journal Developmental Psychology. Once infants go from syllables to words, they then can recognize simple grammars, according to Saffran's second study now in press at the journal Infancy. At age one year - just three months after babies begin using stress cues - infants can recognize patterns in word orderings. After listening to a continuous string of words in a particular order, the infants were able to identify permissible word orderings. Just as noted in the other study, Saffran says that only after prior learning can infants acquire additional language abilities: "Until they learn words, the grammar is invisible." While these two studies looked at babies' ability to acquire sound patterns common in natural languages, a recent third study by the Wisconsin psychologist investigated infants' ability to acquire patterns not often heard in everyday speech. The question Saffran wanted to answer, she says, was, "'Does language work in a way that best fits the brain?'" In other words: Are certain sound patterns more common than others because they make it easier for infants to learn language? This study is in press at Developmental Psychology. Unlike the other studies, which exposed infants to generalizations in language patterns, such as the grouping of sounds, this study tested an infant's ability to recognize something more specific - that syllables begin with some sounds, such as /p/, /d/ and /k/, but not others, such as /b/, /t/ and /g/. This pattern, says Saffran, is uncommon in phonological systems, which tend to place restrictions on types of sound segments, not individual ones. As Saffran found when she measured how long the infants listened to words that did and didn't conform to the rare pattern, there was no significant difference in the listening times. This finding, she says, suggests that babies had difficulty acquiring the pattern. The infants' difficulty in identifying the unusual sound pattern in this third study, she says, is likely to be the result of removing information helpful to young listeners as they acquire language. "There are certain types of patterns that they're better at picking up," adds Saffran. "Perhaps human languages have these patterns to make language more learnable. " Asking questions about what an infant can't learn, she says, can be just as interesting and informative as asking ones about what they can learn. In addition to providing knowledge about language deficits in some children, the answers could offer clues to how human language first developed and how it has evolved. ### NOTE TO PHOTO EDITORS/MULTIMEDIA EDITORS: To download high-resolution photos and a sample sound file to accompany this story, please visit: http://www.news.wisc.edu/newsphotos/saffran.html - Emily Carlson 608-262-9772, emilycarlson at facstaff.wisc.edu http://www.eurekalert.org/pub_releases/2003-02/uow-imo021103.php News in Brain and Behavioural Sciences - Issue 86 - 8th February, 2003 http://human-nature.com/nibbs/issue86.html Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service . ---<><><><><><><><><><><><>----Language----<><><><><><><><><><><><><> Copyrights/"Fair Use": http://www.templetons.com/brad/copymyths.html The "fair use" exemption to copyright law was created to allow things such as commentary, parody, news reporting, research and education about copyrighted works without the permission of the author. That's important so that copyright law doesn't block your freedom to express your own works -- only the ability to express other people's. Intent, and damage to the commercial value of the work are important considerations. You are currently subscribed to language as: language at listserv.linguistlist.org To unsubscribe send a blank email to leave-language-4283Y at csam-lists.montclair.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From hubeyh at mail.montclair.edu Thu Feb 20 12:41:15 2003 From: hubeyh at mail.montclair.edu (H.M. Hubey) Date: Thu, 20 Feb 2003 07:41:15 -0500 Subject: [language] [Fwd: [evol-psych] The gene that maketh man?] Message-ID: <><><><><><><><><><><><>--This is the Language List--<><><><><><><><><><><><><> -------- Original Message -------- Subject: [evol-psych] The gene that maketh man? Date: Thu, 20 Feb 2003 11:51:29 +0000 From: Ian Pitchford Reply-To: Ian Pitchford Organization: http://human-nature.com To: evolutionary-psychology at yahoogroups.com BBC NEWS ONLINE Tuesday, 18 February, 2003, 00:31 GMT The gene that maketh man? The gene is found only in human-like primates US scientists have identified a gene which they say could explain why humans are unique. It seems to have arisen between 21 and 33 million years ago, when primates were becoming more human-like. The gene emerged about the time the path that led to humans, chimps, orangutans and gorillas was splitting off from that of old and new world monkeys. The gene could have duplicated itself, creating many new ones specific to humans, according to researchers at Harvard University in Massachusetts. Genetic clues Science has long sought to explain why we are different from our closest animal cousins - the primates. Full text http://news.bbc.co.uk/1/hi/sci/tech/2772241.stm News in Brain and Behavioural Sciences - Issue 86 - 8th February, 2003 http://human-nature.com/nibbs/issue86.html Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ ---<><><><><><><><><><><><>----Language----<><><><><><><><><><><><><> Copyrights/"Fair Use": http://www.templetons.com/brad/copymyths.html The "fair use" exemption to copyright law was created to allow things such as commentary, parody, news reporting, research and education about copyrighted works without the permission of the author. That's important so that copyright law doesn't block your freedom to express your own works -- only the ability to express other people's. Intent, and damage to the commercial value of the work are important considerations. You are currently subscribed to language as: language at listserv.linguistlist.org To unsubscribe send a blank email to leave-language-4283Y at csam-lists.montclair.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From hubeyh at mail.montclair.edu Tue Feb 4 01:33:18 2003 From: hubeyh at mail.montclair.edu (H.M. Hubey) Date: Mon, 3 Feb 2003 20:33:18 -0500 Subject: [language] "Comments on Clifton's review of Kessler in issue Message-ID: <><><><><><><><><><><><>--This is the Language List--<><><><><><><><><><><><><> I would like to review Kessler's book for the LinguistList. Please reply. -------- Original Message -------- Subject: Re: Kessler's book Date: Tue, 31 Dec 2002 12:37:02 -0500 (EST) From: Terry Langendoen To: "H.M.Hubey" References: <3E112ADE.10800 at mail.montclair.edu> Dear Ms/Mr Hubey, If you'd like to comment on either the book or the review, I suggest you send your comments as a regular email message to linguist at linguistlist.org. In the subject line include a reference to the original positing, e.g. "Comments on Clifton's review of Kessler in issue 13.491". Terry Terry Langendoen, Linguist List book review editor http://linguistlist.org/issues/indices/Review2002r.html/ Co-Principal Investigator, EMELD Project http://emeld.douglass.arizona.edu/ & http://emeld.org/ Department of Linguistics, University of Arizona PO Box 210028, Tucson AZ 85721-0028, USA http://linguistics.arizona.edu/~langendoen/ On Tue, 31 Dec 2002, H.M.Hubey wrote: > > Hello, > > > I just read Kessler's book and would like to write a > review since this one does nit do it justice. > > > > > > > > LINGUIST List 13.491 > > > Fri Feb 22 2002 > > > Review: Kessler, The Significance of Word Lists > > Editor for this issue: Terence Langendoen > > > ------------------------------------------------------------------------ > What follows is another discussion note contributed to our Book > Discussion Forum. We expect these discussions to be informal and > interactive; and the author of the book discussed is cordially invited > to join in. If you are interested in leading a book discussion, look for > books announced on LINGUIST as "available for discussion." (This means > that the publisher has sent us a review copy.) Then contact Simin Karimi > at simin at linguistlist.org or Terry > Langendoen at terry at linguistlist.org . > Subscribe to Blackwell's LL+ at http://www.linguistlistplus.com/ and > donate 20% of your subscription to LINGUIST! You get 30% off on > Blackwells books, and free shipping and postage! > ------------------------------------------------------------------------ > > > Directory > > 1. John & Debbie Clifton, Review of Kessler: The Significance of Word > Lists > > ------------------------------------------------------------------------ > > > Message 1: Review of Kessler: The Significance of Word Lists > > Date: Fri, 22 Feb 2002 20:59:53 +0400 > From: John & Debbie Clifton > > Subject: Review of Kessler: The Significance of Word Lists > > Kessler, Brett. 2001. The Significance of Word Lists. CSLI Publications, > x+277pp, hardback ISBN 1-57586-299-9, paperback ISBN 1-57586-300-6, > Dissertations in Linguistics. > Announced at http://linguistlist.org/issues/12/12-790.html#1 > > John M Clifton, Summer Institute of Linguistics and University of North > Dakota > > DESCRIPTION OF THE BOOK > The two major issues addressed in this book can be characterized in terms of > two senses of the word 'significance' as used in the title of the book. The > first issue is how significant word lists are to determining language > relatedness. The second issue is what is involved in showing that hypotheses > made on the basis word lists are statistically significant. > > In chapter 1, 'Introduction', Kessler (K) addresses the two major positions > on the first issue. On the one side are those like Greenberg and Ruhlen > (1992) who feel that the analysis of word lists can be used to demonstrate > the links between remotely related languages. On the other side are scores > of more traditional historical linguists who claim that the similarities > used to establish these putative links are due to chance. K proposes a third > option: word lists can be used to establish linguistic relationships, but > only when following a rigid methodology designed to ensure the results will > be statistically significant. > > Chapters 2, 'Statistical Methodology', and 3, 'Significance Testing', are > the heart of the book. In these chapters K discusses statistical methodology > in general, and then details the specific methodology proposed for the > analysis of word lists. K then applies this test to Swadesh 100 word lists > from eight languages: Latin, French, English, German, Albanian, Hawaiian, > Navajo, and Turkish. With a few exceptions, the results of the procedure > indicate that the first five are related, and the others are not. At the > risk of over-simplifying a complex procedure, I will attempt to summarize > the contents of the methodology. Feel free to skip the next paragraph if it > is too obtuse. > > The methodology involves constructing a table of correspondences of > word-initial segments in semantically related words in two languages. This > table can then be analyzed using the chi-square test for significance. From > a statistical point of view, the problem is that the number of occurrences > of specific correspondences is too low for the chi-square test to be > meaningful. To remedy this, K proposes the use of a Monte Carlo technique. > Applying this technique, one of the word lists is randomized, a new table is > constructed, and the chi-square test is applied to the new table. This > procedure is repeated 10,000 times. Now the value of the original table is > compared with the values of these 10,000 tables generated by the Monte Carlo > technique, and a valid level of significance can be attached to the original > value. > > As indicated above, the methodology as proposed does not always correctly > identify which languages are related. There are both false positives in > which a relationship is posited between apparently unrelated languages like > Latin and Navajo, and false negatives in which no relationship is posited > between related languages like Albanian and German. K points out that false > positives are unavoidable in statistics; the goal is to minimize them. False > negatives, on the other hand, should be eliminated. In addition, it would be > nice if the methodology could distinguish between closer relationships like > those between English and German, and more distant relationships like those > between English and Albanian. In chapters 4-10, K discusses various ways in > which the methodology might be improved. > > In chapter 4, 'Tests in Different Environments', K concludes that > predictions are not improved by comparing features other than the > word-initial consonant, for example, the first consonant of the second > syllable, or the first vowel, or some combination of the above. Then in > chapter 5, 'Size of the Word Lists', K shows that increasing the size of the > word lists by using the Swadesh 200 word list instead of the Swadesh 100, > does not improve the predictions. > > Chapter 6, 'Precision and Lumping', deals with the implications of two types > of historical changes. First, phonemes can split or merge so that, for > example, /t/ in language A may correspond to /t/, /tj/, and /tw/ in language > B. Second, semantic shifts occur which result in, for example, the lexical > item for 'skin' in language A being related to the lexical item for 'bark' > in language B. K rejects attempts to incorporate such factors into the > procedures on the basis of practical considerations related to the > methodological requirement that lexical items be chosen without reference to > their similarity to forms in other languages. > > Chapters 7-9 deal with what lexical items may need to be eliminated from the > analysis. In chapter 7, 'Nonarbitrary Vocabulary', K discusses forms in > which the phonetic form may be at least partially determined by sound > symbolism including, but not limited to, onomatopoeia and nursery words. > Then K discusses loan words in chapter 8, 'Historical Connection vs. > Relatedness', and language-internally related forms in chapter 9, > 'Language-Internal Cognates'. Language-internally related forms include such > phenomena one phonetic form for related meanings (for example, 'skin' and > 'bark' or 'egg' and 'seed') and derivationally related forms. K argues that > if the goal of the analysis is determining whether two languages are > genetically related, the nonarbitrary aspects of such forms needs to be > eliminated. > > Then, in chapter 10, 'Recurrence Metrics', K introduces some statistical > methods that might be used in place of the chi-square test. > > In the final chapter, 'Conclusions', K summarizes the actual procedures > proposed in the book, and then offers observations on what such procedures > have to offer the practice of historical linguistics. > > The book concludes with an appendix that includes all eight word lists that > are used to test the methodology presented in the book, references, and an > index. > > CRITICAL EVALUATION > It should be obvious by now that this book may be hard going for readers who > have an aversion to mathematics in general or statistics in particular. At > the same time, I feel K does a good job of presenting the material in a form > that should be accessible to readers who do not have a strong background in > statistics. The book is full of examples illustrating the various points. > And the fact that the same eight word lists are used throughout the book > makes it easier to follow the arguments related to variations in the > procedures. > > I feel K has demonstrated that it is possible to develop procedures that > yield statistically significant results (that is, issue two from above). At > the same time, I do not feel K demonstrates how the procedures will bring > together the two sides regarding the issue of how significant a role word > lists should play in determining language relatedness. The problem is that > most of the discussion regarding this issue deals with languages whose > relationship is very remote, while the methodology presented here only seems > to be applicable to languages related at the level of Indo-European. K never > shows how the methodology could be adapted to test more remote > relationships. > > In addition, I am not sure that K's requirement that the analysis must be > based on a pre-determined procedure, on word lists that are chosen without > reference to any of the other languages to be analyzed, will be acceptable > to those interested in determining remote relationships. > > This is not so say, however, that the methodology is without merit. In some > areas like Papua New Guinea and Africa, relationships have not been firmly > established even at the level of Indo-European. In addition, the chapters on > lexical items that should be eliminated from the analysis (7-9) discuss > issues that are important for anyone involved in the analysis of word lists. > I have seen many analyses (my own included) that fail to take into > consideration internal cognates. > > A major thrust of the book is that 'more is not necessarily better'. K > demonstrates the importance of choosing carefully the words to be analyzed. > It is better to analyze a smaller set of words that have been screened in > terms of origin than to analyze a large number of words that are of > questionable status. In other words, K argues that attempts to bolster an > analysis based on word lists of questionable status by simply adding more > words actually works against the trustworthiness of the analysis. At the > same time, this will make the procedure more difficult to apply in > situations as in Papua New Guinea where it is difficult to gather the > information necessary to compile trustworthy word lists. Technical > dictionaries of the caliber used by K simply do not exist in many of the > languages there. > > K also makes it clear that the procedures proposed in this book are not a > replacement for the more traditional tasks of establishing cognates. > Instead, the procedures are meant to show which languages are good > candidates for such a task. > > In conclusion, while I am not sure how influential the book will be in the > debate over the use of word lists for determining remote relationships, I > feel the book has a lot to offer to those involved in more mundane analysis > of word lists. > > BIBLIOGRAPHY > Greenberg, Joseph H. and Merritt Ruhlen. 1992. Linguistic origins of > Native Americans. Scientific American 267:94-99. > > ABOUT THE REVIEWER > John M Clifton has been involved in sociolinguistic research involving, > among other aspects, language relationships, in Papua New Guinea from 1982 > to 1994. More recently, he has just finished coordinating the work of a team > of researchers working in language use and attitudes among speakers of > less-commonly-spoken languages in Azerbaijan. > > -- M. Hubey -o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o The only difference between humans and machines is that humans can be created by unskilled labor. Arthur C. Clarke /\/\/\/\//\/\/\/\/\/\/ http://www.csam.montclair.edu/~hubey ---<><><><><><><><><><><><>----Language----<><><><><><><><><><><><><> Copyrights/"Fair Use": http://www.templetons.com/brad/copymyths.html The "fair use" exemption to copyright law was created to allow things such as commentary, parody, news reporting, research and education about copyrighted works without the permission of the author. That's important so that copyright law doesn't block your freedom to express your own works -- only the ability to express other people's. Intent, and damage to the commercial value of the work are important considerations. You are currently subscribed to language as: language at listserv.linguistlist.org To unsubscribe send a blank email to leave-language-4283Y at csam-lists.montclair.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From hubeyh at mail.montclair.edu Wed Feb 5 01:01:06 2003 From: hubeyh at mail.montclair.edu (H.M. Hubey) Date: Tue, 4 Feb 2003 20:01:06 -0500 Subject: [language] [Fwd The Neuroscience of Language: On Brain Circuits of Words and Serial Order] Message-ID: <><><><><><><><><><><><>--This is the Language List--<><><><><><><><><><><><><> The Neuroscience of Language: On Brain Circuits of Words and Serial Order Friedemann Pulvermuller Hardcover: 275 pages Publisher: Cambridge University Press; ; (January 2003) ISBN: 0521790263 AMAZON - US http://www.amazon.com/exec/obidos/ASIN/0521790263/darwinanddarwini AMAZON - UK http://www.amazon.co.uk/exec/obidos/ASIN/0521790263/humannaturecom How is language organized in the human brain? The Neuroscience of Language puts forth the first systematic model of language to bridge the gap between linguistics and neuroscience. Neuronal models of word and serial order processing are presented in the form of a computational, connectionist neural network. The linguistic emphasis is on words and elementary syntactic rules. Introductory chapters focus on neuronal structure and function, cognitive brain processes, the basics of classical aphasia research and modern neuroimaging of language, neural network approaches to language, and the basics of syntactic theories. The essence of the work is contained in chapters on neural algorithms and networks, basic syntax, serial order mechanisms, and neuronal grammar. Throughout, excursuses illustrate the functioning of brain models of language, some of which are accessible as animations on the book's accompanying web site. It will appeal to graduate students and researchers in neuroscience, psychology, linguistics, and computational modeling. Download sample chapter Contents Preface; 1. A guide to the book; 2. Neuronal structure and function; 3. >From aphasia research to neuroimaging; 4. Words in the brain; Excursus E1: Explaining neuropsychological double dissociations; 5. Regulation, overlap, and web tails; 6. Neural algorithms and neural networks; 7. Basic syntax; 8. Serial order mechanisms I: Synfire chains; 9. Serial order mechanisms II: Sequence detectors; 10. Neuronal grammar; Excursus E2: Basic bits of neuronal grammar; Excursus E3: A web response to a sentence; 11. Neuronal grammar and algorithms; 12. Refining neuronal grammar; Excursus E4: Multiple reverberation for resolving lexical ambiguity; Excursus E5: Multiple reverberation and multiple center embeddings; 13. Neurophysiology of syntax; 14. Linguistics and the brain. Download sample chapter http://assets.cambridge.org/0521790263/sample/0521790263WS.pdf -- M. Hubey -o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o The only difference between humans and machines is that humans can be created by unskilled labor. Arthur C. Clarke /\/\/\/\//\/\/\/\/\/\/ http://www.csam.montclair.edu/~hubey ---<><><><><><><><><><><><>----Language----<><><><><><><><><><><><><> Copyrights/"Fair Use": http://www.templetons.com/brad/copymyths.html The "fair use" exemption to copyright law was created to allow things such as commentary, parody, news reporting, research and education about copyrighted works without the permission of the author. That's important so that copyright law doesn't block your freedom to express your own works -- only the ability to express other people's. Intent, and damage to the commercial value of the work are important considerations. You are currently subscribed to language as: language at listserv.linguistlist.org To unsubscribe send a blank email to leave-language-4283Y at csam-lists.montclair.edu From hubeyh at mail.montclair.edu Tue Feb 11 16:03:10 2003 From: hubeyh at mail.montclair.edu (H.M. Hubey) Date: Tue, 11 Feb 2003 11:03:10 -0500 Subject: [language] Kessler REview Message-ID: <><><><><><><><><><><><>--This is the Language List--<><><><><><><><><><><><><> BOOK REVIEW by H.M. Hubey, Department of Computer Science, Montclair State University, New Jersey DESCRIPTION OF THE BOOK Kessler, Brett. 2001. The Significance of Word Lists. CSLI Publications, x+277pp, hardback ISBN 1-57586-299-9, paperback ISBN 1-57586-300-6, Dissertations in Linguistics. Announced at http://linguistlist.org/issues/12/12-790.html#1 DESCRIPTION OF THE BOOK The major issues addressed in the book are (i) concept of distance: similarity, (ii) comparative method, (iii) statistical tests, specifically the chi-square test, (iv) data-cleaning so that the chi-square test gives good results. Quick conclusion/summary is in order: (i) the book is excellent, (ii) contrary to expectation it is not about statistics but rather linguistics, (iii) its significance lies in its use of the methods of probability theory in a comprehensive way instead of simple and patched methods used previously. In fact, the book's jacket explains the problem most succintly: The most strident controversies in historical linguistics debate whether claims for historical connections between languages are erroneously based on chance similarities between word lists. But even though it is the province of statistical mathematics to judge whether evidence is significant or due to chance, neither side in these debates uses statistics, leaving readers little room to adjudicate competing claims objectively. This book fills that gap by presenting a new statistical methodology that helps linguists decide whether short word lists have more recurrent sound correspondences than can be expected by chance. The author shows that many of the complicating rules of thumb linguists invoke to obviate chance resemblances, such as multilateral comparison or emphasizing grammar over vocabulary, actually decrease the power of quantitative tests. But while the statistical methodology itself is straightforward, the author also details the extensive linguistic work needed to produce word lists that do not yield nonsensical results. The first problem Kessler tackles is the usual confusion between "resemblance" and "cognate", "proof" vs "statistics", and "distance" vs "similarity". It is not unusual even for "alleged" quantitative linguists to get these concepts backwards, and then use Freudian projection as defense. First, the most important concept. Suppose we attempt to ascertain the abstract property which this compound word represents: hotness:coldness. >From what we know we can see that these measure the same thing but the scales are running in opposite directions. In this case, the property we measure has an unequivocal name, temperature. Suppose we attempt it with, nearness:farness. It is easy to see that the property being measured is distance. However, this word is not abstract enough. We can also measure "time" with the same compound word. Or we may use long_ago:recent, or even distantness:recentness. In perceptual space (not physical space or temporal space) the common word in use is "similarity". Thus distance:similarity measures a concept called "distance". The reason for this seeming contradiction is the fact that natural languages often have words with two meanings. We might correct it via similarity:dissimilarity which then measures "distance". It is this dual usage of distance that often confuses people especially in conjunction with the word "similarity" or "dissimilarity". In other words we have distance1:similarity and this concept we measure via distance2. Having said this, we can say that comparative methods are attempts to measure how far from chance the observed data is, nothing more, nothing less. This much is made crystal clear by Kessler, who obviously understands historical linguistics methodology better than some linguists despite being a psychologist. But surely not being a linguist should not be held against him. It is not unusual for new methods to be brought into a field by outsiders. It happens in physics, engineering, computer science, genetics, biology, economics. Why not linguistics? Once this is clear, then it becomes clearer why binary comparison can be put on the same footing as multi-way comparisons. After all, whatever the data represents, all we want to know is "what is the probability that this data occurred purely due to chance?" We obviously want this number as small as possible if we want to conclude that the data represent an event that is not due to chance. Kessler also makes it clear that if we obtain a very small probability that the data represents a state of events that is not due to chance, all we can conclude is exactly that. How it came about depends on other assumptions. Kessler is very clear on the fact that the Swadesh list is nothing more than a formalization of concepts that historical linguists developed over centuries. A list that would be useful for making tests, a list that avoided technological borrowings, onomotopaic words, and other "unreliable" words, and those lists were created by Swadesh. Any linguist who has anything against the lists is not in disagreement only with Swadesh but also with the basic postulates of historical linguistics. In summary what we want is a test or a number that tells us how far from chance distribution the data are. There is such a test. It is the chi-square test. But there is a catch; the data must be independent. That means that if the word for finger and foot come from the same root, they are not independent, and the chi-square test will give incorrect results since it is based on the independence hypothesis. Kessler gives a small and short example of how it works. In truth he probably should have given a whole chapter, or two on the mathematics of the chi-square test, however he probably decided that it can be found in any statistics books. Instead he concentrates on selection procedures for the words. At the end of the book there is the Swadesh list for the languages which Ringe used for his early attempt at use of statistics. Kessler shows that many of these words are borrowings from other languages. In any case, any test will give incorrect results if the inputs are incorrect. In computer science it is called GIGO, Garbage-In, Garbage-out. There is never any substitute, not yet, for human intelligence. However, there is now at least one way of comparing the closeness of languages to each other using some number. That is about closest concept to "distance" that historical linguistics has ever reached. It would have been better if he had developed it further by normalizing it. For example, let d(x,y) be the 'dissimilarity" between languages x and y. Then let s(x,y) be the 'similarity between languages x and y. Then by normalizing these quantities to the interval [0,1] we can easily see that s(x,y) = 1 - d(x,y) Obviously, d(x,y) should be normalized so that d(x,x)=0. It can be seen already that if z, and w are two "most-distant" languages then d(z,w)=1. That is still something that still needs more work. As already mentioned, most of the book is spent on the actual results of the comparisons amongst the languages. Kessler makes various changes e.g. Swadesh 100 vs Swadesh 200, using only the first phoneme vs using more phonemes, etc. He also discusses thoroughly the problems with using more than a single phoneme, or even a single phoneme. The problem is that we do not know which phoneme should be cognate with which phoneme. In other words what if one language has lost the initial consonants. Then we would be attempting to match a consonant to a vowel which is certain to produce bad results. In the words of datamining, this is called data-cleaning, or pre-processing and it is an important part of analysis. Kessler discusses such problems thoroughly and clearly. The final result is that historical linguistics is on its way to becoming a rigorous science like those that preceded it. Kessler probably could have spent more time (and space) explaining the concept of hypothesis testing, false positives, false negatives, etc. even if only in an appendix. Kessler discusses in other chapters how to go about making use of consonants other than the first one in the comparanda. The main problem is the one faced by researchers in speech recognition and genetics. The phonemes have to be "aligned". That is, it is possible that one of the languages could have lost the initial consonant,or could have gone through a metathesis, etc. . Therefore some algorithms are needed to automatically obtain optimum alignment. These require the existence of phonetic/phonemic distance, but these are much easier than semantic distance (and do already exist in various forms, even if only implicitly). There is much more to the book that should be of interest to historical linguists. In summary, the book is excellent but might require some work for those linguists who have math-anxiety or have any kind of aversion to quantitative techniques. However, beginning statistics courses are now taught at universities at the general-education level, and there is no excuse for anyone not to at least have some grasp of the fundamentals of statistics and probability theory. Time never goes backwards. -- M. Hubey -o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o The only difference between humans and machines is that humans can be created by unskilled labor. Arthur C. Clarke /\/\/\/\//\/\/\/\/\/\/ http://www.csam.montclair.edu/~hubey ---<><><><><><><><><><><><>----Language----<><><><><><><><><><><><><> Copyrights/"Fair Use": http://www.templetons.com/brad/copymyths.html The "fair use" exemption to copyright law was created to allow things such as commentary, parody, news reporting, research and education about copyrighted works without the permission of the author. That's important so that copyright law doesn't block your freedom to express your own works -- only the ability to express other people's. Intent, and damage to the commercial value of the work are important considerations. You are currently subscribed to language as: language at listserv.linguistlist.org To unsubscribe send a blank email to leave-language-4283Y at csam-lists.montclair.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From hubeyh at mail.montclair.edu Sat Feb 15 01:11:44 2003 From: hubeyh at mail.montclair.edu (H.M. Hubey) Date: Fri, 14 Feb 2003 20:11:44 -0500 Subject: [language] The Reconstruction Engine (Computational Linguisics 20.3) Message-ID: <><><><><><><><><><><><>--This is the Language List--<><><><><><><><><><><><><> This is a multi-part message in MIME format. -------------- next part -------------- http://www.linguistics.berkeley.edu/~jblowe/REWWW/RE.html -- M. Hubey -o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o The only difference between humans and machines is that humans can be created by unskilled labor. Arthur C. Clarke /\/\/\/\//\/\/\/\/\/\/ http://www.csam.montclair.edu/~hubey -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ---<><><><><><><><><><><><>----Language----<><><><><><><><><><><><><> Copyrights/"Fair Use": http://www.templetons.com/brad/copymyths.html The "fair use" exemption to copyright law was created to allow things such as commentary, parody, news reporting, research and education about copyrighted works without the permission of the author. That's important so that copyright law doesn't block your freedom to express your own works -- only the ability to express other people's. Intent, and damage to the commercial value of the work are important considerations. You are currently subscribed to language as: language at listserv.linguistlist.org To unsubscribe send a blank email to leave-language-4283Y at csam-lists.montclair.edu From hubeyh at mail.montclair.edu Wed Feb 19 15:09:07 2003 From: hubeyh at mail.montclair.edu (H.M. Hubey) Date: Wed, 19 Feb 2003 10:09:07 -0500 Subject: [language] [Fwd: [evol-psych] Infants may offer clues to language development] Message-ID: <><><><><><><><><><><><>--This is the Language List--<><><><><><><><><><><><><> -------- Original Message -------- Subject: [evol-psych] Infants may offer clues to language development Date: Wed, 19 Feb 2003 10:16:57 +0000 From: Ian Pitchford Reply-To: Ian Pitchford Organization: http://human-nature.com To: evolutionary-psychology at yahoogroups.com [ Back to EurekAlert! ] Public release date: 17-Feb-2003 [ Print This Article | Close This Window ] Contact: Jenny Saffran jsaffran at facstaff.wisc.edu 608-262-9942 University of Wisconsin-Madison Infants may offer clues to language development ------------------------------------------------------------------------ ------------------------------------------------------------------------ Caption: A TV monitor displays a video camera image of David Niergarth and his 9-mount-old son Harper in a sound-proof room during their volunteer participation in an infant auditory test study led by psychology professor Jenny Saffran at the Waisman Center Infant Learning Lab. Photo by: Jeff Miller ------------------------------------------------------------------------ Full size image available through contact DENVER - You may not know it, but you took a course in linguistics as a baby. By listening to the talk around them, infants pick up sound patterns that help them understand the speech they hear, according to new research from the University of Wisconsin-Madison. But this research also shows that some patterns are easier to identify, suggesting that the development of human language may have been shaped by what infants could learn. These results were presented here today, Monday, Feb. 17, at the annual meeting of the American Association for the Advancement of Science. In a series of forthcoming papers, psychologist Jenny Saffran, who directs the Infant Learning Laboratory at UW-Madison, suggests how infants quickly acquire language, specifically their ability to find word boundaries - where words begin and end - from a steady stream of speech. "We've known for a long time that babies acquire language rapidly," she says, "but what we haven't known is how they do it." In all her studies, Saffran introduces her infant listeners to an artificial, or nonsense, language. Examples of words include "giku," "tuka" and "bugo." By using these made-up words, which the tiny listeners have never heard before, Saffran can isolate particular elements found in natural languages such as English. For just a couple of minutes, the infants hear dozens of two-syllable words strung together in a stream of monotone speech, unbroken by any pauses (for example, gikutukabugo...). The words are presented in a particular order that reveals a sound pattern. If babies recognize the pattern, says Saffran, they will use it to quickly identify word boundaries in what they hear next. To test this, Saffran introduces her listeners to a new string of nonsense words in which only some of them fit the pattern heard earlier. Saffran records how long the infants listen to the parts that conform to the pattern and the parts that don't. A significant difference in times, she explains, means the infants did pick up the pattern. As her recent studies show, infants do learn sound patterns, which then help them learn words and, ultimately, grammar. Their ability to do this, however, depends on age. By exposing infants who are 6-and-a-half and 9 months old to a string of made-up words in a certain order, Saffran learned that the two age groups use different strategies to determine where words end and begin. While the younger listeners identified word boundaries by relying on the likelihood that certain sounds occur together, the older listeners paid attention to what speech sounds were emphasized, or stressed. Because 90 percent of two-syllable words in English follow the same stress pattern, says Saffran, infants can use the pattern to determine the word boundaries. "At different points in development, babies orient towards some cues and not others," says Saffran. Why? "More linguistic experience." Before infants can recognize that stressed and unstressed syllables are reliable indicators of word boundaries, explains Saffran, they must first know a few words - lessons they learn earlier by learning which sounds are likely to occur together. Findings from this study will be published in an upcoming issue of the journal Developmental Psychology. Once infants go from syllables to words, they then can recognize simple grammars, according to Saffran's second study now in press at the journal Infancy. At age one year - just three months after babies begin using stress cues - infants can recognize patterns in word orderings. After listening to a continuous string of words in a particular order, the infants were able to identify permissible word orderings. Just as noted in the other study, Saffran says that only after prior learning can infants acquire additional language abilities: "Until they learn words, the grammar is invisible." While these two studies looked at babies' ability to acquire sound patterns common in natural languages, a recent third study by the Wisconsin psychologist investigated infants' ability to acquire patterns not often heard in everyday speech. The question Saffran wanted to answer, she says, was, "'Does language work in a way that best fits the brain?'" In other words: Are certain sound patterns more common than others because they make it easier for infants to learn language? This study is in press at Developmental Psychology. Unlike the other studies, which exposed infants to generalizations in language patterns, such as the grouping of sounds, this study tested an infant's ability to recognize something more specific - that syllables begin with some sounds, such as /p/, /d/ and /k/, but not others, such as /b/, /t/ and /g/. This pattern, says Saffran, is uncommon in phonological systems, which tend to place restrictions on types of sound segments, not individual ones. As Saffran found when she measured how long the infants listened to words that did and didn't conform to the rare pattern, there was no significant difference in the listening times. This finding, she says, suggests that babies had difficulty acquiring the pattern. The infants' difficulty in identifying the unusual sound pattern in this third study, she says, is likely to be the result of removing information helpful to young listeners as they acquire language. "There are certain types of patterns that they're better at picking up," adds Saffran. "Perhaps human languages have these patterns to make language more learnable. " Asking questions about what an infant can't learn, she says, can be just as interesting and informative as asking ones about what they can learn. In addition to providing knowledge about language deficits in some children, the answers could offer clues to how human language first developed and how it has evolved. ### NOTE TO PHOTO EDITORS/MULTIMEDIA EDITORS: To download high-resolution photos and a sample sound file to accompany this story, please visit: http://www.news.wisc.edu/newsphotos/saffran.html - Emily Carlson 608-262-9772, emilycarlson at facstaff.wisc.edu http://www.eurekalert.org/pub_releases/2003-02/uow-imo021103.php News in Brain and Behavioural Sciences - Issue 86 - 8th February, 2003 http://human-nature.com/nibbs/issue86.html Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service . ---<><><><><><><><><><><><>----Language----<><><><><><><><><><><><><> Copyrights/"Fair Use": http://www.templetons.com/brad/copymyths.html The "fair use" exemption to copyright law was created to allow things such as commentary, parody, news reporting, research and education about copyrighted works without the permission of the author. That's important so that copyright law doesn't block your freedom to express your own works -- only the ability to express other people's. Intent, and damage to the commercial value of the work are important considerations. You are currently subscribed to language as: language at listserv.linguistlist.org To unsubscribe send a blank email to leave-language-4283Y at csam-lists.montclair.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From hubeyh at mail.montclair.edu Thu Feb 20 12:41:15 2003 From: hubeyh at mail.montclair.edu (H.M. Hubey) Date: Thu, 20 Feb 2003 07:41:15 -0500 Subject: [language] [Fwd: [evol-psych] The gene that maketh man?] Message-ID: <><><><><><><><><><><><>--This is the Language List--<><><><><><><><><><><><><> -------- Original Message -------- Subject: [evol-psych] The gene that maketh man? Date: Thu, 20 Feb 2003 11:51:29 +0000 From: Ian Pitchford Reply-To: Ian Pitchford Organization: http://human-nature.com To: evolutionary-psychology at yahoogroups.com BBC NEWS ONLINE Tuesday, 18 February, 2003, 00:31 GMT The gene that maketh man? The gene is found only in human-like primates US scientists have identified a gene which they say could explain why humans are unique. It seems to have arisen between 21 and 33 million years ago, when primates were becoming more human-like. The gene emerged about the time the path that led to humans, chimps, orangutans and gorillas was splitting off from that of old and new world monkeys. The gene could have duplicated itself, creating many new ones specific to humans, according to researchers at Harvard University in Massachusetts. Genetic clues Science has long sought to explain why we are different from our closest animal cousins - the primates. Full text http://news.bbc.co.uk/1/hi/sci/tech/2772241.stm News in Brain and Behavioural Sciences - Issue 86 - 8th February, 2003 http://human-nature.com/nibbs/issue86.html Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ ---<><><><><><><><><><><><>----Language----<><><><><><><><><><><><><> Copyrights/"Fair Use": http://www.templetons.com/brad/copymyths.html The "fair use" exemption to copyright law was created to allow things such as commentary, parody, news reporting, research and education about copyrighted works without the permission of the author. That's important so that copyright law doesn't block your freedom to express your own works -- only the ability to express other people's. Intent, and damage to the commercial value of the work are important considerations. You are currently subscribed to language as: language at listserv.linguistlist.org To unsubscribe send a blank email to leave-language-4283Y at csam-lists.montclair.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: