<head>
<style type="text/css">
<!--
/* start of attachment style */
.ygrp-photo-title{
clear: both;
font-size: smaller;
height: 15px;
overflow: hidden;
text-align: center;
width: 75px;
}
div.ygrp-photo{
background-position: center;
background-repeat: no-repeat;
background-color: white;
border: 1px solid black;
height: 62px;
width: 62px;
}
div.photo-title
a,
div.photo-title a:active,
div.photo-title a:hover,
div.photo-title a:visited {
text-decoration: none;
}
div.attach-table div.attach-row {
clear: both;
}
div.attach-table div.attach-row div {
float: left;
/* margin: 2px;*/
}
p {
clear: both;
padding: 15px 0 3px 0;
overflow: hidden;
}
div.ygrp-file {
width: 30px;
valign: middle;
}
div.attach-table div.attach-row div div a {
text-decoration: none;
}
div.attach-table div.attach-row div div span {
font-weight: normal;
}
div.ygrp-file-title {
font-weight: bold;
}
/* end of attachment style */
-->
</style>
</head>
<!doctype html public "-//W3C//DTD W3 HTML//EN">
<html><head><style type="text/css"><!--
blockquote, dl, ul, ol, li { padding-top: 0 ; padding-bottom: 0 }
--></style><title>Re: [Lexicog] Potential words</title></head><body>
<!-- |**|begin egp html banner|**| -->
<br><br>
<!-- |**|end egp html banner|**| -->
<div>Richard,</div>
<div><x-tab> </x-tab>A
version of this method is used extensively by the PDLMA (the Project
for the Documentation of the Languages of Meso-America/El Proyecto
para la Documentación de las Lenguas de Mesoamérica). Based on
known phonotactics of related languages and preliminary work
establishing sound correspondences in the dialect under investigation,
a list of all possible roots is created (usually about 4000-5000
forms) and the fieldworkers slog their way through. The hit rate is
pretty low -- maybe 20-30% -- and some existing roots are rejected in
this process because they only occur in combinations and aren't
readily recognizable in isolation, but, in general, this tool achieves
great success, as shown by the fact that it regularly ferrets out
heretofore unknown roots in language families that are fairly well
studied.</div>
<div><br></div>
<div>Rich Rhodes</div>
<div><br></div>
<div><br></div>
<div><br></div>
<div><br></div>
<blockquote type="cite" cite><font size="-1">Hi Richard,</font><br>
<br>
<font size="-1">I used this method with a Mon-Khmer language many
years ago just to collect words, but I also checked it against my CVC
distribution chart and found certain combinations that were avoided
for unknown reasons--maybe they didn't sound good or carried bad
connotations--and certain areas for special use. The one that I
remember is that children's given names shared an area on the
distribution chart that was not used for anything else. This was
particularly interesting because their is a sanction against using the
same name as anyone else known to be living or dead. I guess that a
population explosion could force them into using other areas for
making names.</font><br>
<br>
<font size="-1">Dick</font><br>
<br>
</blockquote>
<blockquote type="cite" cite><br>
<br>
<br>
</blockquote>
<blockquote type="cite" cite>Hi Richard,<br>
<br>
I did this kind of thing in the early days of our language project in
PNG. I had worked out the consonant and vowel phonemes for Amele and
that word roots could be one, two or three syllables. I then had
someone at Ukarumpa High School write a program (this was in 1978) to
generate possible word roots in Amele based on the phoneme inventory
and the syllable patterns. The lists for the one and two syllable
roots weren't too long but the list for the three syllable roots was
enormous! I then distributed these lists to various Amele people for
them to try and identify actual word roots. They thought this was
great fun. But the hard work was then confirming that the roots
indicated were actual words and what their meaning and usage were.<br>
<br>
Another SIL member I know (who I met again just recently) used the
same method for their language project. He said when he gave the lists
out to people one of the men asked, "Do you want all the dirty
words too?"<br>
<br>
But as I recall, this method of "generating" words in a
language was somewhat frowned upon by the linguistic establishment in
SIL-PNG in those days. But I found I got a lot of words that I might
not have got hold of otherwise - such as taboo words.<br>
<br>
I believe there is software available in SIL now that can do this kind
of thing for you. You don't have to ask a high schooler to do it for
you. Oh, and the people you are working with need to be literate in
their own language.<br>
<br>
John Roberts<br>
<br>
<br>
<br>
Richard Gravina wrote:<br>
<font face="Arial" size="-1">I'm interested in knowing more about the
method of data collection based on 'potential' words. This is where
you create lists of artificial words by randomly combining letters,
and then go through the lists with native speakers to see if the words
actually exist in the language.</font><br>
</blockquote>
<blockquote type="cite" cite> <br>
<font face="Arial" size="-1">Does anyone have any experience of using
this? Do you know of any resources or software that would
help?</font><br>
<br>
<font face="Arial" size="-1">Richard Gravina</font><br>
</blockquote>
<blockquote type="cite" cite><br></blockquote>
<blockquote type="cite" cite></blockquote>
<blockquote type="cite"
cite><x-tab>
</x-tab><x-tab>
</x-tab><x-tab>
</x-tab><x-tab>
</x-tab><x-tab>
</x-tab><x-tab>
</x-tab><x-tab>
</x-tab><x-tab>
</x-tab></blockquote>
<blockquote type="cite" cite><br>
The following document was sent as an embedded object but not
referenced by the email above:<br>
Attachment converted: Macintosh HD:Untitled 128 (GIFf/«IC»)
(00217B9C)<br>
The following document was sent as an embedded object but not
referenced by the email above:</blockquote>
<blockquote type="cite" cite>Attachment converted: Macintosh
HD:Untitled 129 (GIFf/«IC») (00217B9D)</blockquote>
<div><br></div>
<div><br></div>
<x-sigsep><pre>--
</pre></x-sigsep>
<div
>******************************************************************<br
>
Richard A. Rhodes<br>
Department of Linguistics<br>
University of California<br>
Berkeley, CA 94720-2650<br>
Voice (510) 643-7325<br>
FAX (510) 643-5688<br>
<br>
***************************************************************<span
></span>***</div>
<!-- |**|begin egp html banner|**| -->
<br>
<br>
<!-- |**|end egp html banner|**| -->
<div width="1" style="color: white; clear: both;"/>__._,_.___</div>
<!-- Start Recommendations -->
<!-- End Recommendations -->
<!-- |**|begin egp html banner|**| -->
<img src="http://geo.yahoo.com/serv?s=97476590/grpId=11682781/grpspId=1709195911/msgId=5040/stime=1247160522" width="1" height="1"> <br>
<!-- |**|end egp html banner|**| -->
<!-- |**|begin egp html banner|**| -->
<br>
<div style="font-family: verdana; font-size: 77%; border-top: 1px solid #666; padding: 5px 0;" >
Your email settings: Individual Email|Traditional <br>
<a href="http://groups.yahoo.com/group/lexicographylist/join;_ylc=X3oDMTJnNWlzcThiBF9TAzk3NDc2NTkwBGdycElkAzExNjgyNzgxBGdycHNwSWQDMTcwOTE5NTkxMQRzZWMDZnRyBHNsawNzdG5ncwRzdGltZQMxMjQ3MTYwNTIy">Change settings via the Web</a> (Yahoo! ID required) <br>
Change settings via email: <a href="mailto:lexicographylist-digest@yahoogroups.com?subject=Email Delivery: Digest">Switch delivery to Daily Digest</a> | <a href = "mailto:lexicographylist-fullfeatured@yahoogroups.com?subject=Change Delivery Format: Fully Featured">Switch to Fully Featured</a> <br>
<a href="http://groups.yahoo.com/group/lexicographylist;_ylc=X3oDMTJlaG9qdWExBF9TAzk3NDc2NTkwBGdycElkAzExNjgyNzgxBGdycHNwSWQDMTcwOTE5NTkxMQRzZWMDZnRyBHNsawNocGYEc3RpbWUDMTI0NzE2MDUyMg--">
Visit Your Group
</a> |
<a href="http://docs.yahoo.com/info/terms/">
Yahoo! Groups Terms of Use
</a> |
<a href="mailto:lexicographylist-unsubscribe@yahoogroups.com?subject=Unsubscribe">
Unsubscribe
</a>
<br>
</div>
<br>
<!-- |**|end egp html banner|**| -->
<div style="color: white; clear: both;"/>__,_._,___</div>
</body>
</html>