22.4036, Qs: Index of synthesis data

linguist at LINGUISTLIST.ORG linguist at LINGUISTLIST.ORG
Sat Oct 15 17:05:31 UTC 2011


LINGUIST List: Vol-22-4036. Sat Oct 15 2011. ISSN: 1069 - 4875.

Subject: 22.4036, Qs: Index of synthesis data

Moderators: Anthony Aristar, Eastern Michigan U <aristar at linguistlist.org>
            Helen Aristar-Dry, Eastern Michigan U <hdry at linguistlist.org>

Reviews: Veronika Drake, U of Wisconsin-Madison
Monica Macaulay, U of Wisconsin-Madison
Rajiv Rao, U of Wisconsin-Madison
Joseph Salmons, U of Wisconsin-Madison
Anja Wanner, U of Wisconsin-Madison
       <reviews at linguistlist.org>

Homepage: http://linguistlist.org

The LINGUIST List is funded by Eastern Michigan University,
and donations from subscribers and publishers.

Editor for this issue: Zac Smith <zac at linguistlist.org>
================================================================  

We'd like to remind readers that the responses to queries are usually
best posted to the individual asking the question. That individual is
then strongly encouraged to post a summary to the list. This policy was
instituted to help control the huge volume of mail on LINGUIST; so we
would appreciate your cooperating with it whenever it seems appropriate.

In addition to posting a summary, we'd like to remind people that it
is usually a good idea to personally thank those individuals who have
taken the trouble to respond to the query.

To post to LINGUIST, use our convenient web form at
http://linguistlist.org/LL/posttolinguist.cfm.

===========================Directory==============================  

1)
Date: 13-Oct-2011
From: Hugo Cesar de Castro Carneiro [hcesarcastro at gmail.com]
Subject: Index of synthesis data


-------------------------Message 1 ---------------------------------- 
Date: Sat, 15 Oct 2011 13:05:18
From: Hugo Cesar de Castro Carneiro [hcesarcastro at gmail.com]
Subject: Index of synthesis data

E-mail this message to a friend:
http://linguistlist.org/issues/emailmessage/verification.cfm?iss=22-4036.html&submissionid=4534154&topicid=8&msgnumber=1
 
My M.Sc. thesis is called ''The function of the index of synthesis of the
languages in part-of-speech tagging with weightless artificial neural
networks''. 

In this thesis my motivation is based on ''like vs. gostam (Portuguese for
''they like'')'' paradigm. In which ''like'' has an ambiguous part of
speech, as it can be a preposition, a conjunction, a verb or even other
part of speech, needing to have a word like ''they'' adjacent to it in
order to help readers to know that it is a ''verb'' (in this context). On
the other hand, ''gostam'' in Portuguese is always a verb, as the ''-am''
suffix informs the reader that ''gostam'' is really a verb. 

So, I am testing a system I've developed in 5 languages: Mandarin Chinese,
English, Portuguese, German and Turkish (from the most isolating language
to the most synthetic). And when I get the information I need from these 5
languages, I will test the system in 4 others: Thai (more synthetic than
Mandarin Chinese and more isolating than English), Japanese (more synthetic
than English and more isolating than Portuguese), Italian (more synthetic
than Portuguese and more isolating than German) and Russian (more synthetic
than German and more isolating than Turkish). 

But I have one problem: The indices of synthesis of these languages are
only estimated by me, and maybe even their order is somewhat wrong (is
Portuguese or German the most synthetic?). 

I would like to know if someone can help me find an index of synthesis of
these languages? Or where can I get a text in each of these languages with
all words with each of their morphemes separated? 

I am concluding my master studies this year, but I need to send a paper to
a journal before I get my M.Sc. in Computer Science degree. 

Linguistic Field(s): Morphology
                     Syntax





-----------------------------------------------------------
LINGUIST List: Vol-22-4036	
----------------------------------------------------------



More information about the LINGUIST mailing list