[Corpora-List] Which Statistical Test is Suitable

fatima zuhra fateeshah at yahoo.com
Thu Jul 14 03:40:09 UTC 2011


Dear Muhammad Shakir Aziz,
Can you please provide an example (or two) of the words, having two spellings? I have worked with Pashto text and I have observed that a single Pashto word is spelled in several (more than two) ways. 
One of my works was concerned with extracting individual words from a written Pashto corpus. The system I used for extracting individual Pashto words gave me such variations of the same word that looked the same at the first glance (e.g. the grapheme "kaaf" may be written a bit longer than how it is written currently in the Urdu spelling of "Shakir" in your name, which will result in a variation of this spelling). Are you considering these variations or some others? 

Regards.
Fatima Tuz ZuhraPh.D. Scholar and Lecturer,Department of Computer Science,University of Peshawar, Pakistan.
--- On Sun, 7/10/11, True Friend <true.friend2004 at gmail.com> wrote:

From: True Friend <true.friend2004 at gmail.com>
Subject: [Corpora-List] Which Statistical Test is Suitable
To: "corpora" <corpora at uib.no>, corpora at lists.uib.no
Date: Sunday, July 10, 2011, 8:23 PM

Dear Members
I am working on a research paper regarding spelling variations. In my language, Urdu, there are some words which have two spellings. For example the data can be like this:

 
 
  Word
  Spelling 1
  Spelling 2
 
 
  X
  24
  40
 
 
  Y
  600
  200
 
 
  Z
  300
  1000
 
Now what I want to show that alternate spellings do exist for this group of words and they are not just spelling errors. Can I use a correlation formula to show that two spellings have a relation?
Waiting for your suggestions.

Regards
-- 
Muhammad Shakir Aziz محمد corpora at uib.noر عزیز

Masters in Applied Linguistics
Translator, Course Developer, Linguist for Urdu, Punjabi and English

Urdu:- http://awaz-e-dost.blogspot.com/

English:- http://linguisticslearner.blogspot.com/

Facebook:- http://www.facebook.com/truefriend2004

Skype:- true_friend2004



-----Inline Attachment Follows-----

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20110713/101b4a57/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list