[Corpora-List] help with n-grams

Marc FRYD marc.fryd at univ-poitiers.fr
Sun Oct 26 08:19:26 UTC 2008


Hi all,
I wonder if anyone could help a linguist with moderate programming 
abilities with the following task.
I am currently working on a corpus of aligned grapheme-to-phoneme 
isolated words.
I would like to produce an N-gram parsing of both levels of data (the 
graphemic and the phonemic) with a view to extracting trends favouring 
realisations (i.e. this grapheme will realise as that phoneme with an x 
rate of occurrence if preceded/followed by such and such graphemes). The 
db is currently c3000 words, but it will keep growing.
Cheers,
Marc



-- 
Dr. Marc FRYD
Senior Lecturer in English Linguistics

Faculté des Lettres et des Langues
Université de Poitiers
95 avenue du Recteur Pineau
86022, Poitiers, France

Office: 05 49 45 48 11
Cell: 06 76 28 18 50




_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list