24.3117, Qs: Issues in creating a speech corpus
linguist at linguistlist.org
linguist at linguistlist.org
Wed Jul 31 16:16:45 UTC 2013
LINGUIST List: Vol-24-3117. Wed Jul 31 2013. ISSN: 1069 - 4875.
Subject: 24.3117, Qs: Issues in creating a speech corpus
Moderator: Damir Cavar, Eastern Michigan U <damir at linguistlist.org>
Reviews: Veronika Drake, U of Wisconsin Madison
Monica Macaulay, U of Wisconsin Madison
Rajiv Rao, U of Wisconsin Madison
Joseph Salmons, U of Wisconsin Madison
Mateja Schuck, U of Wisconsin Madison
Anja Wanner, U of Wisconsin Madison
<reviews at linguistlist.org>
Homepage: http://linguistlist.org
Do you want to donate to LINGUIST without spending an extra penny? Bookmark
the Amazon link for your country below; then use it whenever you buy from
Amazon!
USA: http://www.amazon.com/?_encoding=UTF8&tag=linguistlist-20
Britain: http://www.amazon.co.uk/?_encoding=UTF8&tag=linguistlist-21
Germany: http://www.amazon.de/?_encoding=UTF8&tag=linguistlistd-21
Japan: http://www.amazon.co.jp/?_encoding=UTF8&tag=linguistlist-22
Canada: http://www.amazon.ca/?_encoding=UTF8&tag=linguistlistc-20
France: http://www.amazon.fr/?_encoding=UTF8&tag=linguistlistf-21
For more information on the LINGUIST Amazon store please visit our
FAQ at http://linguistlist.org/amazon-faq.cfm.
Editor for this issue: Alex Isotalo <alx at linguistlist.org>
================================================================
Date: Wed, 31 Jul 2013 12:16:30
From: Pankaj Dwivedi [pankaj.linguistics at gmail.com]
Subject: Issues in creating a speech corpus
E-mail this message to a friend:
http://linguistlist.org/issues/emailmessage/verification.cfm?iss=24-3117.html&submissionid=18065807&topicid=8&msgnumber=1
Hello all,
I am working on a lessor known dialect of Hindi language. I have around 15 hours of its speech data recorded with a professional recorder-Olympus LS100. Data mainly include free discourses from a variety of fields such as stories, daily routine, recipes, experiences, common words in isolation etc.I have also created text files/text grids for audio files using PRAAT. I am wondering if I can create a small speech corpus out of it. If yes, How? What next step should I take? I want to create a TTS system for it. Is it possible? Please explain it to me step by step.
You help will be duly acknowledged in research publications in form of a co-author?
Thank you!
Linguistic Field(s): Computational Linguistics
Text/Corpus Linguistics
----------------------------------------------------------
LINGUIST List: Vol-24-3117
----------------------------------------------------------
More information about the LINGUIST
mailing list