<html><body><div style="color:#000; background-color:#fff; font-family:times new roman, new york, times, serif;font-size:12pt"><div>Hi,<br></div><div><br></div><div>to answer the request from topic 2. Looking for Igbo, Hausa, and Yoruba Corpora (Fink, Clayton R.)</div><div><br></div><div>There is a Yoruba lexical Corpora available from the LDC at the following Link<br><span></span></div><div><br><span></span></div><div><span>http://www.ldc.upenn.edu/Catalog/catalogEntry.jsp?catalogId=LDC2008L03</span></div><div><br></div><div><br></div><div>You can also check the LDC catalog for more lexical ressources for other africain languages.</div><div><br></div><div>Hope this helps</div><div><br></div><div>Wajdi Zaghouani</div><div>Phd. Candidate<br></div><div>University of Quebec at Montreal,</div><div>Linguistics Department<br></div><div><br></div><div><br></div><div><br></div> <div style="font-family: times new roman, new york, times, serif;
font-size: 12pt;"> <div style="font-family: times new roman, new york, times, serif; font-size: 12pt;"> <div dir="ltr"> <font face="Arial" size="2"> <hr size="1"> <b><span style="font-weight:bold;">From:</span></b> "corpora-request@uib.no" <corpora-request@uib.no><br> <b><span style="font-weight: bold;">To:</span></b> corpora@uib.no <br> <b><span style="font-weight: bold;">Sent:</span></b> Sunday, February 26, 2012 6:00:01 AM<br> <b><span style="font-weight: bold;">Subject:</span></b> Corpora Digest, Vol 56, Issue 34<br> </font> </div> <br>Today's Topics:<br><br> 1. Job opening: Post-Doc in English Corpus and/or Computational<br> Linguistics, TU Darmstadt, Germany (Stefan Evert)<br> 2. Looking for Igbo, Hausa, and Yoruba Corpora (Fink, Clayton R.)<br> 3. Re: Looking for Igbo, Hausa, and Yoruba Corpora (Jimmy
O'Regan)<br><br><br>----------------------------------------------------------------------<br><br>Message: 1<br>Date: Sat, 25 Feb 2012 15:53:19 +0100<br>From: Stefan Evert <<a ymailto="mailto:stefanML@collocations.de" href="mailto:stefanML@collocations.de">stefanML@collocations.de</a>><br>Subject: [Corpora-List] Job opening: Post-Doc in English Corpus and/or<br> Computational Linguistics, TU Darmstadt, Germany<br>To: Corpora Mailing List <<a ymailto="mailto:corpora@uib.no" href="mailto:corpora@uib.no">corpora@uib.no</a>><br><br>The English Computational Corpus Linguistics group at Technische Universität<br>Darmstadt is seeking to hire a post-doctoral research assistant. The person<br>we're looking for has a background in English linguistics, experience with<br>corpus-based approaches and/or natural language processing, and is interested<br>in carrying out quantitative corpus studies with state-of-the-art methods
and<br>tools.<br><br>We offer a tigh-knit and cooperative research group, highly motivated students<br>and a vibrant work environment. The main research interests of our group are:<br> - methodological foundations of corpus linguistics<br> - collocations<br> - distributional lexical semantics<br> - register studies and linguistic variation<br> - digital humanities <br><br>Further information and details on the application procedure can be found in<br>the full job announcment below. For informal enquiries, please contact<br>Prof. Dr. Stefan Evert <<a ymailto="mailto:evert@linglit.tu-darmstadt.de" href="mailto:evert@linglit.tu-darmstadt.de">evert@linglit.tu-darmstadt.de</a>>.<br><br>The deadline for applications is Friday, 9 March 2012.<br><br>------------------------------------------------------------------------------<br><br>English: <a href="http://www.intern.tu-darmstadt.de/dez_vii/stellen/stellen_details_62912.en.jsp"
target="_blank">http://www.intern.tu-darmstadt.de/dez_vii/stellen/stellen_details_62912.en.jsp</a><br>German: <a href="http://www.intern.tu-darmstadt.de/dez_vii/stellen/stellen_details_62912.de.jsp" target="_blank">http://www.intern.tu-darmstadt.de/dez_vii/stellen/stellen_details_62912.de.jsp</a><br><br>------------------------------------------------------------------------------<br><br>The Institute of Linguistics and Literary Studies at the Faculty 02 Social and<br>Historical Sciences at Technische Universität Darmstadt invites applications<br>for a vacant position of a<br><br> Research Assistant (Post-Doc) in English Linguistics<br> (Code No. 74)<br><br>The position is initially for three years with a potential extension subject<br>to performance and funding.<br><br>The prospective postholder should have research interests in two or more of<br>the following areas:<br><br> - linguistic models of the English language<br> -
collocations, register studies, etc.<br> - corpus and computational linguistics<br> - statistical approaches in linguistics<br><br>The prospective postholder is expected to contribute to research and teaching<br>in English linguistics. Courses taught must contribute to the teaching<br>portfolio in English linguistics at undergraduate and postgraduate level<br>(Joint Bachelor of Arts Anglistik, Master of Education Englisch, Master of<br>Arts Linguistic and Literary Computing). All courses are taught in<br>English. The postholder is furthermore expected to collaborate closely with<br>the team in English linguistics, assisting in research projects and the<br>writing of research proposals as well as taking over an amount of the<br>administrative duties such as monitoring student progress and general academic<br>management.<br><br>Candidates are expected to pursue independent research towards a further<br>qualification at post-doctoral level (Habilitation or
equivalent such as<br>second book) as part of the fulfillment of their professional duties.<br><br>Candidates wishing to apply should fit the following profile:<br><br> - completed course of studies in English linguistics or teacher-training<br> degree in English<br> - PhD in English linguistics, corpus linguistics or computational linguistics<br> - experience in corpus and/or computational linguistic approaches in<br> linguistics<br> - excellent command of the English language (native or native-like written<br> and spoken English)<br> - some teaching experience in English linguistics<br><br>The Technische Universität Darmstadt intends to increase the number of female<br>faculty members and encourages female candidates to apply. In case of equal<br>qualifications applicants with a degree of disability of at least 50 or equal<br>will be given preference. Wages and salaries are according to the collective<br>agreements on salary
scales, which apply to the Technische Universität<br>Darmstadt (TV-TU Darmstadt). Part-time employment is generally possible.<br><br>Informal inquiries may be addressed to: <br><br> Prof. Dr. Stefan Evert, Technische Universität Darmstadt<br> Institut für Sprach- und Literaturwissenschaft, Hochschulstr. 1, 64289 Darmstadt<br> E-Mail: <a ymailto="mailto:evert@linglit.tu-darmstadt.de" href="mailto:evert@linglit.tu-darmstadt.de">evert@linglit.tu-darmstadt.de</a><br><br>Applications should quote the post?s Identification Number and include a CV, a<br>list of publications, copies of relevant diplomas, and a record of teaching<br>and research activities. They should be sent to:<br><br> The Dean of the Faculty of History and Social Science<br> Prof. Dr. Michèle Knodt<br> Residenzschloss<br> 64293 Darmstadt<br> Germany<br><br>Applicants are asked to additionally send an electronic copy of their<br>application
to the following e-mail address: <a ymailto="mailto:sprachli@linglit.tu-darmstadt.de" href="mailto:sprachli@linglit.tu-darmstadt.de">sprachli@linglit.tu-darmstadt.de</a><br><br>Please note that applications will not be returned after the completion of the<br>recruitment process; applicants are therefore discouraged from submitting<br>originals of certificates as well as applications in folders.<br><br>Application deadline: 9 March 2012<br><br>------------------------------------------------------------------------------<br><br><br><br><br>------------------------------<br><br>Message: 2<br>Date: Sat, 25 Feb 2012 14:31:37 -0500<br>From: "Fink, Clayton R." <<a ymailto="mailto:finkcr1@jhuapl.edu" href="mailto:finkcr1@jhuapl.edu">finkcr1@jhuapl.edu</a>><br>Subject: [Corpora-List] Looking for Igbo, Hausa, and Yoruba Corpora<br>To: "<a ymailto="mailto:corpora@hd.uib.no" href="mailto:corpora@hd.uib.no">corpora@hd.uib.no</a>" <<a
ymailto="mailto:corpora@hd.uib.no" href="mailto:corpora@hd.uib.no">corpora@hd.uib.no</a>><br><br>There's a BBC Hausa service and a Yoruba-language Wikipedia, so there <br>are some possibilities for those languages. Igbo seems to be a real <br>problem, though, in terms of finding text corpora.<br><br>I'm interested, mostly, in training up language id models that I can use <br>on names. I have some small corpora of first names and surnames scraped <br>off of the Web, but it might be interesting to have some larger corpora <br>to work from.<br><br>Thanks,<br><br>Clay<br><br>-- <br>Clay Fink<br>Senior Software Engineer<br>The Johns Hopkins University Applied Physics Laboratory<br><br>240-228-4220<br><br><br><br><br>------------------------------<br><br>Message: 3<br>Date: Sat, 25 Feb 2012 20:23:06 +0000<br>From: "Jimmy O'Regan" <<a ymailto="mailto:joregan@gmail.com" href="mailto:joregan@gmail.com">joregan@gmail.com</a>><br>Subject: Re:
[Corpora-List] Looking for Igbo, Hausa, and Yoruba<br> Corpora<br>To: "Fink, Clayton R." <<a ymailto="mailto:finkcr1@jhuapl.edu" href="mailto:finkcr1@jhuapl.edu">finkcr1@jhuapl.edu</a>><br>Cc: "<a ymailto="mailto:corpora@hd.uib.no" href="mailto:corpora@hd.uib.no">corpora@hd.uib.no</a>" <<a ymailto="mailto:corpora@hd.uib.no" href="mailto:corpora@hd.uib.no">corpora@hd.uib.no</a>><br><br>On 25 February 2012 19:31, Fink, Clayton R. <<a ymailto="mailto:finkcr1@jhuapl.edu" href="mailto:finkcr1@jhuapl.edu">finkcr1@jhuapl.edu</a>> wrote:<br>> There's a BBC Hausa service and a Yoruba-language Wikipedia, so there are<br>> some possibilities for those languages. Igbo seems to be a real problem,<br>> though, in terms of finding text corpora.<br>><br><br>There's an Igbo Wikipedia: <a href="http://ig.wikipedia.org/wiki/Ih%C3%BC_Mbu" target="_blank">http://ig.wikipedia.org/wiki/Ih%C3%BC_Mbu</a><br><br>> I'm
interested, mostly, in training up language id models that I can use on<br>> names. I have some small corpora of first names and surnames scraped off of<br>> the Web, but it might be interesting to have some larger corpora to work<br>> from.<br><br>Kevin Scannell's language id model set<br>(<a href="http://nltk.googlecode.com/svn/trunk/nltk_data/packages/corpora/langid.zip" target="_blank">http://nltk.googlecode.com/svn/trunk/nltk_data/packages/corpora/langid.zip</a>)<br>includes a trigram model for Igbo.<br><br><br>-- <br><Sefam> Are any of the mentors around?<br><jimregan> yes, they're the ones trolling you<br><br><br><br>----------------------------------------------------------------------<br>Send Corpora mailing list submissions to<br> <a ymailto="mailto:corpora@uib.no" href="mailto:corpora@uib.no">corpora@uib.no</a><br><br>To subscribe or unsubscribe via the World Wide Web, visit<br> <a
href="http://mailman.uib.no/listinfo/corpora" target="_blank">http://mailman.uib.no/listinfo/corpora</a><br>or, via email, send a message with subject or body 'help' to<br> <a ymailto="mailto:corpora-request@uib.no" href="mailto:corpora-request@uib.no">corpora-request@uib.no</a><br><br>You can reach the person managing the list at<br> <a ymailto="mailto:corpora-owner@uib.no" href="mailto:corpora-owner@uib.no">corpora-owner@uib.no</a><br><br>When replying, please edit your Subject line so it is more specific<br>than "Re: Contents of Corpora digest..."<br><br><br>_______________________________________________<br>Corpora mailing list<br><a ymailto="mailto:Corpora@uib.no" href="mailto:Corpora@uib.no">Corpora@uib.no</a><br><a href="http://mailman.uib.no/listinfo/corpora" target="_blank">http://mailman.uib.no/listinfo/corpora</a><br><br><br>End of Corpora Digest, Vol 56, Issue
34<br>***************************************<br><br><br> </div> </div> </div></body></html>