<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p><font face="Calibri">Dear colleagues,</font></p>
<p><font face="Calibri">We are glad to inform you that the
previously announced INEL corpora of Selkup and Kamas languages
are now accessible through a new search interface. <br>
The starting pages for both corpora are as follows:<br>
</font>Selkup:
<a class="moz-txt-link-freetext" href="https://inel.corpora.uni-hamburg.de/SelkupCorpus/search">https://inel.corpora.uni-hamburg.de/SelkupCorpus/search</a><br>
Kamas: <a class="moz-txt-link-freetext" href="https://inel.corpora.uni-hamburg.de/KamasCorpus/search">https://inel.corpora.uni-hamburg.de/KamasCorpus/search</a><br>
<br>
<font face="Calibri">The search is based on the Tsakonian Corpus
Platform (Tsakorpus, <a class="moz-txt-link-freetext" href="https://bitbucket.org/tsakorpus/">https://bitbucket.org/tsakorpus/</a>). It
allows for searching based on transcription ("Word" field, also
"Lemma" for the default root shape), translation, and/or
grammatical glosses or categories, including negative queries
and multi-word queries (with or without distance constraints).
Change the "View" from "standard" to "glossed" in the options to
get the glossed output.<br>
Grammatical tags are unordered, they are generated from glosses
by rules. It is usually more efficient to search for grammatical
tags than for specific glosses, unless you're sure of the exact
gloss(es) and their relative order in the word.<br>
Pop-up windows in both "Grammar" and "Gloss" fields let you pick
and choose specific grammatical tags/glosses from the
corpus-specific list.</font><br>
</p>
<div class="moz-forward-container"><font face="Calibri">We hope you
will enjoy using this search interface. <br>
While help pages will be gradually updated, please send your
comments and suggestions to: <a
class="moz-txt-link-abbreviated"
href="mailto:inel@uni-hamburg.de">inel@uni-hamburg.de</a>.</font></div>
<div class="moz-forward-container"><font face="Calibri"><br>
</font></div>
<div class="moz-forward-container"><font face="Calibri"> </font><font
face="Calibri">Best regards,<br>
Alexandre Arkhipov</font></div>
<div class="moz-forward-container"><font face="Calibri"></font><br>
<br>
-------- Перенаправленное сообщение --------
<table class="moz-email-headers-table" cellspacing="0"
cellpadding="0" border="0">
<tbody>
<tr>
<th valign="BASELINE" nowrap="nowrap" align="RIGHT">Тема: </th>
<td>[Lingtyp] FYI: INEL Selkup and Kamas corpora</td>
</tr>
<tr>
<th valign="BASELINE" nowrap="nowrap" align="RIGHT">Дата: </th>
<td>Fri, 25 Jan 2019 18:31:15 +0100</td>
</tr>
<tr>
<th valign="BASELINE" nowrap="nowrap" align="RIGHT">От: </th>
<td>Alexandre Arkhipov <a class="moz-txt-link-rfc2396E" href="mailto:sarkipo@yandex.ru"><sarkipo@yandex.ru></a></td>
</tr>
<tr>
<th valign="BASELINE" nowrap="nowrap" align="RIGHT">Кому: </th>
<td><a class="moz-txt-link-abbreviated" href="mailto:lingtyp@listserv.linguistlist.org">lingtyp@listserv.linguistlist.org</a></td>
</tr>
</tbody>
</table>
<br>
<br>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<p><font face="Calibri">Dear colleagues,</font></p>
<p><font face="Calibri">The first versions of two digital corpora
developed as part of the INEL project (<a
class="moz-txt-link-freetext"
href="https://inel.corpora.uni-hamburg.de/?page_id=920"
moz-do-not-send="true">https://inel.corpora.uni-hamburg.de/?page_id=920</a>),
Selkup and Kamas, are published online.<br>
<br>
Texts are provided with interlinear glossing (with lexical
glosses in English and Russian), translations into English,
Russian and German. Some texts also have (partial) annotations
for syntactic functions, semantic roles and information
status, lexical borrowings and code-switching.<br>
<br>
The corpora are published in open access under Creative
Commons Attribution-NonCommercial-ShareAlike 4.0 International
Public License (CC BY-NC-SA 4.0). See below for details on
using the corpora.<br>
<br>
The corpora are primarily intended for typologically aware
corpus-based grammatical research but may also be of interest
to linguists of other branches as well as to specialists in
folklore, anthropology and history.<br>
<br>
<br>
1. INEL Selkup Corpus (v0.1)<br>
<a class="moz-txt-link-freetext"
href="http://hdl.handle.net/11022/0000-0007-CAE5-3"
moz-do-not-send="true">http://hdl.handle.net/11022/0000-0007-CAE5-3</a><br>
<br>
Selkup is an endangered Samoyedic language (Uralic family),
which used to be spoken in many small settlements dispersed
over a large territory in Western Siberia.<br>
The INEL Selkup corpus is composed of texts from the archive
of Angelina Ivanovna Kuzmina (1924–2002), who gathered a large
amount of material on Selkup in almost all regions where the
Selkup people lived in 1962–1977. Most texts in the corpus
originate from the handwritten part of the archive that she
transferred to Hamburg in 2001, the others come from her sound
recordings digitized in 2001, which have been transcribed and
translated within the INEL project.<br>
The present version of the corpus comprises 78 texts (18 673
words), mostly representing Northern varieties of Selkup.<br>
<br>
<br>
2. INEL Kamas Corpus (v0.1)<br>
<a class="moz-txt-link-freetext"
href="http://hdl.handle.net/11022/0000-0007-CAE6-2"
moz-do-not-send="true">http://hdl.handle.net/11022/0000-0007-CAE6-2</a><br>
<br>
Kamas belongs to the Samoyedic branch of the Uralic language
family. The language became extinct by the late XXth century,
with the death of its last known speaker, Klavdiya Plotnikova
(1895–1989). All the surviving Kamas texts document Forest
Kamas varieties spoken in the settlement of Abalakovo, in the
present Krasnoyarsk Krai in Southern Siberia.<br>
The INEL Kamas corpus is the first publicly available digital
resource with annotated Kamas texts. The INEL Kamas corpus
consists of two parts: folklore texts collected by Kai Donner
in 1912–1914, and transcribed audio recordings of Klavdiya
Plotnikova made between 1964 and 1970 in Abalakovo, Tartu and
Tallinn. Most of these recordings were transcribed within the
INEL project (including re-transcribing some tapes fragments
of which were published by Ago Künnap in 1976–1992).<br>
The present version of the corpus comprises 137 texts (48 293
words); this includes 16 texts collected by Kai Donner and 121
text from the recordings of Klavdiya Plotnikova (ca. 10,5
hours).<br>
<br>
<br>
Working with the corpora<br>
<br>
The data in the corpora (annotated texts as well as
corresponding metadata) are represented in XML formats of the
freely distributed EXMARaLDA suite (<a
class="moz-txt-link-freetext"
href="http://exmaralda.org/en/" moz-do-not-send="true">http://exmaralda.org/en/</a>).<br>
<br>
User guides (in English) are available here:<br>
<a class="moz-txt-link-freetext"
href="https://corpora.uni-hamburg.de/hzsk/en/islandora/object/file:selkup-0.1_INEL_Selkup_Corpus_0.1_User_Documentation/datastream/PDF/INEL_Selkup_Corpus.pdf"
moz-do-not-send="true">https://corpora.uni-hamburg.de/hzsk/en/islandora/object/file:selkup-0.1_INEL_Selkup_Corpus_0.1_User_Documentation/datastream/PDF/INEL_Selkup_Corpus.pdf</a><br>
<a class="moz-txt-link-freetext"
href="https://corpora.uni-hamburg.de/hzsk/en/islandora/object/file:kamas-0.1_INEL_Kamas_Corpus_0.1_User_Documentation/datastream/PDF/INEL_Kamas_Corpus.pdf"
moz-do-not-send="true">https://corpora.uni-hamburg.de/hzsk/en/islandora/object/file:kamas-0.1_INEL_Kamas_Corpus_0.1_User_Documentation/datastream/PDF/INEL_Kamas_Corpus.pdf</a><br>
<br>
For browsing (and playback) of individual texts, use
«Sessions» tab on the main corpus page. Each text can be
viewed in one of three online formats (e.g. Visualizations:
Score) and downloaded in EXB (an EXMARaLDA format). The
sources of texts, i.e. scanned pages (PDF) or sound files
(WAV, MP3) can also be viewed/downloaded.<br>
<br>
For searching across the whole corpus, the complete archive of
the corpus files can be downloaded and searched with the EXAKT
program of the EXMARaLDA suite.<br>
Furthermore, in the next few weeks, an online search interface
will be open for both corpora, based on the Tsakonian Corpus
Platform (Tsakorpus, <a class="moz-txt-link-freetext"
href="https://bitbucket.org/tsakorpus/"
moz-do-not-send="true">https://bitbucket.org/tsakorpus/</a>).
A test search across a fragment of the Selkup corpus is
currently available at <a class="moz-txt-link-freetext"
href="https://inel.corpora.uni-hamburg.de/SelkupCorpus/search"
moz-do-not-send="true">https://inel.corpora.uni-hamburg.de/SelkupCorpus/search</a>.<br>
<br>
Please send your comments and suggestions to: <a
class="moz-txt-link-abbreviated"
href="mailto:inel@uni-hamburg.de" moz-do-not-send="true">inel@uni-hamburg.de</a>.<br>
<br>
Best regards,<br>
Alexandre Arkhipov</font><br>
</p>
</div>
</body>
</html>