<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p>Dear Björn,</p>
<p>I have never studied systematically the quality of the product of
different annotators, so please consider me incompetent in this
respect. However, a presupposition of any such study is obviously
a definition of what a good/correct annotation is. Such a
definition would be possible on certain conditions:</p>
<ol>
<li>The utterance to be annotated has one linguistic
(phonological, grammatical, semantic) structure. This implies
that its meaning is known and there is no (licit) variation of
annotations reflecting an ambiguity in the data.</li>
<li>There is a complete linguistic description of the language.
Among other things, it comprises lists of all linguistic units,
the regularities in their distribution and the set of
constructions that they form.</li>
<li>On the basis of this description, annotation guidelines are
formulated which provide a procedure by which the identity of a
unit found in an utterance is to be determined.</li>
<li>The annotation grid stipulates a representation for every
linguistic unit to be annotated.</li>
</ol>
<p>If all of this (unless I forget anything) could be made formally
explicit, then even an algorithm could produce a correct
annotation. It cannot be made fully explicit because of semantic
and pragmatic factors which cannot be systematized. Now if we
ignore these for a moment, then a given annotation is either
correct or false, and the comparison of products of annotators
boils down to an examination of whether their annotations are
correct. Given this, it would seem to be of secondary importance
whether an annotator is a native speaker or a linguist or what
not; the only question is to what extent he or she obeys the
guidelines.</p>
<p>The moral of my argument is: the burden is principally on the
shoulders of the person who formulates the guidelines. The
annotator can do no better than these.</p>
<p>--------------------------------------------------</p>
<div class="moz-cite-prefix">Am 03.01.2026 um 12:54 schrieb Wiemer,
Bjoern via Lingtyp:<br>
</div>
<blockquote type="cite"
cite="mid:41f16f708cbc43f48c87e00fb0e7da5c@uni-mainz.de">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta name="Generator"
content="Microsoft Word 15 (filtered medium)">
<style>@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}@font-face
{font-family:Aptos;}@font-face
{font-family:Times;
panose-1:2 2 6 3 5 4 5 2 3 4;}p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
font-size:12.0pt;
font-family:"Aptos",sans-serif;}a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#0563C1;
text-decoration:underline;}p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
{mso-style-priority:34;
margin-top:0cm;
margin-right:0cm;
margin-bottom:0cm;
margin-left:36.0pt;
font-size:12.0pt;
font-family:"Aptos",sans-serif;}p.bibliography, li.bibliography, div.bibliography
{mso-style-name:bibliography;
mso-margin-top-alt:auto;
margin-right:0cm;
mso-margin-bottom-alt:auto;
margin-left:0cm;
font-size:12.0pt;
font-family:"Aptos",sans-serif;}span.E-MailFormatvorlage20
{mso-style-type:personal-reply;
color:black;}.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;
mso-ligatures:none;}div.WordSection1
{page:WordSection1;}ol
{margin-bottom:0cm;}ul
{margin-bottom:0cm;}</style>
<div class="WordSection1">
<p class="MsoNormal"><span lang="EN-US"
style="color:black;mso-fareast-language:EN-US">Dear All,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"
style="color:black;mso-fareast-language:EN-US">since this
seems to be the first post on this list this year, I wish
everybody a successful, more peaceful and decent year than
the previous one.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"
style="color:black;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"
style="color:black;mso-fareast-language:EN-US">I want to
raise an issue which gets back to a discussion from October
2023 on this list (see the thread below, in inverse
chronological order). I’m interested to know whether anybody
has a satisfying answer to the question how to deal with
semantic annotation, or the annotation of more complex (and
less obvious) relations, in particular with the annotation
of interclausal relations, both in terms of syntax and in
semantic terms. Problems arise already with the
coordination-subordination gradient, which ultimately is an
outcome of a complex bunch of semantic criteria (like
independence of illocutionary force, perspective from which
referential expressions like tense or person deixis are
interpreted; see also the factors that were analyzed
meticulously, e.g., by Verstraete 2007). Other questions
concern the coding of clause-initial “particles”: are they
just particles, operators of “analytical mood”, or
complementizers? (Notably, these things do not exclude one
another, but they heavily depend on one’s theory, in
particular one’s stance toward complementation and mood.)
Another case in point is the annotation of the functions and
properties of constructions in TAME-domains, especially if
the annotation grid is more fine-grained than mainstream
categorizing.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"
style="color:black;mso-fareast-language:EN-US">
The problems which I have encountered (in pilot studies) are
very similar to those discussed in October 2023 for
seemingly even “simpler”, or more coarse-grained
annotations. And they aggravate a lot when we turn to data
from diachronic corpora: even if being an informed native
speaker is usually an asset, with diachronic data this asset
is often useless, and native knowledge may be even a
hindrance since it leads the analyst to project one’s habits
and norms of contemporary usage to earlier stages of the
“same” language. (Similar points apply for closely related
languages.) I entirely agree that annotators have to be
trained, and grids of annotation to be tested, first of all
because you have to exclude the (very likely) possibility
that raters disagree just because some of the criteria are
not clear to at least one of them (with the consequence that
you cannot know whether disagreement or low Kappa doesn’t
result from misunderstandings, instead of reflecting
properties of your object of study). I also agree that each
criterion of a grid has to be sufficiently defined, and the
annotation grid (or even its “history”) as such be
documented in order to save objective criteria for
replicability and comparability (for cross-linguistic
research, but also for diachronic studies based on a series
of “synchronic cuts” of the given language).<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"
style="color:black;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"
style="color:black;mso-fareast-language:EN-US">On this
background, I’d like to formulate the following questions:<o:p></o:p></span></p>
<ol style="margin-top:0cm" start="1" type="1">
<li class="MsoListParagraph"
style="color:black;margin-left:0cm;mso-list:l0 level1 lfo3">
<span lang="EN-US" style="mso-fareast-language:EN-US">Which
arguments are there that (informed) native speakers are
better annotators than linguistically well-trained
students/linguists who are not native speakers of the
respective language(s), but can be considered experts?<o:p></o:p></span></li>
<li class="MsoListParagraph"
style="color:black;margin-left:0cm;mso-list:l0 level1 lfo3">
<span lang="EN-US" style="mso-fareast-language:EN-US">Conversely,
which arguments are there that non-native speaker experts
might be even better suited as annotators (for this or
that kind of issue)?<o:p></o:p></span></li>
<li class="MsoListParagraph"
style="color:black;margin-left:0cm;mso-list:l0 level1 lfo3">
<span lang="EN-US" style="mso-fareast-language:EN-US">Have
assumptions about pluses and minuses of both kinds of
annotators been tested in practice? That is, do we have
empirical evidence for any such assumptions (or do we just
rely on some sort of common sense, or on the personal
experience of those who have done more complicated
annotation work)?<o:p></o:p></span></li>
<li class="MsoListParagraph"
style="color:black;margin-left:0cm;mso-list:l0 level1 lfo3">
<span lang="EN-US" style="mso-fareast-language:EN-US">How
can pluses and minuses of both kinds of annotators be
counterbalanced in a not too time (and money) consuming
way?<o:p></o:p></span></li>
<li class="MsoListParagraph"
style="color:black;margin-left:0cm;mso-list:l0 level1 lfo3">
<span lang="EN-US" style="mso-fareast-language:EN-US">What
can we do with data from diachronic corpora if we have to
admit that (informed) native speakers are of no use, and
non-native experts are not acknowledged, either? Are we
just deemed to refrain from any reliable and valid
in-depth research based on annotations (and statistics)
for diachronically earlier stages and for diachronic
change?<o:p></o:p></span></li>
<li class="MsoListParagraph"
style="color:black;margin-left:0cm;mso-list:l0 level1 lfo3">
<span lang="EN-US" style="mso-fareast-language:EN-US">In
connection with this, has any cross-linguistic research
that is interested in diachrony tried to implement
insights from such fields like historical semantics and
pragmatics into annotations? In typology, linguistic
change has increasingly become more prominent during the
last 10-15 years (not only from a macro-perspective). I
thus wonder whether typologists have tried to “borrow”
methodology from fields that have possibly been better in
interpreting diachronic data, and even quantify them (to
some extent).<o:p></o:p></span></li>
</ol>
<p class="MsoNormal"><span lang="EN-US"
style="color:black;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"
style="color:black;mso-fareast-language:EN-US">I don’t want
to be too pessimistic, but if we have no good answers as for
who should be doing annotations – informed native speakers
or non-native experts (or only those who are both native and
experts)? – and how we might be able to test the validity of
annotation grids (for comparisons across time and/or
languages), there won’t be convincing arguments how to deal
with diachronic data (or data of lesser studied languages
for which there might be no native speakers available) in
empirical studies that are to disclose more fine-grained
distinctions and changes, also in order to quantify them. In
particular, reviewers of project applications may always ask
for a convincing methodology, and if no such research is
funded we’ll remain ignorant of quite many reasons and
backgrounds of language change.
<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"
style="color:black;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"
style="color:black;mso-fareast-language:EN-US">I’d
appreciate advice, in particular if it provides answers to
any of the questions under 1-6 above.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"
style="color:black;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"
style="color:black;mso-fareast-language:EN-US">Best,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"
style="color:black;mso-fareast-language:EN-US">Björn
(Wiemer).<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"
style="color:black;mso-fareast-language:EN-US"><o:p> </o:p></span>--
<br>
<br>
</p>
</div>
</blockquote>
<div class="moz-signature">
<p style="font-size:90%">Prof. em. Dr. Christian Lehmann<br>
Rudolfstr. 4<br>
99092 Erfurt<br>
<span style="font-variant:small-caps">Deutschland</span></p>
<table style="font-size:80%">
<tbody>
<tr>
<td>Tel.:</td>
<td>+49/361/2113417</td>
</tr>
<tr>
<td>E-Post:</td>
<td><a class="moz-txt-link-abbreviated" href="mailto:christianw_lehmann@arcor.de">christianw_lehmann@arcor.de</a></td>
</tr>
<tr>
<td>Web:</td>
<td><a class="moz-txt-link-freetext" href="https://www.christianlehmann.eu">https://www.christianlehmann.eu</a></td>
</tr>
</tbody>
</table>
</div>
</body>
</html>