[Lingtyp] spectrograms in linguistic description and for language comparison

Sun Dec 11 03:20:18 UTC 2022

I agree with everything here, with one addendum: it's a strawman even if
you do ignore more formal judgment experiments.  The examples are invented,
but each data point is a *pairing* of an example and a judgment. Since the
judgments aren't invented (except in cases of misconduct), it's wrong to
say that the data are.
Neil
On Sat, Dec 10, 2022, 10:05 PM Adam Singerman <adamsingerman at gmail.com>
wrote:

> I think Randy is wrong (sorry if this comes across as blunt) and so I
> am writing, on a Saturday night no less, to voice a different view.
>
> Working inductively from a corpus is great, but no corpus is ever
> going to be large enough to fully represent a given language's
> grammatical possibilities. If we limit ourselves to working
> inductively from corpora then many basic questions about the languages
> we research will go unanswered. From a corpus of natural data we
> simply cannot know whether a given pattern is missing because the
> corpus is finite (i.e., it's just a statistical accident that the
> pattern isn't attested) or whether there's a genuine reason why the
> pattern is not showing up (i.e., its non-attestation is principled).
>
> When I am writing up my research on Tuparí I always prioritize
> non-elicited data (texts, in-person conversation, WhatsApp chats). But
> interpreting and analyzing the non-elicited data requires making
> reference to acceptability judgments. The prefix (e)tareman- is a
> negative polarity item, and it always co-occurs with (and inside the
> scope of) a negator morpheme. But the only way I can make this point
> is by showing that speakers invariably reject tokens of (e)tareman-
> without a licensing negator. Those rejected examples are by definition
> not going to be present in any corpus of naturalistic speech, but they
> tell me something crucial about what the structure of Tuparí does and
> does not allow. If I limit myself to inductively working from a
> corpus, fundamental facts about the prefix (e)tareman- and about
> negation in Tuparí more broadly will be missed.
>
> A lot of recent scholarship has made major strides towards improving
> the methodology of collecting and interpreting acceptability
> judgments. The formal semanticists who work on understudied languages
> (here I am thinking of Judith Tonhauser, Lisa Matthewson, Ryan
> Bochnak, Amy Rose Deal, Scott AnderBois) are extremely careful about
> teasing apart utterances that are rejected because of some
> morphosyntactic ill-formedness (i.e., ungrammaticality) versus ones
> that are rejected because of semantic or pragmatic oddity. The
> important point is that such teasing apart can be done, and the
> descriptions and analyses that result from this work are richer than
> what would result from a methodology that uses corpus examination or
> elicitation only.
>
> One more example from Tuparí: this language has an obligatory
> witnessed/non-witnessed evidential distinction, but the deictic
> orientation of the distinction (to the speaker or to the addressee) is
> determined via clause type. There is a nuanced set of interactions
> between the evidential morphology and the clause-typing morphology,
> and it would have been impossible for me to figure out the basics of
> those interactions without relying primarily on conversational data
> and discourse context. But I still needed to get some acceptability
> judgments to ensure that the picture I'd arrived at wasn't overly
> biased by the limitations of my corpus. Finding speakers who were
> willing to work with me on those judgments wasn't always easy; a fair
> amount of metalinguistic awareness was needed. But it was worth it!
> The generalizations that I was able to publish were much more solid
> than if I had worked exclusively from corpus data. And the methodology
> I learned from the Tonhauser/Matthewson/etc crowd was fundamental to
> this work.
>
> The call to work inductively from corpora would have the practical
> effect of making certain topics totally inaccessible for research
> (control vs raising structures, pied-piping, islands, gaps in
> inflectional paradigms, etc) even though large scale acceptability
> tasks have shown that these phenomena are "real," i.e., they're not
> just in the minds of linguists who are using introspection. Randy's
> point that "no other science allows the scientist to make up his or
> her own data, and so this is something linguists should give up" is a
> straw man argument now that many experimentalist syntacticians use
> large-scale acceptability judgments on platforms like Mechanical Turk
> to get at speakers' judgments. I think we do a disservice to our
> students and to junior scholars if we tell them that the only real
> stuff to be studied will be in the corpora that we assemble. Even the
> best corpora are finite, whereas L1 speakers' knowledge of their
> language is infinitely productive.
>
> — Adam
> _______________________________________________
> Lingtyp mailing list
> Lingtyp at listserv.linguistlist.org
> https://listserv.linguistlist.org/cgi-bin/mailman/listinfo/lingtyp
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lingtyp/attachments/20221210/f618aae5/attachment.htm>