[Corpora-List] Corpus Sanitation - no
Tadeusz Piotrowski
tadpiotr at plusnet.pl
Mon Dec 2 11:23:15 UTC 2002
If that is a sort of voting, I say "no". No -- to cutting out parts of
text from a corpus that is meant to be a sample of certain type of
discourse. A qualified yes, on the other hand, to pedagogic uses of a
corpus, in which perhaps some pruning can be desirable.
I do understand the doubts of those who would not like to do any harm to
their interviewees, or authors, etc., but: is not the significance of a
corpus somewhat exaggerated? Or even the knowledge there is something
like corpus. A person has to know there is a corpus, that he/she can use
it, etc. I am a lexicographer, and that reminds me of the attitude
towards dictionaries: let's throw out the offensive words because
children may read them and learn them. Are there children that do not
know those words? Or let's add some information that is correct
(American Heritage dictionary has entries on wives of ALL American
presidents. Seems a waste of space really).
Yours
profesor Tadeusz Piotrowski
Instytut Filologii Angielskiej
Uniwersytet Opolski
Kopernika 2
Opole, Poland
> -----Original Message-----
> From: owner-corpora at lists.uib.no
> [mailto:owner-corpora at lists.uib.no] On Behalf Of Scott Sadowsky
> Sent: Monday, December 02, 2002 8:19 AM
> To: CORPORA at hd.uib.no
> Subject: Re: [Corpora-List] Corpus Sanitation - no
>
>
> On 12/1/2002 23:09, Christoph Neumann wrote the following:
>
> >I hope that we are never going to be politically, sexually,
> religiously
> >"correct", but only scientifically correct and adequate.
>
> I certainly agree in principle. In practice, however, these
> issues can
> become so convoluted and complicated that they defy any easy
> solution. Let
> me describe a situation that a colleague and I are currently facing.
>
> About six months ago, my colleague recorded an interview with
> a woman of a
> certain profession who mentioned in the interview that she
> had done work
> for a certain church. At one point, speaking of an extremely
> well-known
> and powerful member of the clergy, she said (the local
> equivalent of) "X is
> queer... in both senses of the word". This was clearly not
> meant as an
> insult, but as a statement of fact. She then qualified her statement
> during the next minute or so.
>
> A couple months later, it turns out that said clergyman
> stands accused of
> sexually abusing an impressive number of boys.
>
> Our dilemma is, of course, what to do with this recording.
>
> We have no intention of doing anything that could jeopardize the
> interviewee's anonymity. And in fact, not even her first
> name appears in
> the interview (which happens often enough, with interviewers
> trying to
> establish good rapport). So at first glance, there's no
> problem in this
> regard.
>
> It turns out, though, that she is one of maybe 3 or 4
> practitioners of her
> profession in the whole country, and so identifying her would
> be child's
> play. Furthermore, censoring all the mentions of her
> profession is not an
> option, as something like half the interview is related to
> what she does
> for a living.
>
> On the other hand, we *really* don't want to throw this
> interview out, as
> the subject belongs to the single most elusive demographic
> group in the
> country, one which practically no one --linguists,
> sociologists, marketing
> folk, what have you-- ever obtains access to.
>
> Unfortunately, even if we somehow resolve the above issues,
> our dilemma
> does not end there. Libel, slander and defamation suits are
> a favorite
> pastime of the powerful in Chile, and the local archbishop has been
> threatening to bring such suits against anyone who denounces
> these types of
> crimes. Such suits are criminal actions here, which means
> that you get to
> wait for your trial in jail. And to complicate matters, in truly
> Kafkaesque fashion, the fact that a given statement is true is not an
> admissible defense in these matters.
>
> In short, lord knows what we'd be exposing ourselves to by
> including this
> interview in a publicly-available corpus. It's looking more
> and more like
> our only option is to sit on the recording and transcript,
> using them only
> internally.
>
> I'd certainly be interested in any thoughts anyone may have
> on this matter.
>
> Cheers,
> Scott
>
> _____________________________________________________________
> Scott Sadowsky
> Centro de Estudios Cognitivos, Universidad de Chile
> sadowsky at spanishtranslator.org . ssadowsk at icaro.dic.uchile.cl
> _____________________________________________________________
>
>
>
More information about the Corpora
mailing list