Corpora: a particular type of sloppiness
James L. Fidelholtz
jfidel at siu.buap.mx
Fri Apr 13 17:43:55 UTC 2001
On Wed, 11 Apr 2001, Alexandr Rosen wrote:
>
>> From: "Tadeusz Piotrowski" <tadpiotr at plusnet.pl>
>[snip]
>> I wonder what do the people do with other diacritic-rich languages? German?
>> French? Czech? Is it the same as in Polish?
Marco Antonio Esteves da Rocha <marcor at cce.ufsc.br> answered:
<< [snip]
Curious idea. The absence of diacritics in Portuguese is what disturbs
me, not their inclusion. It is difficult to be sure whether people on
the other end of the message have the equipment and configuration to
actually see those diacritics on screen in their e-mail editor. In fact,
what appears on different screens around the world when you produce
diacritics in your own equipment is quite unpredictable and may be
unreadable for the recipient. So people writing messages in Portuguese
often choose not to use them for safety.
<<
But it makes me feel very uncomfortable. It is not at all the feeling of
using pidgin Portuguese but of writing in a different language,
[snip]>>
and Alexandr Rosen <rosen at chomsky.ruk.cuni.cz> comments:
>I have always thought that the absence of diacritics in most Czech e-mails is
>due to the writer's awareness of the danger of character codes becoming
garbage on the way, rather than due to the writer being lazy. In fact,
[snip]
>I believe it is very unfortunate that we still don't have a reliable way of
>using a Latin-based (or any other) writing system on the Internet, sloppily or
>not.
>
Well, I live and often write in a Spanish-speaking country, and
the answer to Tadeusz's question is that it varies tremendously. I
would say the plurality, if not the majority, of people write without
accents for the reasons adumbrated by Marco Antonio and Alexandr. Of
the large number of messages I receive in Spanish from people who *do*
use accents, a *very* large proportion of them are botched up with
different kinds of codes, and this includes even messages from myself
(via another server, of course) and the fact that my server is set up
for reading the coding which includes Spanish. God knows what the
reason is (I guess I ought to, but then we all have our little areas of
inexplicable ignorance). ;)
I am not a native speaker of Spanish, and have argued in
published articles for the general elimination of accents and diacritics
from Spanish (and would be brash enough to make the same argument for
almost *any* language with diacritics, including Polish, Portuguese and
Czech). My reasons are low functional load for the diacritics in
general (messages I receive in Spanish without diacritics are close to
100% legible, and very close indeed to the legibility of msgs with
diacritics; I'd bet the same is true for Czech, and I know it is for
Polish-- the σ [if that got butchered up, it's an 'o' with an acute
accent over it], for example, is almost 100% predictable), also the
general dropping of diacritics in handwriting, etc. However, because of
the existence of the Royal Spanish Academy of the language and the
general inertia of tradition, this suggestion has a close to zero
probability of being accepted. So, being a hard-headed SOB, I demand
(especially of myself) the proper use of accents in *all* written
Spanish communications. Given the problems alluded to above for Polish
and Czech, however, and which are equally valid for Spanish, one wishes
to spare those who receive their communications from the systematic
butcheries which accented letters are prone to undergo. So I always
write the letter followed by the accent, which for Spanish is just one
extra keystroke (again, I'm too lazy to train myself in how to adapt the
keyboard). Although I am at least the equivalent of an educated native
Spanish speaker in my use of accents (a major problem for writers of
Spanish, by the way, because of some not-quite-predictable uses of
accents, and a few problematical or arbitrary cases), even I have some
lapses in my accentuation. The point here is that, except in a
hypothetical language in which accents really carried a functional load,
leaving them off will do very limited damage to the communication. I
say 'hypothetical' because people *do* leave off accents at the drop of
a hat, and this would in those cases impede communication,
theoretically, and so such an orthography would tend to be rapidly
modified. I strongly suspect that even the Portuguese example cited is
exaggerated, and would have limited effect on actual communication.
Jim
--
James L. Fidelholtz e-mail: jfidel at siu.buap.mx
Posgrado en Ciencias del Lenguaje tel.: +(52-2)229-5500 x5705
Instituto de Ciencias Sociales y Humanidades fax: +(01-2) 229-5681
Benemιrita Universidad Autσnoma de Puebla, MΙXICO
More information about the Corpora
mailing list