[Lingtyp] Should we include original scripts for examples in typological publications?

Sebastian Nordhoff sebastian.nordhoff at glottotopia.de
Mon Nov 17 10:40:41 UTC 2025


Dear all,
a couple of remarks from a publisher's perspective. Language Science 
Press has published books with Latin, Cyrillic, Arabic, Chinese, 
Japanese, Korean, Hebrew, Syriac, Georgian and probably some more 
scripts. I personally like scripts, and I am happy and proud that we can 
produce books which have them.

Language Science Press published "Arabic and contact-induced change", a 
massive work of 700 pages, but it does not use Arabic script. I do not 
know the precise rationale, but I would suspect that the Arabic script 
itself is not really helpful when discussing eg vowel changes.

We have had people submit textbooks, which proudly featured Arabic 
script ت​ع​ل​ق, but تعلق (connected) would have been required. I second 
Peter Arkadiev's idea that haphazardly copy-pasting foreign scripts and 
getting it wrong is worse than not including them in the first place.

We are currently preparing a translation of "Tone in Yonging Na" into 
Chinese, and the rendering of phonetic information in a logographic 
script is a whole science of its own.

If the point of an article is, say, topicalization in the languages of 
the world, the permutation of ABCD to become DABC can be understood 
based on the rendering of the words in Latin script. Of course, it would 
be kind to the speakers of Thai, Hindi, Chinese and Arabic to include 
all those scripts for the examples in their language, but it would not 
really make the argument any clearer, and might actually confuse them in 
the other three cases, which have a script they do not know, next to the 
Latin script.

 From my own research, I would like to raise the issue of diglossia. In 
a diglossic situation, Latin script will allow the representation of the 
spoken variety, which is often what typologists are interested in. Many 
people can tell stories of speakers stating "ah, if you want to write it 
down, it has to be different!" Using Latin script circumvents this 
problem. If you try to retrofit the local script afterwards, I foresee 
all kinds of problems.

Finally, orthographic details are often irrelevant for the point being 
made. Is the correct orthographic rendering of /buumi/ 'land' in Sinhala 
භූමි or ඛූමි? Both are pronounced identically, and speakers are often 
unsure which orthographic rendering to choose. Given that research time 
is limited, it is questionable whether the attention should go to 
orthographic detail which is irrelevant to the point being made.

What I say above concerns typological papers. Things are of course 
different for philological papers.

Best wishes
Sebastian


On 11/14/25 18:25, Konstantin Henke via Lingtyp wrote:
> Dear Lingtyp members,
> 
> I hope this is not an old topic with a consensus I'm not aware of. If it 
> is, please forgive me for re-opening it.
> 
> In the overwhelming majority of example sentences/forms in typological 
> publications I do not see another line providing the original script 
> where one exists for the surveyed language (Thai, Chinese, Korean, 
> Japanese, certain Slavic languages, etc.). It might be a domain-specific 
> thing (I've mostly been working with spatial semantics) but researchers 
> in other domains may have been wondering about the same thing.
> 
> I understand that adding another written representation to the Latin 
> transliteration does not serve the endeavor of typology, which is based 
> on segments that are ideally naturally produced (i.e. spoken) and that 
> especially non-phonemic/phonetic scripts do not add any value for the 
> greater part of a broader audience of researchers and other readers. 
> Instead, adding these scripts eats up space and may even be perceived as 
> an unnecessary show-off with something that looks pretty or exotic.
> 
> Having studied in Taiwan, where Mandarin speakers even in the academic 
> realm are often not familiar with Pinyin, the de-facto standard Latin 
> transliteration of their language, I frequently witnessed them struggle 
> to read examples presented in their very own language if Chinese 
> characters are missing. China, on the other hand, is arguably a rather 
> rare case where the academically used transliteration (Pinyin with tone 
> diacritics) does happen to be almost the same as the most common input 
> method on electronic devices (Pinyin without tone diacritics). I'm not 
> sure if my observation in Taiwan generalizes well, but I wouldn't be 
> surprised if fellow researchers from Thailand, Korea, Japan, Russia etc. 
> struggled to read their language in Latin transliteration. I'm actually 
> quite surprised to see a discipline concerned with freeing itself from 
> Eurocentric bias care so little about its accessibility to non-European 
> contributors and readers.
> 
> That said, I may be overlooking something in addition to the few 
> counter-points mentioned above. I do empathize with the argument that a 
> push for naturalistic data might imply the wish to rid oneself of the 
> burden of written representation (but then we might as well just provide 
> all examples of spoken data in IPA, which I have seen a few researchers 
> do even for familiar IE languages). I would also understand the space 
> question if it weren't for the fact that everyone just reads PDFs now 
> anyways. Layout/font-related issues should hardly pose a problem in the 
> age of Unicode, either. Am I missing something, or are we really just 
> being lazy?
> 
> I'd appreciate any input!
> 
> Best,
> Konstantin
> 
> PS: I'm obviously talking about cases where the original script adds 
> readability for native speakers. Whether or not to add less commonly 
> used scripts like Javanese to raise awareness or for similar reasons, is 
> probably a different topic.
> 
> _______________________________________________
> Lingtyp mailing list
> Lingtyp at listserv.linguistlist.org
> https://listserv.linguistlist.org/cgi-bin/mailman/listinfo/lingtyp



More information about the Lingtyp mailing list