35.494, Review: Germanic Phylogeny: Hartmann (2023)

The LINGUIST List linguist at listserv.linguistlist.org
Mon Feb 12 16:05:07 UTC 2024


LINGUIST List: Vol-35-494. Mon Feb 12 2024. ISSN: 1069 - 4875.

Subject: 35.494, Review: Germanic Phylogeny: Hartmann (2023)

Moderators: Malgorzata E. Cavar, Francis Tyers (linguist at linguistlist.org)
Managing Editor: Justin Fuller
Team: Helen Aristar-Dry, Steven Franks, Everett Green, Daniel Swanson, Maria Lucero Guillen Puon, Zackary Leech, Lynzie Coburn, Natasha Singh, Erin Steitz
Jobs: jobs at linguistlist.org | Conferences: callconf at linguistlist.org | Pubs: pubs at linguistlist.org

Homepage: http://linguistlist.org

Please support the LL editors and operation with a donation at:
           https://funddrive.linguistlist.org/donate/

Editor for this issue: Justin Fuller <justin at linguistlist.org>
================================================================


Date: 13-Feb-2024
From: Bev Thurber [bev at pagophilia.com]
Subject: Computational Linguistics, Historical Linguistics: Hartmann (2023)


Book announced at https://linguistlist.org/issues/34.2142

AUTHOR: Frederik Hartmann
TITLE: Germanic Phylogeny
SERIES TITLE: Oxford Studies in Diachronic and Historical Linguistics
PUBLISHER: Oxford University Press
YEAR: 2023

REVIEWER: Bev Thurber

SUMMARY
The Germanic languages all derive from a common ancestor, but how
exactly did that happen? What happened on the way from Proto-Germanic
to its descendants? Hartmann tries to answer these difficult questions
using two different computational models. Detailed descriptions of the
models form the core of the book. This core is flanked by introductory
and concluding sections, each consisting of two chapters.
The first section comprises the introduction and Chapter 2, “Data”.
The introduction summarizes previous work and lays out the plan: to
model the Germanic languages using a tree and waves. Chapter 2 lays
out the underpinnings of the models in a mere five pages. The dataset
was derived from Agee’s glottometric study (2018) of Gothic, Old
English, Old Norse, and Old High German, with additional data for Old
Frisian and Old Saxon from Bremmer (2009) and Rauch (1992),
respectively. Hartmann also added Burgundian and Vandalic despite
their poor attestation. The dataset is represented in the form of a
matrix with each column representing a language. Innovations are in
rows of zeros (the innovation is absent), ones (the language has the
innovation), and question marks (unknown). Since little is known about
Burgundian and Vandalic, their columns include numerous question
marks. The complete dataset for 479 innovations is given in the
appendix (221–241).
Chapters 3 and 4 form the core of the book. They describe,
respectively, the tree- and wave-based models.
Chapter 3, “tree-based phylogenetics”, begins with a summary of the
different phylogenetic methods that have been used in the past,
including detailed discussions of various distance-based and Bayesian
models, before delving into the details of the Bayesian model used in
the present study. Six different versions of the model tested to
evaluate different sets of assumptions. The parts of the algorithm
that were changed in the different versions were:
1. The tip dating mechanism used to estimate when different languages
branched from the tree was either bounded or inferred.
2. The substitution model, which controls how quickly zeros change to
ones (or ones to zeros) in the vectors for the various languages, had
three variations: a Jukes-Cantor model for a constant rate; an
innovation-only model expressed as high rate for going from 0 to 1,
making it impossible to go from 1 to 0; and a variable-rate model that
attempts to infer an appropriate rate from the data.
Each of the six model variations produces an optimal tree. Hartmann
compared these trees to find the best model. The winner was the model
with bounded tip dating and a variable substitution rate. Despite
winning the competition, this model showed “very limited support” for
subgroupings of the Germanic languages other than West Germanic and
Old High German/Old Saxon (74). In particular, this model did not
identify the subgroup containing Old English and Old Frisian; only the
innovation-only models did. Because of these limitations, Hartmann
concludes that “the degree of horizontal transmission, areal changes,
and linguistic contact have yielded a diversification process for
these languages that is incompatible with rigid tree-like structures,
or at least not captured by them” (78). The next model attempts to
capture this process.
Chapter 4, “A wave model implementation”, is the longest and most
complex of the book. Its purpose is to model the Germanic languages as
a dialect continuum with innovations transmitted across space as well
as time. The chapter begins with a general description of agent-based
models and how they have been applied to language differentiation
problems. This section highlights a key difference between the
Bayesian tree model of Chapter 3 and this agent-based model: this
model is generative, meaning that it runs simulations to show how the
individual languages could have developed over time.
In Hartmann’s agent-based model, each agent represents a speech
community with three attributes: a place on the simulated terrain, an
innovation vector, and a set of parameter values. The simulation
begins with a small number of agents with all-zero linguistic
innovation vectors representing Proto-Germanic. As time passes, these
agents move around and spawn new agents, gradually filling the space
with speech communities. These speech communities are defined by
shared sets of innovations. As they move, the agents transfer
linguistic innovations to other agents (represented by changing a
value in the innovation vector).
A good portion of Chapter 4 is dedicated to analyzing the results of
200,000 simulation runs. Analyzing individual runs allowed Hartmann to
estimate how long it took for individual languages to reach
recognizable forms, i.e., for most of the important innovations to
occur. The model was able to provide a relative chronology for some
innovations as well as dates for some specific events. The model’s
“inferred origin time”—when Proto-Germanic broke up—was around 500 BC
(140), and Proto-Germanic *ai became ē everywhere except Burgundian
around 480 BC (148).
Principal component analyses enabled Hartmann to visualize how the
Germanic languages separated over time. He saw the eastern languages
departing first, followed by “internal diversifications in Northwest
Germanic” (169). Then, Old Saxon and Old High German separated from
Northwest Germanic while Old Norse, Old English, and Old Frisian
continued to overlap. Two results that Hartmann highlighted are that
Old Norse didn’t spread across water as well as the rest of Northwest
Germanic and that innovations had less trouble being transmitted on
the eastern side of the terrain. In contrast, innovations spread from
“epicenters” in the west, with the main ones located in the Old
English and Old High German areas (171).
The final part of the book consists of two chapters that focus on the
bigger picture.
Chapter 5, “Genealogical implications and Germanic phylogeny”,
connects the model’s results with what is known of the history of the
region occupied by Germanic speakers. The main contribution of this
chapter is Hartmann’s “attempt to construct a stemma” (208–210). He
proposes a modified tree structure with Proto-Germanic giving rise to
what he calls Core-Germanic and an East Germanic dialect continuum
comprising Gothic, Vandalic, and Burgundian. Under Core-Germanic
Hartmann places a dialect continuum of North Germanic, which is the
parent of Old Norse, and West Germanic, which gives rise to a dialect
continuum comprising the other languages (208). To better express
these dialect continua, Hartmann provides circle plots that represent
different stages of the languages’ development. The final plot (figure
5.4) shows a Continental West Germanic continuum containing Old
Frisian, Old Saxon, and Old High German with Old English and Old Norse
on the outside. Old Norse is placed directly across the border from
Old Frisian, while Old English sits between Old Frisian and Old Saxon.
Chapter 6, “Computational tree and wave models—final remarks”, puts
the models and their results in a broader context. The results of the
two models are compared with those of Agee (2018). In general, the two
studies seem to match up reasonably well, but certain details differ.
For example, all three models identified Northwest Germanic as a
subgroup, but the strength of this subgroup varied substantially in
the different studies. Hartmann suggests that this is due to the
different weight placed on innovations (216). Putting more weight on
them results in stronger and clearer subgroups. The chapter concludes
with a short section titled “Of hammers and nails” about finding the
right applications for each model. Hartmann found the wave model “a
novel method to complement the robust tree models” (217). He cautions
that “the results of each method will mostly fit within the framework
of the method itself” (217).
The book ends with two long tables (the innovation dataset and a
complete list of the estimated times at which innovations occurred),
the references, and brief indices of languages and subjects.
EVALUATION
Hartmann has tried to simplify the extremely complex process of
language evolution to a solvable math problem. Just how hard this is
becomes apparent quite early on, with the discussion of the dataset.
That little information is available for Vandalic and Burgundian made
the problem more complex: those languages were assigned question marks
for many innovations, which seemed worrying despite Hartmann’s
assurance that both of his models “have mechanisms to address the
issue of missing data and mitigate the effect of fewer datapoints for
these languages” (14).
The study raises important questions for understanding how the
Germanic languages separated. Ingvaeonic provides an example of the
type of question raised. Although Agee (2018: 49) identified
Ingvaeonic as a subgroup within Germanic, only Hartmann’s
innovation-only phylogenetic models concurred. This raises the
question of how strongly innovations should be weighted in comparison
with other factors when grouping languages. Additionally, Agee (2018:
49) identified an Anglo-Frisian subgroup, while Hartmann’s study
supports the idea that Old English and Old Frisian are similar because
they were part of a dialect continuum, not necessarily because they
evolved together. Hartmann suggests that both Ingvaeonic and
Anglo-Frisian are probably “an artefact of the linguistic
discretization of languages in a diversifying dialect continuum”
(220).
The stated goal of the book is to apply computational methods to the
development of the Germanic languages using previous research as a
guide. This is an accurate description of the book’s content, but
Hartmann has achieved more. This investigation opens up new lines of
study, such as the application of agent-based models, while remaining
grounded in traditional Germanic linguistics.
Overall, the book is fascinating in terms of both linguistic content
and approach. The linguistic conclusions are close enough to what is
already known to suggest that these computational models are worth
further study. The detailed descriptions of the algorithms, especially
the agent-based one, left my fingers itching to write some code and
try them out for myself. I look forward to future applications of
these methods.
REFERENCES
Agee, Joshua. 2018. A glottometric subgrouping of the early Germanic
languages. MA thesis. San Jose State University.
Bremmer, Rolf H. 2009. An introduction to Old Frisian: History,
grammar, reader, glossary. Amsterdam: John Benjamins.
Rauch, Irmengard. 1992. The Old Saxon Language: Grammar, epic
narrative, linguistic interference, vol. 1. New York: Peter Lang.

ABOUT THE REVIEWER

Bev Thurber is an independent scholar whose interests include
historical linguistics and the history of ice skating.



------------------------------------------------------------------------------

Please consider donating to the Linguist List https://give.myiu.org/iu-bloomington/I320011968.html


LINGUIST List is supported by the following publishers:

John Benjamins http://www.benjamins.com/

Lincom GmbH https://lincom-shop.eu/

Linguistic Association of Finland http://www.ling.helsinki.fi/sky/

Multilingual Matters http://www.multilingual-matters.com/

Netherlands Graduate School of Linguistics / Landelijke (LOT) http://www.lotpublications.nl/

Wiley http://www.wiley.com


----------------------------------------------------------
LINGUIST List: Vol-35-494
----------------------------------------------------------



More information about the LINGUIST mailing list