31.2931, Featured Linguist: Joakim Nivre

Tue Sep 29 02:26:20 UTC 2020

LINGUIST List: Vol-31-2931. Mon Sep 28 2020. ISSN: 1069 - 4875.

Subject: 31.2931, Featured Linguist: Joakim Nivre

Moderator: Malgorzata E. Cavar (linguist at linguistlist.org)
Student Moderator: Jeremy Coburn
Managing Editor: Becca Morris
Team: Helen Aristar-Dry, Everett Green, Sarah Robinson, Lauren Perkins, Nils Hjortnaes, Yiwen Zhang, Joshua Sims
Jobs: jobs at linguistlist.org | Conferences: callconf at linguistlist.org | Pubs: pubs at linguistlist.org

Homepage: http://linguistlist.org

Please support the LL editors and operation with a donation at:
           https://funddrive.linguistlist.org/donate/

Editor for this issue: Everett Green <everett at linguistlist.org>
================================================================

Date: Mon, 28 Sep 2020 22:25:37
From: LINGUIST List [linguist at linguistlist.org]
Subject: Featured Linguist: Joakim Nivre

For this week's featured linguist we bring you a great piece from Professor
Joakim Nivre!

--

I am delighted to support the fund drive for the LINGUIST List in the year of
its 30th anniversary. Like so many of my colleagues I have relied on the
services of the LINGUIST List throughout the years, and this gives me a
wonderful opportunity to share some glimpses from my career as a computational
linguist as well as some reflections on the development of the field during
these three decades.

When the LINGUIST List was started in 1990, I was a PhD student in general
linguistics at the University of Gothenburg, trying to complete a thesis on
situation semantics (a framework of formal semantics that has since faded into
oblivion) and mostly ignorant of the computational side of linguistics that
later became the focus of my career. The 1990s was the decade when
computational linguistics was transformed by the so-called statistical
revolution, which meant a methodological shift from carefully hand-crafted
rule-based systems that delivered a deep linguistic analysis but were often
lacking in coverage and robustness to statistical models trained on corpus
data going for breadth instead of depth.

The statistical turn in computational linguistics is also what got me into the
field, more or less by accident. After graduating in 1992, I was hired as a
lecturer in the linguistics department in Gothenburg, where around 1995 there
was a pressing need for a course on statistical methods in computational
linguistics but there was no one who was qualified to teach it. Young and
foolish, and eager to learn something new, I decided to accept the challenge
and started developing a course, using as my main sources Eugene Charniak’s
beautiful little book Statistical Language Learning and a compendium on
statistics for linguists by Brigitte Krenn and Christer Samuelsson with the
words “Don’t Panic!” in big boldface letters on the cover. As it turned out,
the University of Gothenburg was not the only institution that needed someone
to teach statistical methods in computational linguistics at the time, and I
ended up almost making a career as an itinerant lecturer in statistical NLP in
Scandinavia and Northern Europe.

Eventually, I also managed to apply my newly acquired expertise to research,
notably in a series of papers on statistical part-of-speech tagging.
Fascinated by the power of inductive inference that allowed us to build
practical systems for linguistic analysis from what was essentially just
frequency counts from corpora, I found that statistical NLP was more fun than
formal semantics and slowly but surely started converting from theoretical to
computational linguistics.

The following decade meant great changes for me both personally and
professionally. After switching gears and getting serious about computational
linguistics, I realized I needed to strengthen my computational competence and
decided to do a second PhD in computer science. In the process, I also moved
from the University of Gothenburg to Växjö University, a young small
university in the south of Sweden, with more limited resources for research
but a lot of enthusiasm and pioneer spirit to make up for it. Looking for a
topic for my second PhD thesis, I stumbled on dependency parsing, which at the
time was a niche area with very little impact in mainstream computational
linguistics. As an illustration of this, when giving my first conference
presentation on dependency parsing in 2003, I had to devote almost half the
talk to explaining what dependency parsing was in the first place and
motivating why such a thing could be worth studying at all.

By another case of fortunate timing, however, I happened to be one of the
first researchers to approach dependency parsing using the new kind of
statistical methods, and together with colleagues like Yuji Matsumoto, Ryan
McDonald, Sabine Buchholz and Kenji Sagae, building on foundational work by
Jason Eisner and Mike Collins, among others, I was fortunate to become one of
the leaders in a new and fruitful line of research that has turned dependency
parsing into the dominant approach to syntactic analysis in NLP, especially
for languages other than English. A milestone year in this development was
2006, when Sabine Buchholz led the organization of the first CoNLL shared task
on multilingual dependency parsing and Sandra Kübler and I gave the first
dependency parsing tutorial at the joint ACL-COLING conference in Sydney.

The rapidly increasing popularity of dependency parsing was in my view due to
three main factors. First, dependency representations provide a more direct
representation of predicate-argument structure than other syntactic
representations, which makes them practically useful when building natural
language understanding applications. Second, thanks to their constrained
nature, these representations can be processed very efficiently, which
facilitates large-scale deployment. And finally, thanks to efforts like the
CoNLL shared tasks, multilingual data sets were made available, which together
with off-the-shelf systems like MSTParser (by Ryan McDonald) and MaltParser
(by my own group) facilitated parser development for many languages. Towards
the end of the decade we also saw dependency parsing being integrated on a
large scale in real applications like information extraction and machine
translation.

The third decade of my co-existence with the LINGUIST List started with the
biggest computational linguistics event in Sweden so far, the ACL conference
in Uppsala in 2010. Together with my colleagues at Uppsala University, where I
had moved to take up a professorship in computational linguistics, I was very
happy to receive computational linguists from all corners of the world during
a very hot week in July. The conference was considered huge at the time, with
almost 1000 participants, but would be considered small by today's standards
(with over 3000 participants in Florence last year), so I am really glad that
we took the opportunity while it was still possible to fit ACL into a small
university town like Uppsala.

My own research during the last decade has to a large extent been concerned
with trying to understand how we can build models that are better equipped to
deal with the structural variation found in the world's languages. In the case
of parsing, for example, it is easy to see that models developed for English,
a language characterized by relatively rigid word order constraints and
limited morphological inflection, often do not work as well when applied to
languages that exhibit different typological properties. However, it is much
harder to see what needs to be done to rectify the situation. A major obstacle
to progress in this area has been the lack of cross-linguistically consistent
morphosyntactic annotation of corpora, making it very hard to clearly
distinguish differences in language structure from more or less accidental
differences in annotation standards. This is why I and many of my colleagues
have devoted considerable time and effort to the initiative known as Universal
Dependencies (UD), whose goal is simply to create cross-linguistically
consistent morphosyntactic annotation for as many languages as possible.

Given that UD is an open community effort without dedicated funding, it has
been remarkably successful and has grown in only six years from ten treebanks
and a dozen researchers to 163 treebanks for 92 languages with contributions
from 370 researchers around the world. I am truly amazed and grateful for the
wonderful response from the community, and UD resources are now used not only
for NLP research but increasingly also in other areas of linguistics, notably
for empirical studies of word order typology. All members of the UD community
deserve recognition for their efforts, but I especially want to thank Marie de
Marneffe, Chris Manning and Ryan McDonald, for being instrumental in getting
the project off the ground, and Filip Ginter, Sampo Pyysalo and (above all)
Dan Zeman, for doing all the heavy lifting as our documentation and release
team.

But is there really a need for something like UD in computational linguistics
today? You may think that, if I was fortunate to experience a few cases of
good timing in my previous career, the decision to start the UD initiative in
2014 may with hindsight look like a case of extremely bad timing. The field of
computational linguistics, and especially the more practical NLP side of it,
has in recent years undergone a second major transformation known as the deep
learning revolution. This has meant a switch from discrete symbolic
representations to dense continuous representations, representations that are
learnt by deep neural networks that are trained holistically for end-to-end
tasks, where the role of traditional linguistic representations has been
reduced to a minimum. In fact, it is very much an open question whether
traditional linguistic processing tasks like part-of-speech tagging and
dependency parsing have any role to play in the NLP systems of the future.

Looking back at the three decades of the LINGUIST List, there is no question
that computational linguistics has gradually diverged from other branches of
linguistics both theoretically and methodologically. The statistical
revolution of the 1990s meant a shift from knowledge-driven to data-driven
methods, but theoretical models from linguistics such as formal grammars were
still often used as the backbone of the systems. The shift from generative to
discriminative statistical models during the next decade further emphasized
the role of features learned from data and so-called grammarless parsers
became the norm, especially for dependency parsing, reducing the role of
traditional linguistics to corpus annotation and (sometimes) clever feature
engineering. During the last decade, the advent of deep learning has to a
large extent eliminated the need for feature engineering, in favor of
representation learning, and the emphasis on end-to-end learning has further
reduced the role of linguistic annotation.

Should we therefore conclude that there is no linguistics left in
computational linguistics? I think not. Paradoxically, as the importance of
explicit linguistic notions in NLP has decreased, the desire to know whether
NLP systems nevertheless learn linguistic notions seems to have increased.
There is a whole new subfield of computational linguistics often referred to
as interpretability studies, which is about understanding the inner workings
of the complex deep neural networks now used for language processing. And a
substantial part of this field is concerned with figuring out whether, say, a
deep neural language model like ELMo or BERT (to mention just two of the most
popular models on the market) implicitly learn linguistic notions like
part-of-speech categories, word senses or even syntactic dependency structure.
And resources like UD have turned out to be of key importance when trying to
probe the black-box models in search for linguistic structure. This opens up
exciting new possibilities for research, which can ultimately be expected to
influence also other branches of linguistics. Exactly where this research will
take us is impossible to say today, but I for one am eagerly looking forward
to following the development over the coming decades in the company of the
LINGUIST List. If you are too, please consider contributing to the fund drive.

--

Thanks for reading and if you want to donate to the LINGUIST List, you can do
so here: https://funddrive.linguistlist.org/donate/
All the best,
--the LL Team

------------------------------------------------------------------------------

***************************    LINGUIST List Support    ***************************
 The 2020 Fund Drive is under way! Please visit https://funddrive.linguistlist.org
  to find out how to donate and check how your university, country or discipline
     ranks in the fund drive challenges. Or go directly to the donation site:
                   https://crowdfunding.iu.edu/the-linguist-list

                        Let's make this a short fund drive!
                Please feel free to share the link to our campaign:
                    https://funddrive.linguistlist.org/donate/

----------------------------------------------------------
LINGUIST List: Vol-31-2931	
----------------------------------------------------------