31.1535, FYI: Data-driven Language and Dialect Mapping

The LINGUIST List linguist at listserv.linguistlist.org
Wed May 6 17:37:11 UTC 2020


LINGUIST List: Vol-31-1535. Wed May 06 2020. ISSN: 1069 - 4875.

Subject: 31.1535, FYI:  Data-driven Language and Dialect Mapping

Moderator: Malgorzata E. Cavar (linguist at linguistlist.org)
Student Moderator: Jeremy Coburn
Managing Editor: Becca Morris
Team: Helen Aristar-Dry, Everett Green, Sarah Robinson, Lauren Perkins, Nils Hjortnaes, Yiwen Zhang, Joshua Sims
Jobs: jobs at linguistlist.org | Conferences: callconf at linguistlist.org | Pubs: pubs at linguistlist.org

Homepage: http://linguistlist.org

Please support the LL editors and operation with a donation at:
           https://funddrive.linguistlist.org/donate/

Editor for this issue: Sarah Robinson <srobinson at linguistlist.org>
================================================================


Date: Wed, 06 May 2020 13:37:02
From: Jonathan Dunn [jonathan.dunn at canterbury.ac.nz]
Subject: Data-driven Language and Dialect Mapping

 
A fully data-driven language mapping project is now available online at
https://www.earthlings.io. This project focuses on the global distribution of
464 languages, including dialect information for seven languages (English,
French, Spanish, Portuguese, German, Russian, and Arabic).

These language maps are fully data-driven and open-source
(https://github.com/jonathandunn). The underlying corpus data is from the
Corpus of Global Language Use, comprising approximately 430 billion words from
the web and social media. The core linguistic modelling is performed using
language identification software (idNet) and software for extracting syntactic
features (c2xg).  The project also makes available population-balanced
gigaword corpora for 50 languages.
 



Linguistic Field(s): Computational Linguistics
                     Sociolinguistics
                     Text/Corpus Linguistics





 



------------------------------------------------------------------------------

***************************    LINGUIST List Support    ***************************
 The 2019 Fund Drive is under way! Please visit https://funddrive.linguistlist.org
  to find out how to donate and check how your university, country or discipline
     ranks in the fund drive challenges. Or go directly to the donation site:
               https://iufoundation.fundly.com/the-linguist-list-2019

                        Let's make this a short fund drive!
                Please feel free to share the link to our campaign:
                    https://funddrive.linguistlist.org/donate/
 


----------------------------------------------------------
LINGUIST List: Vol-31-1535	
----------------------------------------------------------






More information about the LINGUIST mailing list