29.4869, Support: English; Computational Linguistics,Text/Corpus Linguistics: PhD, University of Birmingham & Alan Turing Institute

The LINGUIST List linguist at listserv.linguistlist.org
Thu Dec 6 19:53:40 UTC 2018


LINGUIST List: Vol-29-4869. Thu Dec 06 2018. ISSN: 1069 - 4875.

Subject: 29.4869, Support: English; Computational Linguistics,Text/Corpus Linguistics: PhD, University of Birmingham & Alan Turing Institute

Moderator: linguist at linguistlist.org (Malgorzata E. Cavar)
Reviews: reviews at linguistlist.org (Helen Aristar-Dry, Robert Coté)
Homepage: https://linguistlist.org

Please support the LL editors and operation with a donation at:
           https://funddrive.linguistlist.org/donate/

Editor for this issue: Yiwen Zhang <yiwen at linguistlist.org>
================================================================


Date: Thu, 06 Dec 2018 14:53:17
From: Jack Grieve [j.grieve at bham.ac.uk]
Subject: English; Computational Linguistics,Text/Corpus Linguistics: PhD, University of Birmingham & Alan Turing Institute, United Kingdom

 Institution/Organization: University of Birmingham & Alan Turing Institute 
Department:  
Web Address: http://www.birmingham.ac.uk 

Level: PhD 

Duties: Research
 
Specialty Areas: Computational Linguistics; Text/Corpus Linguistics 
 
Required Language(s): English (eng)

Description:

Web Archives and Cities: Mining the Web to Learn our Cities
Alan Turing Institute Doctoral Scholarship
Supervisors: Emmanouil Tranos (University of Birmingham & Alan Turing
Institute ) & Jack Grieve (University of Birmingham)

This project will utilise an innovative data source of billions of archived
web pages under the .uk domain during the period 1996-2013. It will exploit
the unstructured textual data contained in these webpages in order to
understand the changes that cities in the UK have undergone. Essential element
in this process would be the geolocation of these data. Specifically this
project will answer the following key research questions: 
- How are the dynamics of the UK urban system reflected in online internet
content? 
- Can we detect or even predict the dynamics of the inner structures of cities
in the UK by mining online content? 
- Can we understand urban functions and create urban typologies by using
online content? How is such a ‘digital’ understanding of cities compared to
our long-existing understanding based on traditional data sources? 

This project will use, but not limited to, data from the Internet Archive, the
most complete archive of web pages (Holzmann et al., 2016; Ainsworth et al.,
2011). It will employ the JISC UK Web Domain Dataset, which is a subset of the
Internet Archive curated by the British Library. These data contain billions
of web addresses of webpages within the .uk domain, which have been archived
by the Internet Archive during the period 1996-2013 as well as the archiving
timestamp. The British Library has also generated a subset of this dataset
called Geoindex which contains circa 2.5 billion web addresses of archived
webpages which include at least one UK postcode. 
These unstructured textual data will be interrogated by employing corpus
analytics in order to create meanings, themes and classifications. The student
will have the opportunity to approach the above questions from specific
thematic viewpoints, including, but not limited to, land values, tourism,
local governance etc. Topic modelling and similar type of methods will be used
first in small samples of the corpora and then will be scaled-up. These
methods will be coupled with statistical modelling and spatial analysis in
order to understand the spatiality of these processes. 

The successful applicant will have
- Relevant social science background in either geography/planning/urban
studies or linguistics. Alternatively, a computer science background and
willingness to engage with the above disciplines. 
- Strong computational background including experience in R or Python. 
- Good statistical knowledge. 
- Preferably, experience in Natural Language Processing and Machine Learning.

Funding Notes:
To support students the Turing offers a generous tax-free stipend of £20,500
per annum, a travel allowance and conference fund, and tuition fees for a
period of 3.5 years. Only open to UK/EU students. 
The Turing doctoral studentship scheme combines the strengths and expertise of
world-class universities with the Turing’s unique position as the UK’s
national institute for data science and artificial intelligence, to offer an
exceptional PhD programme.
Turing doctoral students spend approximately half of their time based at the
Institute headquarters at the British Library in London. They will apply and
register for their doctorate at the University of Birmingham, where they will
spend the remainder of their time.

For more information please see
https://www.birmingham.ac.uk/postgraduate/pgr/alan-turing-phds.aspx
 

Application Deadline: 22-Jan-2019 

Web Address for Applications: https://www.birmingham.ac.uk/postgraduate/pgr/alan-turing-phds.aspx 

Contact Information: 
	Jack Grieve 
	j.grieve at bham.ac.uk  


------------------------------------------------------------------------------

*****************    LINGUIST List Support    *****************
Please support the LL editors and operation with a donation at:

              The IU Foundation Crowd Funding site:
       https://iufoundation.fundly.com/the-linguist-list

               The LINGUIST List FundDrive Page:
            https://funddrive.linguistlist.org/donate/
 


----------------------------------------------------------
LINGUIST List: Vol-29-4869	
----------------------------------------------------------






More information about the LINGUIST mailing list