18.1412, Diss: Computational Ling/Lang Acquisition: Pearl: 'Necessary Bias i...'

Wed May 9 15:34:51 UTC 2007

LINGUIST List: Vol-18-1412. Wed May 09 2007. ISSN: 1068 - 4875.

Subject: 18.1412, Diss: Computational Ling/Lang Acquisition: Pearl: 'Necessary Bias i...'

Moderators: Anthony Aristar, Eastern Michigan U <aristar at linguistlist.org>
            Helen Aristar-Dry, Eastern Michigan U <hdry at linguistlist.org>
Reviews: Laura Welcher, Rosetta Project  
       <reviews at linguistlist.org> 

Homepage: http://linguistlist.org/

The LINGUIST List is funded by Eastern Michigan University, 
and donations from subscribers and publishers.

Editor for this issue: Hunter Lockwood <hunter at linguistlist.org>

To post to LINGUIST, use our convenient web form at


Date: 09-May-2007
From: Lisa Pearl < llsp at umd.edu >
Subject: Necessary Bias in Natural Language Learning


-------------------------Message 1 ---------------------------------- 
Date: Wed, 09 May 2007 11:32:09
From: Lisa Pearl < llsp at umd.edu >
Subject: Necessary Bias in Natural Language Learning 

Institution: University of Maryland 
Program: Department of Linguistics 
Dissertation Status: Completed 
Degree Date: 2007 

Author: Lisa Pearl

Dissertation Title: Necessary Bias in Natural Language Learning 

Dissertation URL:  http://www.ling.umd.edu/llsp/papers/PearlThesis.pdf

Linguistic Field(s): Computational Linguistics
                     Language Acquisition

Dissertation Director(s):
William James Idsardi
Jeffrey L. Lidz
Amy Weinberg
Charles Yang

Dissertation Abstract:

This dissertation investigates the mechanism of language acquisition given
the boundary conditions provided by linguistic representation and the time
course of acquisition. Exploration of the mechanism is vital once we
consider the complexity of the system to be learned and the non-transparent
relationship between the observable data and the underlying system. It is
not enough to restrict the potential systems the learner could acquire,
which can be done by defining a finite set of parameters the learner must
set. Even supposing that the system is defined by n binary parameters, we
must still explain how the learner converges on the correct system(s) out
of the possible 2^n systems, using data that is often highly ambiguous and
exception-filled. The main discovery from the case studies presented here
is that learners can in fact succeed provided they are biased to only use a
subset of the available input that is perceived as a cleaner representation
of the underlying system.

The case studies are embedded in a framework that conceptualizes language
learning as three separable components, assuming that learning is the
process of selecting the best-fit option given the available data. These
components are (1) a defined hypothesis space, (2) a definition of the data
used for learning (data intake), and (3) an algorithm that updates the
learner's belief in the available hypotheses, based on data intake. One
benefit of this framework is that components can be investigated
individually. Moreover, defining the learning components in this somewhat
abstract manner allows us to apply the framework to a range of language
learning problems and linguistics domains. In addition, we can combine
discrete linguistic representations with probabilistic methods and so
account for the gradualness and variation in learning that human children

The tool of exploration for these case studies is computational modeling,
which proves itself very useful in addressing the feasibility, sufficiency,
and necessity of data intake filtering since these questions would be very
difficult to address with traditional experimental techniques. In addition,
the results of computational modeling can generate predictions that can
then be tested experimentally. 

LINGUIST List: Vol-18-1412	


More information about the Linguist mailing list