[Corpora-List] ANC, FROWN, Fuzzy Logic
Mark P. Line
mark at polymathix.com
Sat Jul 22 16:17:21 UTC 2006
I guess I want to chime in here. I don't think language is fundamentally a
corpus, but neither do I think that language is a logical system of rules
and classes over that corpus.
I think language is the process which creates the corpus and which can be
characterized by scientists using a logical system of rules and classes
over that corpus. Scientists can also choose means other than rules and
classes to characterize the corpus, and they can even choose to
investigate the process that is language more directly (experimentally,
for instance).
How is the process learned?
Does the process work identically in all humans? If not, how can we
characterize the similarities and differences?
Has the process always been the same phylogenetically? If not, how can we
characterize its origin and evolution?
How does the corpus relate to our predictive models of the process?
How is the process implemented in wetware?
Etc.
Obviously, none of this is news. But since this thread had started talking
about what language fundamentally *is*, I thought it was important to
point out this third option rather than letting stand what I thought was
an unfortunate dichotomy.
-- Mark
P.S. My answer to the original question about fuzzy logic would be that if
you're characterizing the process using formal means (such as rules), then
there are all sorts of good reasons to make the formalism fuzzy. Although
"fuzzy logic" is the term we see in the newspapers when they're talking
about robotic Toyota factories, the fundamental innovation is actually
fuzzy set theory from which all the rest of it can be derived. So all it
really takes to get a fuzzy formalism for linguistic description is to use
fuzzy sets instead of Boolean sets. You do that by introducing an
uncertainty term into the set membership operator which represents degree
of membership -- NOT probability of membership. (Probability and degree
are two very different claims that are frequently confused by amateur
fuzzologists: 'XYZ is probably a noun in this context, but it might be a
verb' vs. 'XYZ is mostly a noun in this context, but it is also partly a
verb'.)
Mark P. Line
Polymathix
San Antonio, TX
Chris Brew wrote:
> Hi Linda and Rob,
>
> Rob's point about language being fundamentally a corpus is
> interesting. I think there are several things at issue here.
>
> - how should we, as scientists, proceed in trying to
> derive objective and generalizable knowledge about
> language from corpora? As Rob says, the debate
> includes serious discussion of whether the object
> of study is to be the corpus itself or some idealized
> extension of the same to include, for example, all the
> words that an educated speaker knows.
>
>
> - even if we reach consensus on how to do science over
> corpora (not that likely, but there is some measure
> of agreement), we might disagree on the extent to which it
> is a nice account of how language users and language
> learners act to treat them as "little scientists" who
> process input in much the same way as we do in our experiments.
>
> - once we have decided what to try and explain, what kind of
> models we should use? See for example
>
> Sarah Bunin Benor and Roger Levy. 2006. The Chicken or the Egg? A
> Probabilistic Analysis of English Binomials. Language 82(2):233-278
>
> which tries out various options for explaining the behaviour
> of expressions like "thunder and lightning".
>
>
> Chris
>
> On Sat, Jul 22, 2006 at 01:26:21PM +0800, Rob Freeman wrote:
>>
>> I think Chaitin's article provides a better idea of the underlying
>> problem
>> with language (for which the solution is to understand language to be
>> fundamentally a corpus and not a logical system of rules and classes
>> over
>> that corpus.)
More information about the Corpora
mailing list