[Corpora-List] ANC, FROWN, Fuzzy Logic
Daoud Clarke
d.clarke at sussex.ac.uk
Mon Jul 24 14:05:15 UTC 2006
Hi,
I'm a DPhil student looking at some related stuff.
On 24 Jul 2006, at 02:24, John F. Sowa wrote:
> Linda, Rob, Chris, and Mark,
>
> I agree with Rob on following point:
>
> RF> As far as I know fuzzy logic is just a way of keeping
> > track of uncertain qualities, it does not explain the
> > underlying uncertainty.
As far as I understand it, fuzzy logic isn't about uncertainty in
qualities, it is about degrees of qualities, or vagueness. Consider the
set of tall people, for example. At what height do we say that someone
belongs to this set? Fuzzy set theory proposes that there should be a
degree of membership to some sets, so if someone is not tall or short,
but somewhere in between, we should assign that person a degree of
membership to the set of tall people, say 0.5. Note that there is no
uncertainty about the person's height: we know exactly how tall they
are, we are just not sure whether to call them tall or not.
At first sight, the issue of representing colours would seem to be
perfect for fuzzy logic: since we can decompose colours (for example)
into the primaries red, green and blue, we can represent each possible
colour as partially belonging to the fuzzy sets of red, green and blue
colours. Then we can define ways to calculate, using fuzzy set
operations, to what degree something that is turquoise can be called
blue, for example. There are numerous problems with this, however. The
most obvious is that the behaviour of the representations change if you
look at colours in a different way. For example, you could equally
classify colours in terms of their degree of membership to the fuzzy
sets of cyan, magenta, yellow and black 'colours'; in this case, fuzzy
intersection would make colours closer to white, whereas in the red
green and blue decomposition, fuzzy intersection makes colours darker.
It may be that you are interested in representing uncertainty. The
standard system for reasoning with uncertainty is Bayesian inference.
The idea is that the mathematics of probability is perfectly suited for
reasoning about uncertainty. For example, not everyone has the same
idea of what turquoise should look like, therefore when someone uses
the term 'turquoise' we are not sure exactly what colour she is
referring to. We could ask people to specify their idea of turquoise in
terms of its red green and blue components, and then use their opinions
to estimate a probability distribution for the term 'turquoise' over
all the possible colours. (This would be a continuous function over the
three dimensional vector space in the cube between the points (0,0,0)
and (1,1,1), with a dimension corresponding to each of red green and
blue, such that integrating the function over this space would give 1).
Repeating this for all the terms for colours in the English language,
we could then use this, for example, to estimate the probability that
given someone had used the term 'blue' they meant the same colour that
another person would refer to as 'turquoise'.
Unfortunately I have no idea how this relates to vantage theory.
>
> I also agree that Greg Chaitin makes many good points, but
> the connection between those points and this discussion is
> not clear.
>
> RF> the solution is to understand language to be fundamentally
> > a corpus and not a logical system of rules and classes over
> > that corpus.
>
> The first half of that sentence doesn't say much, since Chomsky
> also claimed that language is a corpus, but one that is generated
> by rules. Saying that the corpus is not generated by rules might
> be a reasonable claim, but then it is necessary to answer Chris's
> questions:
>
> CB> how should we, as scientists, proceed in trying to derive
> > objective and generalizable knowledge about language from
> > corpora?
> >
> > once we have decided what to try and explain, what kind of
> > models we should use?
>
I think perhaps what the reference to Greg Chaitin's work was getting
at was perhaps related to the following. In practice we are always
faced with a finite corpus, whereas the theoretical corpora generated
by rules are infinite. We can view our finite corpus as a sample from
some hypothetical infinite corpus. The question is, what theory gives
us the best estimate of this infinite corpus, given the finite sample?
Using our finite corpus we can form theories about the infinite corpus,
which may or may not incorporate our linguistic knowledge of the
language in question. From an information theoretic perspective, the
best theory would be the one that enabled us to express the finite
corpus using the least amount of information -- the one that best
compressed the information in the corpus.
Of course theories become large and unwieldy, so we may prefer the
minimum description length principle: the best theory for a sequence of
data is the one that minimises the size of the theory plus the size of
the data described using the theory.
Some of this has been put into practice by Bill Teahan, who applies
text compression techniques to NLP applications. It would be extremely
interesting however to see whether the use of linguistic theories can
help provide better text compression. To my awareness this has not been
looked into.
Daoud Clarke
More information about the Corpora
mailing list