[Corpora-List] ANC, FROWN, Fuzzy Logic

Mon Jul 24 14:05:15 UTC 2006

Hi,

I'm a DPhil student looking at some related stuff.

On 24 Jul 2006, at 02:24, John F. Sowa wrote:

> Linda, Rob, Chris, and Mark,
>
> I agree with Rob on following point:
>
> RF> As far as I know fuzzy logic is just a way of keeping
> > track of uncertain qualities, it does not explain the
> > underlying uncertainty.

As far as I understand it, fuzzy logic isn't about uncertainty in 
qualities, it is about degrees of qualities, or vagueness. Consider the 
set of tall people, for example. At what height do we say that someone 
belongs to this set? Fuzzy set theory proposes that there should be a 
degree of membership to some sets, so if someone is not tall or short, 
but somewhere in between, we should assign that person a degree of 
membership to the set of tall people, say 0.5. Note that there is no 
uncertainty about the person's height: we know exactly how tall they 
are, we are just not sure whether to call them tall or not.

At first sight, the issue of representing colours would seem to be 
perfect for fuzzy logic: since we can decompose colours (for example) 
into the primaries red, green and blue, we can represent each possible 
colour as partially belonging to the fuzzy sets of red, green and blue 
colours. Then we can define ways to calculate, using fuzzy set 
operations, to what degree something that is turquoise can be called 
blue, for example. There are numerous problems with this, however. The 
most obvious is that the behaviour of the representations change if you 
look at colours in a different way. For example, you could equally 
classify colours in terms of their degree of membership to the fuzzy 
sets of cyan, magenta, yellow and black 'colours'; in this case, fuzzy 
intersection would make colours closer to white, whereas in the red 
green and blue decomposition, fuzzy intersection makes colours darker.

It may be that you are interested in representing uncertainty. The 
standard system for reasoning with uncertainty is Bayesian inference. 
The idea is that the mathematics of probability is perfectly suited for 
reasoning about uncertainty. For example, not everyone has the same 
idea of what turquoise should look like, therefore when someone uses 
the term 'turquoise' we are not sure exactly what colour she is 
referring to. We could ask people to specify their idea of turquoise in 
terms of its red green and blue components, and then use their opinions 
to estimate a probability distribution for the term 'turquoise' over 
all the possible colours. (This would be a continuous function over the 
three dimensional vector space in the cube between the points (0,0,0) 
and (1,1,1), with a dimension corresponding to each of red green and 
blue, such that integrating the function over this space would give 1). 
Repeating this for all the terms for colours in the English language, 
we could then use this, for example, to estimate the probability that 
given someone had used the term 'blue'  they meant the same colour that 
another person would refer to as 'turquoise'.

Unfortunately I have no idea how this relates to vantage theory.

>
> I also agree that Greg Chaitin makes many good points, but
> the connection between those points and this discussion is
> not clear.
>
> RF> the solution is to understand language to be fundamentally
> > a corpus and not a logical system of rules and classes over
> > that corpus.
>
> The first half of that sentence doesn't say much, since Chomsky
> also claimed that language is a corpus, but one that is generated
> by rules.  Saying that the corpus is not generated by rules might
> be a reasonable claim, but then it is necessary to answer Chris's
> questions:
>
> CB> how should we, as scientists, proceed in trying to derive
> > objective and generalizable knowledge about language from
> > corpora?
> >
> > once we have decided what to try and explain, what kind of
> > models we should use?
>

I think perhaps what the reference to Greg Chaitin's work was getting 
at was perhaps related to the following. In practice we are always 
faced with a finite corpus, whereas the theoretical corpora generated 
by rules are infinite. We can view our finite corpus as a sample from 
some hypothetical infinite corpus. The question is, what theory gives 
us the best estimate of this infinite corpus, given the finite sample? 
Using our finite corpus we can form theories about the infinite corpus, 
which may or may not incorporate our linguistic knowledge of the 
language in question. From an information theoretic perspective, the 
best theory would be the one that enabled us to express the finite 
corpus using the least amount of information -- the one that best 
compressed the information in the corpus.

Of course theories become large and unwieldy, so we may prefer the 
minimum description length principle: the best theory for a sequence of 
data is the one that minimises the size of the theory plus the size of 
the data described using the theory.

Some of this has been put into practice by Bill Teahan, who applies 
text compression techniques to NLP applications. It would be extremely 
interesting however to see whether the use of linguistic theories can 
help provide better text compression. To my awareness this has not been 
looked into.

Daoud Clarke