Anomalous and unreliable database behavior

Joel S. Berson Berson at ATT.NET
Sat Sep 11 17:39:16 UTC 2010

At 9/11/2010 12:30 PM, George Thompson wrote:
>So, by my calculation, there were 3 potential matches in each of the
>6 appearances of this notice: a total of 18; EAN only found 3 of
>them.  Similar experiences give me the notion that these databases
>are likely to find only about 1/4 of what they should be finding.
>Frequently, for instance, I will read a story in one of these
>databases and want to find related stories -- earlier or later
>stories, or other appearances of someone named in the story.  I
>search for the name, and whatever may come up does not include the
>story I had started from.

I find similar false negatives in EAN, although I have no data for
the percentage.  I too have found something reading a microfilm that
does not turn up in an EAN search.  Or things missed with one set of
search terms that turn up with another set -- but do have the terms
tried in the first search.

The false negative phenomenon is of course the most unfortunate of
the various database problems.  It perhaps explains why EAN is so
generous with its search hits -- trying to minimize false negatives
-- but that has its own disadvantages, too many false positives.

We can't win.


The American Dialect Society -

More information about the Ads-l mailing list