Databases: question about searching multiple full-text databases simultaneously

Garson O'Toole adsgarsonotoole at GMAIL.COM
Wed Jan 18 17:07:06 UTC 2012


On January 5th Bill Mullins posted a message about his ongoing efforts
to compile and share a comprehensive list of full-text databases.
Bill's work keeping track of these proliferating databases is
marvelous. The organizations and individuals creating these
repositories (which are often free to access) deserve high praise.

The value of these archives would be enormously enhanced if they could
be searched simultaneously through a single interface. Are there any
projects with the goal of joint searchability for these small
databases? The non-productive duplication of efforts makes progress
much slower. The goals of a joint searchability project would include:

1) Algorithmic support for a flexible and expressive query language
with wildcards.
2) High-quality optical character recognition and segmentation of text fields.
3) Standardized scanning methods and strategies to assure quality.
4) High-quality open-source and/or free software shared between
multiple organizations.

Using some of these databases is an exercise in frustration. Yet, even
the most difficult to use databases reflect a substantial and
praise-worthy effort to share information. Joint searchability would
pragmatically unlock access to important resources.

Garson

On Thu, Jan 5, 2012 at 5:48 PM, Mullins, Bill AMRDEC
<Bill.Mullins at us.army.mil> wrote:
> ---------------------- Information from the mail header -----------------------
> Sender:       American Dialect Society <ADS-L at LISTSERV.UGA.EDU>
> Poster:       "Mullins, Bill AMRDEC" <Bill.Mullins at US.ARMY.MIL>
> Subject:      full text databases (UNCLASSIFIED)
> -------------------------------------------------------------------------------
>
> Classification: UNCLASSIFIED
> Caveats: NONE
>
> Starting several years ago, Mark Mandel has hosted a web page listing
> full-text databases that I put together (thanks, Mark!).
>
> I finally got around to re-compiling it, and am using Google Sites to
> host it.
>
> This group may find it useful.  I'd appreciate any feedback offered,
> comments, additions, suggestions, etc.  It still needs a little
> tweaking, but is at the 90% level of completion, I'd guess.
>
> It is mostly the old list, but with additions.  I've found many more
> student newspapers, for example.
>
> https://sites.google.com/site/fulltextdatabases/
>
>
> Let me know . . .
>
> Bill
> Classification: UNCLASSIFIED
> Caveats: NONE
>
> ------------------------------------------------------------
> The American Dialect Society - http://www.americandialect.org

------------------------------------------------------------
The American Dialect Society - http://www.americandialect.org



More information about the Ads-l mailing list