[Corpora-List] CLAIRLIB Release

mtjoseph at umich.edu mtjoseph at umich.edu
Wed Oct 18 21:51:05 UTC 2006


                     Clairlib, The Clair Library

                           is now available

              http://tangra.si.umich.edu/clair/clairlib


INTRODUCTION

The University of Michigan's CLAIR (Computational Linguistics And 
Information Retrieval) group (http://tangra.si.umich.edu/clair) is 
happy to present the second release of clairlib, the Clair library.

The Clair library is written in Perl and is intended to simplify a 
number of generic tasks in Natural Language Processing (NLP), 
Information Retrieval (IR), and Lexical Network Analysis. Its 
architecture also allows for external software to be plugged in with 
very little effort.

Clairlib features a tiered architecture with a core shared by all 
applications and subject-specific libraries (currently in political 
science and bioinformatics).

FUNCTIONALITY

Native: Tokenization, Summarization, LexRank, Biased LexRank, Document 
Clustering, Document Indexing, PageRank, Biased Pagerank, Web Graph 
Analysis, Bioinformatics Text Analysis, Political Science Text 
Analysis, Network Building, Power Law Distribution Analysis, Network 
Analysis and Computation (Watts-Strogatz Clustering Coefficient, 
Cosines, Random Walks), Tf, Idf

Imported: Stemming, Sentence Segmentation, Web Page Download, Web 
Crawling, XML Parsing, XML Tree Building, XML Writing

FUNDING

This work has been supported in part by grants R01 LM008106 
"Representing and Acquiring Knowledge of Genome Regulation" and U54 
DA021519 "National center for integrative bioinformatics", both from 
the National Institutes of Health as well as grants IDM 0329043 
"Probabilistic and link-based Methods for Exploiting Very Large Textual 
Repositories" and DHB 0527513 "The Dynamics of Politcal Representation 
and Political Rhetoric," both from the National Science Foundation.

ABOUT

The Clair Library is developed by the Clair group at the University of 
Michigan.  It encompasses the functionality of MEAD and perltree, two 
of CLAIR's earlier releases.

Project design: Dragomir R. Radev

Main implementers: Anthony Fader, Mark Hodges, and Dragomir R. Radev

Additional code by: Timothy Allison, Michael Dagitses, Aaron Elkiss, 
Gunes Erkan, Scott Gifford, Mark Joseph, Samuela Pollack, and Adam 
Winkel



More information about the Corpora mailing list