[Corpora-List] Finding representative terms
Chris Jordan
cjordan at cs.dal.ca
Mon Dec 26 17:09:29 UTC 2005
There is no such thing as an ideal term discrimination function
unfortunately however I would recommend trying something like relative
entropy. It is what I have used in the past with my thesis work on
automatically manufacturing queries. Cai et al also used relative and
other divergence functions for query expansion.
*@inproceedings*{Cai_query_expansion,
author = {D. Cai and C. J. van Rijsbergen and J. M. Jose},
title = {Automatic query expansion based on divergence},
booktitle = {CIKM '01: Proceedings of the Tenth International Conference on Information and Knowledge Management},
year = {2001},
isbn = {1-58113-436-3},
pages = {419--426},
location = {Atlanta, Georgia, USA},
doi = {http://doi.acm.org/10.1145/502585.502656},
publisher = {ACM Press},
}
Delip Rao wrote:
>Hi,
>
>Is there any work that tries to find the most
>important/representative words from a document? I have
>tried using IDF but results were very poor. Also IDF
>does not make sense if we have a single document and
>want to get the most important term(s) out of it.
>
>Thanks!
>Delip
>
>
>
>__________________________________
>Meet your soulmate!
>Yahoo! Asia presents Meetic - where millions of singles gather
>http://asia.yahoo.com/meetic
>
>
>
>
More information about the Corpora
mailing list