Ranking of Clustering using Information Theory and Mutual Information

Jung Hyun Lee

상세 보기

Ranking of Clustering using Information Theory and Mutual Information

Jung Hyun Lee

초록

Current information retrieval sysytems with inverted index file technique reflects only the fact of occurrence of the key word of the query in the document. This kind of systems do not utilize context information and naturally can not overcome the barrier of the correctness. Users of the information retieval system want to get high precision, ranking those documents they want to retrieve within the top list, rather than high recall ratio. In this paper, we propose the document clustering method using the entropy values extracted with the user query and their profiles. The user profile used here is a ranked list of the recently used words based on their previous occurring ratio. For entroy calculation we coupled the probability vector information with the mutual information and also calculated the based on the Bayesian learning and the inverse document frequency. In our experiment KT set95 was used as a corpus for mutual information extraction. Experiment result showed that this system has 13% increase of precision without degrading the recurrent ratio of the previous system.

제목: Ranking of Clustering using Information Theory and Mutual Information

저자: Jung Hyun Lee

학회명: Proceeding of the ITC-CSCC'99