Relevance maximization for high-recall retrieval problem: finding all needles in a haystack

Song, Justin JongSu; Lee, Wookey

doi:10.1007/s11227-016-1956-8

상세 보기

Relevance maximization for high-recall retrieval problem: finding all needles in a haystack

Song, Justin JongSu;
Lee, Wookey

Citations

WEB OF SCIENCE

8

Citations

SCOPUS

7

초록

High-recall retrieval problem, aiming at finding the full set of relevant documents in a huge result set by effective mining techniques, is particularly useful for patent information retrieval, legal document retrieval, medical document retrieval, market information retrieval, and literature review. The existing high-recall retrieval methods, however, have been far from satisfactory to retrieve all relevant documents due to not only high-recall and precision threshold measurements but also a sheer minimize the number of reviewed documents. To address this gap, we generalize the problem to a novel high-recall retrieval model, which can be represented as finding all needles in a giant haystack. To compute candidate groups consisting ofkrelevant documents efficiently, we propose dynamic diverse retrieval algorithms specialized for the patent-searching method, in which an effective dynamic interactive retrieval can be achieved. In the various types of datasets, the dynamic ranking method shows considerable improvements with respect to time and cost over the conventional static ranking approaches.

키워드

High-recall retrieval problem; Patent retrieval; Diversity retrieval; INDEPENDENCE; PARAMETERS; DOMINATION

제목: Relevance maximization for high-recall retrieval problem: finding all needles in a haystack

저자: Song, Justin JongSu; Lee, Wookey

DOI: 10.1007/s11227-016-1956-8

발행일: 2020-10

유형: Article

저널명: Journal of Supercomputing

권: 76

호: 10

페이지: 7734 ~ 7757