322 research outputs found
Technology Assisted Reviews: Finding the Last Few Relevant Documents by Asking Yes/No Questions to Reviewers
The goal of a technology-assisted review is to achieve high recall with low
human effort. Continuous active learning algorithms have demonstrated good
performance in locating the majority of relevant documents in a collection,
however their performance is reaching a plateau when 80\%-90\% of them has been
found. Finding the last few relevant documents typically requires exhaustively
reviewing the collection. In this paper, we propose a novel method to identify
these last few, but significant, documents efficiently. Our method makes the
hypothesis that entities carry vital information in documents, and that
reviewers can answer questions about the presence or absence of an entity in
the missing relevance documents. Based on this we devise a sequential Bayesian
search method that selects the optimal sequence of questions to ask. The
experimental results show that our proposed method can greatly improve
performance requiring less reviewing effort.Comment: This paper is accepted by SIGIR 201
- …