Search CORE

39 research outputs found

Technology Assisted Reviews: Finding the Last Few Relevant Documents by Asking Yes/No Questions to Reviewers

Author: Gordon
Grossman Maura R.
Kanoulas Evangelos
O'Mara-Eves Alison
Wen Zheng
Publication venue
Publication date: 01/01/2018
Field of study

The goal of a technology-assisted review is to achieve high recall with low human effort. Continuous active learning algorithms have demonstrated good performance in locating the majority of relevant documents in a collection, however their performance is reaching a plateau when 80\%-90\% of them has been found. Finding the last few relevant documents typically requires exhaustively reviewing the collection. In this paper, we propose a novel method to identify these last few, but significant, documents efficiently. Our method makes the hypothesis that entities carry vital information in documents, and that reviewers can answer questions about the presence or absence of an entity in the missing relevance documents. Based on this we devise a sequential Bayesian search method that selects the optimal sequence of questions to ask. The experimental results show that our proposed method can greatly improve performance requiring less reviewing effort.Comment: This paper is accepted by SIGIR 201

arXiv.org e-Print Archive

Crossref

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Active Learning Strategies for Technology Assisted Sensitivity Review

Author: Macdonald Craig
McDonald Graham
Ounis Iadh
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Government documents must be reviewed to identify and protect any sensitive information, such as personal information, before the documents can be released to the public. However, in the era of digital government documents, such as e-mail, traditional sensitivity review procedures are no longer practical, for example due to the volume of documents to be reviewed. Therefore, there is a need for new technology assisted review protocols to integrate automatic sensitivity classification into the sensitivity review process. Moreover, to effectively assist sensitivity review, such assistive technologies must incorporate reviewer feedback to enable sensitivity classifiers to quickly learn and adapt to the sensitivities within a collection, when the types of sensitivity are not known a priori. In this work, we present a thorough evaluation of active learning strategies for sensitivity review. Moreover, we present an active learning strategy that integrates reviewer feedback, from sensitive text annotations, to identify features of sensitivity that enable us to learn an effective sensitivity classifier (0.7 Balanced Accuracy) using significantly less reviewer effort, according to the sign test (p < 0.01 ). Moreover, this approach results in a 51% reduction in the number of documents required to be reviewed to achieve the same level of classification accuracy, compared to when the approach is deployed without annotation features

Crossref

Enlighten

APS: An Active PubMed Search System for Technology Assisted Reviews

Author: Kanoulas E.
Li D.
Zafeiriadis P.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2020
Field of study

International Migration, Integration and Social Cohesion online publications

APS: An Active PubMed Search System for Technology Assisted Reviews

Author: Kanoulas E.
Li D.
Zafeiriadis P.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2020
Field of study

International Migration, Integration and Social Cohesion online publications

Technology assisted reviews: Finding the last few relevant documents by asking yes/no questions to reviewers

Author: Kanoulas E.
Li D.
Zou J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

International Migration, Integration and Social Cohesion online publications

Technology assisted reviews: Finding the last few relevant documents by asking yes/no questions to reviewers

Author: Kanoulas E.
Li D.
Zou J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

International Migration, Integration and Social Cohesion online publications

Total Recall, Language Processing, and Software Engineering

Author: Grossman Maura R
Nguyen An Thanh
Wallace Byron C
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 31/08/2018
Field of study

A broad class of software engineering problems can be generalized as the "total recall problem". This short paper claims that identifying and exploring total recall language processing problems in software engineering is an important task with wide applicability. To make that case, we show that by applying and adapting the state of the art active learning and text mining, solutions of the total recall problem, can help solve two important software engineering tasks: (a) supporting large literature reviews and (b) identifying software security vulnerabilities. Furthermore, we conjecture that (c) test case prioritization and (d) static warning identification can also be categorized as the total recall problem. The widespread applicability of "total recall" to software engineering suggests that there exists some underlying framework that encompasses not just natural language processing, but a wide range of important software engineering tasks.Comment: 4 pages, 2 figures. Submitted to NL4SE@ESEC/FSE 201

arXiv.org e-Print Archive

Crossref