Search CORE

34 research outputs found

Technology Assisted Reviews: Finding the Last Few Relevant Documents by Asking Yes/No Questions to Reviewers

Author: Gordon
Grossman Maura R.
Kanoulas Evangelos
O'Mara-Eves Alison
Wen Zheng
Publication venue
Publication date: 01/01/2018
Field of study

The goal of a technology-assisted review is to achieve high recall with low human effort. Continuous active learning algorithms have demonstrated good performance in locating the majority of relevant documents in a collection, however their performance is reaching a plateau when 80\%-90\% of them has been found. Finding the last few relevant documents typically requires exhaustively reviewing the collection. In this paper, we propose a novel method to identify these last few, but significant, documents efficiently. Our method makes the hypothesis that entities carry vital information in documents, and that reviewers can answer questions about the presence or absence of an entity in the missing relevance documents. Based on this we devise a sequential Bayesian search method that selects the optimal sequence of questions to ask. The experimental results show that our proposed method can greatly improve performance requiring less reviewing effort.Comment: This paper is accepted by SIGIR 201

arXiv.org e-Print Archive

Crossref

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Inconsistent Responsiveness Determination in Document Review: Difference of Opinion or Human Error?

Author: Cormack Gordon V.
Grossman Maura R.
Publication venue: DigitalCommons@Pace
Publication date: 16/10/2012
Field of study

This Article analyzes the inconsistency between different document review efforts on the same document collection to determine whether that inconsistency is due primarily to ambiguity in applying the definition of responsiveness to particular documents, or due primarily to human error. By examining documents from the TREC 2009 Legal Track, the Authors show that inconsistent assessments regarding the same documents are due in large part to human error. Therefore, the quality of a review effort is not simply a matter of opinion; it is possible to show objectively that some reviews, and some review methods, are better than others

DigitalCommons@Pace

Technology-Assisted Review in E-Discovery Can Be More Effective and More Efficient Than Exhaustive Manual Review

Author: Cormack Gordon V.
Grossman Maura R.
Publication venue: UR Scholarship Repository
Publication date: 01/01/2011
Field of study

E-discovery processes that use automated tools to prioritize and select documents for review are typically regarded as potential cost-savers – but inferior alternatives – to exhaustive manual review, in which a cadre of reviewers assesses every document for responsiveness to a production request, and for privilege. This Article offers evidence that such technology-assisted processes, while indeed more efficient, can also yield results superior to those of exhaustive manual review, as measured by recall and precision, as well as F1, a summary measure combining both recall and precision. The evidence derives from an analysis of data collected from the TREC 2009 Legal Track Interactive Task, and shows that, at TREC 2009, technology-assisted review processes enabled two participating teams to achieve results superior to those that could have been achieved through a manual review of the entire document collection by the official TREC assessors

University of Richmond

Total Recall, Language Processing, and Software Engineering

Author: Grossman Maura R
Nguyen An Thanh
Wallace Byron C
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 31/08/2018
Field of study

A broad class of software engineering problems can be generalized as the "total recall problem". This short paper claims that identifying and exploring total recall language processing problems in software engineering is an important task with wide applicability. To make that case, we show that by applying and adapting the state of the art active learning and text mining, solutions of the total recall problem, can help solve two important software engineering tasks: (a) supporting large literature reviews and (b) identifying software security vulnerabilities. Furthermore, we conjecture that (c) test case prioritization and (d) static warning identification can also be categorized as the total recall problem. The widespread applicability of "total recall" to software engineering suggests that there exists some underlying framework that encompasses not just natural language processing, but a wide range of important software engineering tasks.Comment: 4 pages, 2 figures. Submitted to NL4SE@ESEC/FSE 201

arXiv.org e-Print Archive

Crossref

Artificial Justice: The Quandary of AI in the Courtroom

Author: Gless Sabine
Grimm Paul W.
Grossman Maura R.
Hildebrandt Mireille
Publication venue: Duke University School of Law
Publication date: 01/09/2022
Field of study

bepress Legal Repository

Duke Law Scholarship Repository