Search CORE

164,204 research outputs found

Detection is the central problem in real-word spelling correction

Author: Wilcox-O'Hearn L. Amber
Publication venue
Publication date: 15/08/2014
Field of study

Real-word spelling correction differs from non-word spelling correction in its aims and its challenges. Here we show that the central problem in real-word spelling correction is detection. Methods from non-word spelling correction, which focus instead on selection among candidate corrections, do not address detection adequately, because detection is either assumed in advance or heavily constrained. As we demonstrate in this paper, merely discriminating between the intended word and a random close variation of it within the context of a sentence is a task that can be performed with high accuracy using straightforward models. Trigram models are sufficient in almost all cases. The difficulty comes when every word in the sentence is a potential error, with a large set of possible candidate corrections. Despite their strengths, trigram models cannot reliably find true errors without introducing many more, at least not when used in the obvious sequential way without added structure. The detection task exposes weakness not visible in the selection task

arXiv.org e-Print Archive

CiteSeerX

Rethinking a Reinvigorated Right To Assemble

Author: Brod Nicholas S.
Publication venue: Duke University School of Law
Publication date: 01/10/2013
Field of study

Revived after a decades-long slumber, the First Amendment’s Assembly Clause has garnered robust attention of late. Endeavoring to reinvigorate this forgotten clause, legal scholars have outlined a normative vision of the assembly right that would better safeguard the freedom of association. This Note argues that such an approach—no matter its merits or its deficiencies—overlooks the Clause’s central aim. The assembly right is in fact best understood as an assembly right, not as a right about associations. This Note advances that proposition by closely analyzing the text and the history of the Assembly Clause, a project that has not yet been systematically undertaken. The evidence unearthed from this inquiry demonstrates that the Assembly Clause seeks, as its first-order concern, to protect in-person, flesh–and–blood gatherings. Such protection is thus ultimately of great import in rethinking both the freedoms afforded and the constraints imposed on dissent within our constitutional framework

bepress Legal Repository

Duke Law Scholarship Repository

Cognitive apprenticeship : teaching the craft of reading, writing, and mathtematics

Author: Brown John Seely
Collins Allan
Newman Susan E.
Publication venue: Cambridge, Mass. : Bolt Beranek and Newman, Inc.
Publication date: 01/01/1987
Field of study

Includes bibliographical references (p. 25-27)This research was supported by the National Institute of Education under Contract no. US-NIE-C-400-81-0030 and the Office of Naval Research under Contract No. N00014-85-C-002

Illinois Digital Environment for Access to Learning and Scholarship Repository

Judicial Attention as a Scarce Resource: A Preliminary Defense of How Judges Allocate Time Across Cases in the Federal Courts of Appeals

Author: Levy Marin K.
Publication venue: Duke University School of Law
Publication date: 01/01/2013
Field of study

Federal appellate judges no longer have the time to hear argument and draft opinions in all of their cases. The average annual filing per active judgeship now stands at 330 filed cases per year — more than four times what it was sixty years ago. In response, judges have adopted case management strategies that effectively involve spending significantly less time on certain classes of cases than on others. Various scholars have decried this state of affairs, suggesting that the courts have created a “bifurcated” system of justice with “separate and unequal tracks.” These reformers propose altering the relevant constraints of the courts, primarily by increasing the number of judges or decreasing the judiciary’s caseload. These approaches, however, have not gained political traction thus far and seem unlikely to in the foreseeable future. This Article takes a realist approach and argues that we should recognize judicial attention for what it is — a scarce resource — and assess whether there is evidence that the courts are allocating that resource improperly. Loosely borrowing the framework of resource allocation from the political science and economics literatures, this Article considers how to apply the concepts of inputs and outputs to the work of the federal appellate courts, suggesting judicial attention as the input and a combination of error correction and law development as the output. It then makes the preliminary case that the courts’ case management techniques in fact largely comport with an output-maximization approach, while still limiting inequality of outputs across cases. This Article concludes that the courts’ overall strategy nevertheless presents opportunities for enhancement. It suggests several improvements, focusing on the review structure of cases that receive the least amount of judicial attention, to help ensure that all federal cases receive an appropriate form of appellate review

bepress Legal Repository

Duke Law Scholarship Repository

DCU@TRECMed 2012: Using ad-hoc baselines for domain-specific retrieval

Author: Goeuriot Lorraine
Jones Gareth J.F.
Kelly Liadh
Leveling Johannes
Publication venue
Publication date: 09/11/2012
Field of study

This paper describes the first participation of DCU in the TREC Medical Records Track (TRECMed). We performed some initial experiments on the 2011 TRECMed data based on the BM25 retrieval model. Surprisingly, we found that the standard BM25 model with default parameters, performs comparable to the best automatic runs submitted to TRECMed 2011 and would have resulted in rank four out of 29 participating groups. We expected that some form of domain adaptation would increase performance. However, results on the 2011 data proved otherwise: concept-based query expansion decreased performance, and filtering and reranking by term proximity also decreased performance slightly. We submitted four runs based on the BM25 retrieval model to TRECMed 2012 using standard BM25, standard query expansion, result filtering, and concept-based query expansion. Official results for 2012 confirm that domain-specific knowledge does not increase performance compared to the BM25 baseline as applied by us

Irish Universities

DCU Online Research Access Service

Collaborative Development and Evaluation of Text-processing Workflows in a UIMA-supported Web-based Workbench

Author: Ananiadou S
Rak R
Rowley A
Publication venue
Publication date: 01/05/2012
Field of study

Challenges in creating comprehensive text-processing worklows include a lack of the interoperability of individual components coming from different providers and/or a requirement imposed on the end users to know programming techniques to compose such workflows. In this paper we demonstrate Argo, a web-based system that addresses these issues in several ways. It supports the widely adopted Unstructured Information Management Architecture (UIMA), which handles the problem of interoperability; it provides a web browser-based interface for developing workflows by drawing diagrams composed of a selection of available processing components; and it provides novel user-interactive analytics such as the annotation editor which constitutes a bridge between automatic processing and manual correction. These features extend the target audience of Argo to users with a limited or no technical background. Here, we focus specifically on the construction of advanced workflows, involving multiple branching and merging points, to facilitate various comparative evalutions. Together with the use of user-collaboration capabilities supported in Argo, we demonstrate several use cases including visual inspections, comparisions of multiple processing segments or complete solutions against a reference standard, inter-annotator agreement, and shared task mass evaluations. Ultimetely, Argo emerges as a one-stop workbench for defining, processing, editing and evaluating text processing tasks

CiteSeerX

The University of Manchester - Institutional Repository

An Analysis of Source-Side Grammatical Errors in NMT

Author: Anastasopoulos Antonios
Publication venue
Publication date: 01/01/2019
Field of study

The quality of Neural Machine Translation (NMT) has been shown to significantly degrade when confronted with source-side noise. We present the first large-scale study of state-of-the-art English-to-German NMT on real grammatical noise, by evaluating on several Grammar Correction corpora. We present methods for evaluating NMT robustness without true references, and we use them for extensive analysis of the effects that different grammatical errors have on the NMT output. We also introduce a technique for visualizing the divergence distribution caused by a source-side error, which allows for additional insights.Comment: Accepted and to be presented at BlackboxNLP 201

arXiv.org e-Print Archive

Crossref