8,380 research outputs found
Legal Judgement Prediction for UK Courts
Legal Judgement Prediction (LJP) is the task of automatically predicting the outcome of a court case given only the case document. During the last five years researchers have successfully attempted this task for the supreme courts of three jurisdictions: the European Union, France, and China. Motivation includes the many real world applications including: a prediction system that can be used at the judgement drafting stage, and the identification of the most important words and phrases within a judgement. The aim of our research was to build, for the first time, an LJP model for UK court cases. This required the creation of a labelled data set of UK court judgements and the subsequent application of machine learning models. We evaluated different feature representations and different algorithms. Our best performing model achieved: 69.05% accuracy and 69.02 F1 score. We demonstrate that LJP is a promising area of further research for UK courts by achieving high model performance and the ability to easily extract useful features
A General Approach for Predicting the Behavior of the Supreme Court of the United States
Building on developments in machine learning and prior work in the science of
judicial prediction, we construct a model designed to predict the behavior of
the Supreme Court of the United States in a generalized, out-of-sample context.
To do so, we develop a time evolving random forest classifier which leverages
some unique feature engineering to predict more than 240,000 justice votes and
28,000 cases outcomes over nearly two centuries (1816-2015). Using only data
available prior to decision, our model outperforms null (baseline) models at
both the justice and case level under both parametric and non-parametric tests.
Over nearly two centuries, we achieve 70.2% accuracy at the case outcome level
and 71.9% at the justice vote level. More recently, over the past century, we
outperform an in-sample optimized null model by nearly 5%. Our performance is
consistent with, and improves on the general level of prediction demonstrated
by prior work; however, our model is distinctive because it can be applied
out-of-sample to the entire past and future of the Court, not a single term.
Our results represent an important advance for the science of quantitative
legal prediction and portend a range of other potential applications.Comment: version 2.02; 18 pages, 5 figures. This paper is related to but
distinct from arXiv:1407.6333, and the results herein supersede
arXiv:1407.6333. Source code available at
https://github.com/mjbommar/scotus-predict-v
Slave to the Algorithm? Why a \u27Right to an Explanation\u27 Is Probably Not the Remedy You Are Looking For
Algorithms, particularly machine learning (ML) algorithms, are increasingly important to individualsâ lives, but have caused a range of concerns revolving mainly around unfairness, discrimination and opacity. Transparency in the form of a âright to an explanationâ has emerged as a compellingly attractive remedy since it intuitively promises to open the algorithmic âblack boxâ to promote challenge, redress, and hopefully heightened accountability. Amidst the general furore over algorithmic bias we describe, any remedy in a storm has looked attractive. However, we argue that a right to an explanation in the EU General Data Protection Regulation (GDPR) is unlikely to present a complete remedy to algorithmic harms, particularly in some of the core âalgorithmic war storiesâ that have shaped recent attitudes in this domain. Firstly, the law is restrictive, unclear, or even paradoxical concerning when any explanation-related right can be triggered. Secondly, even navigating this, the legal conception of explanations as âmeaningful information about the logic of processingâ may not be provided by the kind of ML âexplanationsâ computer scientists have developed, partially in response. ML explanations are restricted both by the type of explanation sought, the dimensionality of the domain and the type of user seeking an explanation. However, âsubject-centric explanations (SCEs) focussing on particular regions of a model around a query show promise for interactive exploration, as do explanation systems based on learning a model from outside rather than taking it apart (pedagogical versus decompositional explanations) in dodging developers\u27 worries of intellectual property or trade secrets disclosure. Based on our analysis, we fear that the search for a âright to an explanationâ in the GDPR may be at best distracting, and at worst nurture a new kind of âtransparency fallacy.â But all is not lost. We argue that other parts of the GDPR related (i) to the right to erasure ( right to be forgotten ) and the right to data portability; and (ii) to privacy by design, Data Protection Impact Assessments and certification and privacy seals, may have the seeds we can use to make algorithms more responsible, explicable, and human-centered
Modelling source- and target-language syntactic Information as conditional context in interactive neural machine translation
In interactive machine translation (MT),
human translators correct errors in auto-
matic translations in collaboration with the
MT systems, which is seen as an effective
way to improve the productivity gain in
translation. In this study, we model source-
language syntactic constituency parse and
target-language syntactic descriptions in
the form of supertags as conditional con-
text for interactive prediction in neural
MT (NMT). We found that the supertags
significantly improve productivity gain in
translation in interactive-predictive NMT
(INMT), while syntactic parsing somewhat
found to be effective in reducing human
efforts in translation. Furthermore, when
we model this source- and target-language
syntactic information together as the con-
ditional context, both types complement
each other and our fully syntax-informed
INMT model shows statistically significant
reduction in human efforts for a Frenchâ
toâEnglish translation task in a reference-
simulated setting, achieving 4.30 points
absolute (corresponding to 9.18% relative)
improvement in terms of word prediction
accuracy (WPA) and 4.84 points absolute
(corresponding to 9.01% relative) reduc-
tion in terms of word stroke ratio (WSR)
over the baseline
The Hidden Inconsistencies Introduced by Predictive Algorithms in Judicial Decision Making
Algorithms, from simple automation to machine learning, have been introduced
into judicial contexts to ostensibly increase the consistency and efficiency of
legal decision making. In this paper, we describe four types of inconsistencies
introduced by risk prediction algorithms. These inconsistencies threaten to
violate the principle of treating similar cases similarly and often arise from
the need to operationalize legal concepts and human behavior into specific
measures that enable the building and evaluation of predictive algorithms.
These inconsistencies, however, are likely to be hidden from their end-users:
judges, parole officers, lawyers, and other decision-makers. We describe the
inconsistencies, their sources, and propose various possible indicators and
solutions. We also consider the issue of inconsistencies due to the use of
algorithms in light of current trends towards more autonomous algorithms and
less human-understandable behavioral big data. We conclude by discussing judges
and lawyers' duties of technological ("algorithmic") competence and call for
greater alignment between the evaluation of predictive algorithms and
corresponding judicial goals
TermEval: an automatic metric for evaluating terminology translation in MT
Terminology translation plays a crucial role in domain-specific machine translation (MT). Preservation of domain-knowledge from source to target is arguably the most concerning factor for the customers in translation industry, especially for critical domains such as medical, transportation, military, legal and aerospace. However, evaluation of terminology translation, despite its huge importance in the translation industry, has been a less examined area in MT research. Term translation quality in MT is usually measured with domain experts, either in academia or industry. To the best of our knowledge, as of yet there is no publicly available solution to automatically evaluate terminology translation in MT. In particular, manual intervention is often needed to evaluate terminology translation in MT, which, by nature, is a time-consuming and highly expensive task. In fact, this is unimaginable in an industrial setting where customised MT systems are often needed to be updated for many reasons (e.g. availability of new training data or leading MT techniques). Hence, there is a genuine need to have a faster and less expensive solution to this problem,
which could aid the end-users to instantly identify term translation problems in MT.
In this study, we propose an automatic evaluation metric, TermEval, for evaluating terminology translation in MT. To the best of our knowledge, there is no gold-standard dataset available for measuring terminology translation quality in MT. In the absence of gold standard evaluation test set, we semi-automatically create a gold-standard dataset from English--Hindi judicial domain parallel corpus.
We trained state-of-the-art phrase-based SMT (PB-SMT) and neural MT (NMT) models on two translation directions: English-to-Hindi and Hindi-to-English, and use TermEval to evaluate their performance on terminology translation over the created gold standard test set. In order to measure the correlation between TermEval scores and human judgments, translations of each source terms (of the gold standard test set) is validated with human evaluator. High correlation between TermEval and human judgements manifests the effectiveness of the proposed terminology translation evaluation metric. We also carry out comprehensive manual evaluation on terminology translation and present our observations
- âŠ