Search CORE

9,340 research outputs found

Legal Judgement Prediction for UK Courts

Author: Bengio Yoshua
Blei David M
Edo-Osagie Oduwa
Joachims Thorsten
Lawlor Reed C
Le Quoc
Lee Sangno
Medvedeva Masha
Mikolov Tomas
Socher Richard
Wyner Adam
Zhang Xiang
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 20/04/2020
Field of study

Legal Judgement Prediction (LJP) is the task of automatically predicting the outcome of a court case given only the case document. During the last five years researchers have successfully attempted this task for the supreme courts of three jurisdictions: the European Union, France, and China. Motivation includes the many real world applications including: a prediction system that can be used at the judgement drafting stage, and the identification of the most important words and phrases within a judgement. The aim of our research was to build, for the first time, an LJP model for UK court cases. This required the creation of a labelled data set of UK court judgements and the subsequent application of machine learning models. We evaluated different feature representations and different algorithms. Our best performing model achieved: 69.05% accuracy and 69.02 F1 score. We demonstrate that LJP is a promising area of further research for UK courts by achieving high model performance and the ability to easily extract useful features

Crossref

University of East Anglia digital repository

Distributed Robust Learning

Author: Feng Jiashi
Mannor Shie
Xu Huan
Publication venue
Publication date: 07/02/2015
Field of study

We propose a framework for distributed robust statistical learning on {\em big contaminated data}. The Distributed Robust Learning (DRL) framework can reduce the computational time of traditional robust learning methods by several orders of magnitude. We analyze the robustness property of DRL, showing that DRL not only preserves the robustness of the base robust learning method, but also tolerates contaminations on a constant fraction of results from computing nodes (node failures). More precisely, even in presence of the most adversarial outlier distribution over computing nodes, DRL still achieves a breakdown point of at least

\lambda^*/2

, where

\lambda^*

is the break down point of corresponding centralized algorithm. This is in stark contrast with naive division-and-averaging implementation, which may reduce the breakdown point by a factor of

k

when

k

computing nodes are used. We then specialize the DRL framework for two concrete cases: distributed robust principal component analysis and distributed robust regression. We demonstrate the efficiency and the robustness advantages of DRL through comprehensive simulations and predicting image tags on a large-scale image set.Comment: 18 pages, 2 figure

arXiv.org e-Print Archive

CiteSeerX

Personality in Computational Advertising: A Benchmark

Author: Roffo Giorgio
Vinciarelli Alessandro
Publication venue
Publication date: 01/01/2016
Field of study

In the last decade, new ways of shopping online have increased the possibility of buying products and services more easily and faster than ever. In this new context, personality is a key determinant in the decision making of the consumer when shopping. A person’s buying choices are influenced by psychological factors like impulsiveness; indeed some consumers may be more susceptible to making impulse purchases than others. Since affective metadata are more closely related to the user’s experience than generic parameters, accurate predictions reveal important aspects of user’s attitudes, social life, including attitude of others and social identity. This work proposes a highly innovative research that uses a personality perspective to determine the unique associations among the consumer’s buying tendency and advert recommendations. In fact, the lack of a publicly available benchmark for computational advertising do not allow both the exploration of this intriguing research direction and the evaluation of recent algorithms. We present the ADS Dataset, a publicly available benchmark consisting of 300 real advertisements (i.e., Rich Media Ads, Image Ads, Text Ads) rated by 120 unacquainted individuals, enriched with Big-Five users’ personality factors and 1,200 personal users’ pictures

Enlighten

Fighting Authorship Linkability with Crowdsourcing

Author: Almishari Mishari
Oguz Ekin
Tsudik Gene
Publication venue
Publication date: 01/01/2014
Field of study

Massive amounts of contributed content -- including traditional literature, blogs, music, videos, reviews and tweets -- are available on the Internet today, with authors numbering in many millions. Textual information, such as product or service reviews, is an important and increasingly popular type of content that is being used as a foundation of many trendy community-based reviewing sites, such as TripAdvisor and Yelp. Some recent results have shown that, due partly to their specialized/topical nature, sets of reviews authored by the same person are readily linkable based on simple stylometric features. In practice, this means that individuals who author more than a few reviews under different accounts (whether within one site or across multiple sites) can be linked, which represents a significant loss of privacy. In this paper, we start by showing that the problem is actually worse than previously believed. We then explore ways to mitigate authorship linkability in community-based reviewing. We first attempt to harness the global power of crowdsourcing by engaging random strangers into the process of re-writing reviews. As our empirical results (obtained from Amazon Mechanical Turk) clearly demonstrate, crowdsourcing yields impressively sensible reviews that reflect sufficiently different stylometric characteristics such that prior stylometric linkability techniques become largely ineffective. We also consider using machine translation to automatically re-write reviews. Contrary to what was previously believed, our results show that translation decreases authorship linkability as the number of intermediate languages grows. Finally, we explore the combination of crowdsourcing and machine translation and report on the results

arXiv.org e-Print Archive

Crossref

A Measurement of Rb using a Double Tagging Method

Author: Abbiendi G
Ackerstaff K
Alexander G
Allison J
Altekamp N
Anderson KJ
Anderson S
Arcelli S
Asai S
Ashby SF
Axen D
Azuelos G
Ball AH
Barberio E
Barlow RJ
Bartoldus R
Batley JR
Baumann S
Bechtluft J
Behnke T
Bell KW
Bella G
Bellerive A
Bentvelsen S
Bethke S
Betts S
Biebel O
Biguzzi A
Bird SD
Blobel V
Bloodworth IJ
Bobinski M
Bock P
Bohme J
Bonacorsi D
Boutemeur M
Braibant S
Bright-Thomas P
Brigliadori L
Brown RM
Burckhart HJ
Burgard C
Burgin R
Capiluppi P
Carnegie RK
Carter AA
Carter JR
Chang CY
Charlton DG
Chrisman D
Ciocca C
Clarke PEL
Clay E
Cohen I
Conboy JE
Cooke OC
Couyoumtzelis C
Coxe RL
Cuffiani M
Dado S
Dallavalle GM
Davis R
De Jong S
de Roeck A
del Pozo LA
Desch K
Dienes B
Dixit MS
Dubbert J
Duchovni E
Duckeck G
Duerdoth IP
Eatough D
Estabrooks PG
Etzion E
Evans HG
Fabbri F
Fanti M
Faust AA
Fiedler F
Fierro M
Fleck I
Folman R
Furtjes A
Futyan DI
Gagnon P
Gary JW
Gascon J
Gascon-Shotkin SM
Gaycken G
Geich-Gimbel C
Giacomelli G
Giacomelli P
Gibson V
Gibson WR
Gingrich DM
Glenzinski D
Goldberg J
Gorn W
Grandi C
Gross E
Grunhaus J
Gruwe M
Hanson GG
Hansroul M
Hapke M
Harder K
Hargrove CK
Hartmann C
Hauschild M
Hawkes CM
Hawkings R
Hemingway RJ
Herndon M
Herten G
Heuer RD
Hildreth MD
Hill JC
Hillier SJ
Hobson PR
Hocker A
Homer RJ
Honma AK
Horvath D
Hossain KR
Howard R
Huntemeyer P
Igo-Kemenes P
Imrie DC
Ishii K
Jacob FR
Jawahery A
Jeremie H
Jimack M
Jones CR
Jovanovic P
Junk TR
Karlen D
Kartvelishvili V
Kawagoe K
Kawamoto T
Kayal PI
Keeler RK
Kellogg RG
Kennedy BW
Klier A
Kluth S
Kobayashi T
Kobel M
Koetke DS
Kokott TP
Kolrep M
Komamiya S
Kowalewski RV
Kress T
Krieger P
Kuhl T
Kyberd P
Lafferty GD
Lanske D
Lauber J
Lautenschlager SR
Lawson I
Layter JG
Lazic D
Lee AM
Lellouch D
Letts J
Levinson L
Liebisch R
List B
Littlewood C
Liu D
Lloyd AW
Lloyd SL
Loebinger FK
Long GD
Losty MJ
Ludwig J
Macchiolo A
Macpherson A
Mader W
Mannelli M
Marcellini S
Markopoulos C
Martin AJ
Martin JP
Martinez G
Mashimo T
Mattig P
McDonald WJ
McKenna J
Mckigney EA
McMahon TJ
McPherson RA
Meijers F
Menke S
Merritt FS
Mes H
Meyer J
Michelini A
Mihara S
Mikenberg G
Miller DJ
Mir R
Mohr W
Montanari A
Mori T
Nagai K
Nakamura I
Neal HA
Nellen B
Nisius R
O'Neale SW
Oakham FG
Odorici F
Ogren HO
OPAL Collaboration
Oreglia MJ
Orito S
Palinkas J
Pasztor G
Pater JR
Patrick GN
Patt J
Perez-Ochoa R
Petzold S
Pfeifenschneider P
Pilcher JE
Pinfold J
Plane DE
Poffenberger P
Polok J
Przybycien M
Rembser C
Rick H
Robertson S
Robins SA
Rodning N
Roney JM
Roscoe K
Rossi AM
Rozen Y
Runge K
Runolfsson O
Rust DR
Sachs K
Saeki T
Sahr O
Sang WM
Sarkisyan EKG
Sbarra C
Schaile AD
Schaile O
Scharf F
Scharff-Hansen P
Schieck J
Schmitt B
Schmitt S
Schoning A
Schroder M
Schumacher M
Schwick C
Scott WG
Seuster R
Shears TG
Shen BC
Shepherd-Themistocleous CH
Sherwood P
Siroli GP
Sittler A
Skuja A
Smith AM
Snow GA
Sobie R
Soldner-Rembold S
Sproston M
Stahl A
Stephens K
Steuerer J
Stoll K
Strohmer R
Strom D
Surrow B
Talbot SD
Tanaka S
Taras P
Tarem S
Teuscher R
Thiergen M
Thomson MA
Torrence E
Towers S
Trigger I
Trocsanyi Z
Tsur E
Turcot AS
Turner-Watson MF
Van Kooten R
Vannerem P
Verzocchi M
von Krogh J
von Torne E
Voss H
Wackerle F
Wagner A
Ward CP
Ward DR
Watkins PM
Watson AT
Watson NK
Wells PS
Wermes N
White JS
Wilson GW
Wilson JA
Wyatt TR
Yamashita S
Yekutieli G
Zacek V
Zer-Zion D
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 31/08/1998
Field of study

The fraction of Z to bbbar events in hadronic Z decays has been measured by the OPAL experiment using the data collected at LEP between 1992 and 1995. The Z to bbbar decays were tagged using displaced secondary vertices, and high momentum electrons and muons. Systematic uncertainties were reduced by measuring the b-tagging efficiency using a double tagging technique. Efficiency correlations between opposite hemispheres of an event are small, and are well understood through comparisons between real and simulated data samples. A value of Rb = 0.2178 +- 0.0011 +- 0.0013 was obtained, where the first error is statistical and the second systematic. The uncertainty on Rc, the fraction of Z to ccbar events in hadronic Z decays, is not included in the errors. The dependence on Rc is Delta(Rb)/Rb = -0.056*Delta(Rc)/Rc where Delta(Rc) is the deviation of Rc from the value 0.172 predicted by the Standard Model. The result for Rb agrees with the value of 0.2155 +- 0.0003 predicted by the Standard Model.Comment: 42 pages, LaTeX, 14 eps figures included, submitted to European Physical Journal

arXiv.org e-Print Archive

Brunel University Research Archive

Evaluating two methods for Treebank grammar compaction

Author: Gaizauskas R.
Hepple M.
Krotov A.
Wilks Y.
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/12/1999
Field of study

Treebanks, such as the Penn Treebank, provide a basis for the automatic creation of broad coverage grammars. In the simplest case, rules can simply be ‘read off’ the parse-annotations of the corpus, producing either a simple or probabilistic context-free grammar. Such grammars, however, can be very large, presenting problems for the subsequent computational costs of parsing under the grammar. In this paper, we explore ways by which a treebank grammar can be reduced in size or ‘compacted’, which involve the use of two kinds of technique: (i) thresholding of rules by their number of occurrences; and (ii) a method of rule-parsing, which has both probabilistic and non-probabilistic variants. Our results show that by a combined use of these two techniques, a probabilistic context-free grammar can be reduced in size by 62% without any loss in parsing performance, and by 71% to give a gain in recall, but some loss in precision

Crossref

White Rose Research Online