Search CORE

179 research outputs found

Challenges and solutions for Latin named entity recognition

Author: Ajaka Petra
Brown Christopher
de Marneffe Marie-Catherine
Elsner Micha
Erdmann Alex
Janse Mark
Joseph Brian D.
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2016
Field of study

Although spanning thousands of years and genres as diverse as liturgy, historiography, lyric and other forms of prose and poetry, the body of Latin texts is still relatively sparse compared to English. Data sparsity in Latin presents a number of challenges for traditional Named Entity Recognition techniques. Solving such challenges and enabling reliable Named Entity Recognition in Latin texts can facilitate many down-stream applications, from machine translation to digital historiography, enabling Classicists, historians, and archaeologists for instance, to track the relationships of historical persons, places, and groups on a large scale. This paper presents the first annotated corpus for evaluating Named Entity Recognition in Latin, as well as a fully supervised model that achieves over 90% F-score on a held-out test set, significantly outperforming a competitive baseline. We also present a novel active learning strategy that predicts how many and which sentences need to be annotated for named entities in order to attain a specified degree of accuracy when recognizing named entities automatically in a given text. This maximizes the productivity of annotators while simultaneously controlling quality

Ghent University Academic Bibliography

Classifier-Based Pattern Selection Approach for Relation Instance Extraction

Author: C Cortes
D Das
JT Kim
MC Marneffe De
O Etzioni
S Brin
S Patwardhan
S Riedel
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

University of Liverpool Repository

Crossref

Deep Memory Networks for Attitude Identification

Author: Chang C.-C.
Collobert R.
De Marneffe M.-C.
Faulkner A.
Gimpel K.
Glorot X.
Hasan K. S.
Hermann K. M.
Irsoy O.
Jiang L.
Kingma D.
Kobayashi N.
Le Q. V.
Li F.
Mikolov T.
Mohammad S. M.
Popescu A.
Socher R.
Socher R.
Sukhbaatar S.
Vo D.-T.
Walker M. A.
Wang S.
Zhang M.
Zirn C.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 16/01/2017
Field of study

We consider the task of identifying attitudes towards a given set of entities from text. Conventionally, this task is decomposed into two separate subtasks: target detection that identifies whether each entity is mentioned in the text, either explicitly or implicitly, and polarity classification that classifies the exact sentiment towards an identified entity (the target) into positive, negative, or neutral. Instead, we show that attitude identification can be solved with an end-to-end machine learning architecture, in which the two subtasks are interleaved by a deep memory network. In this way, signals produced in target detection provide clues for polarity classification, and reversely, the predicted polarity provides feedback to the identification of targets. Moreover, the treatments for the set of targets also influence each other -- the learned representations may share the same semantics for some targets but vary for others. The proposed deep memory network, the AttNet, outperforms methods that do not consider the interactions between the subtasks or those among the targets, including conventional machine learning methods and the state-of-the-art deep learning models.Comment: Accepted to WSDM'1

arXiv.org e-Print Archive

Crossref

Towards Computing Inferences from English News Headlines

Author: A Kronrod
A Pilkington
D Dor
E Iarovici
G Yule
H Paul Grice
I Dagan
M-C de Marneffe
SC Levinson
V Fromkin
V Pekar
Publication venue
Publication date: 18/10/2019
Field of study

Newspapers are a popular form of written discourse, read by many people, thanks to the novelty of the information provided by the news content in it. A headline is the most widely read part of any newspaper due to its appearance in a bigger font and sometimes in colour print. In this paper, we suggest and implement a method for computing inferences from English news headlines, excluding the information from the context in which the headlines appear. This method attempts to generate the possible assumptions a reader formulates in mind upon reading a fresh headline. The generated inferences could be useful for assessing the impact of the news headline on readers including children. The understandability of the current state of social affairs depends greatly on the assimilation of the headlines. As the inferences that are independent of the context depend mainly on the syntax of the headline, dependency trees of headlines are used in this approach, to find the syntactical structure of the headlines and to compute inferences out of them.Comment: PACLING 2019 Long paper, 15 page

arXiv.org e-Print Archive

Crossref

A Formal Framework for Modelling Coercion Resistance and Receipt Freeness

Author: D. Chaum
D. Unruh
H.L. Jonker
O. Marneffe de
R.W. Gardner
S. Delaune
T. Moran
T. Okamoto
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Abstract. Coercion resistance and receipt freeness are critical proper-ties for any voting system. However, many different definitions of these properties have been proposed, some formal and some informal; and there has been little attempt to tie these definitions together or identify rela-tions between them. We give here a general framework for specifying different coercion re-sistance and receipt freeness properties using the process algebra CSP. The framework is general enough to accommodate a wide range of defini-tions, and strong enough to cover both randomization attacks and forced abstention attacks. We provide models of some simple voting systems, and show how the framework can be used to analyze these models un-der different definitions of coercion resistance and receipt freeness. Our formalisation highlights the variation between the definitions, and the importance of understanding the relations between them.

CiteSeerX

Crossref

University of Surrey

Automated Anonymity Verification of the ThreeBallot Voting System

Author: A. Fujioka
A. Juels
A.W. Roscoe
C.A.R. Hoare
D. Chaum
D. Chaum
J. Cichoń
K. Henry
O. Marneffe de
P.Y.A. Ryan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

In recent years, a large number of secure voting protocols have been proposed in the literature. Often these protocols contain flaws, but because they are complex protocols, rigorous formal analysis has proven hard to come by. Rivest’s ThreeBallot voting system is important because it aims to provide security (voter anonymity and voter verifiability) without requiring cryptography. In this paper, we construct a CSP model of ThreeBallot, and use it to produce the first automated formal analysis of its anonymity property. Along the way, we discover that one of the crucial assumptions under which ThreeBallot (and many other voting systems) operates-the Short Ballot Assumption-is highly ambiguous in the literature.We give various plausible precise interpretations, and discover that in each case, the interpretation either is unrealistically strong, or else fails to ensure anonymity. Therefore, we give a version of the Short Ballot Assumption for ThreeBallot that is realistic but still provides a guarantee of anonymity

CiteSeerX

Crossref

University of Surrey

Surrey Research Insight

University of Turku in the BioNLP'11 Shared Task

Author: A Jimeno Yepes
D McClosky
D McClosky
de Marneffe
E Buyko
E Charniak
Filip Ginter
H Kilicoglu
H Kilicoglu
I Tsochantaridis
J Björne
J Björne
J Björne
J Heimonen
J Jourde
Jari Björne
JD Kim
JD Kim
JD Kim
JP Euzéby
M Miwa
M Miwa
MC de Marneffe
MF Porter
N Nguyen
P Stenetorp
R Bossy
S Pyysalo
S Pyysalo
S Riedel
S Riedel
S Riedel
S Van Landeghem
S Van Landeghem
T Ohta
Tapio Salakoski
Y Kim
Z Ratkovic
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

Semantically linking molecular entities in literature through entity relationships

Author: A Airola
A Reverter
Bernard De Baets
C Burgess
D Jurgens
D McClosky
DLT Rohde
E Charniak
EW Sayers
H Kilicoglu
I Tsochantaridis
J Björne
J Björne
J Björne
J Björne
Jari Björne
JD Kim
JD Kim
JD Kim
M Buckland
M de Marneffe
M de Marneffe
M Krallinger
M Miwa
M Sahlgren
MF Porter
R Leaman
S Pyysalo
S Pyysalo
S Pyysalo
S van Dongen
S Van Landeghem
S Van Landeghem
S Van Landeghem
S Van Landeghem
S Van Landeghem
S Van Landeghem
Sofie Van Landeghem
T Ohta
Tapio Salakoski
The UniProt Consortium
Thomas Abeel
TK Landauer
VN Vapnik
Yves Van de Peer
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

Background Text mining tools have gained popularity to process the vast amount of available research articles in the biomedical literature. It is crucial that such tools extract information with a sufficient level of detail to be applicable in real life scenarios. Studies of mining non-causal molecular relations attribute to this goal by formally identifying the relations between genes, promoters, complexes and various other molecular entities found in text. More importantly, these studies help to enhance integration of text mining results with database facts. Results We describe, compare and evaluate two frameworks developed for the prediction of non-causal or 'entity' relations (REL) between gene symbols and domain terms. For the corresponding REL challenge of the BioNLP Shared Task of 2011, these systems ranked first (57.7% F-score) and second (41.6% F-score). In this paper, we investigate the performance discrepancy of 16 percentage points by benchmarking on a related and more extensive dataset, analysing the contribution of both the term detection and relation extraction modules. We further construct a hybrid system combining the two frameworks and experiment with intersection and union combinations, achieving respectively high-precision and high-recall results. Finally, we highlight extremely high-performance results (F-score > 90%) obtained for the specific subclass of embedded entity relations that are essential for integrating text mining predictions with database facts. Conclusions The results from this study will enable us in the near future to annotate semantic relations between molecular entities in the entire scientific literature available through PubMed. The recent release of the EVEX dataset, containing biomolecular event predictions for millions of PubMed articles, is an interesting and exciting opportunity to overlay these entity relations with event predictions on a literature-wide scale

Crossref

TU Delft Repository

Springer - Publisher Connector

Ghent University Academic Bibliography

PubMed Central

Archivsystem Ask23

Learning perceptually grounded word meanings from unaligned parallel data

Author: A. Vogel
C. A. Thompson
C. Matuszek
C. Matuszek
D. L. Chen
H. Poon
J. Clarke
J. Dzifcak
Joshua Joseph
K. Hsiao
L. S. Zettlemoyer
M. MacMahon
M. Marneffe de
M. Skubic
N. Mavridis
Nicholas Roy
P. Liang
P. Rybski
Pratiksha Thaker
R. S. Jackendoff
S. Chernova
S. Piantadosi
S. R. K. Branavan
S. Tellex
S. Tellex
Stefanie Tellex
T. Kollar
T. Kwiatkowski
Y. Wong
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/05/2012
Field of study

In order for robots to effectively understand natural language commands, they must be able to acquire meaning representations that can be mapped to perceptual features in the external world. Previous approaches to learning these grounded meaning representations require detailed annotations at training time. In this paper, we present an approach to grounded language acquisition which is capable of jointly learning a policy for following natural language commands such as “Pick up the tire pallet,” as well as a mapping between specific phrases in the language and aspects of the external world; for example the mapping between the words “the tire pallet” and a specific object in the environment. Our approach assumes a parametric form for the policy that the robot uses to choose actions in response to a natural language command that factors based on the structure of the language. We use a gradient method to optimize model parameters. Our evaluation demonstrates the effectiveness of the model on a corpus of commands given to a robotic forklift by untrained users.U.S. Army Research Laboratory (Collaborative Technology Alliance Program, Cooperative Agreement W911NF-10-2-0016)United States. Office of Naval Research (MURIs N00014-07-1-0749)United States. Army Research Office (MURI N00014-11-1-0688)United States. Defense Advanced Research Projects Agency (DARPA BOLT program under contract HR0011-11-2-0008

DSpace@MIT

Crossref