Search CORE

978 research outputs found

Pattern Mining for Named Entity Recognition

Author: B Bouchou
D Nadeau
DD McDonald
F Pedregosa
N Friburger
O Etzioni
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

International audienceMany evaluation campaigns have shown that knowledge-based and data-driven approaches remain equally competitive for Named Entity Recognition. Our re-search team has developed CasEN, a symbolic system based on finite state tran-ducers, which achieved promising results during the Ester2 French-speaking eval-uation campaign. Despite these encouraging results, manually extending the cov-erage of such a hand-crafted system is a difficult task. In this paper, we present a novel approach based on pattern mining for NER and to supplement our sys-tem's knowledge base. The system, mXS, exhaustively searches for hierarchical sequential patterns, that aim at detecting Named Entity boundaries. We assess their efficiency by using such patterns in a standalone mode and in combination with our existing system

Crossref

HAL Université de Tours

A HMM POS Tagger for Micro-blogging Type Texts

Author: A. Ritter
K. Gimpel
L. Barbosa
L. Derczynski
O. Etzioni
R. Cooper
T. Finin
Publication venue: Springer Verlag
Publication date: 01/01/2014
Field of study

The high volume of communication via micro-blogging type messages has created an increased demand for text processing tools customised the unstructured text genre. The available text processing tools developed on structured texts has been shown to deteriorate significantly when used on unstructured, micro-blogging type texts. In this paper, we present the results of testing a HMM based POS (Part-Of-Speech) tagging model customized for unstructured texts. We also evaluated the tagger against published CRF based state-of-the-art POS tagging models customized for Tweet messages using three publicly available Tweet corpora. Finally, we did cross-validation tests with both the taggers by training them on one Tweet corpus and testing them on another one

Crossref

AUT Scholarly Commons

Growing a list

Author: Benjamin Letham
Cynthia Rudin
DF Hsu
Katherine A. Heller
M Lalmas
MMS Beg
O Bousquet
O Etzioni
R Gupta
S Soderland
Publication venue: Massachusetts Institute of Technology, Operations Research Center
Publication date: 21/08/2012
Field of study

It is easy to find expert knowledge on the Internet on almost any topic, but obtaining a complete overview of a given topic is not always easy: Information can be scattered across many sources and must be aggregated to be useful. We introduce a method for intelligently growing a list of relevant items, starting from a small seed of examples. Our algorithm takes advantage of the wisdom of the crowd, in the sense that there are many experts who post lists of things on the Internet. We use a collection of simple machine learning components to find these experts and aggregate their lists to produce a single complete and meaningful list. We use experiments with gold standards and open-ended experiments without gold standards to show that our method significantly outperforms the state of the art. Our method uses the clustering algorithm Bayesian Sets even when its underlying independence assumption is violated, and we provide a theoretical generalization bound to motivate its use.

CiteSeerX

DSpace@MIT

Crossref

Classifier-Based Pattern Selection Approach for Relation Instance Extraction

Author: C Cortes
D Das
JT Kim
MC Marneffe De
O Etzioni
S Brin
S Patwardhan
S Riedel
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

University of Liverpool Repository

Crossref

Winding of planar gaussian processes

Author: Baruch Horovitz
Benichou O
Comtet A
Comtet A
Fisher M
Grosberg A
Horovitz B Etzioni Y Le Doussal P
Pierre Le Doussal
Rajabpour M A
Rudnick J
Rudnick J
Samokhin K
Samokhin K
Shapere A
Yoav Etzioni
Publication venue: 'IOP Publishing'
Publication date: 03/04/2009
Field of study

We consider a smooth, rotationally invariant, centered gaussian process in the plane, with arbitrary correlation matrix

C_{t t'}

. We study the winding angle

\phi_t

around its center. We obtain a closed formula for the variance of the winding angle as a function of the matrix

C_{tt'}

. For most stationary processes

C_{tt'}=C(t-t')

the winding angle exhibits diffusion at large time with diffusion coefficient

D = \int_0^\infty ds C'(s)^2/(C(0)^2-C(s)^2)

. Correlations of

\exp(i n \phi_t)

with integer

n

, the distribution of the angular velocity

\dot \phi_t

, and the variance of the algebraic area are also obtained. For smooth processes with stationary increments (random walks) the variance of the winding angle grows as

{1/2} (\ln t)^2

, with proper generalizations to the various classes of fractional Brownian motion. These results are tested numerically. Non integer

n

is studied numerically.Comment: 12 pages, 6 figure

arXiv.org e-Print Archive

Crossref

Creativity and Autonomy in Swarm Intelligence Systems

Author: A Clark
A Dorin
A Etzioni
A Rothenberg
B Holldobler
CH Janson
CW Reynolds
DR Myatt
E Bonabeau
F Heppner
G Deleuze
G Greenfield
G. Borgia
JF Kennedy
John Mark Bishop
K Sims
K. Sims
M Boden
M Moglich
M. Boden
Mohammad Majid al-Rifaie
O Bown
P Galanter
P McCorduck
P Restany
R Sternberg
S Levy
S O’Sullivan
Suzanne Caines
T Nagel
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 06/03/2012
Field of study

This work introduces two swarm intelligence algorithms -- one mimicking the behaviour of one species of ants (\emph{Leptothorax acervorum}) foraging (a `Stochastic Diffusion Search', SDS) and the other algorithm mimicking the behaviour of birds flocking (a `Particle Swarm Optimiser', PSO) -- and outlines a novel integration strategy exploiting the local search properties of the PSO with global SDS behaviour. The resulting hybrid algorithm is used to sketch novel drawings of an input image, exploliting an artistic tension between the local behaviour of the `birds flocking' - as they seek to follow the input sketch - and the global behaviour of the `ants foraging' - as they seek to encourage the flock to explore novel regions of the canvas. The paper concludes by exploring the putative `creativity' of this hybrid swarm system in the philosophical light of the `rhizome' and Deleuze's well known `Orchid and Wasp' metaphor

Goldsmiths Research Online

Crossref

Greenwich Academic Literature Archive

Next big challenges in core AI technology

Author: DeCario N.
Dengel A.
Etzioni O.
Hoos H.H.
Li F.F.
Traverso P.
Tsujii J.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

Algorithms and the Foundations of Software technolog

Leiden University Scholary Publications

Computational fact checking from knowledge networks

Author: A Gupta
A Kata
AJ Flanagin
Alain Barrat
Alessandro Flammini
AP Masucci
B Markines
B Nyhan
C Castillo
CM Bishop
D Liben-Nowell
F Niu
Filippo Menczer
Giovanni Luca Ciampaglia
J Giles
J Ratkiewicz
JA Capitán
Johan Bollen
KT Poole
L Aiello
L Breiman
LA Adamic
LF Cranor
Luis M. Rocha
M Conover
M Mendoza
M Nickel
M Steyvers
N Lao
O Etzioni
Prashant Shiralkar
R Priedhorsky
S Auer
S Cohen
S DeDeo
S Lewandowsky
S Luper
SY Rieh
T Berners-Lee
T Fawcett
T Kamada
T Simas
TN Jagatic
X Dong
Y Wu
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 14/01/2015
Field of study

Traditional fact checking by expert journalists cannot keep up with the enormous volume of information that is now generated online. Computational fact checking may significantly enhance our ability to evaluate the veracity of dubious information. Here we show that the complexities of human fact checking can be approximated quite well by finding the shortest path between concept nodes under properly defined semantic proximity metrics on knowledge graphs. Framed as a network problem this approach is feasible with efficient computational techniques. We evaluate this approach by examining tens of thousands of claims related to history, entertainment, geography, and biographical information using a public knowledge graph extracted from Wikipedia. Statements independently known to be true consistently receive higher support via our method than do false ones. These findings represent a significant step toward scalable computational fact-checking methods that may one day mitigate the spread of harmful misinformation

arXiv.org e-Print Archive

Access to Research and Communications Annals

Crossref

Directory of Open Access Journals

PubMed Central

FigShare

Pharmacoeconomic analysis of adjuvant oral capecitabine vs intravenous 5-FU/LV in Dukes' C colon cancer: the X-ACT trial

Author: A Beham
A Figer
A Malzyner
A Messori
BE Hillner
C Twelves
C Twelves
C Twelves
DG Haller
DJ Sargent
E Díaz-Rubio
E Van Cutsem
E Van Cutsem
F Coxon
G Fountzilas
G Liu
HT Arkenau
I Bustová
J Cassidy
J J McKendrick
J-Y Douillard
J. Sastre
JE Siegel
K K Patel
K Lesniewski-Kmak
L P Garrison
M Miwa
MC Weinstein
MJ O'Connell
MM Borner
N Wolmark
O Bertetto
P Dufour
P G Johnston
R Etzioni
R Porschen
S Jelic
S Ramsey
SD Ramsey
T S Maughan
W Cowell
W Scheithauer
W Scheithauer
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

Oral capecitabine (Xeloda<sup>®</sup>) is an effective drug with favourable safety in adjuvant and metastatic colorectal cancer. Oxaliplatin-based therapy is becoming standard for Dukes' C colon cancer in patients suitable for combination therapy, but is not yet approved by the UK National Institute for Health and Clinical Excellence (NICE) in the adjuvant setting. Adjuvant capecitabine is at least as effective as 5-fluorouracil/leucovorin (5-FU/LV), with significant superiority in relapse-free survival and a trend towards improved disease-free and overall survival. We assessed the cost-effectiveness of adjuvant capecitabine from payer (UK National Health Service (NHS)) and societal perspectives. We used clinical trial data and published sources to estimate incremental direct and societal costs and gains in quality-adjusted life months (QALMs). Acquisition costs were higher for capecitabine than 5-FU/LV, but higher 5-FU/LV administration costs resulted in 57% lower chemotherapy costs for capecitabine. Capecitabine vs 5-FU/LV-associated adverse events required fewer medications and hospitalisations (cost savings £3653). Societal costs, including patient travel/time costs, were reduced by >75% with capecitabine vs 5-FU/LV (cost savings £1318), with lifetime gain in QALMs of 9 months. Medical resource utilisation is significantly decreased with capecitabine vs 5-FU/LV, with cost savings to the NHS and society. Capecitabine is also projected to increase life expectancy vs 5-FU/LV. Cost savings and better outcomes make capecitabine a preferred adjuvant therapy for Dukes' C colon cancer. This pharmacoeconomic analysis strongly supports replacing 5-FU/LV with capecitabine in the adjuvant treatment of colon cancer in the UK

Crossref

PubMed Central

Oxford University Research Archive

Enlighten

Understanding Democracy and Development Traps Using a Data-Driven Approach

Author: Azariadis C
Berthelemy JC
Conway D
Cox GW
Etzioni O
Mullainathan S
Nelson RR
Weingast BR
Welzel C
Publication venue: 'Mary Ann Liebert Inc'
Publication date: 01/03/2015
Field of study

Methods from machine learning and data science are becoming increasingly important in the social sciences, providing powerful new ways of identifying statistical relationships in large data sets. However, these relationships do not necessarily offer an understanding of the processes underlying the data. To address this problem, we have developed a method for fitting nonlinear dynamical systems models to data related to social change. Here, we use this method to investigate how countries become trapped at low levels of socioeconomic development. We identify two types of traps. The first is a democracy trap, where countries with low levels of economic growth and/or citizen education fail to develop democracy. The second trap is in terms of cultural values, where countries with low levels of democracy and/or life expectancy fail to develop emancipative values. We show that many key developing countries, including India and Egypt, lie near the border of these development traps, and we investigate the time taken for these nations to transition toward higher democracy and socioeconomic well-being

Crossref

PubMed Central

White Rose Research Online