Search CORE

10 research outputs found

Pattern Mining for Named Entity Recognition

Author: B Bouchou
D Nadeau
DD McDonald
F Pedregosa
N Friburger
O Etzioni
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

International audienceMany evaluation campaigns have shown that knowledge-based and data-driven approaches remain equally competitive for Named Entity Recognition. Our re-search team has developed CasEN, a symbolic system based on finite state tran-ducers, which achieved promising results during the Ester2 French-speaking eval-uation campaign. Despite these encouraging results, manually extending the cov-erage of such a hand-crafted system is a difficult task. In this paper, we present a novel approach based on pattern mining for NER and to supplement our sys-tem's knowledge base. The system, mXS, exhaustively searches for hierarchical sequential patterns, that aim at detecting Named Entity boundaries. We assess their efficiency by using such patterns in a standalone mode and in combination with our existing system

Crossref

HAL Université de Tours

Similarité entre textes basées sur les noms propres

Author: Friburger N
Publication venue: 'African Journals Online (AJOL)'
Publication date: 02/03/2004
Field of study

Résumé: Les noms propres représentent environ 10% du texte d'un article de journal. Leur quantité et leur qualité informationnelle sont déjà utilisée dans les systèmes d'extraction d'informations (conférences MUC) Nous avons crée un outil basé sur une description linguistique sous forme de transducteurs à nombre finis d'états. Les noms propres extraits sont alors utilisés dans le but de recherche d'information : il s'agit de présenter aux utilisateurs des textes journalistiques sous la forme d'une hiérarchie et de fournir une description des sujets traités dans les textes. Dans cet article nous présentons une mesure de similarité automatique de textes avec une similarité avec les mots seuls Mots clés : Similarité/ Classification hiérarchique/ Noms propres. Similarites between proper namer besed texts Abstract: Proper naner represent about 10% newspaper articles in English or French texts. Thier quantity and informational qualité are already usen in different Information Extraction systems. Proper names have widely been studied in the MUC confrences designed to promote research in Information Extraction. We have created our own named entity extraction tool based on a linguistic description with automata. The extracted names are used in an iformation retrieval a topic description of the clusters. We verify the interest of the use of proper names in a similarity measure to improve cluster the interest of the use of propre names in a similarity measure to improve clustering. This measure merge a similarity besed on all the words with a similarity based on the propre names. Key words : Similarity/ Hierarchic clustering/ Proper names. Revue d'Information Scientifique & Technique Vol.12(2) 2002: 61-7

AJOL - African Journals Online

ANERsys: An Arabic Named Entity Recognition System Based on Maximum Entropy

Author: N. Friburger
R. Rosenfeld
S. Abuleil
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

Crossref

Building a Dictionary of Anthroponyms

Author: D. McDonald
D. Yarowsky
N. Friburger
O. Piton
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

Crossref

Using Information Extraction to Build a Directory of Conference Announcements

Author: A. McCallum
D. Freitag
J.R. Hobbs
N. Friburger
S. Soderland
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2004
Field of study

Crossref

Finite-State Transducer Cascade to Extract Proper Names in Texts

Author: E. Roche
J. R. Hobbs
N. Friburger
S. Abney
S. Coates-Stephens
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Descriptional Complexity of Iterated Uniform Finite-State Transducers

Author: A Malcher
A Salomaa
C Citrini
GH Mealy
H Bordihn
J Hartmanis
MP Bianchi
N Friburger
ND Jones
ND Jones
Y Gao
Z Bednárová
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

We introduce the deterministic computational model of an iterated uniform finite-state transducer (IUFST). A IUFST performs the same length-preserving transduction on several left-to-right sweeps. The first sweep takes place on the input string, while any other sweep processes the output of the previous one. The IUFST accepts or rejects upon halting in an accepting or rejecting state along its sweeps. First, we focus on constant sweep bounded IUFSTs. We study their descriptional power vs. deterministic finite automata, and the state cost of implementing language operations. Then, we focus on non-constant sweep bounded IUFSTs, showing a nonregular language hierarchy depending on sweep complexity. The hardness of some classical decision problems on constant sweep bounded IUFSTs is also investigated

Crossref

AIR Universita degli studi di Milano

HAL Descartes

Hal-Diderot

Concentration of plasminogen and antiplasmin in plasma and serum

Author: Alkjaersig N.
Cederholm-Williams S.A.
Collen D.
Friburger P.
Hedner U.
Ogston D.
S. Cederholm-Williams
Teger-Nilsson A.C.
Thorsen S.
Vermylen C.
Wiman B.
Wiman B.
Wiman B.
Publication venue: 'BMJ'
Publication date
Field of study

Crossref

Iterated Uniform Finite-State Transducers: Descriptional Complexity of Nondeterminism and Two-Way Motion

Author: A Bertoni
A Ginzburg
A Malcher
C Citrini
C Mereghetti
C Mereghetti
C Mereghetti
G Hardy
GH Mealy
H Bordihn
J Hartmanis
M Kutrib
M Kutrib
M Kutrib
MO Rabin
MP Bianchi
MP Bianchi
N Friburger
Z Bednárová
Z Bednárová
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

An iterated uniform finite-state transducer executes the same length-preserving transduction in iterative sweeps. The first sweep occurs on the input string, while any subsequent sweep works on the output of the previous one. We consider devices with one-way motion and two-way motion, i.e., sweeps are either from left to right only, or alternate from left to right and from right to left. In addition, devices may work deterministically or nondeterministically. Here, we restrict to study devices performing a constant number of sweeps, which are known to characterize exactly the regular languages. We determine the descriptional costs of removing two-way motion, nondeterminism, and sweeps, and, in particular, the costs for the conversion to deterministic or nondeterministic finite automata. Finally, the special case of unary languages is investigated, and a language family is presented that is immune to the resources of nondeterminism and two-way motion, in the sense that both resources can neither reduce the number of states nor the number of sweeps

Crossref

AIR Universita degli studi di Milano