Search CORE

7 research outputs found

A case study on grammatical-based representation for regular expression evolution

Author: A.E. Eiben
B.D. Dunay
D.F. Barrero
E.M. Gold
G. Zipf
J.E.F. Friedl
K. Thompson
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-12433-4_45Proceedings of 8th International Conference on Practical Applications of Agents and Multiagent SystemsRegular expressions, or simply regex, have been widely used as a powerful pattern matching and text extractor tool through decades. Although they provide a powerful and flexible notation to define and retrieve patterns from text, the syntax and the grammatical rules of these regex notations are not easy to use, and even to understand. Any regex can be represented as a Deterministic or Non-Deterministic Finite Automata; so it is possible to design a representation to automatically build a regex, and a optimization algorithm able to find the best regex in terms of complexity. This paper introduces both, a graph-based representation for regex, and a particular heuristic-based evolutionary computing algorithm based on grammatical features from this language in a particular data extraction problem.This work has been partially supported by the Spanish Ministry of Science and Innovation under the projects Castilla-La Mancha project PEII09-0266-6640, COMPUBIODIVE (TIN2007-65989), and by HADA (TIN2007-64718)

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Biblos-e Archivo

Variable length-based genetic representation to automatically evolve wrappers

Author: B. Hutt
C.L. Ramsey
D. Camacho
D. Chu
D. Goldberg
D.F. Barrero
D.S. Burke
E.M. Gold
J.E.F. Friedl
J.G. Brookshear
J.H. Holland
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-12433-4_44Proceedings 8th International Conference on Practical Applications of Agents and Multiagent SystemsThe Web has been the star service on the Internet, however the outsized information available and its decentralized nature has originated an intrinsic difficulty to locate, extract and compose information. An automatic approach is required to handle with this huge amount of data. In this paper we present a machine learning algorithm based on Genetic Algorithms which generates a set of complex wrappers, able to extract information from theWeb. The paper presents the experimental evaluation of these wrappers over a set of basic data sets.This work has been partially supported by the Spanish Ministry of Science and Innovation under the projects Castilla-La Mancha project PEII09-0266-6640, COMPUBIODIVE (TIN2007-65989), and by V-LeaF (TIN2008-02729-E/TIN)

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Biblos-e Archivo

A Polynomial Time Match Test for Large Classes of Extended Regular Expressions

Author: A. Aho
A. Ehrenfeucht
C. Câmpeanu
D. Angluin
J.E.F. Friedl
O. Ibarra
O. Ibarra
O. Ibarra
R.K. Guy
T. Jiang
T. Shinohara
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

polynomial time match test for large classes of extended regular expressions This item was submitted to Loughborough University's Institutional Repository by the/an author

CiteSeerX

Crossref

Loughborough University Institutional Repository

Inside the Class of REGEX Languages

Author: B. Carle
C. Câmpeanu
C. Câmpeanu
C. Câmpeanu
G. Penna Della
H. Bordihn
J. Albert
J.E.F. Friedl
K.S. Larsen
M. Holzer
T. Shinohara
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

We study different possibilities of combining the concept of homomorphic replacement with regular expressions in order to investigate the class of languages given by extended regular expressions with backreferences (REGEX). It is shown in which regard existing and natural ways to do this fail to reach the expressive power of REGEX. Furthermore, the complexity of the membership problem for REGEX with a bounded number of backreferences is considered

Crossref

Loughborough University Institutional Repository

Cuts in Regular Expressions

Author: A.J. Mayer
A.K. Chandra
H. Gruber
H. Petersen
J.E.F. Friedl
J.M. Robson
M. Fürer
O. Kupferman
V.M. Glushkov
W. Gelade
W. Gelade
W. Gelade
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Crossref

Characterising REGEX Languages by Regular Languages Equipped with Factor-Referencing

Author: B. Carle
C. Câmpeanu
C. Câmpeanu
C. Câmpeanu
D.D. Freydenberger
G.D. Penna
H. Bordihn
J. Albert
J. Dassow
J.E.F. Friedl
M.L. Schmid
R.W. Floyd
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Crossref

Text Mining in Genomics and Proteomics

Author: B. Vogelstein
B.R. Zeeberg
C. Blaschke
C. Blaschke
C. Friedman
C. Mering von
D. Chaussabel
D. Proux
D.R. Masys
E. Phizicky
E.M. Marcotte
E.S. Lander
F. Al-Shahrour
F. Liu
F. Mitelman
G. Sherlock
H. Mi
H. Shatkay
H. Yu
I. Donaldson
I.H. Witten
J.A. White
J.D. Kim
J.E.F. Friedl
J.L. DeRisi
J.M. Stuart
J.W. Cooper
K. Franzen
L. Hirschman
L. Tanabe
L.J. Jensen
M. Ashburner
M. Krauthammer
M.J. Schuemie
P. Glenisson
P.K. Shah
R. Hoffmann
R. Hoffmann
R. Hoffmann
R. Hoffmann
R. Kuffner
R.R. Hausser
S. Heim
S. Mika
S. Raychaudhuri
S. Raychaudhuri
T. Ono
T.H. Rabbitts
T.K. Jenssen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

Crossref