Search CORE

14 research outputs found

Product graph-based higher order contextual similarities for inexact subgraph matching

Author: Bunke H
Dutta A
Lladós J
Pal U
Publication venue: 'Elsevier BV'
Publication date: 16/10/2019
Field of study

This is the author accepted manuscript. The final version is available from Elsevier via the DOI in this record Many algorithms formulate graph matching as an optimization of an objective function of pairwise quantification of nodes and edges of two graphs to be matched. Pairwise measurements usually consider local attributes but disregard contextual information involved in graph structures. We address this issue by proposing contextual similarities between pairs of nodes. This is done by considering the tensor product graph (TPG) of two graphs to be matched, where each node is an ordered pair of nodes of the operand graphs. Contextual similarities between a pair of nodes are computed by accumulating weighted walks (normalized pairwise similarities) terminating at the corresponding paired node in TPG. Once the contextual similarities are obtained, we formulate subgraph matching as a node and edge selection problem in TPG. We use contextual similarities to construct an objective function and optimize it with a linear programming approach. Since random walk formulation through TPG takes into account higher order information, it is not a surprise that we obtain more reliable similarities and better discrimination among the nodes and edges. Experimental results shown on synthetic as well as real benchmarks illustrate that higher order contextual similarities increase discriminating power and allow one to find approximate solutions to the subgraph matching problem.European Union Horizon 202

Open Research Exeter

Pyramidal Stochastic Graphlet Embedding for Document Pattern Classification

Author: Dutta A
Fornes A
Llados J
Riba P
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/10/2019
Field of study

This is the author accepted manuscript. The final version is available from IEEE via the DOI in this recordDocument pattern classification methods using graphs have received a lot of attention because of its robust representation paradigm and rich theoretical background. However, the way of preserving and the process for delineating documents with graphs introduce noise in the rendition of underlying data, which creates instability in the graph representation. To deal with such unreliability in representation, in this paper, we propose Pyramidal Stochastic Graphlet Embedding (PSGE). Given a graph representing a document pattern, our method first computes a graph pyramid by successively reducing the base graph. Once the graph pyramid is computed, we apply Stochastic Graphlet Embedding (SGE) for each level of the pyramid and combine their embedded representation to obtain a global delineation of the original graph. The consideration of pyramid of graphs rather than just a base graph extends the representational power of the graph embedding, which reduces the instability caused due to noise and distortion. When plugged with support vector machine, our proposed PSGE has outperformed the state-of-The-art results in recognition of handwritten words as well as graphical symbols.European Union Horizon 2020Ministerio de Educación, Cultura y Deporte, SpainRamon y Cajal FellowshipCERCA Program/Generalitat de Cataluny

Open Research Exeter

Entity Local Structure Graph Matching for Mislabeling Correction

Author: Belaïd Abdel
Joseph Aurélie
Kooli Nihel
Poulain D 'andecy Vincent
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/04/2016
Field of study

International audienceThis paper proposes an entity local structure comparison approach based on inexact subgraph matching. The comparison results are used for mislabeling correction in the local structure. The latter represents a set of entity attribute labels which are physically close in a document image. It is modeled by an attributed graph describing the content and presentation features of the labels by the nodes and the geometrical features by the arcs. A local structure graph is matched with a structure model which represents a set of local structure model graphs. The structure model is initially built using a set of well chosen local structures based on a graph clustering algorithm and is then incrementally updated. The subgraph matching adopts a specific cost function that integrates the feature dissimilarities. The matched model graph is used to extract the missed labels, prune the extraneous ones and correct the erroneous label fields in the local structure. The evaluation of the structure comparison approach on 525 local structures extracted from 200 business documents achieves about 90% for recall and 95% for precision. The mislabeling correction rates in these local structures vary between 73% and 100%

INRIA a CCSD electronic archive server

Shared Memory Parallel Subgraph Enumeration

Author: Kimmig Raphael
Meyerhenke Henning
Strash Darren
Publication venue
Publication date: 25/05/2017
Field of study

The subgraph enumeration problem asks us to find all subgraphs of a target graph that are isomorphic to a given pattern graph. Determining whether even one such isomorphic subgraph exists is NP-complete---and therefore finding all such subgraphs (if they exist) is a time-consuming task. Subgraph enumeration has applications in many fields, including biochemistry and social networks, and interestingly the fastest algorithms for solving the problem for biochemical inputs are sequential. Since they depend on depth-first tree traversal, an efficient parallelization is far from trivial. Nevertheless, since important applications produce data sets with increasing difficulty, parallelism seems beneficial. We thus present here a shared-memory parallelization of the state-of-the-art subgraph enumeration algorithms RI and RI-DS (a variant of RI for dense graphs) by Bonnici et al. [BMC Bioinformatics, 2013]. Our strategy uses work stealing and our implementation demonstrates a significant speedup on real-world biochemical data---despite a highly irregular data access pattern. We also improve RI-DS by pruning the search space better; this further improves the empirical running times compared to the already highly tuned RI-DS.Comment: 18 pages, 12 figures, To appear at the 7th IEEE Workshop on Parallel / Distributed Computing and Optimization (PDCO 2017

arXiv.org e-Print Archive

Crossref

Localisation automatique de champs de saisie sur des images de formulaires couleur par isomorphisme de sous-graphe

Author: Adam Sébastien
Hammami Maroua
Héroux Pierre
Poulain D&apos
Publication venue: HAL CCSD
Publication date: 09/03/2016
Field of study

International audienceThis paper presents an approach for spotting textual fields in colored forms. We proceed by locating these fields thanks to their neighboring context which is modeled with a structural representation. First, informative zones are extracted. Second, forms are represented by graphs in which nodes represent colored rectangles while edges represent neighboring links. Finally, the context of the queried region of interest is modeled as a graph. Subgraph isomorphism is applied in order to locate this ROI in the structural representation of a whole document. Evaluated on a 130-document image dataset, experimental results show up that our approach is efficient and that the requested information is found even if its position is changed.Cet article présente une approche permettant la localisation de champs de saisie sur des images couleur de formulaires. Ces champs sont localisés grâce à une modélisation structurelle représentant leur contexte. Dans un premier temps, les zones informatives sont ex-traites. Les formulaires sont ensuite représentés par des graphes au sein desquels les noeuds représentent des rectangles de couleur uniforme tandis que les arcs modélisent les relations de voisinage. Finalement, le voisinage de la région d'intérêt à localiser est également représenté par un graphe. Une recherche d'isomorphisme de sous graphe vise à localiser le graphe modélisant le voisinage de la région d'intérêt au sein de la représentation structurelle du document cible. Une expérimentation est réalisée sur une base de 130 images de document. Les résultats montrent l'efficacité de la méthode même si la position de la région d'intérêt est variable

Inexact graph matching for entity recognition in OCRed documents

Author: Belaid Abdel
Kooli Nihel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 04/12/2016
Field of study

International audienceThis paper proposes an entity recognition system in image documents recognized by OCR. The system is based on a graph matching technique and is guided by a database describing the entities in its records. The input of the system is a document which is labeled by the entity attributes. A first grouping of those labels based on a function score leads to a selected set of candidate entities. The entity labels which are locally close are modeled by a structure graph. This graph is matched with model graphs learned for this purpose. The graph matching technique relies on a specific cost function that integrates the feature dissimilarities. The matching results are exploited to correct the mislabeling errors and then validate the entity recognition task. The system evaluation on three datasets which treat different kind of entities shows a variation between 88.3% and 95% for recall and 94.3% and 95.7% for precision

INRIA a CCSD electronic archive server

An integer linear program for substitution-tolerant subgraph isomorphism and its use for symbol spotting in technical drawings

Author: Adam Sébastien
Héroux Pierre
Le Bodic Pierre
Lecourtier Yves
Publication venue: 'Elsevier BV'
Publication date: 16/06/2012
Field of study

International audienceThis paper tackles the problem of substitution-tolerant subgraph isomorphism which is a specific class of error-tolerant isomorphism. This problem aims at finding a subgraph isomorphism of a pattern graph S in a target graph G. This isomorphism only considers label substitutions and forbids vertex and edge insertion in G. This kind of subgraph isomorphism is often needed in pattern recognition problems when graphs are attributed with real values and no exact matching can be found between attributes due to noise. Our proposal to solve the problem of substitution-tolerant subgraph isomorphism relies on its formulation in the Integer Linear Program (ILP) formalism. Using a general ILP solver, the approach is able to find, if one exists, a mapping of a pattern graph into a target graph such that the topology of the searched graph is kept and the editing operations between the labels have a minimal cost. This technique is evaluated on both a set of synthetic graphs and a problem of symbol detection in technical drawings. In the second case, document and symbol images are represented by vector-attributed Region Adjacency Graphs built from a segmentation process. Obtained results demonstrate the relevance of considering subgraph isomorphism as an optimization process

HAL - Normandie Université

HAL-CentraleSupelec

HAL-Rennes 1

27th Annual European Symposium on Algorithms: ESA 2019, September 9-11, 2019, Munich/Garching, Germany

Author: ESA <27. 2019, München>
Publication venue: Schloss Dagstuhl - Leibniz-Zentrum für Informatik GmbH, Dagstuhl Publishing
Publication date: 01/09/2019
Field of study

Digitale Bibliothek Thüringen