119,665 research outputs found
Structural Data Recognition with Graph Model Boosting
This paper presents a novel method for structural data recognition using a
large number of graph models. In general, prevalent methods for structural data
recognition have two shortcomings: 1) Only a single model is used to capture
structural variation. 2) Naive recognition methods are used, such as the
nearest neighbor method. In this paper, we propose strengthening the
recognition performance of these models as well as their ability to capture
structural variation. The proposed method constructs a large number of graph
models and trains decision trees using the models. This paper makes two main
contributions. The first is a novel graph model that can quickly perform
calculations, which allows us to construct several models in a feasible amount
of time. The second contribution is a novel approach to structural data
recognition: graph model boosting. Comprehensive structural variations can be
captured with a large number of graph models constructed in a boosting
framework, and a sophisticated classifier can be formed by aggregating the
decision trees. Consequently, we can carry out structural data recognition with
powerful recognition capability in the face of comprehensive structural
variation. The experiments shows that the proposed method achieves impressive
results and outperforms existing methods on datasets of IAM graph database
repository.Comment: 8 page
Structural matching by discrete relaxation
This paper describes a Bayesian framework for performing relational graph matching by discrete relaxation. Our basic aim is to draw on this framework to provide a comparative evaluation of a number of contrasting approaches to relational matching. Broadly speaking there are two main aspects to this study. Firstly we locus on the issue of how relational inexactness may be quantified. We illustrate that several popular relational distance measures can be recovered as specific limiting cases of the Bayesian consistency measure. The second aspect of our comparison concerns the way in which structural inexactness is controlled. We investigate three different realizations ai the matching process which draw on contrasting control models. The main conclusion of our study is that the active process of graph-editing outperforms the alternatives in terms of its ability to effectively control a large population of contaminating clutter
Toward a multilevel representation of protein molecules: comparative approaches to the aggregation/folding propensity problem
This paper builds upon the fundamental work of Niwa et al. [34], which
provides the unique possibility to analyze the relative aggregation/folding
propensity of the elements of the entire Escherichia coli (E. coli) proteome in
a cell-free standardized microenvironment. The hardness of the problem comes
from the superposition between the driving forces of intra- and inter-molecule
interactions and it is mirrored by the evidences of shift from folding to
aggregation phenotypes by single-point mutations [10]. Here we apply several
state-of-the-art classification methods coming from the field of structural
pattern recognition, with the aim to compare different representations of the
same proteins gathered from the Niwa et al. data base; such representations
include sequences and labeled (contact) graphs enriched with chemico-physical
attributes. By this comparison, we are able to identify also some interesting
general properties of proteins. Notably, (i) we suggest a threshold around 250
residues discriminating "easily foldable" from "hardly foldable" molecules
consistent with other independent experiments, and (ii) we highlight the
relevance of contact graph spectra for folding behavior discrimination and
characterization of the E. coli solubility data. The soundness of the
experimental results presented in this paper is proved by the statistically
relevant relationships discovered among the chemico-physical description of
proteins and the developed cost matrix of substitution used in the various
discrimination systems.Comment: 17 pages, 3 figures, 46 reference
Graph edit distance from spectral seriation
This paper is concerned with computing graph edit distance. One of the criticisms that can be leveled at existing methods for computing graph edit distance is that they lack some of the formality and rigor of the computation of string edit distance. Hence, our aim is to convert graphs to string sequences so that string matching techniques can be used. To do this, we use a graph spectral seriation method to convert the adjacency matrix into a string or sequence order. We show how the serial ordering can be established using the leading eigenvector of the graph adjacency matrix. We pose the problem of graph-matching as a maximum a posteriori probability (MAP) alignment of the seriation sequences for pairs of graphs. This treatment leads to an expression in which the edit cost is the negative logarithm of the a posteriori sequence alignment probability. We compute the edit distance by finding the sequence of string edit operations which minimizes the cost of the path traversing the edit lattice. The edit costs are determined by the components of the leading eigenvectors of the adjacency matrix and by the edge densities of the graphs being matched. We demonstrate the utility of the edit distance on a number of graph clustering problems
Offline Signature Verification by Combining Graph Edit Distance and Triplet Networks
Biometric authentication by means of handwritten signatures is a challenging
pattern recognition task, which aims to infer a writer model from only a
handful of genuine signatures. In order to make it more difficult for a forger
to attack the verification system, a promising strategy is to combine different
writer models. In this work, we propose to complement a recent structural
approach to offline signature verification based on graph edit distance with a
statistical approach based on metric learning with deep neural networks. On the
MCYT and GPDS benchmark datasets, we demonstrate that combining the structural
and statistical models leads to significant improvements in performance,
profiting from their complementary properties
Convex Graph Invariant Relaxations For Graph Edit Distance
The edit distance between two graphs is a widely used measure of similarity
that evaluates the smallest number of vertex and edge deletions/insertions
required to transform one graph to another. It is NP-hard to compute in
general, and a large number of heuristics have been proposed for approximating
this quantity. With few exceptions, these methods generally provide upper
bounds on the edit distance between two graphs. In this paper, we propose a new
family of computationally tractable convex relaxations for obtaining lower
bounds on graph edit distance. These relaxations can be tailored to the
structural properties of the particular graphs via convex graph invariants.
Specific examples that we highlight in this paper include constraints on the
graph spectrum as well as (tractable approximations of) the stability number
and the maximum-cut values of graphs. We prove under suitable conditions that
our relaxations are tight (i.e., exactly compute the graph edit distance) when
one of the graphs consists of few eigenvalues. We also validate the utility of
our framework on synthetic problems as well as real applications involving
molecular structure comparison problems in chemistry.Comment: 27 pages, 7 figure
Designing labeled graph classifiers by exploiting the R\'enyi entropy of the dissimilarity representation
Representing patterns as labeled graphs is becoming increasingly common in
the broad field of computational intelligence. Accordingly, a wide repertoire
of pattern recognition tools, such as classifiers and knowledge discovery
procedures, are nowadays available and tested for various datasets of labeled
graphs. However, the design of effective learning procedures operating in the
space of labeled graphs is still a challenging problem, especially from the
computational complexity viewpoint. In this paper, we present a major
improvement of a general-purpose classifier for graphs, which is conceived on
an interplay between dissimilarity representation, clustering,
information-theoretic techniques, and evolutionary optimization algorithms. The
improvement focuses on a specific key subroutine devised to compress the input
data. We prove different theorems which are fundamental to the setting of the
parameters controlling such a compression operation. We demonstrate the
effectiveness of the resulting classifier by benchmarking the developed
variants on well-known datasets of labeled graphs, considering as distinct
performance indicators the classification accuracy, computing time, and
parsimony in terms of structural complexity of the synthesized classification
models. The results show state-of-the-art standards in terms of test set
accuracy and a considerable speed-up for what concerns the computing time.Comment: Revised versio
Structural graph matching using the EM algorithm and singular value decomposition
This paper describes an efficient algorithm for inexact graph matching. The method is purely structural, that is, it uses only the edge or connectivity structure of the graph and does not draw on node or edge attributes. We make two contributions: 1) commencing from a probability distribution for matching errors, we show how the problem of graph matching can be posed as maximum-likelihood estimation using the apparatus of the EM algorithm; and 2) we cast the recovery of correspondence matches between the graph nodes in a matrix framework. This allows one to efficiently recover correspondence matches using the singular value decomposition. We experiment with the method on both real-world and synthetic data. Here, we demonstrate that the method offers comparable performance to more computationally demanding method
- …