4,769 research outputs found
A comparison between two representatives of a set of graphs: median vs barycenter graph
Trabajo presentado al Joint IAPR International Workshop on Structural, Syntactic and Statistical Pattern Recognition (SSPR&SPR) celebrado en Esmirna (TurquÃa) del 18 al 20 de agosto de 2010.In this paper we consider two existing methods to generate a representative of a given set of graphs, that satisfy the following two conditions. On the one hand, that they are applicable to graphs with any kind of labels in nodes and edges and on the other hand, that they can handle relatively large amount of data. Namely, the approximated algorithms to compute the Median Graph via graph embedding and a new method to compute the Barycenter Graph. Our contribution is to give a new algorithm for the barycenter computation and to compare it to the median Graph. To compare these two representatives, we take into account algorithmic considerations and experimental results on the quality of the representative and its robustness, on several datasets.This work was supported by projects: 'CONSOLIDER-INGENIO 2010 Multimodal interaction in pattern recognition and computer vision' (V-00069), 'Robotica ubicua para entornos urbanos' (J-01225).Peer Reviewe
A comparison between two representatives of a set of graphs: median vs barycenter graph
In this paper we consider two existing methods to generate a representative of a given set of graphs, that satisfy the following two conditions. On the one hand, that they are applicable to graphs with any kind of labels in nodes and edges and on the other hand, that they can handle relatively large amount of data. Namely, the approximated algorithms to compute the Median Graph via graph embedding and a new method to compute the Barycenter Graph. Our contribution is to give a new algorithm for the barycenter computation and to compare it to the median Graph. To compare these two representatives, we take into account algorithmic considerations and experimental results on the quality of the representative and its robustness, on several datasets.Preprin
A generic framework for median graph computation based on a recursive embedding approach
The median graph has been shown to be a good choice to obtain a representative of a set of graphs. However, its computation is a complex problem. Recently, graph embedding into vector spaces has been proposed to obtain approximations of the median graph. The problem with such an approach is how to go from a point in the vector space back to a graph in the graph space. The main contribution of this paper is the generalization of this previous method, proposing a generic recursive procedure that permits to recover the graph corresponding to a point in the vector space, introducing only the amount of approximation inherent to the use of graph matching algorithms. In order to evaluate the proposed method, we compare it with the set median and with the other state-of-the-art embedding-based methods for the median graph computation. The experiments are carried out using four different databases (one semi-artificial and three containing real-world data). Results show that with the proposed approach we can obtain better medians, in terms of the sum of distances to the training graphs, than with the previous existing methods. © 2011 Elsevier Inc. All rights reserved.This work has been supported by the Spanish research programmes Consolider Ingenio 2010 CSD2007-00018, TIN2006-15694-C02-02 and TIN2008-04998 and the fellowship RYC-2009-05031.Peer Reviewe
Sequential Deliberation for Social Choice
In large scale collective decision making, social choice is a normative study
of how one ought to design a protocol for reaching consensus. However, in
instances where the underlying decision space is too large or complex for
ordinal voting, standard voting methods of social choice may be impractical.
How then can we design a mechanism - preferably decentralized, simple,
scalable, and not requiring any special knowledge of the decision space - to
reach consensus? We propose sequential deliberation as a natural solution to
this problem. In this iterative method, successive pairs of agents bargain over
the decision space using the previous decision as a disagreement alternative.
We describe the general method and analyze the quality of its outcome when the
space of preferences define a median graph. We show that sequential
deliberation finds a 1.208- approximation to the optimal social cost on such
graphs, coming very close to this value with only a small constant number of
agents sampled from the population. We also show lower bounds on simpler
classes of mechanisms to justify our design choices. We further show that
sequential deliberation is ex-post Pareto efficient and has truthful reporting
as an equilibrium of the induced extensive form game. We finally show that for
general metric spaces, the second moment of of the distribution of social cost
of the outcomes produced by sequential deliberation is also bounded
Tree Edit Distance Learning via Adaptive Symbol Embeddings
Metric learning has the aim to improve classification accuracy by learning a
distance measure which brings data points from the same class closer together
and pushes data points from different classes further apart. Recent research
has demonstrated that metric learning approaches can also be applied to trees,
such as molecular structures, abstract syntax trees of computer programs, or
syntax trees of natural language, by learning the cost function of an edit
distance, i.e. the costs of replacing, deleting, or inserting nodes in a tree.
However, learning such costs directly may yield an edit distance which violates
metric axioms, is challenging to interpret, and may not generalize well. In
this contribution, we propose a novel metric learning approach for trees which
we call embedding edit distance learning (BEDL) and which learns an edit
distance indirectly by embedding the tree nodes as vectors, such that the
Euclidean distance between those vectors supports class discrimination. We
learn such embeddings by reducing the distance to prototypical trees from the
same class and increasing the distance to prototypical trees from different
classes. In our experiments, we show that BEDL improves upon the
state-of-the-art in metric learning for trees on six benchmark data sets,
ranging from computer science over biomedical data to a natural-language
processing data set containing over 300,000 nodes.Comment: Paper at the International Conference of Machine Learning (2018),
2018-07-10 to 2018-07-15 in Stockholm, Swede
Empirical geodesic graphs and CAT(k) metrics for data analysis
A methodology is developed for data analysis based on empirically constructed
geodesic metric spaces. For a probability distribution, the length along a path
between two points can be defined as the amount of probability mass accumulated
along the path. The geodesic, then, is the shortest such path and defines a
geodesic metric. Such metrics are transformed in a number of ways to produce
parametrised families of geodesic metric spaces, empirical versions of which
allow computation of intrinsic means and associated measures of dispersion.
These reveal properties of the data, based on geometry, such as those that are
difficult to see from the raw Euclidean distances. Examples of application
include clustering and classification. For certain parameter ranges, the spaces
become CAT(0) spaces and the intrinsic means are unique. In one case, a minimal
spanning tree of a graph based on the data becomes CAT(0). In another, a
so-called "metric cone" construction allows extension to CAT() spaces. It is
shown how to empirically tune the parameters of the metrics, making it possible
to apply them to a number of real cases.Comment: Statistics and Computing, 201
How Many Dissimilarity/Kernel Self Organizing Map Variants Do We Need?
In numerous applicative contexts, data are too rich and too complex to be
represented by numerical vectors. A general approach to extend machine learning
and data mining techniques to such data is to really on a dissimilarity or on a
kernel that measures how different or similar two objects are. This approach
has been used to define several variants of the Self Organizing Map (SOM). This
paper reviews those variants in using a common set of notations in order to
outline differences and similarities between them. It discusses the advantages
and drawbacks of the variants, as well as the actual relevance of the
dissimilarity/kernel SOM for practical applications
- …