613 research outputs found
One-class classifiers based on entropic spanning graphs
One-class classifiers offer valuable tools to assess the presence of outliers
in data. In this paper, we propose a design methodology for one-class
classifiers based on entropic spanning graphs. Our approach takes into account
the possibility to process also non-numeric data by means of an embedding
procedure. The spanning graph is learned on the embedded input data and the
outcoming partition of vertices defines the classifier. The final partition is
derived by exploiting a criterion based on mutual information minimization.
Here, we compute the mutual information by using a convenient formulation
provided in terms of the -Jensen difference. Once training is
completed, in order to associate a confidence level with the classifier
decision, a graph-based fuzzy model is constructed. The fuzzification process
is based only on topological information of the vertices of the entropic
spanning graph. As such, the proposed one-class classifier is suitable also for
data characterized by complex geometric structures. We provide experiments on
well-known benchmarks containing both feature vectors and labeled graphs. In
addition, we apply the method to the protein solubility recognition problem by
considering several representations for the input samples. Experimental results
demonstrate the effectiveness and versatility of the proposed method with
respect to other state-of-the-art approaches.Comment: Extended and revised version of the paper "One-Class Classification
Through Mutual Information Minimization" presented at the 2016 IEEE IJCNN,
Vancouver, Canad
On classes of non-Gaussian asymptotic minimizers in entropic uncertainty principles
In this paper we revisit the Bialynicki-Birula & Mycielski uncertainty
principle and its cases of equality. This Shannon entropic version of the
well-known Heisenberg uncertainty principle can be used when dealing with
variables that admit no variance. In this paper, we extend this uncertainty
principle to Renyi entropies. We recall that in both Shannon and Renyi cases,
and for a given dimension n, the only case of equality occurs for Gaussian
random vectors. We show that as n grows, however, the bound is also
asymptotically attained in the cases of n-dimensional Student-t and Student-r
distributions. A complete analytical study is performed in a special case of a
Student-t distribution. We also show numerically that this effect exists for
the particular case of a n-dimensional Cauchy variable, whatever the Renyi
entropy considered, extending the results of Abe and illustrating the
analytical asymptotic study of the student-t case. In the Student-r case, we
show numerically that the same behavior occurs for uniformly distributed
vectors. These particular cases and other ones investigated in this paper are
interesting since they show that this asymptotic behavior cannot be considered
as a "Gaussianization" of the vector when the dimension increases
Fast depth-based subgraph kernels for unattributed graphs
In this paper, we investigate two fast subgraph kernels based on a depth-based representation of graph-structure. Both methods gauge depth information through a family of K-layer expansion subgraphs rooted at a vertex [1]. The first method commences by computing a centroid-based complexity trace for each graph, using a depth-based representation rooted at the centroid vertex that has minimum shortest path length variance to the remaining vertices [2]. This subgraph kernel is computed by measuring the Jensen-Shannon divergence between centroid-based complexity entropy traces. The second method, on the other hand, computes a depth-based representation around each vertex in turn. The corresponding subgraph kernel is computed using isomorphisms tests to compare the depth-based representation rooted at each vertex in turn. For graphs with n vertices, the time complexities for the two new kernels are O(n 2) and O(n 3), in contrast to O(n 6) for the classic Gärtner graph kernel [3]. Key to achieving this efficiency is that we compute the required Shannon entropy of the random walk for our kernels with O(n 2) operations. This computational strategy enables our subgraph kernels to easily scale up to graphs of reasonably large sizes and thus overcome the size limits arising in state-of-the-art graph kernels. Experiments on standard bioinformatics and computer vision graph datasets demonstrate the effectiveness and efficiency of our new subgraph kernels
The mutual information between graphs
The estimation of mutual information between graphs has been an elusive problem until the formulation of graph matching in terms of manifold alignment. Then, graphs are mapped to multi-dimensional sets of points through structure preserving embeddings. Point-wise alignment algorithms can be exploited in this context to re-cast graph matching in terms of point matching. Methods based on bypass entropy estimation must be deployed to render the estimation of mutual information computationally tractable. In this paper the novel contribution is to show how manifold alignment can be combined with copula-based entropy estimators to efficiently estimate the mutual information between graphs. We compare the empirical copula with an Archimedean copula (the independent one) in terms of retrieval/recall after graph comparison. Our experiments show that mutual information built in both choices improves significantly state-of-the art divergences.Funding. F. Escolano, M.A. Lozano: Project TIN2012-32839 (Spanish Gov.). M. Curado: BES-2013-064482 (Spanish Gov.). E. R. Hancock: Royal Society Wolfson Research Merit Award
Graph similarity through entropic manifold alignment
In this paper we decouple the problem of measuring graph similarity into two sequential steps. The first step is the linearization of the quadratic assignment problem (QAP) in a low-dimensional space, given by the embedding trick. The second step is the evaluation of an information-theoretic distributional measure, which relies on deformable manifold alignment. The proposed measure is a normalized conditional entropy, which induces a positive definite kernel when symmetrized. We use bypass entropy estimation methods to compute an approximation of the normalized conditional entropy. Our approach, which is purely topological (i.e., it does not rely on node or edge attributes although it can potentially accommodate them as additional sources of information) is competitive with state-of-the-art graph matching algorithms as sources of correspondence-based graph similarity, but its complexity is linear instead of cubic (although the complexity of the similarity measure is quadratic). We also determine that the best embedding strategy for graph similarity is provided by commute time embedding, and we conjecture that this is related to its inversibility property, since the inverse of the embeddings obtained using our method can be used as a generative sampler of graph structure.The work of the first and third authors was supported by the projects TIN2012-32839 and TIN2015-69077-P of the Spanish Government. The work of the second author was supported by a Royal Society Wolfson Research Merit Award
On some entropy functionals derived from R\'enyi information divergence
We consider the maximum entropy problems associated with R\'enyi -entropy,
subject to two kinds of constraints on expected values. The constraints
considered are a constraint on the standard expectation, and a constraint on
the generalized expectation as encountered in nonextensive statistics. The
optimum maximum entropy probability distributions, which can exhibit a
power-law behaviour, are derived and characterized. The R\'enyi entropy of the
optimum distributions can be viewed as a function of the constraint. This
defines two families of entropy functionals in the space of possible expected
values. General properties of these functionals, including nonnegativity,
minimum, convexity, are documented. Their relationships as well as numerical
aspects are also discussed. Finally, we work out some specific cases for the
reference measure and recover in a limit case some well-known entropies
Convergence Rates for Empirical Estimation of Binary Classification Bounds
Bounding the best achievable error probability for binary classification problems is relevant to many applications including machine learning, signal processing, and information theory. Many bounds on the Bayes binary classification error rate depend on information divergences between the pair of class distributions. Recently, the Henze–Penrose (HP) divergence has been proposed for bounding classification error probability. We consider the problem of empirically estimating the HP-divergence from random samples. We derive a bound on the convergence rate for the Friedman–Rafsky (FR) estimator of the HP-divergence, which is related to a multivariate runs statistic for testing between two distributions. The FR estimator is derived from a multicolored Euclidean minimal spanning tree (MST) that spans the merged samples. We obtain a concentration inequality for the Friedman–Rafsky estimator of the Henze–Penrose divergence. We validate our results experimentally and illustrate their application to real datasets
- …