63 research outputs found
Linear-time algorithms for the subpath kernel
The subpath kernel is a useful positive definite kernel, which takes arbitrary rooted trees as input, no matter whether they are ordered or unordered, We first show that the subpath kernel can exhibit excellent classification performance in combination with SVM through an intensive experiment. Secondly, we develop a theory of irreducible trees, and then, using it as a rigid mathematical basis, reconstruct a bottom-up linear-time algorithm for the subtree kernel, which is a correction of an algorithm well-known in the literature. Thirdly, we show a novel top-down algorithm, with which we can realize a linear-time parallel-computing algorithm to compute the subpath kernel
A subpath kernel for learning hierarchical image representations
International audienceTree kernels have demonstrated their ability to deal with hierarchical data, as the intrinsic tree structure often plays a discrimi-native role. While such kernels have been successfully applied to various domains such as nature language processing and bioinformatics, they mostly concentrate on ordered trees and whose nodes are described by symbolic data. Meanwhile, hierarchical representations have gained increasing interest to describe image content. This is particularly true in remote sensing, where such representations allow for revealing different objects of interest at various scales through a tree structure. However, the induced trees are unordered and the nodes are equipped with numerical features. In this paper, we propose a new structured kernel for hierarchical image representations which is built on the concept of subpath kernel. Experimental results on both artificial and remote sensing datasets show that the proposed kernel manages to deal with the hierarchical nature of the data, leading to better classification rates
The Weight Function in the Subtree Kernel is Decisive
Tree data are ubiquitous because they model a large variety of situations,
e.g., the architecture of plants, the secondary structure of RNA, or the
hierarchy of XML files. Nevertheless, the analysis of these non-Euclidean data
is difficult per se. In this paper, we focus on the subtree kernel that is a
convolution kernel for tree data introduced by Vishwanathan and Smola in the
early 2000's. More precisely, we investigate the influence of the weight
function from a theoretical perspective and in real data applications. We
establish on a 2-classes stochastic model that the performance of the subtree
kernel is improved when the weight of leaves vanishes, which motivates the
definition of a new weight function, learned from the data and not fixed by the
user as usually done. To this end, we define a unified framework for computing
the subtree kernel from ordered or unordered trees, that is particularly
suitable for tuning parameters. We show through eight real data classification
problems the great efficiency of our approach, in particular for small
datasets, which also states the high importance of the weight function.
Finally, a visualization tool of the significant features is derived.Comment: 36 page
The Weight Function in the Subtree Kernel is Decisive
Tree data are ubiquitous because they model a large variety of situations,
e.g., the architecture of plants, the secondary structure of RNA, or the
hierarchy of XML files. Nevertheless, the analysis of these non-Euclidean data
is difficul per se. In this paper, we focus on the subtree kernel that is a
convolution kernel for tree data introduced by Vishwanathan and Smola in the
early 2000's. More precisely, we investigate the influence of the weight
function from a theoretical perspective and in real data applications. We
establish on a 2-classes stochastic model that the performance of the subtree
kernel is improved when the weight of leaves vanishes, which motivates the
definition of a new weight function, learned from the data and not fixed by the
user as usually done. To this end, we define a unified framework for computing
the subtree kernel from ordered or unordered trees, that is particularly
suitable for tuning parameters. We show through two real data classification
problems the great efficiency of our approach, in particular with respect to
the ones considered in the literature, which also states the high importance of
the weight function. Finally, a visualization tool of the significant features
is derived.Comment: 28 page
Tree Echo State Networks
In this paper we present the Tree Echo State Network (TreeESN) model, generalizing the paradigm of Reservoir Computing to tree structured data. TreeESNs exploit an untrained generalized recursive reservoir, exhibiting extreme efficiency for learning in structured domains. In addition, we highlight through the paper other characteristics of the approach: First, we discuss the Markovian characterization of reservoir dynamics, extended to the case of tree domains, that is implied by the contractive setting of the TreeESN state transition function. Second, we study two types of state mapping functions to map the tree structured state of TreeESN into a fixed-size feature representation for classification or regression tasks. The critical role of the relation between the choice of the state mapping function and the Markovian characterization of the task is analyzed and experimentally investigated on both artificial and real-world tasks. Finally, experimental results on benchmark and real-world tasks show that the TreeESN approach, in spite of its efficiency, can achieve comparable results with state-of-the-art, although more complex, neural and kernel based models for tree structured data
Hitting and Harvesting Pumpkins
The "c-pumpkin" is the graph with two vertices linked by c>0 parallel edges.
A c-pumpkin-model in a graph G is a pair A,B of disjoint subsets of vertices of
G, each inducing a connected subgraph of G, such that there are at least c
edges in G between A and B. We focus on covering and packing c-pumpkin-models
in a given graph: On the one hand, we provide an FPT algorithm running in time
2^O(k) n^O(1) deciding, for any fixed c>0, whether all c-pumpkin-models can be
covered by at most k vertices. This generalizes known single-exponential FPT
algorithms for Vertex Cover and Feedback Vertex Set, which correspond to the
cases c=1,2 respectively. On the other hand, we present a O(log
n)-approximation algorithm for both the problems of covering all
c-pumpkin-models with a smallest number of vertices, and packing a maximum
number of vertex-disjoint c-pumpkin-models.Comment: v2: several minor change
Spotting Trees with Few Leaves
We show two results related to the Hamiltonicity and -Path algorithms in
undirected graphs by Bj\"orklund [FOCS'10], and Bj\"orklund et al., [arXiv'10].
First, we demonstrate that the technique used can be generalized to finding
some -vertex tree with leaves in an -vertex undirected graph in
time. It can be applied as a subroutine to solve the
-Internal Spanning Tree (-IST) problem in
time using polynomial space, improving upon previous algorithms for this
problem. In particular, for the first time we break the natural barrier of
. Second, we show that the iterated random bipartition employed by
the algorithm can be improved whenever the host graph admits a vertex coloring
with few colors; it can be an ordinary proper vertex coloring, a fractional
vertex coloring, or a vector coloring. In effect, we show improved bounds for
-Path and Hamiltonicity in any graph of maximum degree
or with vector chromatic number at most 8
Distributed Smoothed Tree Kernel
In this paper we explore the possibility to merge the world of Compositional Distributional Semantic Models (CDSM) with Tree Kernels (TK). In particular, we will introduce a specific tree kernel (smoothed tree kernel, or STK) and then show that is possibile to approximate such kernel with the dot product of two vectors obtained compositionally from the sentences, creating in such a way a new CDSM
- âŠ