63 research outputs found

    Linear-time algorithms for the subpath kernel

    Get PDF
    The subpath kernel is a useful positive definite kernel, which takes arbitrary rooted trees as input, no matter whether they are ordered or unordered, We first show that the subpath kernel can exhibit excellent classification performance in combination with SVM through an intensive experiment. Secondly, we develop a theory of irreducible trees, and then, using it as a rigid mathematical basis, reconstruct a bottom-up linear-time algorithm for the subtree kernel, which is a correction of an algorithm well-known in the literature. Thirdly, we show a novel top-down algorithm, with which we can realize a linear-time parallel-computing algorithm to compute the subpath kernel

    A subpath kernel for learning hierarchical image representations

    Get PDF
    International audienceTree kernels have demonstrated their ability to deal with hierarchical data, as the intrinsic tree structure often plays a discrimi-native role. While such kernels have been successfully applied to various domains such as nature language processing and bioinformatics, they mostly concentrate on ordered trees and whose nodes are described by symbolic data. Meanwhile, hierarchical representations have gained increasing interest to describe image content. This is particularly true in remote sensing, where such representations allow for revealing different objects of interest at various scales through a tree structure. However, the induced trees are unordered and the nodes are equipped with numerical features. In this paper, we propose a new structured kernel for hierarchical image representations which is built on the concept of subpath kernel. Experimental results on both artificial and remote sensing datasets show that the proposed kernel manages to deal with the hierarchical nature of the data, leading to better classification rates

    The Weight Function in the Subtree Kernel is Decisive

    Get PDF
    Tree data are ubiquitous because they model a large variety of situations, e.g., the architecture of plants, the secondary structure of RNA, or the hierarchy of XML files. Nevertheless, the analysis of these non-Euclidean data is difficult per se. In this paper, we focus on the subtree kernel that is a convolution kernel for tree data introduced by Vishwanathan and Smola in the early 2000's. More precisely, we investigate the influence of the weight function from a theoretical perspective and in real data applications. We establish on a 2-classes stochastic model that the performance of the subtree kernel is improved when the weight of leaves vanishes, which motivates the definition of a new weight function, learned from the data and not fixed by the user as usually done. To this end, we define a unified framework for computing the subtree kernel from ordered or unordered trees, that is particularly suitable for tuning parameters. We show through eight real data classification problems the great efficiency of our approach, in particular for small datasets, which also states the high importance of the weight function. Finally, a visualization tool of the significant features is derived.Comment: 36 page

    The Weight Function in the Subtree Kernel is Decisive

    Get PDF
    Tree data are ubiquitous because they model a large variety of situations, e.g., the architecture of plants, the secondary structure of RNA, or the hierarchy of XML files. Nevertheless, the analysis of these non-Euclidean data is difficul per se. In this paper, we focus on the subtree kernel that is a convolution kernel for tree data introduced by Vishwanathan and Smola in the early 2000's. More precisely, we investigate the influence of the weight function from a theoretical perspective and in real data applications. We establish on a 2-classes stochastic model that the performance of the subtree kernel is improved when the weight of leaves vanishes, which motivates the definition of a new weight function, learned from the data and not fixed by the user as usually done. To this end, we define a unified framework for computing the subtree kernel from ordered or unordered trees, that is particularly suitable for tuning parameters. We show through two real data classification problems the great efficiency of our approach, in particular with respect to the ones considered in the literature, which also states the high importance of the weight function. Finally, a visualization tool of the significant features is derived.Comment: 28 page

    Tree Echo State Networks

    Get PDF
    In this paper we present the Tree Echo State Network (TreeESN) model, generalizing the paradigm of Reservoir Computing to tree structured data. TreeESNs exploit an untrained generalized recursive reservoir, exhibiting extreme efficiency for learning in structured domains. In addition, we highlight through the paper other characteristics of the approach: First, we discuss the Markovian characterization of reservoir dynamics, extended to the case of tree domains, that is implied by the contractive setting of the TreeESN state transition function. Second, we study two types of state mapping functions to map the tree structured state of TreeESN into a fixed-size feature representation for classification or regression tasks. The critical role of the relation between the choice of the state mapping function and the Markovian characterization of the task is analyzed and experimentally investigated on both artificial and real-world tasks. Finally, experimental results on benchmark and real-world tasks show that the TreeESN approach, in spite of its efficiency, can achieve comparable results with state-of-the-art, although more complex, neural and kernel based models for tree structured data

    Hitting and Harvesting Pumpkins

    Full text link
    The "c-pumpkin" is the graph with two vertices linked by c>0 parallel edges. A c-pumpkin-model in a graph G is a pair A,B of disjoint subsets of vertices of G, each inducing a connected subgraph of G, such that there are at least c edges in G between A and B. We focus on covering and packing c-pumpkin-models in a given graph: On the one hand, we provide an FPT algorithm running in time 2^O(k) n^O(1) deciding, for any fixed c>0, whether all c-pumpkin-models can be covered by at most k vertices. This generalizes known single-exponential FPT algorithms for Vertex Cover and Feedback Vertex Set, which correspond to the cases c=1,2 respectively. On the other hand, we present a O(log n)-approximation algorithm for both the problems of covering all c-pumpkin-models with a smallest number of vertices, and packing a maximum number of vertex-disjoint c-pumpkin-models.Comment: v2: several minor change

    Spotting Trees with Few Leaves

    Full text link
    We show two results related to the Hamiltonicity and kk-Path algorithms in undirected graphs by Bj\"orklund [FOCS'10], and Bj\"orklund et al., [arXiv'10]. First, we demonstrate that the technique used can be generalized to finding some kk-vertex tree with ll leaves in an nn-vertex undirected graph in O∗(1.657k2l/2)O^*(1.657^k2^{l/2}) time. It can be applied as a subroutine to solve the kk-Internal Spanning Tree (kk-IST) problem in O∗(min⁡(3.455k,1.946n))O^*(\min(3.455^k, 1.946^n)) time using polynomial space, improving upon previous algorithms for this problem. In particular, for the first time we break the natural barrier of O∗(2n)O^*(2^n). Second, we show that the iterated random bipartition employed by the algorithm can be improved whenever the host graph admits a vertex coloring with few colors; it can be an ordinary proper vertex coloring, a fractional vertex coloring, or a vector coloring. In effect, we show improved bounds for kk-Path and Hamiltonicity in any graph of maximum degree Δ=4,
,12\Delta=4,\ldots,12 or with vector chromatic number at most 8

    Distributed Smoothed Tree Kernel

    Get PDF
    In this paper we explore the possibility to merge the world of Compositional Distributional Semantic Models (CDSM) with Tree Kernels (TK). In particular, we will introduce a specific tree kernel (smoothed tree kernel, or STK) and then show that is possibile to approximate such kernel with the dot product of two vectors obtained compositionally from the sentences, creating in such a way a new CDSM
    • 

    corecore