6,120 research outputs found

    Compact Random Feature Maps

    Full text link
    Kernel approximation using randomized feature maps has recently gained a lot of interest. In this work, we identify that previous approaches for polynomial kernel approximation create maps that are rank deficient, and therefore do not utilize the capacity of the projected feature space effectively. To address this challenge, we propose compact random feature maps (CRAFTMaps) to approximate polynomial kernels more concisely and accurately. We prove the error bounds of CRAFTMaps demonstrating their superior kernel reconstruction performance compared to the previous approximation schemes. We show how structured random matrices can be used to efficiently generate CRAFTMaps, and present a single-pass algorithm using CRAFTMaps to learn non-linear multi-class classifiers. We present experiments on multiple standard data-sets with performance competitive with state-of-the-art results.Comment: 9 page

    On Recursive Edit Distance Kernels with Application to Time Series Classification

    Get PDF
    This paper proposes some extensions to the work on kernels dedicated to string or time series global alignment based on the aggregation of scores obtained by local alignments. The extensions we propose allow to construct, from classical recursive definition of elastic distances, recursive edit distance (or time-warp) kernels that are positive definite if some sufficient conditions are satisfied. The sufficient conditions we end-up with are original and weaker than those proposed in earlier works, although a recursive regularizing term is required to get the proof of the positive definiteness as a direct consequence of the Haussler's convolution theorem. The classification experiment we conducted on three classical time warp distances (two of which being metrics), using Support Vector Machine classifier, leads to conclude that, when the pairwise distance matrix obtained from the training data is \textit{far} from definiteness, the positive definite recursive elastic kernels outperform in general the distance substituting kernels for the classical elastic distances we have tested.Comment: 14 page

    A tree-based kernel for graphs with continuous attributes

    Full text link
    The availability of graph data with node attributes that can be either discrete or real-valued is constantly increasing. While existing kernel methods are effective techniques for dealing with graphs having discrete node labels, their adaptation to non-discrete or continuous node attributes has been limited, mainly for computational issues. Recently, a few kernels especially tailored for this domain, and that trade predictive performance for computational efficiency, have been proposed. In this paper, we propose a graph kernel for complex and continuous nodes' attributes, whose features are tree structures extracted from specific graph visits. The kernel manages to keep the same complexity of state-of-the-art kernels while implicitly using a larger feature space. We further present an approximated variant of the kernel which reduces its complexity significantly. Experimental results obtained on six real-world datasets show that the kernel is the best performing one on most of them. Moreover, in most cases the approximated version reaches comparable performances to current state-of-the-art kernels in terms of classification accuracy while greatly shortening the running times.Comment: This work has been submitted to the IEEE Transactions on Neural Networks and Learning Systems for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl

    Toward a multilevel representation of protein molecules: comparative approaches to the aggregation/folding propensity problem

    Full text link
    This paper builds upon the fundamental work of Niwa et al. [34], which provides the unique possibility to analyze the relative aggregation/folding propensity of the elements of the entire Escherichia coli (E. coli) proteome in a cell-free standardized microenvironment. The hardness of the problem comes from the superposition between the driving forces of intra- and inter-molecule interactions and it is mirrored by the evidences of shift from folding to aggregation phenotypes by single-point mutations [10]. Here we apply several state-of-the-art classification methods coming from the field of structural pattern recognition, with the aim to compare different representations of the same proteins gathered from the Niwa et al. data base; such representations include sequences and labeled (contact) graphs enriched with chemico-physical attributes. By this comparison, we are able to identify also some interesting general properties of proteins. Notably, (i) we suggest a threshold around 250 residues discriminating "easily foldable" from "hardly foldable" molecules consistent with other independent experiments, and (ii) we highlight the relevance of contact graph spectra for folding behavior discrimination and characterization of the E. coli solubility data. The soundness of the experimental results presented in this paper is proved by the statistically relevant relationships discovered among the chemico-physical description of proteins and the developed cost matrix of substitution used in the various discrimination systems.Comment: 17 pages, 3 figures, 46 reference
    • …
    corecore