2,269 research outputs found

    A novel Boolean kernels family for categorical data

    Get PDF
    Kernel based classifiers, such as SVM, are considered state-of-the-art algorithms and are widely used on many classification tasks. However, this kind of methods are hardly interpretable and for this reason they are often considered as black-box models. In this paper, we propose a new family of Boolean kernels for categorical data where features correspond to propositional formulas applied to the input variables. The idea is to create human-readable features to ease the extraction of interpretation rules directly from the embedding space. Experiments on artificial and benchmark datasets show the effectiveness of the proposed family of kernels with respect to established ones, such as RBF, in terms of classification accuracy

    Efficiency versus Convergence of Boolean Kernels for On-Line Learning Algorithms

    Full text link
    The paper studies machine learning problems where each example is described using a set of Boolean features and where hypotheses are represented by linear threshold elements. One method of increasing the expressiveness of learned hypotheses in this context is to expand the feature set to include conjunctions of basic features. This can be done explicitly or where possible by using a kernel function. Focusing on the well known Perceptron and Winnow algorithms, the paper demonstrates a tradeoff between the computational efficiency with which the algorithm can be run over the expanded feature space and the generalization ability of the corresponding learning algorithm. We first describe several kernel functions which capture either limited forms of conjunctions or all conjunctions. We show that these kernels can be used to efficiently run the Perceptron algorithm over a feature space of exponentially many conjunctions; however we also show that using such kernels, the Perceptron algorithm can provably make an exponential number of mistakes even when learning simple functions. We then consider the question of whether kernel functions can analogously be used to run the multiplicative-update Winnow algorithm over an expanded feature space of exponentially many conjunctions. Known upper bounds imply that the Winnow algorithm can learn Disjunctive Normal Form (DNF) formulae with a polynomial mistake bound in this setting. However, we prove that it is computationally hard to simulate Winnows behavior for learning DNF over such a feature set. This implies that the kernel functions which correspond to running Winnow for this problem are not efficiently computable, and that there is no general construction that can run Winnow with kernels

    Support Vector Methods for Higher-Level Event Extraction in Point Data

    Get PDF
    Phenomena occur both in space and time. Correspondingly, ability to model spatiotemporal behavior translates into ability to model phenomena as they occur in reality. Given the complexity inherent when integrating spatial and temporal dimensions, however, the establishment of computational methods for spatiotemporal analysis has proven relatively elusive. Nonetheless, one method, the spatiotemporal helix, has emerged from the field of video processing. Designed to efficiently summarize and query the deformation and movement of spatiotemporal events, the spatiotemporal helix has been demonstrated as capable of describing and differentiating the evolution of hurricanes from sequences of images. Being derived from image data, the representations of events for which the spatiotemporal helix was originally created appear in areal form (e.g., a hurricane covering several square miles is represented by groups of pixels). ii Many sources of spatiotemporal data, however, are not in areal form and instead appear as points. Examples of spatiotemporal point data include those from an epidemiologist recording the time and location of cases of disease and environmental observations collected by a geosensor at the point of its location. As points, these data cannot be directly incorporated into the spatiotemporal helix for analysis. However, with the analytic potential for clouds of point data limited, phenomena represented by point data are often described in terms of events. Defined as change units localized in space and time, the concept of events allows for analysis at multiple levels. For instance lower-level events refer to occurrences of interest described by single data streams at point locations (e.g., an individual case of a certain disease or a significant change in chemical concentration in the environment) while higher-level events describe occurrences of interest derived from aggregations of lower-level events and are frequently described in areal form (e.g., a disease cluster or a pollution cloud). Considering that these higher-level events appear in areal form, they could potentially be incorporated into the spatiotemporal helix. With deformation being an important element of spatiotemporal analysis, however, at the crux of a process for spatiotemporal analysis based on point data would be accurate translation of lower-level event points into representations of higher-level areal events. A limitation of current techniques for the derivation of higher-level events is that they imply bias a priori regarding the shape of higher-level events (e.g., elliptical, convex, linear) which could limit the description of the deformation of higher-level events over time. The objective of this research is to propose two newly developed kernel methods, support vector clustering (SVC) and support vector machines (SVMs), as means for iii translating lower-level event points into higher-level event areas that follow the distribution of lower-level points. SVC is suggested for the derivation of higher-level events arising in point process data while SVMs are explored for their potential with scalar field data (i.e., spatially continuous real-valued data). Developed in the field of machine learning to solve complex non-linear problems, both of these methods are capable of producing highly non-linear representations of higher-level events that may be more suitable than existing methods for spatiotemporal analysis of deformation. To introduce these methods, this thesis is organized so that a context for these methods is first established through a description of existing techniques. This discussion leads to a technical explanation of the mechanics of SVC and SVMs and to the implementation of each of the kernel methods on simulated datasets. Results from these simulations inform discussion regarding the application potential of SVC and SVMs

    Sparse Learning over Infinite Subgraph Features

    Full text link
    We present a supervised-learning algorithm from graph data (a set of graphs) for arbitrary twice-differentiable loss functions and sparse linear models over all possible subgraph features. To date, it has been shown that under all possible subgraph features, several types of sparse learning, such as Adaboost, LPBoost, LARS/LASSO, and sparse PLS regression, can be performed. Particularly emphasis is placed on simultaneous learning of relevant features from an infinite set of candidates. We first generalize techniques used in all these preceding studies to derive an unifying bounding technique for arbitrary separable functions. We then carefully use this bounding to make block coordinate gradient descent feasible over infinite subgraph features, resulting in a fast converging algorithm that can solve a wider class of sparse learning problems over graph data. We also empirically study the differences from the existing approaches in convergence property, selected subgraph features, and search-space sizes. We further discuss several unnoticed issues in sparse learning over all possible subgraph features.Comment: 42 pages, 24 figures, 4 table

    Chameleon: A Hybrid Secure Computation Framework for Machine Learning Applications

    Get PDF
    We present Chameleon, a novel hybrid (mixed-protocol) framework for secure function evaluation (SFE) which enables two parties to jointly compute a function without disclosing their private inputs. Chameleon combines the best aspects of generic SFE protocols with the ones that are based upon additive secret sharing. In particular, the framework performs linear operations in the ring Z2l\mathbb{Z}_{2^l} using additively secret shared values and nonlinear operations using Yao's Garbled Circuits or the Goldreich-Micali-Wigderson protocol. Chameleon departs from the common assumption of additive or linear secret sharing models where three or more parties need to communicate in the online phase: the framework allows two parties with private inputs to communicate in the online phase under the assumption of a third node generating correlated randomness in an offline phase. Almost all of the heavy cryptographic operations are precomputed in an offline phase which substantially reduces the communication overhead. Chameleon is both scalable and significantly more efficient than the ABY framework (NDSS'15) it is based on. Our framework supports signed fixed-point numbers. In particular, Chameleon's vector dot product of signed fixed-point numbers improves the efficiency of mining and classification of encrypted data for algorithms based upon heavy matrix multiplications. Our evaluation of Chameleon on a 5 layer convolutional deep neural network shows 133x and 4.2x faster executions than Microsoft CryptoNets (ICML'16) and MiniONN (CCS'17), respectively

    Identification of functionally related enzymes by learning-to-rank methods

    Full text link
    Enzyme sequences and structures are routinely used in the biological sciences as queries to search for functionally related enzymes in online databases. To this end, one usually departs from some notion of similarity, comparing two enzymes by looking for correspondences in their sequences, structures or surfaces. For a given query, the search operation results in a ranking of the enzymes in the database, from very similar to dissimilar enzymes, while information about the biological function of annotated database enzymes is ignored. In this work we show that rankings of that kind can be substantially improved by applying kernel-based learning algorithms. This approach enables the detection of statistical dependencies between similarities of the active cleft and the biological function of annotated enzymes. This is in contrast to search-based approaches, which do not take annotated training data into account. Similarity measures based on the active cleft are known to outperform sequence-based or structure-based measures under certain conditions. We consider the Enzyme Commission (EC) classification hierarchy for obtaining annotated enzymes during the training phase. The results of a set of sizeable experiments indicate a consistent and significant improvement for a set of similarity measures that exploit information about small cavities in the surface of enzymes
    • …
    corecore