2,269 research outputs found
A novel Boolean kernels family for categorical data
Kernel based classifiers, such as SVM, are considered state-of-the-art algorithms and are widely used on many classification tasks. However, this kind of methods are hardly interpretable and for this reason they are often considered as black-box models. In this paper, we propose a new family of Boolean kernels for categorical data where features correspond to propositional formulas applied to the input variables. The idea is to create human-readable features to ease the extraction of interpretation rules directly from the embedding space. Experiments on artificial and benchmark datasets show the effectiveness of the proposed family of kernels with respect to established ones, such as RBF, in terms of classification accuracy
Efficiency versus Convergence of Boolean Kernels for On-Line Learning Algorithms
The paper studies machine learning problems where each example is described
using a set of Boolean features and where hypotheses are represented by linear
threshold elements. One method of increasing the expressiveness of learned
hypotheses in this context is to expand the feature set to include conjunctions
of basic features. This can be done explicitly or where possible by using a
kernel function. Focusing on the well known Perceptron and Winnow algorithms,
the paper demonstrates a tradeoff between the computational efficiency with
which the algorithm can be run over the expanded feature space and the
generalization ability of the corresponding learning algorithm. We first
describe several kernel functions which capture either limited forms of
conjunctions or all conjunctions. We show that these kernels can be used to
efficiently run the Perceptron algorithm over a feature space of exponentially
many conjunctions; however we also show that using such kernels, the Perceptron
algorithm can provably make an exponential number of mistakes even when
learning simple functions. We then consider the question of whether kernel
functions can analogously be used to run the multiplicative-update Winnow
algorithm over an expanded feature space of exponentially many conjunctions.
Known upper bounds imply that the Winnow algorithm can learn Disjunctive Normal
Form (DNF) formulae with a polynomial mistake bound in this setting. However,
we prove that it is computationally hard to simulate Winnows behavior for
learning DNF over such a feature set. This implies that the kernel functions
which correspond to running Winnow for this problem are not efficiently
computable, and that there is no general construction that can run Winnow with
kernels
Support Vector Methods for Higher-Level Event Extraction in Point Data
Phenomena occur both in space and time. Correspondingly, ability to model spatiotemporal behavior translates into ability to model phenomena as they occur in reality. Given the complexity inherent when integrating spatial and temporal dimensions, however, the establishment of computational methods for spatiotemporal analysis has proven relatively elusive. Nonetheless, one method, the spatiotemporal helix, has emerged from the field of video processing. Designed to efficiently summarize and query the deformation and movement of spatiotemporal events, the spatiotemporal helix has been demonstrated as capable of describing and differentiating the evolution of hurricanes from sequences of images. Being derived from image data, the representations of events for which the spatiotemporal helix was originally created appear in areal form (e.g., a hurricane covering several square miles is represented by groups of pixels). ii Many sources of spatiotemporal data, however, are not in areal form and instead appear as points. Examples of spatiotemporal point data include those from an epidemiologist recording the time and location of cases of disease and environmental observations collected by a geosensor at the point of its location. As points, these data cannot be directly incorporated into the spatiotemporal helix for analysis. However, with the analytic potential for clouds of point data limited, phenomena represented by point data are often described in terms of events. Defined as change units localized in space and time, the concept of events allows for analysis at multiple levels. For instance lower-level events refer to occurrences of interest described by single data streams at point locations (e.g., an individual case of a certain disease or a significant change in chemical concentration in the environment) while higher-level events describe occurrences of interest derived from aggregations of lower-level events and are frequently described in areal form (e.g., a disease cluster or a pollution cloud). Considering that these higher-level events appear in areal form, they could potentially be incorporated into the spatiotemporal helix. With deformation being an important element of spatiotemporal analysis, however, at the crux of a process for spatiotemporal analysis based on point data would be accurate translation of lower-level event points into representations of higher-level areal events. A limitation of current techniques for the derivation of higher-level events is that they imply bias a priori regarding the shape of higher-level events (e.g., elliptical, convex, linear) which could limit the description of the deformation of higher-level events over time. The objective of this research is to propose two newly developed kernel methods, support vector clustering (SVC) and support vector machines (SVMs), as means for iii translating lower-level event points into higher-level event areas that follow the distribution of lower-level points. SVC is suggested for the derivation of higher-level events arising in point process data while SVMs are explored for their potential with scalar field data (i.e., spatially continuous real-valued data). Developed in the field of machine learning to solve complex non-linear problems, both of these methods are capable of producing highly non-linear representations of higher-level events that may be more suitable than existing methods for spatiotemporal analysis of deformation. To introduce these methods, this thesis is organized so that a context for these methods is first established through a description of existing techniques. This discussion leads to a technical explanation of the mechanics of SVC and SVMs and to the implementation of each of the kernel methods on simulated datasets. Results from these simulations inform discussion regarding the application potential of SVC and SVMs
Sparse Learning over Infinite Subgraph Features
We present a supervised-learning algorithm from graph data (a set of graphs)
for arbitrary twice-differentiable loss functions and sparse linear models over
all possible subgraph features. To date, it has been shown that under all
possible subgraph features, several types of sparse learning, such as Adaboost,
LPBoost, LARS/LASSO, and sparse PLS regression, can be performed. Particularly
emphasis is placed on simultaneous learning of relevant features from an
infinite set of candidates. We first generalize techniques used in all these
preceding studies to derive an unifying bounding technique for arbitrary
separable functions. We then carefully use this bounding to make block
coordinate gradient descent feasible over infinite subgraph features, resulting
in a fast converging algorithm that can solve a wider class of sparse learning
problems over graph data. We also empirically study the differences from the
existing approaches in convergence property, selected subgraph features, and
search-space sizes. We further discuss several unnoticed issues in sparse
learning over all possible subgraph features.Comment: 42 pages, 24 figures, 4 table
Chameleon: A Hybrid Secure Computation Framework for Machine Learning Applications
We present Chameleon, a novel hybrid (mixed-protocol) framework for secure
function evaluation (SFE) which enables two parties to jointly compute a
function without disclosing their private inputs. Chameleon combines the best
aspects of generic SFE protocols with the ones that are based upon additive
secret sharing. In particular, the framework performs linear operations in the
ring using additively secret shared values and nonlinear
operations using Yao's Garbled Circuits or the Goldreich-Micali-Wigderson
protocol. Chameleon departs from the common assumption of additive or linear
secret sharing models where three or more parties need to communicate in the
online phase: the framework allows two parties with private inputs to
communicate in the online phase under the assumption of a third node generating
correlated randomness in an offline phase. Almost all of the heavy
cryptographic operations are precomputed in an offline phase which
substantially reduces the communication overhead. Chameleon is both scalable
and significantly more efficient than the ABY framework (NDSS'15) it is based
on. Our framework supports signed fixed-point numbers. In particular,
Chameleon's vector dot product of signed fixed-point numbers improves the
efficiency of mining and classification of encrypted data for algorithms based
upon heavy matrix multiplications. Our evaluation of Chameleon on a 5 layer
convolutional deep neural network shows 133x and 4.2x faster executions than
Microsoft CryptoNets (ICML'16) and MiniONN (CCS'17), respectively
Identification of functionally related enzymes by learning-to-rank methods
Enzyme sequences and structures are routinely used in the biological sciences
as queries to search for functionally related enzymes in online databases. To
this end, one usually departs from some notion of similarity, comparing two
enzymes by looking for correspondences in their sequences, structures or
surfaces. For a given query, the search operation results in a ranking of the
enzymes in the database, from very similar to dissimilar enzymes, while
information about the biological function of annotated database enzymes is
ignored.
In this work we show that rankings of that kind can be substantially improved
by applying kernel-based learning algorithms. This approach enables the
detection of statistical dependencies between similarities of the active cleft
and the biological function of annotated enzymes. This is in contrast to
search-based approaches, which do not take annotated training data into
account. Similarity measures based on the active cleft are known to outperform
sequence-based or structure-based measures under certain conditions. We
consider the Enzyme Commission (EC) classification hierarchy for obtaining
annotated enzymes during the training phase. The results of a set of sizeable
experiments indicate a consistent and significant improvement for a set of
similarity measures that exploit information about small cavities in the
surface of enzymes
- …