50,877 research outputs found
Learning Structural Kernels for Natural Language Processing
Structural kernels are a flexible learning
paradigm that has been widely used in Natural
Language Processing. However, the problem
of model selection in kernel-based methods
is usually overlooked. Previous approaches
mostly rely on setting default values for kernel
hyperparameters or using grid search,
which is slow and coarse-grained. In contrast,
Bayesian methods allow efficient model
selection by maximizing the evidence on the
training data through gradient-based methods.
In this paper we show how to perform this
in the context of structural kernels by using
Gaussian Processes. Experimental results on
tree kernels show that this procedure results
in better prediction performance compared to
hyperparameter optimization via grid search.
The framework proposed in this paper can be
adapted to other structures besides trees, e.g.,
strings and graphs, thereby extending the utility
of kernel-based methods
Extending local features with contextual information in graph kernels
Graph kernels are usually defined in terms of simpler kernels over local
substructures of the original graphs. Different kernels consider different
types of substructures. However, in some cases they have similar predictive
performances, probably because the substructures can be interpreted as
approximations of the subgraphs they induce. In this paper, we propose to
associate to each feature a piece of information about the context in which the
feature appears in the graph. A substructure appearing in two different graphs
will match only if it appears with the same context in both graphs. We propose
a kernel based on this idea that considers trees as substructures, and where
the contexts are features too. The kernel is inspired from the framework in
[6], even if it is not part of it. We give an efficient algorithm for computing
the kernel and show promising results on real-world graph classification
datasets.Comment: To appear in ICONIP 201
Graph kernels between point clouds
Point clouds are sets of points in two or three dimensions. Most kernel
methods for learning on sets of points have not yet dealt with the specific
geometrical invariances and practical constraints associated with point clouds
in computer vision and graphics. In this paper, we present extensions of graph
kernels for point clouds, which allow to use kernel methods for such ob jects
as shapes, line drawings, or any three-dimensional point clouds. In order to
design rich and numerically efficient kernels with as few free parameters as
possible, we use kernels between covariance matrices and their factorizations
on graphical models. We derive polynomial time dynamic programming recursions
and present applications to recognition of handwritten digits and Chinese
characters from few training examples
Kernelized Hashcode Representations for Relation Extraction
Kernel methods have produced state-of-the-art results for a number of NLP
tasks such as relation extraction, but suffer from poor scalability due to the
high cost of computing kernel similarities between natural language structures.
A recently proposed technique, kernelized locality-sensitive hashing (KLSH),
can significantly reduce the computational cost, but is only applicable to
classifiers operating on kNN graphs. Here we propose to use random subspaces of
KLSH codes for efficiently constructing an explicit representation of NLP
structures suitable for general classification methods. Further, we propose an
approach for optimizing the KLSH model for classification problems by
maximizing an approximation of mutual information between the KLSH codes
(feature vectors) and the class labels. We evaluate the proposed approach on
biomedical relation extraction datasets, and observe significant and robust
improvements in accuracy w.r.t. state-of-the-art classifiers, along with
drastic (orders-of-magnitude) speedup compared to conventional kernel methods.Comment: To appear in the proceedings of conference, AAAI-1
Fast Supervised Hashing with Decision Trees for High-Dimensional Data
Supervised hashing aims to map the original features to compact binary codes
that are able to preserve label based similarity in the Hamming space.
Non-linear hash functions have demonstrated the advantage over linear ones due
to their powerful generalization capability. In the literature, kernel
functions are typically used to achieve non-linearity in hashing, which achieve
encouraging retrieval performance at the price of slow evaluation and training
time. Here we propose to use boosted decision trees for achieving non-linearity
in hashing, which are fast to train and evaluate, hence more suitable for
hashing with high dimensional data. In our approach, we first propose
sub-modular formulations for the hashing binary code inference problem and an
efficient GraphCut based block search method for solving large-scale inference.
Then we learn hash functions by training boosted decision trees to fit the
binary codes. Experiments demonstrate that our proposed method significantly
outperforms most state-of-the-art methods in retrieval precision and training
time. Especially for high-dimensional data, our method is orders of magnitude
faster than many methods in terms of training time.Comment: Appearing in Proc. IEEE Conf. Computer Vision and Pattern
Recognition, 2014, Ohio, US
Maximum Inner-Product Search using Tree Data-structures
The problem of {\em efficiently} finding the best match for a query in a
given set with respect to the Euclidean distance or the cosine similarity has
been extensively studied in literature. However, a closely related problem of
efficiently finding the best match with respect to the inner product has never
been explored in the general setting to the best of our knowledge. In this
paper we consider this general problem and contrast it with the existing
best-match algorithms. First, we propose a general branch-and-bound algorithm
using a tree data structure. Subsequently, we present a dual-tree algorithm for
the case where there are multiple queries. Finally we present a new data
structure for increasing the efficiency of the dual-tree algorithm. These
branch-and-bound algorithms involve novel bounds suited for the purpose of
best-matching with inner products. We evaluate our proposed algorithms on a
variety of data sets from various applications, and exhibit up to five orders
of magnitude improvement in query time over the naive search technique.Comment: Under submission in KDD 201
- …