2,766,483 research outputs found
A Convex Feature Learning Formulation for Latent Task Structure Discovery
This paper considers the multi-task learning problem and in the setting where
some relevant features could be shared across few related tasks. Most of the
existing methods assume the extent to which the given tasks are related or
share a common feature space to be known apriori. In real-world applications
however, it is desirable to automatically discover the groups of related tasks
that share a feature space. In this paper we aim at searching the exponentially
large space of all possible groups of tasks that may share a feature space. The
main contribution is a convex formulation that employs a graph-based
regularizer and simultaneously discovers few groups of related tasks, having
close-by task parameters, as well as the feature space shared within each
group. The regularizer encodes an important structure among the groups of tasks
leading to an efficient algorithm for solving it: if there is no feature space
under which a group of tasks has close-by task parameters, then there does not
exist such a feature space for any of its supersets. An efficient active set
algorithm that exploits this simplification and performs a clever search in the
exponentially large space is presented. The algorithm is guaranteed to solve
the proposed formulation (within some precision) in a time polynomial in the
number of groups of related tasks discovered. Empirical results on benchmark
datasets show that the proposed formulation achieves good generalization and
outperforms state-of-the-art multi-task learning algorithms in some cases.Comment: ICML201
Kernel principal component analysis (KPCA) for the de-noising of communication signals
This paper is concerned with the problem of de-noising for non-linear signals. Principal Component Analysis (PCA) cannot be applied to non-linear signals however it is known that using kernel functions, a non-linear signal can be transformed into a linear signal in a higher dimensional space. In that feature space, a linear algorithm can be applied to a non-linear problem. It is proposed that using the principal components extracted from this feature space, the signal can be de-noised in its input space
Space-efficient Feature Maps for String Alignment Kernels
String kernels are attractive data analysis tools for analyzing string data.
Among them, alignment kernels are known for their high prediction accuracies in
string classifications when tested in combination with SVM in various
applications. However, alignment kernels have a crucial drawback in that they
scale poorly due to their quadratic computation complexity in the number of
input strings, which limits large-scale applications in practice. We address
this need by presenting the first approximation for string alignment kernels,
which we call space-efficient feature maps for edit distance with moves
(SFMEDM), by leveraging a metric embedding named edit sensitive parsing (ESP)
and feature maps (FMs) of random Fourier features (RFFs) for large-scale string
analyses. The original FMs for RFFs consume a huge amount of memory
proportional to the dimension d of input vectors and the dimension D of output
vectors, which prohibits its large-scale applications. We present novel
space-efficient feature maps (SFMs) of RFFs for a space reduction from O(dD) of
the original FMs to O(d) of SFMs with a theoretical guarantee with respect to
concentration bounds. We experimentally test SFMEDM on its ability to learn SVM
for large-scale string classifications with various massive string data, and we
demonstrate the superior performance of SFMEDM with respect to prediction
accuracy, scalability and computation efficiency.Comment: Full version for ICDM'19 pape
Analysis of Vocal Disorders in a Feature Space
This paper provides a way to classify vocal disorders for clinical
applications. This goal is achieved by means of geometric signal separation in
a feature space. Typical quantities from chaos theory (like entropy,
correlation dimension and first lyapunov exponent) and some conventional ones
(like autocorrelation and spectral factor) are analysed and evaluated, in order
to provide entries for the feature vectors. A way of quantifying the amount of
disorder is proposed by means of an healthy index that measures the distance of
a voice sample from the centre of mass of both healthy and sick clusters in the
feature space. A successful application of the geometrical signal separation is
reported, concerning distinction between normal and disordered phonation.Comment: 12 pages, 3 figures, accepted for publication in Medical Engineering
& Physic
- …
