2,970 research outputs found
Localized Lasso for High-Dimensional Regression
We introduce the localized Lasso, which is suited for learning models that
are both interpretable and have a high predictive power in problems with high
dimensionality and small sample size . More specifically, we consider a
function defined by local sparse models, one at each data point. We introduce
sample-wise network regularization to borrow strength across the models, and
sample-wise exclusive group sparsity (a.k.a., norm) to introduce
diversity into the choice of feature sets in the local models. The local models
are interpretable in terms of similarity of their sparsity patterns. The cost
function is convex, and thus has a globally optimal solution. Moreover, we
propose a simple yet efficient iterative least-squares based optimization
procedure for the localized Lasso, which does not need a tuning parameter, and
is guaranteed to converge to a globally optimal solution. The solution is
empirically shown to outperform alternatives for both simulated and genomic
personalized medicine data
Distributed Regression in Sensor Networks: Training Distributively with Alternating Projections
Wireless sensor networks (WSNs) have attracted considerable attention in
recent years and motivate a host of new challenges for distributed signal
processing. The problem of distributed or decentralized estimation has often
been considered in the context of parametric models. However, the success of
parametric methods is limited by the appropriateness of the strong statistical
assumptions made by the models. In this paper, a more flexible nonparametric
model for distributed regression is considered that is applicable in a variety
of WSN applications including field estimation. Here, starting with the
standard regularized kernel least-squares estimator, a message-passing
algorithm for distributed estimation in WSNs is derived. The algorithm can be
viewed as an instantiation of the successive orthogonal projection (SOP)
algorithm. Various practical aspects of the algorithm are discussed and several
numerical simulations validate the potential of the approach.Comment: To appear in the Proceedings of the SPIE Conference on Advanced
Signal Processing Algorithms, Architectures and Implementations XV, San
Diego, CA, July 31 - August 4, 200
Geometric deep learning: going beyond Euclidean data
Many scientific fields study data with an underlying structure that is a
non-Euclidean space. Some examples include social networks in computational
social sciences, sensor networks in communications, functional networks in
brain imaging, regulatory networks in genetics, and meshed surfaces in computer
graphics. In many applications, such geometric data are large and complex (in
the case of social networks, on the scale of billions), and are natural targets
for machine learning techniques. In particular, we would like to use deep
neural networks, which have recently proven to be powerful tools for a broad
range of problems from computer vision, natural language processing, and audio
analysis. However, these tools have been most successful on data with an
underlying Euclidean or grid-like structure, and in cases where the invariances
of these structures are built into networks used to model them. Geometric deep
learning is an umbrella term for emerging techniques attempting to generalize
(structured) deep neural models to non-Euclidean domains such as graphs and
manifolds. The purpose of this paper is to overview different examples of
geometric deep learning problems and present available solutions, key
difficulties, applications, and future research directions in this nascent
field
Multiclass latent locally linear support vector machines
Kernelized Support Vector Machines (SVM) have gained the status of off-the-shelf classifiers, able to deliver state of the art performance on almost any problem. Still, their practical use is constrained by their computational and memory complexity, which grows super-linearly with the number of training samples. In order to retain the low training and testing complexity of linear classifiers and the exibility of non linear ones, a growing, promising alternative is represented by methods that learn non-linear classifiers through local combinations of linear ones. In this paper we propose a new multi class local classifier, based on a latent SVM formulation. The proposed classifier makes use of a set of linear models that are linearly combined using sample and class specific weights. Thanks to the latent formulation, the combination coefficients are modeled as latent variables. We allow soft combinations and we provide a closed-form solution for their estimation, resulting in an efficient prediction rule. This novel formulation allows to learn in a principled way the sample specific weights and the linear classifiers, in a unique optimization problem, using a CCCP optimization procedure. Extensive experiments on ten standard UCI machine learning datasets, one large binary dataset, three character and digit recognition databases, and a visual place categorization dataset show the power of the proposed approach
Doctor of Philosophy
dissertationThe contributions in the area of kernelized learning techniques have expanded beyond a few basic kernel functions to general kernel functions that could be learned along with the rest of a statistical learning model. This dissertation aims to explore various directions in \emph{kernel learning}, a setting where we can learn not only a model, but also glean information about the geometry of the data from which we learn, by learning a positive definite (p.d.) kernel. Throughout, we can exploit several properties of kernels that relate to their \emph{geometry} -- a facet that is often overlooked. We revisit some of the necessary mathematical background required to understand kernel learning in context, such as reproducing kernel Hilbert spaces (RKHSs), the reproducing property, the representer theorem, etc. We then cover kernelized learning with support vector machines (SVMs), multiple kernel learning (MKL), and localized kernel learning (LKL). We move on to Bochner's theorem, a tool vital to one of the kernel learning areas we explore. The main portion of the thesis is divided into two parts: (1) kernel learning with SVMs, a.k.a. MKL, and (2) learning based on Bochner's theorem. In the first part, we present efficient, accurate, and scalable algorithms based on the SVM, one that exploits multiplicative weight updates (MWU), and another that exploits local geometry. In the second part, we use Bochner's theorem to incorporate a kernel into a neural network and discover that kernel learning in this fashion, continuous kernel learning (CKL), is superior even to MKL
- …