35 research outputs found
Sparse GCA and Thresholded Gradient Descent
Generalized correlation analysis (GCA) is concerned with uncovering linear
relationships across multiple datasets. It generalizes canonical correlation
analysis that is designed for two datasets. We study sparse GCA when there are
potentially multiple generalized correlation tuples in data and the loading
matrix has a small number of nonzero rows. It includes sparse CCA and sparse
PCA of correlation matrices as special cases. We first formulate sparse GCA as
generalized eigenvalue problems at both population and sample levels via a
careful choice of normalization constraints. Based on a Lagrangian form of the
sample optimization problem, we propose a thresholded gradient descent
algorithm for estimating GCA loading vectors and matrices in high dimensions.
We derive tight estimation error bounds for estimators generated by the
algorithm with proper initialization. We also demonstrate the prowess of the
algorithm on a number of synthetic datasets
Distributed stochastic optimization via matrix exponential learning
In this paper, we investigate a distributed learning scheme for a broad class
of stochastic optimization problems and games that arise in signal processing
and wireless communications. The proposed algorithm relies on the method of
matrix exponential learning (MXL) and only requires locally computable gradient
observations that are possibly imperfect and/or obsolete. To analyze it, we
introduce the notion of a stable Nash equilibrium and we show that the
algorithm is globally convergent to such equilibria - or locally convergent
when an equilibrium is only locally stable. We also derive an explicit linear
bound for the algorithm's convergence speed, which remains valid under
measurement errors and uncertainty of arbitrarily high variance. To validate
our theoretical analysis, we test the algorithm in realistic
multi-carrier/multiple-antenna wireless scenarios where several users seek to
maximize their energy efficiency. Our results show that learning allows users
to attain a net increase between 100% and 500% in energy efficiency, even under
very high uncertainty.Comment: 31 pages, 3 figure
Toward Certified Robustness of Distance Metric Learning
Metric learning aims to learn a distance metric such that semantically
similar instances are pulled together while dissimilar instances are pushed
away. Many existing methods consider maximizing or at least constraining a
distance margin in the feature space that separates similar and dissimilar
pairs of instances to guarantee their generalization ability. In this paper, we
advocate imposing an adversarial margin in the input space so as to improve the
generalization and robustness of metric learning algorithms. We first show
that, the adversarial margin, defined as the distance between training
instances and their closest adversarial examples in the input space, takes
account of both the distance margin in the feature space and the correlation
between the metric and triplet constraints. Next, to enhance robustness to
instance perturbation, we propose to enlarge the adversarial margin through
minimizing a derived novel loss function termed the perturbation loss. The
proposed loss can be viewed as a data-dependent regularizer and easily plugged
into any existing metric learning methods. Finally, we show that the enlarged
margin is beneficial to the generalization ability by using the theoretical
technique of algorithmic robustness. Experimental results on 16 datasets
demonstrate the superiority of the proposed method over existing
state-of-the-art methods in both discrimination accuracy and robustness against
possible noise
On Stein's Identity and Near-Optimal Estimation in High-dimensional Index Models
We consider estimating the parametric components of semi-parametric multiple
index models in a high-dimensional and non-Gaussian setting. Such models form a
rich class of non-linear models with applications to signal processing, machine
learning and statistics. Our estimators leverage the score function based first
and second-order Stein's identities and do not require the covariates to
satisfy Gaussian or elliptical symmetry assumptions common in the literature.
Moreover, to handle score functions and responses that are heavy-tailed, our
estimators are constructed via carefully thresholding their empirical
counterparts. We show that our estimator achieves near-optimal statistical rate
of convergence in several settings. We supplement our theoretical results via
simulation experiments that confirm the theory