35 research outputs found
Algorithms and Theory for Robust PCA and Phase Retrieval
In this dissertation, we investigate two problems, both of which require the recovery of unknowns from measurements that are potentially corrupted by outliers. The first part focuses on the problem of \emph{robust principal component analysis} (PCA), which aims to recover an unknown low-rank matrix from a corrupted and partially-observed matrix.
The robust PCA problem, originally nonconvex itself, has been solved via a convex relaxation based approach \emph{principal component pursuit} (PCP) in the literature.
However, previous works assume that the sparse errors uniformly spread over the entire matrix and characterize the condition under which PCP guarantees exact recovery. We generalize these results by allowing non-uniform error corruptions over the low-rank matrix and characterize the conditions on the error corruption probability of each individual entry based on the local coherence of the low-rank matrix, under which correct recovery can be guaranteed by PCP. Our results yield new insights on the graph clustering problem beyond the relevant literature.
The second part of the thesis studies the phase retrieval problem, which requires recovering an unknown vector from only its magnitude measurements. Differently from the first part, we solve this problem directly via optimizing nonconvex objectives. As the nonconvex objective is often constructed in such a way that the true vector is its global optimizer, the difficulty here is to design algorithms to find the global optimizer efficiently and provably.
In order to solve this problem, we propose a gradient-like algorithm named reshaped Wirtinger flow (RWF). For random Gaussian measurements, we show that RWF enjoys linear convergence to a global optimizer as long as the number of measurements is on the order of the dimension of the unknown vector. This achieves the best possible sample complexity as well as the state-of-the-art computational efficiency.
Moreover, we study the phase retrieval problem when the measurements are corrupted by adversarial outliers, which models situations with missing data or sensor failures. In order to resist possible observation outliers in an oblivious manner, we propose a novel median truncation approach to modify the nonconvex approach in both the initialization and the gradient descent steps. We apply the median truncation approach to the Poisson loss and the reshaped quadratic loss respectively, and obtain two algorithms \emph{median-TWF} and \emph{median-RWF}. We show that both algorithms recover the signal from a near-optimal number of independent Gaussian measurements, even when a constant fraction of measurements is corrupted. We further show that both algorithms are stable when measurements are corrupted by both sparse arbitrary outliers and dense bounded noises. We establish our results on the performance guarantee via the development of non-trivial concentration measures of the median-related quantities, which can be of independent interest
The Capacity Region of the Source-Type Model for Secret Key and Private Key Generation
The problem of simultaneously generating a secret key (SK) and private key
(PK) pair among three terminals via public discussion is investigated, in which
each terminal observes a component of correlated sources. All three terminals
are required to generate a common secret key concealed from an eavesdropper
that has access to public discussion, while two designated terminals are
required to generate an extra private key concealed from both the eavesdropper
and the remaining terminal. An outer bound on the SK-PK capacity region was
established in [1], and was shown to be achievable for one case. In this paper,
achievable schemes are designed to achieve the outer bound for the remaining
two cases, and hence the SK-PK capacity region is established in general. The
main technique lies in the novel design of a random binning-joint decoding
scheme that achieves the existing outer bound.Comment: 20 pages, 4 figure
Convergence Theory of Learning Over-parameterized ResNet: A Full Characterization
ResNet structure has achieved great empirical success since its debut. Recent
work established the convergence of learning over-parameterized ResNet with a
scaling factor on the residual branch where is the network
depth. However, it is not clear how learning ResNet behaves for other values of
. In this paper, we fully characterize the convergence theory of gradient
descent for learning over-parameterized ResNet with different values of .
Specifically, with hiding logarithmic factor and constant coefficients, we show
that for gradient descent is guaranteed to converge to the
global minma, and especially when the convergence is irrelevant
of the network depth. Conversely, we show that for ,
the forward output grows at least with rate in expectation and then the
learning fails because of gradient explosion for large . This means the
bound is sharp for learning ResNet with arbitrary depth.
To the best of our knowledge, this is the first work that studies learning
ResNet with full range of .Comment: 31 page
DiffKendall: A Novel Approach for Few-Shot Learning with Differentiable Kendall's Rank Correlation
Few-shot learning aims to adapt models trained on the base dataset to novel
tasks where the categories are not seen by the model before. This often leads
to a relatively uniform distribution of feature values across channels on novel
classes, posing challenges in determining channel importance for novel tasks.
Standard few-shot learning methods employ geometric similarity metrics such as
cosine similarity and negative Euclidean distance to gauge the semantic
relatedness between two features. However, features with high geometric
similarities may carry distinct semantics, especially in the context of
few-shot learning. In this paper, we demonstrate that the importance ranking of
feature channels is a more reliable indicator for few-shot learning than
geometric similarity metrics. We observe that replacing the geometric
similarity metric with Kendall's rank correlation only during inference is able
to improve the performance of few-shot learning across a wide range of datasets
with different domains. Furthermore, we propose a carefully designed
differentiable loss for meta-training to address the non-differentiability
issue of Kendall's rank correlation. Extensive experiments demonstrate that the
proposed rank-correlation-based approach substantially enhances few-shot
learning performance