35 research outputs found

    Algorithms and Theory for Robust PCA and Phase Retrieval

    Get PDF
    In this dissertation, we investigate two problems, both of which require the recovery of unknowns from measurements that are potentially corrupted by outliers. The first part focuses on the problem of \emph{robust principal component analysis} (PCA), which aims to recover an unknown low-rank matrix from a corrupted and partially-observed matrix. The robust PCA problem, originally nonconvex itself, has been solved via a convex relaxation based approach \emph{principal component pursuit} (PCP) in the literature. However, previous works assume that the sparse errors uniformly spread over the entire matrix and characterize the condition under which PCP guarantees exact recovery. We generalize these results by allowing non-uniform error corruptions over the low-rank matrix and characterize the conditions on the error corruption probability of each individual entry based on the local coherence of the low-rank matrix, under which correct recovery can be guaranteed by PCP. Our results yield new insights on the graph clustering problem beyond the relevant literature. The second part of the thesis studies the phase retrieval problem, which requires recovering an unknown vector from only its magnitude measurements. Differently from the first part, we solve this problem directly via optimizing nonconvex objectives. As the nonconvex objective is often constructed in such a way that the true vector is its global optimizer, the difficulty here is to design algorithms to find the global optimizer efficiently and provably. In order to solve this problem, we propose a gradient-like algorithm named reshaped Wirtinger flow (RWF). For random Gaussian measurements, we show that RWF enjoys linear convergence to a global optimizer as long as the number of measurements is on the order of the dimension of the unknown vector. This achieves the best possible sample complexity as well as the state-of-the-art computational efficiency. Moreover, we study the phase retrieval problem when the measurements are corrupted by adversarial outliers, which models situations with missing data or sensor failures. In order to resist possible observation outliers in an oblivious manner, we propose a novel median truncation approach to modify the nonconvex approach in both the initialization and the gradient descent steps. We apply the median truncation approach to the Poisson loss and the reshaped quadratic loss respectively, and obtain two algorithms \emph{median-TWF} and \emph{median-RWF}. We show that both algorithms recover the signal from a near-optimal number of independent Gaussian measurements, even when a constant fraction of measurements is corrupted. We further show that both algorithms are stable when measurements are corrupted by both sparse arbitrary outliers and dense bounded noises. We establish our results on the performance guarantee via the development of non-trivial concentration measures of the median-related quantities, which can be of independent interest

    The Capacity Region of the Source-Type Model for Secret Key and Private Key Generation

    Full text link
    The problem of simultaneously generating a secret key (SK) and private key (PK) pair among three terminals via public discussion is investigated, in which each terminal observes a component of correlated sources. All three terminals are required to generate a common secret key concealed from an eavesdropper that has access to public discussion, while two designated terminals are required to generate an extra private key concealed from both the eavesdropper and the remaining terminal. An outer bound on the SK-PK capacity region was established in [1], and was shown to be achievable for one case. In this paper, achievable schemes are designed to achieve the outer bound for the remaining two cases, and hence the SK-PK capacity region is established in general. The main technique lies in the novel design of a random binning-joint decoding scheme that achieves the existing outer bound.Comment: 20 pages, 4 figure

    Convergence Theory of Learning Over-parameterized ResNet: A Full Characterization

    Full text link
    ResNet structure has achieved great empirical success since its debut. Recent work established the convergence of learning over-parameterized ResNet with a scaling factor τ=1/L\tau=1/L on the residual branch where LL is the network depth. However, it is not clear how learning ResNet behaves for other values of τ\tau. In this paper, we fully characterize the convergence theory of gradient descent for learning over-parameterized ResNet with different values of τ\tau. Specifically, with hiding logarithmic factor and constant coefficients, we show that for τ≤1/L\tau\le 1/\sqrt{L} gradient descent is guaranteed to converge to the global minma, and especially when τ≤1/L\tau\le 1/L the convergence is irrelevant of the network depth. Conversely, we show that for τ>L−12+c\tau>L^{-\frac{1}{2}+c}, the forward output grows at least with rate LcL^c in expectation and then the learning fails because of gradient explosion for large LL. This means the bound τ≤1/L\tau\le 1/\sqrt{L} is sharp for learning ResNet with arbitrary depth. To the best of our knowledge, this is the first work that studies learning ResNet with full range of τ\tau.Comment: 31 page

    DiffKendall: A Novel Approach for Few-Shot Learning with Differentiable Kendall's Rank Correlation

    Full text link
    Few-shot learning aims to adapt models trained on the base dataset to novel tasks where the categories are not seen by the model before. This often leads to a relatively uniform distribution of feature values across channels on novel classes, posing challenges in determining channel importance for novel tasks. Standard few-shot learning methods employ geometric similarity metrics such as cosine similarity and negative Euclidean distance to gauge the semantic relatedness between two features. However, features with high geometric similarities may carry distinct semantics, especially in the context of few-shot learning. In this paper, we demonstrate that the importance ranking of feature channels is a more reliable indicator for few-shot learning than geometric similarity metrics. We observe that replacing the geometric similarity metric with Kendall's rank correlation only during inference is able to improve the performance of few-shot learning across a wide range of datasets with different domains. Furthermore, we propose a carefully designed differentiable loss for meta-training to address the non-differentiability issue of Kendall's rank correlation. Extensive experiments demonstrate that the proposed rank-correlation-based approach substantially enhances few-shot learning performance
    corecore