103 research outputs found

    Learning Models For Corrupted Multi-Dimensional Data: Fundamental Limits And Algorithms

    Get PDF
    Developing machine learning models for unstructured multi-dimensional datasets such as datasets with unreliable labels and noisy multi-dimensional signals with or without missing information have becoming a central necessity. We are not always fortunate enough to get noise-free datasets for developing classification and representation models. Though there is a number of techniques available to deal with noisy datasets, these methods do not exploit the multi-dimensional structures of the signals, which could be used to improve the overall classification and representation performance of the model. In this thesis, we develop a Kronecker-structure (K-S) subspace model that exploits the multi-dimensional structure of the signal. First, we study the classification performance of K-S subspace models in two asymptotic regimes when the signal dimensions go to infinity and when the noise power tends to zero. We characterize the misclassification probability in terms of diversity order and we drive an exact expression for the diversity order. We further derive a tighter bound on misclassification probability in terms of pairwise geometry of the subspaces. The proposed scheme is optimal in most of the signal dimension regimes except in one regime where the signal dimension is less than twice the subspace dimension, however, hitting such a signal dimension regime is very rare in practice. We empirically show that the classification performance of K-S subspace models agrees with the diversity order analysis. We also develop an algorithm, Kronecker- Structured Learning of Discriminative Dictionaries (K-SLD2), for fast and compact K-S subspace learning for better classification and representation of multidimensional signals. We show that the K-SLD2 algorithm balances compact signal representation and good classification performance on synthetic and real-world datasets. Next, we develop a scheme to detect whether a given multi-dimensional signal with missing information lies on a given K-S subspace. We find that under some mild incoherence conditions we must observe ��(��1 log ��1) number of rows and ��(��2 log ��2) number of columns in order to detect the K-S subspace. In order to account for unreliable labels in datasets we present Nonlinear, Noise- aware, Quasiclustering (NNAQC), a method for learning deep convolutional networks from datasets corrupted by unknown label noise. We append a nonlinear noise model to a standard convolutional network, which is learned in tandem with the parameters of the network. Further, we train the network using a loss function that encourages the clustering of training images. We argue that the non-linear noise model, while not rigorous as a probabilistic model, results in a more effective denoising operator during backpropagation. We evaluate the performance of NNAQC on artificially injected label noise to MNIST, CIFAR-10, CIFAR-100, and ImageNet datasets and on a large-scale Clothing1M dataset with inherent label noise. We show that on all these datasets, NNAQC provides significantly improved classification performance over the state of the art and is robust to the amount of label noise and the training samples

    Low-rank Tensor Estimation via Riemannian Gauss-Newton: Statistical Optimality and Second-Order Convergence

    Full text link
    In this paper, we consider the estimation of a low Tucker rank tensor from a number of noisy linear measurements. The general problem covers many specific examples arising from applications, including tensor regression, tensor completion, and tensor PCA/SVD. We consider an efficient Riemannian Gauss-Newton (RGN) method for low Tucker rank tensor estimation. Different from the generic (super)linear convergence guarantee of RGN in the literature, we prove the first local quadratic convergence guarantee of RGN for low-rank tensor estimation in the noisy setting under some regularity conditions and provide the corresponding estimation error upper bounds. A deterministic estimation error lower bound, which matches the upper bound, is provided that demonstrates the statistical optimality of RGN. The merit of RGN is illustrated through two machine learning applications: tensor regression and tensor SVD. Finally, we provide the simulation results to corroborate our theoretical findings

    New n\sqrt{n}-consistent, numerically stable higher-order influence function estimators

    Full text link
    Higher-Order Influence Functions (HOIFs) provide a unified theory for constructing rate-optimal estimators for a large class of low-dimensional (smooth) statistical functionals/parameters (and sometimes even infinite-dimensional functions) that arise in substantive fields including epidemiology, economics, and the social sciences. Since the introduction of HOIFs by Robins et al. (2008), they have been viewed mostly as a theoretical benchmark rather than a useful tool for statistical practice. Works aimed to flip the script are scant, but a few recent papers Liu et al. (2017, 2021b) make some partial progress. In this paper, we take a fresh attempt at achieving this goal by constructing new, numerically stable HOIF estimators (or sHOIF estimators for short with ``s'' standing for ``stable'') with provable statistical, numerical, and computational guarantees. This new class of sHOIF estimators (up to the 2nd order) was foreshadowed in synthetic experiments conducted by Liu et al. (2020a)
    • …
    corecore