1 research outputs found

    Unsupervised Representative Selection and Signal Unmixing

    Get PDF
    This thesis presents unsupervised machine learning algorithms to tackle two related problems: selecting representatives in a dataset and identifying constituent components in mixture data. In both problems, we aim to reveal a few key hidden features that sufficiently explain the data. The main intuition behind our algorithms is that, in an appropriately constructed dictionary, a sparse representation of the data corresponds to selecting these unknown features. Our goal is to efficiently seek such sparse representations under suitable conditions. In the representative selection problem, our objective is to pick a few representative data points that capture distinguished characteristics of a dataset. This corresponds to identifying the vertices of the polytope generated by the data. To do so, we start by modeling each data point as a convex combination of the polytope vertices. Then, in the dictionary formed by the dataset itself, we look for sparse representations of the data which subsequently imply the vertices. To seek such sparse representations, we proposed a greedy pursuit algorithm and a non-convex entropy minimization algorithm. We theoretically justify our proposed algorithms and demonstrate their vertex recovery performance on both synthetic and real data. In the unmixing problem, we assume that each data point is a mixture of a few unknown components, and we wish to decompose data into these underlying constituents. We consider a highly under-sampled regime in which the number of measurements is far less than the data dimension. Furthermore, we solve an even more challenging unmixing problem in which the under-sampled mixture are indirectly observed via a nonlinear operator such as Sigmoid and Relu. To find the unknown constituents, we form a dictionaries with atoms resembling the constituents and seek the sparse representations corresponding to them. We proposed a fast and robust greedy algorithm, called UnmixMP, to find such sparse representations. We prove its robust unmixing performance and support our theoretical analysis by various experiments on both synthetic and real image data. Our algorithms are fast and robust, and supported by rigorous theoretical analysis. Our experimental results shows that the proposed are significantly more robust than state-of-the-art representative selection and unmixing algorithms in the aforementioned settings
    corecore