5 research outputs found

    PCA using graph total variation

    Get PDF
    Mining useful clusters from high dimensional data has received sig- nificant attention of the signal processing and machine learning com- munity in the recent years. Linear and non-linear dimensionality reduction has played an important role to overcome the curse of di- mensionality. However, often such methods are accompanied with problems such as high computational complexity (usually associated with the nuclear norm minimization), non-convexity (for matrix fac- torization methods) or susceptibility to gross corruptions in the data. In this paper we propose a convex, robust, scalable and efficient Prin- cipal Component Analysis (PCA) based method to approximate the low-rank representation of high dimensional datasets via a two-way graph regularization scheme. Compared to the exact recovery meth- ods, our method is approximate, in that it enforces a piecewise con- stant assumption on the samples using a graph total variation and a piecewise smoothness assumption on the features using a graph Tikhonov regularization. Futhermore, it retrieves the low-rank rep- resentation in a time that is linear in the number of data samples. Clustering experiments on 3 benchmark datasets with different types of corruptions show that our proposed model outperforms 7 state-of- the-art dimensionality reduction models

    From data to structures:graph learning under smoothness assumptions and applications in data science

    Get PDF
    Weighted undirected graphs are a simple, yet powerful way to encode structure in data. A first question we need to address regarding such graphs is how to use them effectively to enhance machine learning problems. A second but more important question is how to obtain such a graph directly from the data. These two questions lead the direction of this thesis. We first explore the use of graphs in matrix completion, with direct implications in recommendation systems. In such systems side information graphs arise naturally, by comparing user behavior and interactions, and by product profiling. Given incomplete object ratings by users, and known graphs for user and object similarities, we propose a new matrix completion model that exploits the graphs to achieve better recommendations. Our model is convex and is solved efficiently with an Alternating Direction Multipliers method (ADMM). It achieves better recommendations than standard convex matrix completion, especially when very few ratings are known a priori, while it is robust to the graph quality. Continuing with the second question, we are interested in learning a graph from data. Data is usually assumed to be smooth on a graph, an assumption associated with graph sparsity and connectivity. Using this assumption, we provide a general graph learning framework that is convex and easy to optimize. We classify graph learning according to the prior structure information in edge-local, node-local or global. Edge-local models are the cheapest computationally, learning each edge independently. Exponential decay graphs and ϵ\epsilon-neighborhood graphs, the most prevalent weighting schemes in the literature, are such models of our framework. We provide a new model that interpolates between these two, and achieves better performance in many settings. Node-local models take into account the neighborhood of each edge. While they require iterative algorithms, they are stronger models achieving better results. We provide a new node-local model for general graph learning under smoothness assumptions, that redefines the state of the art. We discuss the previous state of the art, two global models of our framework. We provide efficient primal-dual based algorithms for efficiently solving our model and the most prevalent of the previous ones. Our model performs best in many settings. Large scale applications require graphs that are easy to compute, with cost that scales gracefully with respect to the number of nodes, and do not need parameter tuning. We provide the first large scale graph learning solution, based on approximate nearest neighbors with cost of O(mlog(m))\mathcal{O}(m\log(m)) for mm nodes. Our solution automatically finds the needed parameters given the sparsity level needed. We finally explore learning graphs that change in time. In many applications graphs are not static, and data comes from different versions of them. In such cases there is no one graph that correctly summarizes connectivity, and we need to learn several graphs, one for each selected time window. Assuming that graphs change slowly, we provide two new models based on a known global and our own node-local one. Our convex time-varying-graph models can use any convex differentiable loss function for time-change penalization, have a cost linear to the number of time windows, and are easy to parallelize. The obtained graphs fit the data better and provide a good compromise between detail in time and computational cost
    corecore