452 research outputs found

    Fast Graph Laplacian regularized kernel learning via semidefinite-quadratic-linear programming.

    Get PDF
    Wu, Xiaoming.Thesis (M.Phil.)--Chinese University of Hong Kong, 2011.Includes bibliographical references (p. 30-34).Abstracts in English and Chinese.Abstract --- p.iAcknowledgement --- p.ivChapter 1 --- Introduction --- p.1Chapter 2 --- Preliminaries --- p.4Chapter 2.1 --- Kernel Learning Theory --- p.4Chapter 2.1.1 --- Positive Semidefinite Kernel --- p.4Chapter 2.1.2 --- The Reproducing Kernel Map --- p.6Chapter 2.1.3 --- Kernel Tricks --- p.7Chapter 2.2 --- Spectral Graph Theory --- p.8Chapter 2.2.1 --- Graph Laplacian --- p.8Chapter 2.2.2 --- Eigenvectors of Graph Laplacian --- p.9Chapter 2.3 --- Convex Optimization --- p.10Chapter 2.3.1 --- From Linear to Conic Programming --- p.11Chapter 2.3.2 --- Second-Order Cone Programming --- p.12Chapter 2.3.3 --- Semidefinite Programming --- p.12Chapter 3 --- Fast Graph Laplacian Regularized Kernel Learning --- p.14Chapter 3.1 --- The Problems --- p.14Chapter 3.1.1 --- MVU --- p.16Chapter 3.1.2 --- PCP --- p.17Chapter 3.1.3 --- Low-Rank Approximation: from SDP to QSDP --- p.18Chapter 3.2 --- Previous Approach: from QSDP to SDP --- p.20Chapter 3.3 --- Our Formulation: from QSDP to SQLP --- p.21Chapter 3.4 --- Experimental Results --- p.23Chapter 3.4.1 --- The Results --- p.25Chapter 4 --- Conclusion --- p.28Bibliography --- p.3

    Matrix completion and extrapolation via kernel regression

    Get PDF
    Matrix completion and extrapolation (MCEX) are dealt with here over reproducing kernel Hilbert spaces (RKHSs) in order to account for prior information present in the available data. Aiming at a faster and low-complexity solver, the task is formulated as a kernel ridge regression. The resultant MCEX algorithm can also afford online implementation, while the class of kernel functions also encompasses several existing approaches to MC with prior information. Numerical tests on synthetic and real datasets show that the novel approach performs faster than widespread methods such as alternating least squares (ALS) or stochastic gradient descent (SGD), and that the recovery error is reduced, especially when dealing with noisy data

    Approximate Computation and Implicit Regularization for Very Large-scale Data Analysis

    Full text link
    Database theory and database practice are typically the domain of computer scientists who adopt what may be termed an algorithmic perspective on their data. This perspective is very different than the more statistical perspective adopted by statisticians, scientific computers, machine learners, and other who work on what may be broadly termed statistical data analysis. In this article, I will address fundamental aspects of this algorithmic-statistical disconnect, with an eye to bridging the gap between these two very different approaches. A concept that lies at the heart of this disconnect is that of statistical regularization, a notion that has to do with how robust is the output of an algorithm to the noise properties of the input data. Although it is nearly completely absent from computer science, which historically has taken the input data as given and modeled algorithms discretely, regularization in one form or another is central to nearly every application domain that applies algorithms to noisy data. By using several case studies, I will illustrate, both theoretically and empirically, the nonobvious fact that approximate computation, in and of itself, can implicitly lead to statistical regularization. This and other recent work suggests that, by exploiting in a more principled way the statistical properties implicit in worst-case algorithms, one can in many cases satisfy the bicriteria of having algorithms that are scalable to very large-scale databases and that also have good inferential or predictive properties.Comment: To appear in the Proceedings of the 2012 ACM Symposium on Principles of Database Systems (PODS 2012

    Learning with Multiple Similarities

    Get PDF
    The notion of similarities between data points is central to many classification and clustering algorithms. We often encounter situations when there are more than one set of pairwise similarity graphs between objects, either arising from different measures of similarity between objects or from a single similarity measure defined on multiple data representations, or a combination of these. Such examples can be found in various applications in computer vision, natural language processing and computational biology. Combining information from these multiple sources is often beneficial in learning meaningful concepts from data. This dissertation proposes novel methods to effectively fuse information from these multiple similarity graphs, targeted towards two fundamental tasks in machine learning - classification and clustering. In particular, I propose two models for learning spectral embedding from multiple similarity graphs using ideas from co-training and co-regularization. Further, I propose a novel approach to the problem of multiple kernel learning (MKL), converting it to a more familiar problem of binary classification in a transformed space. The proposed MKL approach learns a ``good'' linear combination of base kernels by optimizing a quality criterion that is justified both empirically and theoretically. The ideas of the proposed MKL method are also extended to learning nonlinear combinations of kernels, in particular, polynomial kernel combination and more general nonlinear kernel combination using random forests

    Conic Optimization Theory: Convexification Techniques and Numerical Algorithms

    Full text link
    Optimization is at the core of control theory and appears in several areas of this field, such as optimal control, distributed control, system identification, robust control, state estimation, model predictive control and dynamic programming. The recent advances in various topics of modern optimization have also been revamping the area of machine learning. Motivated by the crucial role of optimization theory in the design, analysis, control and operation of real-world systems, this tutorial paper offers a detailed overview of some major advances in this area, namely conic optimization and its emerging applications. First, we discuss the importance of conic optimization in different areas. Then, we explain seminal results on the design of hierarchies of convex relaxations for a wide range of nonconvex problems. Finally, we study different numerical algorithms for large-scale conic optimization problems.Comment: 18 page
    • …
    corecore