19 research outputs found

    Constructive Approximation and Learning by Greedy Algorithms

    Get PDF
    This thesis develops several kernel-based greedy algorithms for different machine learning problems and analyzes their theoretical and empirical properties. Greedy approaches have been extensively used in the past for tackling problems in combinatorial optimization where finding even a feasible solution can be a computationally hard problem (i.e., not solvable in polynomial time). A key feature of greedy algorithms is that a solution is constructed recursively from the smallest constituent parts. In each step of the constructive process a component is added to the partial solution from the previous step and, thus, the size of the optimization problem is reduced. The selected components are given by optimization problems that are simpler and easier to solve than the original problem. As such schemes are typically fast at constructing a solution they can be very effective on complex optimization problems where finding an optimal/good solution has a high computational cost. Moreover, greedy solutions are rather intuitive and the schemes themselves are simple to design and easy to implement. There is a large class of problems for which greedy schemes generate an optimal solution or a good approximation of the optimum. In the first part of the thesis, we develop two deterministic greedy algorithms for optimization problems in which a solution is given by a set of functions mapping an instance space to the space of reals. The first of the two approaches facilitates data understanding through interactive visualization by providing means for experts to incorporate their domain knowledge into otherwise static kernel principal component analysis. This is achieved by greedily constructing embedding directions that maximize the variance at data points (unexplained by the previously constructed embedding directions) while adhering to specified domain knowledge constraints. The second deterministic greedy approach is a supervised feature construction method capable of addressing the problem of kernel choice. The goal of the approach is to construct a feature representation for which a set of linear hypotheses is of sufficient capacity — large enough to contain a satisfactory solution to the considered problem and small enough to allow good generalization from a small number of training examples. The approach mimics functional gradient descent and constructs features by fitting squared error residuals. We show that the constructive process is consistent and provide conditions under which it converges to the optimal solution. In the second part of the thesis, we investigate two problems for which deterministic greedy schemes can fail to find an optimal solution or a good approximation of the optimum. This happens as a result of making a sequence of choices which take into account only the immediate reward without considering the consequences onto future decisions. To address this shortcoming of deterministic greedy schemes, we propose two efficient randomized greedy algorithms which are guaranteed to find effective solutions to the corresponding problems. In the first of the two approaches, we provide a mean to scale kernel methods to problems with millions of instances. An approach, frequently used in practice, for this type of problems is the Nyström method for low-rank approximation of kernel matrices. A crucial step in this method is the choice of landmarks which determine the quality of the approximation. We tackle this problem with a randomized greedy algorithm based on the K-means++ cluster seeding scheme and provide a theoretical and empirical study of its effectiveness. In the second problem for which a deterministic strategy can fail to find a good solution, the goal is to find a set of objects from a structured space that are likely to exhibit an unknown target property. This discrete optimization problem is of significant interest to cyclic discovery processes such as de novo drug design. We propose to address it with an adaptive Metropolis–Hastings approach that samples candidates from the posterior distribution of structures conditioned on them having the target property. The proposed constructive scheme defines a consistent random process and our empirical evaluation demonstrates its effectiveness across several different application domains

    Quantum algorithms for machine learning and optimization

    Get PDF
    The theories of optimization and machine learning answer foundational questions in computer science and lead to new algorithms for practical applications. While these topics have been extensively studied in the context of classical computing, their quantum counterparts are far from well-understood. In this thesis, we explore algorithms that bridge the gap between the fields of quantum computing and machine learning. First, we consider general optimization problems with only function evaluations. For two core problems, namely general convex optimization and volume estimation of convex bodies, we give quantum algorithms as well as quantum lower bounds that constitute the quantum speedups of both problems to be polynomial compared to their classical counterparts. We then consider machine learning and optimization problems with input data stored explicitly as matrices. We first look at semidefinite programs and provide quantum algorithms with polynomial speedup compared to the classical state-of-the-art. We then move to machine learning and give the optimal quantum algorithms for linear and kernel-based classifications. To complement with our quantum algorithms, we also introduce a framework for quantum-inspired classical algorithms, showing that for low-rank matrix arithmetics there can only be polynomial quantum speedup. Finally, we study statistical problems on quantum computers, with the focus on testing properties of probability distributions. We show that for testing various properties including L1-distance, L2-distance, Shannon and Renyi entropies, etc., there are polynomial quantum speedups compared to their classical counterparts. We also extend these results to testing properties of quantum states

    An exact approach for aggregated formulations

    Get PDF
    corecore