10 research outputs found

    Polyhedral separation via difference of convex (DC) programming

    Get PDF
    We consider polyhedral separation of sets as a possible tool in supervised classification. In particular, we focus on the optimization model introduced by Astorino and Gaudioso (J Optim Theory Appl 112(2):265–293, 2002) and adopt its reformulation in difference of convex (DC) form. We tackle the problem by adapting the algorithm for DC programming known as DCA. We present the results of the implementation of DCA on a number of benchmark classification datasets

    Combining Prior Knowledge and Data: Beyond the Bayesian Framework

    Get PDF
    For many tasks such as text categorization and control of robotic systems, state-of-the art learning systems can produce results comparable in accuracy to those of human subjects. However, the amount of training data needed for such systems can be prohibitively large for many practical problems. A text categorization system, for example, may need to see many text postings manually tagged with their subjects before it learns to predict the subject of the next posting with high accuracy. A reinforcement learning (RL) system learning how to drive a car needs a lot of experimentation with the actual car before acquiring the optimal policy. An optimizing compiler targeting a certain platform has to construct, compile, and execute many versions of the same code with different optimization parameters to determine which optimizations work best. Such extensive sampling can be time-consuming, expensive (in terms of both expense of the human expertise needed to label data and wear and tear on the robotic equipment used for exploration in case of RL), and sometimes dangerous (e.g., an RL agent driving the car off the cliff to see if it survives the crash). The goal of this work is to reduce the amount of training data an agent needs in order to learn how to perform a task successfully. This is done by providing the system with prior knowledge about its domain. The knowledge is used to bias the agent towards useful solutions and limit the amount of training needed. We explore this task in three contexts: classification (determining the subject of a newsgroup posting), control (learning to perform tasks such as driving a car up the mountain in simulation), and optimization (optimizing performance of linear algebra operations on different hardware platforms). For the text categorization problem, we introduce a novel algorithm which efficiently integrates prior knowledge into large margin classification. We show that prior knowledge simplifies the problem by reducing the size of the hypothesis space. We also provide formal convergence guarantees for our algorithm. For reinforcement learning, we introduce a novel framework for defining planning problems in terms of qualitative statements about the world (e.g., ``the faster the car is going, the more likely it is to reach the top of the mountain''). We present an algorithm based on policy iteration for solving such qualitative problems and prove its convergence. We also present an alternative framework which allows the user to specify prior knowledge quantitatively in form of a Markov Decision Process (MDP). This prior is used to focus exploration on those regions of the world in which the optimal policy is most sensitive to perturbations in transition probabilities and rewards. Finally, in the compiler optimization problem, the prior is based on an analytic model which determines good optimization parameters for a given platform. This model defines a Bayesian prior which, combined with empirical samples (obtained by measuring the performance of optimized code segments), determines the maximum-a-posteriori estimate of the optimization parameters

    A submodular optimization framework for never-ending learning : semi-supervised, online, and active learning.

    Get PDF
    The revolution in information technology and the explosion in the use of computing devices in people\u27s everyday activities has forever changed the perspective of the data mining and machine learning fields. The enormous amounts of easily accessible, information rich data is pushing the data analysis community in general towards a shift of paradigm. In the new paradigm, data comes in the form a stream of billions of records received everyday. The dynamic nature of the data and its sheer size makes it impossible to use the traditional notion of offline learning where the whole data is accessible at any time point. Moreover, no amount of human resources is enough to get expert feedback on the data. In this work we have developed a unified optimization based learning framework that approaches many of the challenges mentioned earlier. Specifically, we developed a Never-Ending Learning framework which combines incremental/online, semi-supervised, and active learning under a unified optimization framework. The established framework is based on the class of submodular optimization methods. At the core of this work we provide a novel formulation of the Semi-Supervised Support Vector Machines (S3VM) in terms of submodular set functions. The new formulation overcomes the non-convexity issues of the S3VM and provides a state of the art solution that is orders of magnitude faster than the cutting edge algorithms in the literature. Next, we provide a stream summarization technique via exemplar selection. This technique makes it possible to keep a fixed size exemplar representation of a data stream that can be used by any label propagation based semi-supervised learning technique. The compact data steam representation allows a wide range of algorithms to be extended to incremental/online learning scenario. Under the same optimization framework, we provide an active learning algorithm that constitute the feedback between the learning machine and an oracle. Finally, the developed Never-Ending Learning framework is essentially transductive in nature. Therefore, our last contribution is an inductive incremental learning technique for incremental training of SVM using the properties of local kernels. We demonstrated through this work the importance and wide applicability of the proposed methodologies

    Joint Optimization of Fidelity and Commensurability for Manifold Alignment and Graph Matching

    Get PDF
    In this thesis, we investigate how to perform inference in settings in which the data consist of different modalities or views. For effective learning utilizing the information available, data fusion that considers all views of these multiview data settings is needed. We also require dimensionality reduction to address the problems associated with high dimensionality, or “the curse of dimensionality.” We are interested in the type of information that is available in the multiview data that is essential for the inference task. We also seek to determine the principles to be used throughout the dimensionality reduction and data fusion steps to provide acceptable task performance. Our research focuses on exploring how these queries and their solutions are relevant to particular data problems of interest

    Manifold learning for emulations of computer models

    Get PDF
    Computer simulations are widely used in scientific research and engineering areas. Thought they could provide accurate result, the computational expense is normally high and thus hinder their applications to problems, where repeated evaluations are required, e.g, design optimization and uncertainty quantification. For partial differential equation (PDE) models the outputs of interest are often spatial fields, leading to high-dimensional output spaces. Although emulators can be used to find faithful and computationally inexpensive approximations of computer models, there are few methods for handling high-dimensional output spaces. For Gaussian process (GP) emulation, approximations of the correlation structure and/or dimensionality reduction are necessary. Linear dimensionality reduction will fail when the output space is not well approximated by a linear subspace of the ambient space in which it lies. Manifold learning can overcome the limitations of linear methods if an accurate inverse map is available. In this thesis, manifold learning is applied to construct GP emulators for very high-dimensional output spaces arising from parameterised PDE model simulations. Artificial neural network (ANN) support vector machine (SVM) emulators using manifold learning are also studied. A general framework for the inverse map approximation and a new efficient method for diffusion maps were developed. The manifold learning based emulators are then to extend reduced order models (ROMs) based on proper orthogonal decomposition to dynamic, parameterized PDEs. A similar approach is used to extend the discrete empirical interpolation method (DEIM) to ROMs for nonlinear, parameterized dynamic PDEs

    Uncertainty in Artificial Intelligence: Proceedings of the Thirty-Fourth Conference

    Get PDF
    corecore