129 research outputs found

    Scalable Loss-calibrated Bayesian Decision Theory and Preference Learning

    Get PDF
    Bayesian decision theory provides a framework for optimal action selection under uncertainty given a utility function over actions and world states and a distribution over world states. The application of Bayesian decision theory in practice is often limited by two problems: (1) in application domains such as recommendation, the true utility function of a user is a priori unknown and must be learned from user interactions; and (2) computing expected utilities under complex state distributions and (potentially uncertain) utility functions is often computationally expensive and requires tractable approximations. In this thesis, we aim to address both of these problems. For (1), we take a Bayesian non-parametric approach to utility function modeling and learning. In our first contribution, we exploit community structure prevalent in collective user preferences using a Dirichlet Process mixture of Gaussian Processes (GPs). In our second contribution, we take the underlying GP preference model of the first contribution and show how to jointly address both (1) and (2) by sparsifying the GP model in order to preserve optimal decisions while ensuring tractable expected utility computations. In our third and final contribution, we directly address (2) in a Monte Carlo framework by deriving an optimal loss-calibrated importance sampling distribution and show how it can be extended to uncertain utility representations developed in the previous contributions. Our empirical evaluations in various applications including multiple preference learning problems using synthetic and real user data and robotics decision-making scenarios derived from actual occupancy grid maps demonstrate the effectiveness of the theoretical foundations laid in this thesis and pave the way for future advances that address important practical problems at the intersection of Bayesian decision theory and scalable machine learning

    The Manifold of Neural Responses Informs Physiological Circuits in the Visual System

    Get PDF
    The rapid development of multi-electrode and imaging techniques is leading to a data explosion in neuroscience, opening the possibility of truly understanding the organization and functionality of our visual systems. Furthermore, the need for more natural visual stimuli greatly increases the complexity of the data. Together, these create a challenge for machine learning. Our goal in this thesis is to develop one such technique. The central pillar of our contribution is designing a manifold of neurons, and providing an algorithmic approach to inferring it. This manifold is functional, in the sense that nearby neurons on the manifold respond similarly (in time) to similar aspects of the stimulus ensemble. By organizing the neurons, our manifold differs from other, standard manifolds as they are used in visual neuroscience which instead organize the stimuli. Our contributions to the machine learning component of the thesis are twofold. First, we develop a tensor representation of the data, adopting a multilinear view of potential circuitry. Tensor factorization then provides an intermediate representation between the neural data and the manifold. We found that the rank of the neural factor matrix can be used to select an appropriate number of tensor factors. Second, to apply manifold learning techniques, a similarity kernel on the data must be defined. Like many others, we employ a Gaussian kernel, but refine it based on a proposed graph sparsification technique—this makes the resulting manifolds less sensitive to the choice of bandwidth parameter. We apply this method to neuroscience data recorded from retina and primary visual cortex in the mouse. For the algorithm to work, however, the underlying circuitry must be exercised to as full an extent as possible. To this end, we develop an ensemble of flow stimuli, which simulate what the mouse would \u27see\u27 running through a field. Applying the algorithm to the retina reveals that neurons form clusters corresponding to known retinal ganglion cell types. In the cortex, a continuous manifold is found, indicating that, from a functional circuit point of view, there may be a continuum of cortical function types. Interestingly, both manifolds share similar global coordinates, which hint at what the key ingredients to vision might be. Lastly, we turn to perhaps the most widely used model for the cortex: deep convolutional networks. Their feedforward architecture leads to manifolds that are even more clustered than the retina, and not at all like that of the cortex. This suggests, perhaps, that they may not suffice as general models for Artificial Intelligence

    HodgeRank with Information Maximization for Crowdsourced Pairwise Ranking Aggregation

    Full text link
    Recently, crowdsourcing has emerged as an effective paradigm for human-powered large scale problem solving in various domains. However, task requester usually has a limited amount of budget, thus it is desirable to have a policy to wisely allocate the budget to achieve better quality. In this paper, we study the principle of information maximization for active sampling strategies in the framework of HodgeRank, an approach based on Hodge Decomposition of pairwise ranking data with multiple workers. The principle exhibits two scenarios of active sampling: Fisher information maximization that leads to unsupervised sampling based on a sequential maximization of graph algebraic connectivity without considering labels; and Bayesian information maximization that selects samples with the largest information gain from prior to posterior, which gives a supervised sampling involving the labels collected. Experiments show that the proposed methods boost the sampling efficiency as compared to traditional sampling schemes and are thus valuable to practical crowdsourcing experiments.Comment: Accepted by AAAI201

    Belief-space Planning for Active Visual SLAM in Underwater Environments.

    Full text link
    Autonomous mobile robots operating in a priori unknown environments must be able to integrate path planning with simultaneous localization and mapping (SLAM) in order to perform tasks like exploration, search and rescue, inspection, reconnaissance, target-tracking, and others. This level of autonomy is especially difficult in underwater environments, where GPS is unavailable, communication is limited, and environment features may be sparsely- distributed. In these situations, the path taken by the robot can drastically affect the performance of SLAM, so the robot must plan and act intelligently and efficiently to ensure successful task completion. This document proposes novel research in belief-space planning for active visual SLAM in underwater environments. Our motivating application is ship hull inspection with an autonomous underwater robot. We design a Gaussian belief-space planning formulation that accounts for the randomness of the loop-closure measurements in visual SLAM and serves as the mathematical foundation for the research in this thesis. Combining this planning formulation with sampling-based techniques, we efficiently search for loop-closure actions throughout the environment and present a two-step approach for selecting revisit actions that results in an opportunistic active SLAM framework. The proposed active SLAM method is tested in hybrid simulations and real-world field trials of an underwater robot performing inspections of a physical modeling basin and a U.S. Coast Guard cutter. To reduce computational load, we present research into efficient planning by compressing the representation and examining the structure of the underlying SLAM system. We propose the use of graph sparsification methods online to reduce complexity by planning with an approximate distribution that represents the original, full pose graph. We also propose the use of the Bayes tree data structure—first introduced for fast inference in SLAM—to perform efficient incremental updates when evaluating candidate plans that are similar. As a final contribution, we design risk-averse objective functions that account for the randomness within our planning formulation. We show that this aversion to uncertainty in the posterior belief leads to desirable and intuitive behavior within active SLAM.PhDMechanical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/133303/1/schaves_1.pd

    SOBER: Highly Parallel Bayesian Optimization and Bayesian Quadrature over Discrete and Mixed Spaces

    Full text link
    Batch Bayesian optimisation and Bayesian quadrature have been shown to be sample-efficient methods of performing optimisation and quadrature where expensive-to-evaluate objective functions can be queried in parallel. However, current methods do not scale to large batch sizes -- a frequent desideratum in practice (e.g. drug discovery or simulation-based inference). We present a novel algorithm, SOBER, which permits scalable and diversified batch global optimisation and quadrature with arbitrary acquisition functions and kernels over discrete and mixed spaces. The key to our approach is to reformulate batch selection for global optimisation as a quadrature problem, which relaxes acquisition function maximisation (non-convex) to kernel recombination (convex). Bridging global optimisation and quadrature can efficiently solve both tasks by balancing the merits of exploitative Bayesian optimisation and explorative Bayesian quadrature. We show that SOBER outperforms 11 competitive baselines on 12 synthetic and diverse real-world tasks.Comment: 34 pages, 12 figure

    Learning to soar: exploration strategies in reinforcement learning for resource-constrained missions

    Get PDF
    An unpowered aerial glider learning to soar in a wind field presents a new manifestation of the exploration-exploitation trade-off. This thesis proposes a directed, adaptive and nonmyopic exploration strategy in a temporal difference reinforcement learning framework for tackling the resource-constrained exploration-exploitation task of this autonomous soaring problem. The complete learning algorithm is developed in a SARSA() framework, which uses a Gaussian process with a squared exponential covariance function to approximate the value function. The three key contributions of this thesis form the proposed exploration-exploitation strategy. Firstly, a new information measure is derived from the change in the variance volume surrounding the Gaussian process estimate. This measure of information gain is used to define the exploration reward of an observation. Secondly, a nonmyopic information value is presented that captures both the immediate exploration reward due to taking an action as well as future exploration opportunities that result. Finally, this information value is combined with the state-action value of SARSA() through a dynamic weighting factor to produce an exploration-exploitation management scheme for resource-constrained learning systems. The proposed learning strategy encourages either exploratory or exploitative behaviour depending on the requirements of the learning task and the available resources. The performance of the learning algorithms presented in this thesis is compared against other SARSA() methods. Results show that actively directing exploration to regions of the state-action space with high uncertainty improves the rate of learning, while dynamic management of the exploration-exploitation behaviour according to the available resources produces prudent learning behaviour in resource-constrained systems

    Learning to soar: exploration strategies in reinforcement learning for resource-constrained missions

    Get PDF
    An unpowered aerial glider learning to soar in a wind field presents a new manifestation of the exploration-exploitation trade-off. This thesis proposes a directed, adaptive and nonmyopic exploration strategy in a temporal difference reinforcement learning framework for tackling the resource-constrained exploration-exploitation task of this autonomous soaring problem. The complete learning algorithm is developed in a SARSA() framework, which uses a Gaussian process with a squared exponential covariance function to approximate the value function. The three key contributions of this thesis form the proposed exploration-exploitation strategy. Firstly, a new information measure is derived from the change in the variance volume surrounding the Gaussian process estimate. This measure of information gain is used to define the exploration reward of an observation. Secondly, a nonmyopic information value is presented that captures both the immediate exploration reward due to taking an action as well as future exploration opportunities that result. Finally, this information value is combined with the state-action value of SARSA() through a dynamic weighting factor to produce an exploration-exploitation management scheme for resource-constrained learning systems. The proposed learning strategy encourages either exploratory or exploitative behaviour depending on the requirements of the learning task and the available resources. The performance of the learning algorithms presented in this thesis is compared against other SARSA() methods. Results show that actively directing exploration to regions of the state-action space with high uncertainty improves the rate of learning, while dynamic management of the exploration-exploitation behaviour according to the available resources produces prudent learning behaviour in resource-constrained systems
    • …
    corecore