270,460 research outputs found

    Model Estimation Within Planning and Learning

    Get PDF
    Risk and reward are fundamental concepts in the cooperative control of unmanned systems. In this research, we focus on developing a constructive relationship between cooperative planning and learning algorithms to mitigate the learning risk, while boosting system (planner & learner) asymptotic performance and guaranteeing the safety of agent behavior. Our framework is an instance of the intelligent cooperative control architecture (iCCA) where the learner incrementally improves on the output of a baseline planner through interaction and constrained exploration. We extend previous work by extracting the embedded parameterized transition model from within the cooperative planner and making it adaptable and accessible to all iCCA modules. We empirically demonstrate the advantage of using an adaptive model over a static model and pure learning approaches in an example GridWorld problem and a UAV mission planning scenario with 200 million possibilities. Finally we discuss two extensions to our approach to handle cases where the true model can not be captured exactly through the presumed functional form.United States. Air Force Office of Scientific Research (FA9550-09-1-0522)Natural Sciences and Engineering Research Council of CanadaUSAF (FA9550-09-1-0522

    One Objective to Rule Them All: A Maximization Objective Fusing Estimation and Planning for Exploration

    Full text link
    In online reinforcement learning (online RL), balancing exploration and exploitation is crucial for finding an optimal policy in a sample-efficient way. To achieve this, existing sample-efficient online RL algorithms typically consist of three components: estimation, planning, and exploration. However, in order to cope with general function approximators, most of them involve impractical algorithmic components to incentivize exploration, such as optimization within data-dependent level-sets or complicated sampling procedures. To address this challenge, we propose an easy-to-implement RL framework called \textit{Maximize to Explore} (\texttt{MEX}), which only needs to optimize \emph{unconstrainedly} a single objective that integrates the estimation and planning components while balancing exploration and exploitation automatically. Theoretically, we prove that \texttt{MEX} achieves a sublinear regret with general function approximations for Markov decision processes (MDP) and is further extendable to two-player zero-sum Markov games (MG). Meanwhile, we adapt deep RL baselines to design practical versions of \texttt{MEX}, in both model-free and model-based manners, which can outperform baselines by a stable margin in various MuJoCo environments with sparse rewards. Compared with existing sample-efficient online RL algorithms with general function approximations, \texttt{MEX} achieves similar sample efficiency while enjoying a lower computational cost and is more compatible with modern deep RL methods

    FactorJoin: A New Cardinality Estimation Framework for Join Queries

    Full text link
    Cardinality estimation is one of the most fundamental and challenging problems in query optimization. Neither classical nor learning-based methods yield satisfactory performance when estimating the cardinality of the join queries. They either rely on simplified assumptions leading to ineffective cardinality estimates or build large models to understand the data distributions, leading to long planning times and a lack of generalizability across queries. In this paper, we propose a new framework FactorJoin for estimating join queries. FactorJoin combines the idea behind the classical join-histogram method to efficiently handle joins with the learning-based methods to accurately capture attribute correlation. Specifically, FactorJoin scans every table in a DB and builds single-table conditional distributions during an offline preparation phase. When a join query comes, FactorJoin translates it into a factor graph model over the learned distributions to effectively and efficiently estimate its cardinality. Unlike existing learning-based methods, FactorJoin does not need to de-normalize joins upfront or require executed query workloads to train the model. Since it only relies on single-table statistics, FactorJoin has small space overhead and is extremely easy to train and maintain. In our evaluation, FactorJoin can produce more effective estimates than the previous state-of-the-art learning-based methods, with 40x less estimation latency, 100x smaller model size, and 100x faster training speed at comparable or better accuracy. In addition, FactorJoin can estimate 10,000 sub-plan queries within one second to optimize the query plan, which is very close to the traditional cardinality estimators in commercial DBMS.Comment: Paper accepted by SIGMOD 202

    Bayesian Active Learning for Personalization and Uncertainty Quantification in Cardiac Electrophysiological Model

    Get PDF
    Cardiacvascular disease is the top death causing disease worldwide. In recent years, high-fidelity personalized models of the heart have shown an increasing capability to supplement clinical cardiology for improved patient-specific diagnosis, prediction, and treatment planning. In addition, they have shown promise to improve scientific understanding of a variety of disease mechanisms. However, model personalization by estimating the patient-specific tissue properties that are in the form of parameters of a physiological model is challenging. This is because tissue properties, in general, cannot be directly measured and they need to be estimated from measurements that are indirectly related to them through a physiological model. Moreover, these unknown tissue properties are heterogeneous and spatially varying throughout the heart volume presenting a difficulty of high-dimensional (HD) estimation from indirect and limited measurement data. The challenge in model personalization, therefore, summarizes to solving an ill-posed inverse problem where the unknown parameters are HD and the forward model is complex with a non-linear and computationally expensive physiological model. In this dissertation, we address the above challenge with following contributions. First, to address the concern of a complex forward model, we propose the surrogate modeling of the complex target function containing the forward model – an objective function in deterministic estimation or a posterior probability density function in probabilistic estimation – by actively selecting a set of training samples and a Bayesian update of the prior over the target function. The efficient and accurate surrogate of the expensive target function obtained in this manner is then utilized to accelerate either deterministic or probabilistic parameter estimation. Next, within the framework of Bayesian active learning we enable active surrogate learning over a HD parameter space with two novel approaches: 1) a multi-scale optimization that can adaptively allocate higher resolution to heterogeneous tissue regions and lower resolution to homogeneous tissue regions; and 2) a generative model from low-dimensional (LD) latent code to HD tissue properties. Both of these approaches are independently developed and tested within a parameter optimization framework. Furthermore, we devise a novel method that utilizes the surrogate pdf learned on an estimated LD parameter space to improve the proposal distribution of Metropolis Hastings for an accelerated sampling of the exact posterior pdf. We evaluate the presented methods on estimating local tissue excitability of a cardiac electrophysiological model in both synthetic data experiments and real data experiments. Results demonstrate that the presented methods are able to improve the accuracy and efficiency in patient-specific model parameter estimation in comparison to the existing approaches used for model personalization

    The Effects of Learning Contexts on the Development of Reflective Thinking in University Education: Design and Validation of a Questionnaire

    Get PDF
    Reflective thinking is a key skill for constructing meaning in university. Its development requires appropriate learning contexts which can function as spaces for reflection, along with approaches, conditions and methods which can boost students' training in thinking, framed within their process of knowledge construction and their development of competencies and professional skills. This paper reports the development and testing of a questionnaire on the value of the learning contexts designed to foster reflective thinking. To ensure validity, the constructs measured were derived from the extensive literature on conditions for planning learning and teaching activities for reflective thinking based on narrative-based methods. The instrument was validated with a sample of students (n = 375) from five universities. The results obtained from the estimation of a 10-factor model offer appropriate goodness of fit and parsimony with acceptable and consistent indices of reliability. The results contribute to the evidence supporting the reliability and validity of the questionnaire and confirm the value of the model's components for devising higher education teaching activities to promote a reflective thinking process

    Interval estimation of construction cost at completion using least squares support vector machine

    Get PDF
    Completing a project within the planned budget is the bottom-line of construction companies. To achieve this goal, periodic cost estimation is vitally important not only in the planning phase, but also in the execution phase. Due to high uncertainty in operational environment, point estimation of project cost is oftentimes not sufficient to assist the decision-making process. This study utilizes Least Squares Support Vector Machine (LS-SVM), machine learning based interval estimation (MLIE), and Differential Evolution (DE) to establish a novel model for predicting construction project cost. LS-SVM is a supervised learning technique used for regression analysis. MLIE is employed for inference of prediction intervals. Moreover, our model deploys DE in the cross validation process to search for the optimal values of tuning parameters. The newly developed model, named as EAC-LSPIM, yields results consisting of a point estimate coupled with lower and upper prediction limits, at a certain level of confidence, to accentuate uncertainty. Simulation and performance comparison demonstrate that the new model is capable of delivering accurate and reliable forecasting results

    Active model learning and diverse action sampling for task and motion planning

    Full text link
    The objective of this work is to augment the basic abilities of a robot by learning to use new sensorimotor primitives to enable the solution of complex long-horizon problems. Solving long-horizon problems in complex domains requires flexible generative planning that can combine primitive abilities in novel combinations to solve problems as they arise in the world. In order to plan to combine primitive actions, we must have models of the preconditions and effects of those actions: under what circumstances will executing this primitive achieve some particular effect in the world? We use, and develop novel improvements on, state-of-the-art methods for active learning and sampling. We use Gaussian process methods for learning the conditions of operator effectiveness from small numbers of expensive training examples collected by experimentation on a robot. We develop adaptive sampling methods for generating diverse elements of continuous sets (such as robot configurations and object poses) during planning for solving a new task, so that planning is as efficient as possible. We demonstrate these methods in an integrated system, combining newly learned models with an efficient continuous-space robot task and motion planner to learn to solve long horizon problems more efficiently than was previously possible.Comment: Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain. https://www.youtube.com/playlist?list=PLoWhBFPMfSzDbc8CYelsbHZa1d3uz-W_
    corecore