77,973 research outputs found

    Model-Based Reinforcement Learning with Continuous States and Actions

    No full text
    Finding an optimal policy in a reinforcement learning (RL) framework with continuous state and action spaces is challenging. Approximate solutions are often inevitable. GPDP is an approximate dynamic programming algorithm based on Gaussian process (GP) models for the value functions. In this paper, we extend GPDP to the case of unknown transition dynamics. After building a GP model for the transition dynamics, we apply GPDP to this model and determine a continuous-valued policy in the entire state space. We apply the resulting controller to the underpowered pendulum swing up. Moreover, we compare our results on this RL task to a nearly optimal discrete DP solution in a fully known environment

    Approximate Dynamic Programming with Gaussian Processes

    Get PDF
    In general, it is difficult to determine an optimal closed-loop policy in nonlinear control problems with continuous-valued state and control domains. Hence, approximations are often inevitable. The standard method of discretizing states and controls suffers from the curse of dimensionality and strongly depends on the chosen temporal sampling rate. In this paper, we introduce Gaussian process dynamic programming (GPDP) and determine an approximate globally optimal closed-loop policy. In GPDP, value functions in the Bellman recursion of the dynamic programming algorithm are modeled using Gaussian processes. GPDP returns an optimal statefeedback for a finite set of states. Based on these outcomes, we learn a possibly discontinuous closed-loop policy on the entire state space by switching between two independently trained Gaussian processes. A binary classifier selects one Gaussian process to predict the optimal control signal. We show that GPDP is able to yield an almost optimal solution to an LQ problem using few sample points. Moreover, we successfully apply GPDP to the underpowered pendulum swing up, a complex nonlinear control problem

    Multi-view Regularized Gaussian Processes

    Full text link
    Gaussian processes (GPs) have been proven to be powerful tools in various areas of machine learning. However, there are very few applications of GPs in the scenario of multi-view learning. In this paper, we present a new GP model for multi-view learning. Unlike existing methods, it combines multiple views by regularizing marginal likelihood with the consistency among the posterior distributions of latent functions from different views. Moreover, we give a general point selection scheme for multi-view learning and improve the proposed model by this criterion. Experimental results on multiple real world data sets have verified the effectiveness of the proposed model and witnessed the performance improvement through employing this novel point selection scheme

    Gaussian process model based predictive control

    Get PDF
    Gaussian process models provide a probabilistic non-parametric modelling approach for black-box identification of non-linear dynamic systems. The Gaussian processes can highlight areas of the input space where prediction quality is poor, due to the lack of data or its complexity, by indicating the higher variance around the predicted mean. Gaussian process models contain noticeably less coefficients to be optimized. This paper illustrates possible application of Gaussian process models within model-based predictive control. The extra information provided within Gaussian process model is used in predictive control, where optimization of control signal takes the variance information into account. The predictive control principle is demonstrated on control of pH process benchmark

    A note on local well-posedness of generalized KdV type equations with dissipative perturbations

    Full text link
    In this note we report local well-posedness results for the Cauchy problems associated to generalized KdV type equations with dissipative perturbation for given data in the low regularity L2L^2-based Sobolev spaces. The method of proof is based on the {\em contraction mapping principle} employed in some appropriate time weighted spaces.Comment: 14 page

    Gaussian Process priors with uncertain inputs? Application to multiple-step ahead time series forecasting

    Get PDF
    We consider the problem of multi-step ahead prediction in time series analysis using the non-parametric Gaussian process model. k-step ahead forecasting of a discrete-time non-linear dynamic system can be performed by doing repeated one-step ahead predictions. For a state-space model of the form y t = f(Yt-1 ,..., Yt-L ), the prediction of y at time t + k is based on the point estimates of the previous outputs. In this paper, we show how, using an analytical Gaussian approximation, we can formally incorporate the uncertainty about intermediate regressor values, thus updating the uncertainty on the current prediction

    Incidence of components of metabolic syndrome in the metabolically healthy obese over 9 years follow-up: the Atherosclerosis Risk In Communities study.

    Get PDF
    BackgroundSome obese adults are not afflicted by the metabolic abnormalities often associated with obesity (the 'metabolically healthy obese' (MHO)); however, they may be at increased risk of developing cardiometabolic abnormalities in the future. Little is known about the relative incidence of individual components of metabolic syndrome (MetSyn).MethodsWe used data from a multicenter, community-based cohort aged 45-64 years at recruitment (the Atherosclerosis Risk In Communities study) to examine the first appearance of any MetSyn component, excluding waist circumference. Body mass index (BMI, kg m-2) and cardiometabolic data were collected at four triennial visits. Our analysis included 3969 adults who were not underweight and free of the components of MetSyn at the initial visit. Participants were classified as metabolically healthy normal weight (MHNW), overweight (MHOW) and MHO at each visit. Adjusted hazard ratios (HR) and 95% confidence intervals were estimated with proportional hazards regression models.ResultsThe relative rate of developing each risk factor was higher among MHO than MHNW, with the strongest association noted for elevated fasting glucose (MHO vs MHNW, HR: 2.33 (1.77, 3.06)). MHO was also positively associated with elevated triglycerides (HR: 1.63 (1.27, 2.09)), low high-density lipoprotein-cholesterol (HR: 1.68 (1.32, 2.13)) and elevated blood pressure (HR: 1.54 (1.26, 1.88)). A similar, but less pronounced pattern was noted among the MHOW vs MHNW.ConclusionsWe conclude that even among apparently healthy individuals, obesity and overweight are related to more rapid development of at least one cardiometabolic risk factor, and that elevations in blood glucose develop most rapidly

    Fast Hands-free Writing by Gaze Direction

    Full text link
    We describe a method for text entry based on inverse arithmetic coding that relies on gaze direction and which is faster and more accurate than using an on-screen keyboard. These benefits are derived from two innovations: the writing task is matched to the capabilities of the eye, and a language model is used to make predictable words and phrases easier to write.Comment: 3 pages. Final versio

    Doing the Möbius Strip: The politics of the Bailey Review

    Get PDF
    In media and policy discourses on sexualisation, there has been an apparent split. Some have constructed young women as innocent children, incapable of meaningful sexual and commercial choices; others have treated young women as neo-liberal adults, agentic and savvy choice-makers. We analyse how the Bailey Review on the Sexualisation and Commercialisation of Childhood (published by the UK Department of Education) attempts to manage the tensions associated with making both arguments at once. We theorise the split as ‘doing the möbius strip’, as both sides agree on the assumption that commercial and sexual choice is either present or absent for young women. In this way, they reframe the contradictions and inequalities that shape young women’s behaviours as a problem of propriety and decency
    corecore