6 research outputs found

    Safety, Risk Awareness and Exploration in Reinforcement Learning

    No full text
    Replicating the human ability to solve complex planning problems based on minimal prior knowledge has been extensively studied in the field of reinforcement learning. Algorithms for discrete or approximate models are supported by theoretical guarantees but the necessary assumptions are often constraining. We aim to extend these results in the direction of practical applicability to more realistic settings. Our contributions are restricted to three specific aspects of practical problems that we believe to be important when applying reinforcement learning techniques: risk awareness, safe exploration and data efficient exploration. Risk awareness is important in planning situations where restarts are not available and performance depends on one-off returns rather than average returns. The expected return is no longer an appropriate objective because the law of large numbers does not apply. In Chapter 2 we propose a new optimization objective for risk-aware planning and show that it has desirable theoretical properties, relating it to previously proposed risk-aware objectives: minmax, exponential utility, percentile and mean minus variance. In environments with uncertain dynamics, exploration is often necessary to improve performance. Existing reinforcement learning algorithms provide theoretical exploration guarantees, but they tendto rely on the assumption that any state is eventually reachable from any other state by following a suitable policy. For most physical systems this assumption is impractical as the systems would break before any reasonable exploration has taken place. In Chapter 3 weaddress the need for a safe exploration method. In Chapter 4 we address the specific challenges presented by extending model-based reinforcement learning methods from discrete to continuous dynamical systems. System representations based on explicitly enumerated states are not longer applicable. To address this challenge we use a Dirichlet process mixture of linear models to represent dynamics. The proposed model strikes a good balance between compact representation and flexibility. To address the challenge of efficient exploration-exploitation trade-off we apply the principle of Optimism in the Face of Uncertainty that underlies numerous other provably efficient algorithms in simpler settings. Our algorithm reduces the exploration problem to a sequence of classical optimal control problems. Synthetic experiments illustrate the effectiveness of our methods

    Denoising archival films using a learned bayesian model

    No full text
    We develop a Bayesian model of digitized archival films and use this for denoising, or more specifically de-graining, individual frames. In contrast to previous approaches our model uses a learned spatial prior and a unique likelihood term that models the physics that generates the image grain. The spatial prior is represented by a high-order Markov random field based on the recently proposed Field-of-Experts framework. We propose a new model of the image grain in archival films based on an inhomogeneous beta distribution in which the variance is a function of image luminance. We train this noise model for a particular film and perform de-graining using a diffusion method. Quantitative results show improved signalto-noise ratio relative to the standard ad hoc Gaussian noise model. Index Terms — Image restoration, optical film, noise 1

    Mechanochemical activation of copper concentrate and the effect on oxidation of metal sulphides

    No full text
    This work presents the effect of mechanochemical activation in an attrition mill, in water medium and for different time internals, on the particle size distribution and microstructure of copper concentrate as well as, on the oxidation of the metal sulphides after treatment in an autoclave. Results show that the mean particle size decreased after 30 minutes of milling almost 10 times and the specific surface increased from 0.1 to 4.3 m2/g. Regarding the micro-structural changes, it was found that during the mechanochemical activation the average crystallite size of chalcopyrite decreased, following an exponential trend towards a limiting value of approximately 20 nm, assuming spherical or equiaxed crystallites. The enhanced structural disorder of chalcopyrite is also highlighted by the linear increase of lattice strain with the milling time. Finally, results from the leaching experiments, demonstrated that the mechanical treatment improved the oxidation of sulphides by lowering the reaction temperature and increasing the reaction rates. The above data suggest that the mechanochemical activation of copper concentrate is an efficient method to enhance the hydrometallurgical oxidation of copper concentrate and chalcopyrite in particular.status: publishe
    corecore