9,491 research outputs found

    Policy Optimization with Model-based Explorations

    Full text link
    Model-free reinforcement learning methods such as the Proximal Policy Optimization algorithm (PPO) have successfully applied in complex decision-making problems such as Atari games. However, these methods suffer from high variances and high sample complexity. On the other hand, model-based reinforcement learning methods that learn the transition dynamics are more sample efficient, but they often suffer from the bias of the transition estimation. How to make use of both model-based and model-free learning is a central problem in reinforcement learning. In this paper, we present a new technique to address the trade-off between exploration and exploitation, which regards the difference between model-free and model-based estimations as a measure of exploration value. We apply this new technique to the PPO algorithm and arrive at a new policy optimization method, named Policy Optimization with Model-based Explorations (POME). POME uses two components to predict the actions' target values: a model-free one estimated by Monte-Carlo sampling and a model-based one which learns a transition model and predicts the value of the next state. POME adds the error of these two target estimations as the additional exploration value for each state-action pair, i.e, encourages the algorithm to explore the states with larger target errors which are hard to estimate. We compare POME with PPO on Atari 2600 games, and it shows that POME outperforms PPO on 33 games out of 49 games.Comment: Accepted at AAAI-1

    Tin Nanoparticles Encapsulated in Hollow TIO2 Spheres as High Performance Anode Materials for Li-Ion Batteries

    Get PDF
    Tin, an anode material with a high capacity for lithium-ion batteries, has poor cyclic performance because of the high volume expansion upon lithiation. Based on a literature review of the applications of lithium-ion batteries and current research progress of the tin-based anode materials for lithium-ion batteries, we developed a method to synthesize hollow TiO2 spheres with tin nanoparticles anchored on the inner surface of the TiO2 shell. Such a unique tin/TiO2 composite alleviates the volume change of tin–based anode materials in charge-discharge processes. SnCl2·2H2O (Tin (II) chloride dihydrate) and titanium (IV) isopropoxide (TIPT) were used as the Sn source and the Ti source, respectively, while CaCO3 was used as a template to fabricate the TiO2 hollow shell. A variety of modern material testing methods (XRD, SEM, XPS, Raman, BET, etc.) and electrochemical measurements such as galvanostatic charge-discharge and cyclic voltammetry (CV) testing were employed to systematically study effects of various synthesis parameters on the structure and battery performance of the as-prepared materials. We also discussed the key factors influencing the cycle performance of the composite electrode material and the related mechanism

    Numerical and experimental analysis on microbubble generation and multiphase mixing in novel microfluidic devices

    Get PDF
    In this study, a novel K-junction microfluidic junction and a conventional cross-junction were investigated numerically and experimentally for microbubble generation and multiple fluids mixing. In the K-junction, liquid solutions were injected into the junction via three liquid inlet channels, along with inert nitrogen gas supplied via the gas inlet channel, to periodically generate microbubbles in a controlled manner at the outlet channel. Numerical simulations based on Finite Volume method and Volume of Fluid (VOF) technique and experiments of both the K-junction and the cross-junction were conducted. The effect of parameters such as contact angle, surface tension, viscosity, gas pressure and gas-liquid flow ratios on the microbubble size distribution was investigated. The process of microbubble generation, obtained through high speed camera imaging and the numerical simulation, has shown good agreement in both junctions as well as the influence of viscosity and gas-liquid flow ratios for the K-junction and cross-junction. It was indicated that parameters like solution viscosities, gas-to-liquid flow ratios, gas inlet pressure, and their combination have a significant influence on the microbubble diameter, which was found to be in the range of 70-240 µm when using micro capillaries of 100 µm inner diameter. The multiple fluids mixing study was investigated by using two or three different polymer solutions for the cross-junction and the K-junction respectively in simulations and experiments. It can be seen that the mixing process obtained from simulations agrees well with experimental results and chaotic mixing was found in the mixing area of the K-junction, with higher mixing efficiency than the cross junction. Fluorescent images of microbubbles generated by using polymer solutions with dyes inside have shown the devices’ potential of encapsulating fluorescent dyes and polymers on the shell of bubbles and could be adopted as a method to encapsulate active pharmaceutical ingredients for potential applications in drug delivery
    • …
    corecore