659 research outputs found

    Combining Subgoal Graphs with Reinforcement Learning to Build a Rational Pathfinder

    Full text link
    In this paper, we present a hierarchical path planning framework called SG-RL (subgoal graphs-reinforcement learning), to plan rational paths for agents maneuvering in continuous and uncertain environments. By "rational", we mean (1) efficient path planning to eliminate first-move lags; (2) collision-free and smooth for agents with kinematic constraints satisfied. SG-RL works in a two-level manner. At the first level, SG-RL uses a geometric path-planning method, i.e., Simple Subgoal Graphs (SSG), to efficiently find optimal abstract paths, also called subgoal sequences. At the second level, SG-RL uses an RL method, i.e., Least-Squares Policy Iteration (LSPI), to learn near-optimal motion-planning policies which can generate kinematically feasible and collision-free trajectories between adjacent subgoals. The first advantage of the proposed method is that SSG can solve the limitations of sparse reward and local minima trap for RL agents; thus, LSPI can be used to generate paths in complex environments. The second advantage is that, when the environment changes slightly (i.e., unexpected obstacles appearing), SG-RL does not need to reconstruct subgoal graphs and replan subgoal sequences using SSG, since LSPI can deal with uncertainties by exploiting its generalization ability to handle changes in environments. Simulation experiments in representative scenarios demonstrate that, compared with existing methods, SG-RL can work well on large-scale maps with relatively low action-switching frequencies and shorter path lengths, and SG-RL can deal with small changes in environments. We further demonstrate that the design of reward functions and the types of training environments are important factors for learning feasible policies.Comment: 20 page

    Stable polymer glasses

    Get PDF
    This thesis presents investigations on stable polymer glasses prepared through physical vapour deposition from different perspectives. This is the first time that polymers have been used in simple vapour deposition and made into stable glass. The ability of our lab to create stable polymer glasses with exceptional stability and extremely long lifetimes is demonstrated through the preparation and characterization of ultrastable PS as well as PMMA glasses. Attempts at preparing stable polymer glass with higher molecular weight are reported, including two different methods–using higher molecular weight sources and crosslinking as-deposited glasses with ultraviolet radiation. The surface properties of stable polymer glasses including their surface morphology and surface relaxation are studied. With a slower bulk dynamics in stable glasses as expected, the surface evolution of the as-deposited films and the rejuvenated films are both enhanced compared to the bulk and are not easily distinguishable from each other. Investigations on stable polymer glasses confined to thin films are reported. The results support the existence of a surface mobile layer, and it is found that glass stability decreases with decreasing film thickness, as determined by different measures of stability. By studying stable polymer glasses from different perspectives in this thesis, we hope to provide valuable insights into many fundamental questions about the surface dynamics in thin films, the limit of packing in amorphous materials, and the nature of the complex and fascinating phenomenon–the glass transition

    Crystallization Studies of Highly Monodisperse Oligomeric Poly(Ethylene Oxide)

    Get PDF
    Poly(ethylene oxide) is one of the most intensively studied polymers in terms of crystallization, because of its linear structure. In this thesis the chapters are organized in a self-contained fashion, with a general introduction and a brief conclusion. We introduce the purification and characterization of highly monodisperse PEO oligomers, and the analysis of their melting and crystallization behaviours. Through evaporative purification, we have been able to purify low molecular weight PEO, and achieve a polydispersity index six times better than the neat commercial sample, as measured by mass spectroscopy. Melting temperatures are obtained using differential scanning calorimetry. Based on the Gibbs Thomson relation, we claim that during crystallization, some purified PEO samples can form crystal lamellae not only with extended chains, but also with once-folded chains, which is normally not expected for polymers with such short chain lengths. The fact that we are able to control the melting temperature through annealing treatment on the crystal validates our chain-folding model of folded and extended chains

    ModuLoRA: Finetuning 3-Bit LLMs on Consumer GPUs by Integrating with Modular Quantizers

    Full text link
    We propose a memory-efficient finetuning algorithm for large language models (LLMs) that supports finetuning LLMs with 65B parameters in 3-bit or 4-bit precision on as little as one 48GB GPU. Our method, modular low-rank adaptation (ModuLoRA), integrates any user-specified weight quantizer with finetuning via low-rank adapters (LoRAs). Our approach relies on a simple quantization-agnostic backward pass that adaptively materializes low-precision LLM weights from a custom black-box quantization module. This approach enables finetuning 3-bit LLMs for the first time--leveraging state-of-the-art 3-bit OPTQ quantization often outperforms finetuning that relies on less sophisticated 4-bit and 8-bit methods. In our experiments, ModuLoRA attains competitive performance on text classification, natural language infernece, and instruction following tasks using significantly less memory than existing approaches, and we also surpass the state-of-the-art ROUGE score on a popular summarization task. We release ModuLoRA together with a series of low-precision models--including the first family of 3-bit instruction following Alpaca LLMs--as part of LLMTOOLS, a user-friendly library for quantizing, running, and finetuning LLMs on consumer GPUs

    Observational Constraints on the Response of High‐Latitude Northern Forests to Warming

    Get PDF
    Since the 1960s, carbon cycling in the high‐latitude northern forest (HLNF) has experienced dramatic changes: Most of the forest is greening and net carbon uptake from the atmosphere has increased. During the same time period, the CO₂ seasonal cycle amplitude (SCA) has increased by ~50% or more. Disentangling complex processes that drive these changes has been challenging. In this study, we substitute spatial sensitivity to temperature for time to quantify the impact of temperature increase on gross primary production (GPP), total ecosystem respiration (TER), the fraction of Photosynthetic Active Radiation (fPAR), and the resulted contribution of these changes in amplifying the CO₂ SCA over the HLNF since 1960s. We use the spatial heterogeneity of GPP inferred from solar‐induced chlorophyll Fluorescence in combination with net ecosystem exchange (NEE) inferred from column CO₂ observations made between 2015 and 2017 from NASA's Orbiting Carbon Observatory‐2. We find that three quarters of the spatial variations in GPP can be explained by the spatial variation in the growing season mean temperature (GSMT). The long term hindcast captures both the magnitude and spatial variability of the trends in observed fPAR. We estimate that between 1960 and 2010, the increase in GSMT enhanced both GPP and the SCA of NEE by ~20%. The calculated enhancement of NEE due to increase in GSMT contributes 56–72% of the trend in the CO₂ SCA at high latitudes, much larger than simulations by most biogeochemical models
    • 

    corecore