674 research outputs found
Combining Subgoal Graphs with Reinforcement Learning to Build a Rational Pathfinder
In this paper, we present a hierarchical path planning framework called SG-RL
(subgoal graphs-reinforcement learning), to plan rational paths for agents
maneuvering in continuous and uncertain environments. By "rational", we mean
(1) efficient path planning to eliminate first-move lags; (2) collision-free
and smooth for agents with kinematic constraints satisfied. SG-RL works in a
two-level manner. At the first level, SG-RL uses a geometric path-planning
method, i.e., Simple Subgoal Graphs (SSG), to efficiently find optimal abstract
paths, also called subgoal sequences. At the second level, SG-RL uses an RL
method, i.e., Least-Squares Policy Iteration (LSPI), to learn near-optimal
motion-planning policies which can generate kinematically feasible and
collision-free trajectories between adjacent subgoals. The first advantage of
the proposed method is that SSG can solve the limitations of sparse reward and
local minima trap for RL agents; thus, LSPI can be used to generate paths in
complex environments. The second advantage is that, when the environment
changes slightly (i.e., unexpected obstacles appearing), SG-RL does not need to
reconstruct subgoal graphs and replan subgoal sequences using SSG, since LSPI
can deal with uncertainties by exploiting its generalization ability to handle
changes in environments. Simulation experiments in representative scenarios
demonstrate that, compared with existing methods, SG-RL can work well on
large-scale maps with relatively low action-switching frequencies and shorter
path lengths, and SG-RL can deal with small changes in environments. We further
demonstrate that the design of reward functions and the types of training
environments are important factors for learning feasible policies.Comment: 20 page
Stable polymer glasses
This thesis presents investigations on stable polymer glasses prepared through physical vapour deposition from different perspectives. This is the first time that polymers have been used in simple vapour deposition and made into stable glass. The ability of our lab to create stable polymer glasses with exceptional stability and extremely long lifetimes is demonstrated through the preparation and characterization of ultrastable PS as well as PMMA glasses. Attempts at preparing stable polymer glass with higher molecular weight are reported, including two different methodsâusing higher molecular weight sources and crosslinking as-deposited glasses with ultraviolet radiation. The surface properties of stable polymer glasses including their surface morphology and surface relaxation are studied. With a slower bulk dynamics in stable glasses as expected, the surface evolution of the as-deposited films and the rejuvenated films are both enhanced compared to the bulk and are not easily distinguishable from each other. Investigations on stable polymer glasses confined to thin films are reported. The results support the existence of a surface mobile layer, and it is found that glass stability decreases with decreasing film thickness, as determined by different measures of stability. By studying stable polymer glasses from different perspectives in this thesis, we hope to provide valuable insights into many fundamental questions about the surface dynamics in thin films, the limit of packing in amorphous materials, and the nature of the complex and fascinating phenomenonâthe glass transition
Crystallization Studies of Highly Monodisperse Oligomeric Poly(Ethylene Oxide)
Poly(ethylene oxide) is one of the most intensively studied polymers in terms of crystallization, because of its linear structure. In this thesis the chapters are organized in a self-contained fashion, with a general introduction and a brief conclusion. We introduce the purification and characterization of highly monodisperse PEO oligomers, and the analysis of their melting and crystallization behaviours. Through evaporative purification, we have been able to purify low molecular weight PEO, and achieve a polydispersity index six times better than the neat commercial sample, as measured by mass spectroscopy. Melting temperatures are obtained using differential scanning calorimetry. Based on the Gibbs Thomson relation, we claim that during crystallization, some purified PEO samples can form crystal lamellae not only with extended chains, but also with once-folded chains, which is normally not expected for polymers with such short chain lengths. The fact that we are able to control the melting temperature through annealing treatment on the crystal validates our chain-folding model of folded and extended chains
ModuLoRA: Finetuning 3-Bit LLMs on Consumer GPUs by Integrating with Modular Quantizers
We propose a memory-efficient finetuning algorithm for large language models
(LLMs) that supports finetuning LLMs with 65B parameters in 3-bit or 4-bit
precision on as little as one 48GB GPU. Our method, modular low-rank adaptation
(ModuLoRA), integrates any user-specified weight quantizer with finetuning via
low-rank adapters (LoRAs). Our approach relies on a simple
quantization-agnostic backward pass that adaptively materializes low-precision
LLM weights from a custom black-box quantization module. This approach enables
finetuning 3-bit LLMs for the first time--leveraging state-of-the-art 3-bit
OPTQ quantization often outperforms finetuning that relies on less
sophisticated 4-bit and 8-bit methods. In our experiments, ModuLoRA attains
competitive performance on text classification, natural language infernece, and
instruction following tasks using significantly less memory than existing
approaches, and we also surpass the state-of-the-art ROUGE score on a popular
summarization task. We release ModuLoRA together with a series of low-precision
models--including the first family of 3-bit instruction following Alpaca
LLMs--as part of LLMTOOLS, a user-friendly library for quantizing, running, and
finetuning LLMs on consumer GPUs
Observational Constraints on the Response of HighâLatitude Northern Forests to Warming
Since the 1960s, carbon cycling in the highâlatitude northern forest (HLNF) has experienced dramatic changes: Most of the forest is greening and net carbon uptake from the atmosphere has increased. During the same time period, the COâ seasonal cycle amplitude (SCA) has increased by ~50% or more. Disentangling complex processes that drive these changes has been challenging. In this study, we substitute spatial sensitivity to temperature for time to quantify the impact of temperature increase on gross primary production (GPP), total ecosystem respiration (TER), the fraction of Photosynthetic Active Radiation (fPAR), and the resulted contribution of these changes in amplifying the COâ SCA over the HLNF since 1960s. We use the spatial heterogeneity of GPP inferred from solarâinduced chlorophyll Fluorescence in combination with net ecosystem exchange (NEE) inferred from column COâ observations made between 2015 and 2017 from NASA's Orbiting Carbon Observatoryâ2. We find that three quarters of the spatial variations in GPP can be explained by the spatial variation in the growing season mean temperature (GSMT). The long term hindcast captures both the magnitude and spatial variability of the trends in observed fPAR. We estimate that between 1960 and 2010, the increase in GSMT enhanced both GPP and the SCA of NEE by ~20%. The calculated enhancement of NEE due to increase in GSMT contributes 56â72% of the trend in the COâ SCA at high latitudes, much larger than simulations by most biogeochemical models
- âŠ