153 research outputs found
Diversify & Conquer: Outcome-directed Curriculum RL via Out-of-Distribution Disagreement
Reinforcement learning (RL) often faces the challenges of uninformed search
problems where the agent should explore without access to the domain knowledge
such as characteristics of the environment or external rewards. To tackle these
challenges, this work proposes a new approach for curriculum RL called
Diversify for Disagreement & Conquer (D2C). Unlike previous curriculum learning
methods, D2C requires only a few examples of desired outcomes and works in any
environment, regardless of its geometry or the distribution of the desired
outcome examples. The proposed method performs diversification of the
goal-conditional classifiers to identify similarities between visited and
desired outcome states and ensures that the classifiers disagree on states from
out-of-distribution, which enables quantifying the unexplored region and
designing an arbitrary goal-conditioned intrinsic reward signal in a simple and
intuitive way. The proposed method then employs bipartite matching to define a
curriculum learning objective that produces a sequence of well-adjusted
intermediate goals, which enable the agent to automatically explore and conquer
the unexplored region. We present experimental results demonstrating that D2C
outperforms prior curriculum RL methods in both quantitative and qualitative
aspects, even with the arbitrarily distributed desired outcome examples
Growth of A Massive Black Hole Via Tidal Disruption Accretion
Stars that are tidally disrupted by the massive black hole (MBH) may
contribute significantly to the growth of the MBH, especially in dense nuclear
star clusters (NSCs). Yet, this tidal disruption accretion (TDA) of stars onto
the MBH has largely been overlooked compared to the gas accretion (GA) channel
in most numerical experiments until now. In this work, we implement a black
hole growth channel via TDA in the high-resolution adaptive mesh refinement
code Enzo to investigate its influence on a MBH seed's early evolution. We find
that a MBH seed grows rapidly from to in 200\,Myrs in some of the tested simulations.
Compared to a MBH seed that grows only via GA, TDA can enhance the MBH's growth
rate by up to more than an order of magnitude. However, as predicted, TDA
mainly helps the early growth of the MBH (from to
) while the later evolution is generally
dominated by GA. We also observe that the star formation near the MBH is
suppressed when TDA is most active, sometimes with a visible cavity in gas (of
size a few pc) created in the vicinity of the MBH. It is because the MBH
may grow expeditiously with both GA and TDA, and the massive MBH could consume
its neighboring gas faster than being replenished by gas inflows. Our study
demonstrates the need to consider different channels of black hole accretion
that may provide clues for the existence of supermassive black holes at high
redshifts.Comment: 17 pages, and 10 figure
Locating carbon neutral mobility hubs using artificial intelligence techniques
This research proposes a novel, three-tier AI-based scheme for the allocation of carbon–neutral mobility hubs. Initially, it identified optimal sites using a genetic algorithm, which optimized travel times and achieved a high fitness value of 77,000,000. Second, it involved an Ensemble-based suitability analysis of the pinpointed locations, using factors such as land use mix, densities of population and employment, and proximities of parking, biking, and transit. Each factor is weighted by its carbon emissions contribution, then incorporated into a suitability analysis model, generating scores that guide the final selection of the most suitable mobility hub sites. The final step employs a traffic assignment model to evaluate these sites’ environmental and economic impacts. This includes measuring reductions in vehicle kilometers traveled and calculating other cost savings. Focusing on addressing sustainable development goals 11 and 9, this study leverages advanced techniques to enhance transportation planning policies. The Ensemble model demonstrated strong predictive accuracy, achieving an R-squared of 95% in training and 53% in testing. The identified hubs’ sites reduced daily vehicle travel by 771,074 km, leading to annual savings of 225.5 million USD. This comprehensive approach integrates carbon-focused analyses and post-assessment evaluations, thereby offering a comprehensive framework for sustainable mobility hub planning
Development of Resilience Index in Transport Systems
This paper demonstrates the quantification of the resilience index (RI) in transport systems. The transport infrastructure can be managed by using the concepts of resilience. Vugrin, Warren, Ehlen, & Camphouse (2010) emphasized the enhancement of resilience in infrastructure before disasters and the establishment of efficient measures for the recovery of systems in an emergency. The concept of resilience has a significant influence on transport planning and operations for disaster preparation. Lee, Kim, & Lee (2013) investigated the concepts of resilience and examined case studies using valuable asset-management techniques in order to maintain the resilience concepts which should be introduced in transport infrastructure planning and operations. Therefore, this paper presents the RI based on Vurgrin et al. (2010) and Lee et al. (2010).
The first part of this paper focuses on the measurement of the RI using the recovery-dependent resilience (Vugrin et al., 2010) in transport infrastructures. For quantifying the RI, we have developed various variables that are used to target an achievable or a desired system performance in disaster recovery efforts. The second part of this paper focuses on the applications of the RI in case studies. The examined cases are road networks in flooded areas, heavy snowfall districts, and landslide occurrence zones. Each case is analyzed for transport costs both under normal and disaster conditions using the transport demand estimation models. Finally, we quantify the RI, which is important for establishing the provision of safety, recovery, and rehabilitation of transport infrastructures in flooding, snowfall, and landslide areas
Generalized Gumbel-Softmax Gradient Estimator for Various Discrete Random Variables
Estimating the gradients of stochastic nodes is one of the crucial research
questions in the deep generative modeling community, which enables the gradient
descent optimization on neural network parameters. This estimation problem
becomes further complex when we regard the stochastic nodes to be discrete
because pathwise derivative techniques cannot be applied. Hence, the stochastic
gradient estimation of discrete distributions requires either a score function
method or continuous relaxation of the discrete random variables. This paper
proposes a general version of the Gumbel-Softmax estimator with continuous
relaxation, and this estimator is able to relax the discreteness of probability
distributions including more diverse types, other than categorical and
Bernoulli. In detail, we utilize the truncation of discrete random variables
and the Gumbel-Softmax trick with a linear transformation for the relaxed
reparameterization. The proposed approach enables the relaxed discrete random
variable to be reparameterized and to backpropagated through a large scale
stochastic computational graph. Our experiments consist of (1) synthetic data
analyses, which show the efficacy of our methods; and (2) applications on VAE
and topic model, which demonstrate the value of the proposed estimation in
practices
CQM: Curriculum Reinforcement Learning with a Quantized World Model
Recent curriculum Reinforcement Learning (RL) has shown notable progress in
solving complex tasks by proposing sequences of surrogate tasks. However, the
previous approaches often face challenges when they generate curriculum goals
in a high-dimensional space. Thus, they usually rely on manually specified goal
spaces. To alleviate this limitation and improve the scalability of the
curriculum, we propose a novel curriculum method that automatically defines the
semantic goal space which contains vital information for the curriculum
process, and suggests curriculum goals over it. To define the semantic goal
space, our method discretizes continuous observations via vector
quantized-variational autoencoders (VQ-VAE) and restores the temporal relations
between the discretized observations by a graph. Concurrently, ours suggests
uncertainty and temporal distance-aware curriculum goals that converges to the
final goals over the automatically composed goal space. We demonstrate that the
proposed method allows efficient explorations in an uninformed environment with
raw goal examples only. Also, ours outperforms the state-of-the-art curriculum
RL methods on data efficiency and performance, in various goal-reaching tasks
even with ego-centric visual inputs.Comment: Accepted to NeurIPS 202
DFX: A Low-latency Multi-FPGA Appliance for Accelerating Transformer-based Text Generation
Transformer is a deep learning language model widely used for natural
language processing (NLP) services in datacenters. Among transformer models,
Generative Pre-trained Transformer (GPT) has achieved remarkable performance in
text generation, or natural language generation (NLG), which needs the
processing of a large input context in the summarization stage, followed by the
generation stage that produces a single word at a time. The conventional
platforms such as GPU are specialized for the parallel processing of large
inputs in the summarization stage, but their performance significantly degrades
in the generation stage due to its sequential characteristic. Therefore, an
efficient hardware platform is required to address the high latency caused by
the sequential characteristic of text generation.
In this paper, we present DFX, a multi-FPGA acceleration appliance that
executes GPT-2 model inference end-to-end with low latency and high throughput
in both summarization and generation stages. DFX uses model parallelism and
optimized dataflow that is model-and-hardware-aware for fast simultaneous
workload execution among devices. Its compute cores operate on custom
instructions and provide GPT-2 operations end-to-end. We implement the proposed
hardware architecture on four Xilinx Alveo U280 FPGAs and utilize all of the
channels of the high bandwidth memory (HBM) and the maximum number of compute
resources for high hardware efficiency. DFX achieves 5.58x speedup and 3.99x
energy efficiency over four NVIDIA V100 GPUs on the modern GPT-2 model. DFX is
also 8.21x more cost-effective than the GPU appliance, suggesting that it is a
promising solution for text generation workloads in cloud datacenters.Comment: Extension of HOTCHIPS 2022 and accepted in MICRO 202
- …