153 research outputs found

    Diversify & Conquer: Outcome-directed Curriculum RL via Out-of-Distribution Disagreement

    Full text link
    Reinforcement learning (RL) often faces the challenges of uninformed search problems where the agent should explore without access to the domain knowledge such as characteristics of the environment or external rewards. To tackle these challenges, this work proposes a new approach for curriculum RL called Diversify for Disagreement & Conquer (D2C). Unlike previous curriculum learning methods, D2C requires only a few examples of desired outcomes and works in any environment, regardless of its geometry or the distribution of the desired outcome examples. The proposed method performs diversification of the goal-conditional classifiers to identify similarities between visited and desired outcome states and ensures that the classifiers disagree on states from out-of-distribution, which enables quantifying the unexplored region and designing an arbitrary goal-conditioned intrinsic reward signal in a simple and intuitive way. The proposed method then employs bipartite matching to define a curriculum learning objective that produces a sequence of well-adjusted intermediate goals, which enable the agent to automatically explore and conquer the unexplored region. We present experimental results demonstrating that D2C outperforms prior curriculum RL methods in both quantitative and qualitative aspects, even with the arbitrarily distributed desired outcome examples

    Growth of A Massive Black Hole Via Tidal Disruption Accretion

    Full text link
    Stars that are tidally disrupted by the massive black hole (MBH) may contribute significantly to the growth of the MBH, especially in dense nuclear star clusters (NSCs). Yet, this tidal disruption accretion (TDA) of stars onto the MBH has largely been overlooked compared to the gas accretion (GA) channel in most numerical experiments until now. In this work, we implement a black hole growth channel via TDA in the high-resolution adaptive mesh refinement code Enzo to investigate its influence on a MBH seed's early evolution. We find that a MBH seed grows rapidly from 103 M⊙10^3\,\mathrm{M}_\odot to ≳106 M⊙\gtrsim 10^6\,\mathrm{M}_\odot in 200\,Myrs in some of the tested simulations. Compared to a MBH seed that grows only via GA, TDA can enhance the MBH's growth rate by up to more than an order of magnitude. However, as predicted, TDA mainly helps the early growth of the MBH (from 103−4 M⊙10^{3-4}\,\mathrm{M}_\odot to ≲105 M⊙\lesssim10^{5}\,\mathrm{M}_\odot) while the later evolution is generally dominated by GA. We also observe that the star formation near the MBH is suppressed when TDA is most active, sometimes with a visible cavity in gas (of size ∼\sim a few pc) created in the vicinity of the MBH. It is because the MBH may grow expeditiously with both GA and TDA, and the massive MBH could consume its neighboring gas faster than being replenished by gas inflows. Our study demonstrates the need to consider different channels of black hole accretion that may provide clues for the existence of supermassive black holes at high redshifts.Comment: 17 pages, and 10 figure

    Locating carbon neutral mobility hubs using artificial intelligence techniques

    Get PDF
    This research proposes a novel, three-tier AI-based scheme for the allocation of carbon–neutral mobility hubs. Initially, it identified optimal sites using a genetic algorithm, which optimized travel times and achieved a high fitness value of 77,000,000. Second, it involved an Ensemble-based suitability analysis of the pinpointed locations, using factors such as land use mix, densities of population and employment, and proximities of parking, biking, and transit. Each factor is weighted by its carbon emissions contribution, then incorporated into a suitability analysis model, generating scores that guide the final selection of the most suitable mobility hub sites. The final step employs a traffic assignment model to evaluate these sites’ environmental and economic impacts. This includes measuring reductions in vehicle kilometers traveled and calculating other cost savings. Focusing on addressing sustainable development goals 11 and 9, this study leverages advanced techniques to enhance transportation planning policies. The Ensemble model demonstrated strong predictive accuracy, achieving an R-squared of 95% in training and 53% in testing. The identified hubs’ sites reduced daily vehicle travel by 771,074 km, leading to annual savings of 225.5 million USD. This comprehensive approach integrates carbon-focused analyses and post-assessment evaluations, thereby offering a comprehensive framework for sustainable mobility hub planning

    Development of Resilience Index in Transport Systems

    Get PDF
    This paper demonstrates the quantification of the resilience index (RI) in transport systems. The transport infrastructure can be managed by using the concepts of resilience. Vugrin, Warren, Ehlen, & Camphouse (2010) emphasized the enhancement of resilience in infrastructure before disasters and the establishment of efficient measures for the recovery of systems in an emergency. The concept of resilience has a significant influence on transport planning and operations for disaster preparation. Lee, Kim, & Lee (2013) investigated the concepts of resilience and examined case studies using valuable asset-management techniques in order to maintain the resilience concepts which should be introduced in transport infrastructure planning and operations. Therefore, this paper presents the RI based on Vurgrin et al. (2010) and Lee et al. (2010). The first part of this paper focuses on the measurement of the RI using the recovery-dependent resilience (Vugrin et al., 2010) in transport infrastructures. For quantifying the RI, we have developed various variables that are used to target an achievable or a desired system performance in disaster recovery efforts. The second part of this paper focuses on the applications of the RI in case studies. The examined cases are road networks in flooded areas, heavy snowfall districts, and landslide occurrence zones. Each case is analyzed for transport costs both under normal and disaster conditions using the transport demand estimation models. Finally, we quantify the RI, which is important for establishing the provision of safety, recovery, and rehabilitation of transport infrastructures in flooding, snowfall, and landslide areas

    Generalized Gumbel-Softmax Gradient Estimator for Various Discrete Random Variables

    Full text link
    Estimating the gradients of stochastic nodes is one of the crucial research questions in the deep generative modeling community, which enables the gradient descent optimization on neural network parameters. This estimation problem becomes further complex when we regard the stochastic nodes to be discrete because pathwise derivative techniques cannot be applied. Hence, the stochastic gradient estimation of discrete distributions requires either a score function method or continuous relaxation of the discrete random variables. This paper proposes a general version of the Gumbel-Softmax estimator with continuous relaxation, and this estimator is able to relax the discreteness of probability distributions including more diverse types, other than categorical and Bernoulli. In detail, we utilize the truncation of discrete random variables and the Gumbel-Softmax trick with a linear transformation for the relaxed reparameterization. The proposed approach enables the relaxed discrete random variable to be reparameterized and to backpropagated through a large scale stochastic computational graph. Our experiments consist of (1) synthetic data analyses, which show the efficacy of our methods; and (2) applications on VAE and topic model, which demonstrate the value of the proposed estimation in practices

    CQM: Curriculum Reinforcement Learning with a Quantized World Model

    Full text link
    Recent curriculum Reinforcement Learning (RL) has shown notable progress in solving complex tasks by proposing sequences of surrogate tasks. However, the previous approaches often face challenges when they generate curriculum goals in a high-dimensional space. Thus, they usually rely on manually specified goal spaces. To alleviate this limitation and improve the scalability of the curriculum, we propose a novel curriculum method that automatically defines the semantic goal space which contains vital information for the curriculum process, and suggests curriculum goals over it. To define the semantic goal space, our method discretizes continuous observations via vector quantized-variational autoencoders (VQ-VAE) and restores the temporal relations between the discretized observations by a graph. Concurrently, ours suggests uncertainty and temporal distance-aware curriculum goals that converges to the final goals over the automatically composed goal space. We demonstrate that the proposed method allows efficient explorations in an uninformed environment with raw goal examples only. Also, ours outperforms the state-of-the-art curriculum RL methods on data efficiency and performance, in various goal-reaching tasks even with ego-centric visual inputs.Comment: Accepted to NeurIPS 202

    DFX: A Low-latency Multi-FPGA Appliance for Accelerating Transformer-based Text Generation

    Full text link
    Transformer is a deep learning language model widely used for natural language processing (NLP) services in datacenters. Among transformer models, Generative Pre-trained Transformer (GPT) has achieved remarkable performance in text generation, or natural language generation (NLG), which needs the processing of a large input context in the summarization stage, followed by the generation stage that produces a single word at a time. The conventional platforms such as GPU are specialized for the parallel processing of large inputs in the summarization stage, but their performance significantly degrades in the generation stage due to its sequential characteristic. Therefore, an efficient hardware platform is required to address the high latency caused by the sequential characteristic of text generation. In this paper, we present DFX, a multi-FPGA acceleration appliance that executes GPT-2 model inference end-to-end with low latency and high throughput in both summarization and generation stages. DFX uses model parallelism and optimized dataflow that is model-and-hardware-aware for fast simultaneous workload execution among devices. Its compute cores operate on custom instructions and provide GPT-2 operations end-to-end. We implement the proposed hardware architecture on four Xilinx Alveo U280 FPGAs and utilize all of the channels of the high bandwidth memory (HBM) and the maximum number of compute resources for high hardware efficiency. DFX achieves 5.58x speedup and 3.99x energy efficiency over four NVIDIA V100 GPUs on the modern GPT-2 model. DFX is also 8.21x more cost-effective than the GPU appliance, suggesting that it is a promising solution for text generation workloads in cloud datacenters.Comment: Extension of HOTCHIPS 2022 and accepted in MICRO 202
    • …
    corecore