8,573 research outputs found
Machine learning in solar physics
The application of machine learning in solar physics has the potential to
greatly enhance our understanding of the complex processes that take place in
the atmosphere of the Sun. By using techniques such as deep learning, we are
now in the position to analyze large amounts of data from solar observations
and identify patterns and trends that may not have been apparent using
traditional methods. This can help us improve our understanding of explosive
events like solar flares, which can have a strong effect on the Earth
environment. Predicting hazardous events on Earth becomes crucial for our
technological society. Machine learning can also improve our understanding of
the inner workings of the sun itself by allowing us to go deeper into the data
and to propose more complex models to explain them. Additionally, the use of
machine learning can help to automate the analysis of solar data, reducing the
need for manual labor and increasing the efficiency of research in this field.Comment: 100 pages, 13 figures, 286 references, accepted for publication as a
Living Review in Solar Physics (LRSP
Contextual Pre-Planning on Reward Machine Abstractions for Enhanced Transfer in Deep Reinforcement Learning
Recent studies show that deep reinforcement learning (DRL) agents tend to
overfit to the task on which they were trained and fail to adapt to minor
environment changes. To expedite learning when transferring to unseen tasks, we
propose a novel approach to representing the current task using reward machines
(RM), state machine abstractions that induce subtasks based on the current
task's rewards and dynamics. Our method provides agents with symbolic
representations of optimal transitions from their current abstract state and
rewards them for achieving these transitions. These representations are shared
across tasks, allowing agents to exploit knowledge of previously encountered
symbols and transitions, thus enhancing transfer. Our empirical evaluation
shows that our representations improve sample efficiency and few-shot transfer
in a variety of domains.Comment: IJCAI Workshop on Planning and Reinforcement Learning, 202
Emergence of Adaptive Circadian Rhythms in Deep Reinforcement Learning
Adapting to regularities of the environment is critical for biological
organisms to anticipate events and plan. A prominent example is the circadian
rhythm corresponding to the internalization by organisms of the -hour
period of the Earth's rotation. In this work, we study the emergence of
circadian-like rhythms in deep reinforcement learning agents. In particular, we
deployed agents in an environment with a reliable periodic variation while
solving a foraging task. We systematically characterize the agent's behavior
during learning and demonstrate the emergence of a rhythm that is endogenous
and entrainable. Interestingly, the internal rhythm adapts to shifts in the
phase of the environmental signal without any re-training. Furthermore, we show
via bifurcation and phase response curve analyses how artificial neurons
develop dynamics to support the internalization of the environmental rhythm.
From a dynamical systems view, we demonstrate that the adaptation proceeds by
the emergence of a stable periodic orbit in the neuron dynamics with a phase
response that allows an optimal phase synchronisation between the agent's
dynamics and the environmental rhythm.Comment: ICML 202
Reinforcement learning in large state action spaces
Reinforcement learning (RL) is a promising framework for training intelligent agents which learn to optimize long term utility by directly interacting with the environment. Creating RL methods which scale to large state-action spaces is a critical problem towards ensuring real world deployment of RL systems. However, several challenges limit the applicability of RL to large scale settings. These include difficulties with exploration, low sample efficiency, computational intractability, task constraints like decentralization and lack of guarantees about important properties like performance, generalization and robustness in potentially unseen scenarios.
This thesis is motivated towards bridging the aforementioned gap. We propose several principled algorithms and frameworks for studying and addressing the above challenges RL. The proposed methods cover a wide range of RL settings (single and multi-agent systems (MAS) with all the variations in the latter, prediction and control, model-based and model-free methods, value-based and policy-based methods). In this work we propose the first results on several different problems: e.g. tensorization of the Bellman equation which allows exponential sample efficiency gains (Chapter 4), provable suboptimality arising from structural constraints in MAS(Chapter 3), combinatorial generalization results in cooperative MAS(Chapter 5), generalization results on observation shifts(Chapter 7), learning deterministic policies in a probabilistic RL framework(Chapter 6). Our algorithms exhibit provably enhanced performance and sample efficiency along with better scalability. Additionally, we also shed light on generalization aspects of the agents under different frameworks. These properties have been been driven by the use of several advanced tools (e.g. statistical machine learning, state abstraction, variational inference, tensor theory).
In summary, the contributions in this thesis significantly advance progress towards making RL agents ready for large scale, real world applications
Swarm Reinforcement Learning For Adaptive Mesh Refinement
The Finite Element Method, an important technique in engineering, is aided by
Adaptive Mesh Refinement (AMR), which dynamically refines mesh regions to allow
for a favorable trade-off between computational speed and simulation accuracy.
Classical methods for AMR depend on task-specific heuristics or expensive error
estimators, hindering their use for complex simulations. Recent learned AMR
methods tackle these problems, but so far scale only to simple toy examples. We
formulate AMR as a novel Adaptive Swarm Markov Decision Process in which a mesh
is modeled as a system of simple collaborating agents that may split into
multiple new agents. This framework allows for a spatial reward formulation
that simplifies the credit assignment problem, which we combine with Message
Passing Networks to propagate information between neighboring mesh elements. We
experimentally validate the effectiveness of our approach, Adaptive Swarm Mesh
Refinement (ASMR), showing that it learns reliable, scalable, and efficient
refinement strategies on a set of challenging problems. Our approach
significantly speeds up computation, achieving up to 30-fold improvement
compared to uniform refinements in complex simulations. Additionally, we
outperform learned baselines and achieve a refinement quality that is on par
with a traditional error-based AMR strategy without expensive oracle
information about the error signal.Comment: Version 1 of this paper is a preliminary workshop version that was
accepted as a workshop paper in the ICLR 2023 Workshop on Physics for Machine
Learnin
Safe Reinforcement Learning as Wasserstein Variational Inference: Formal Methods for Interpretability
Reinforcement Learning or optimal control can provide effective reasoning for
sequential decision-making problems with variable dynamics. Such reasoning in
practical implementation, however, poses a persistent challenge in interpreting
the reward function and corresponding optimal policy. Consequently, formalizing
the sequential decision-making problems as inference has a considerable value,
as probabilistic inference in principle offers diverse and powerful
mathematical tools to infer the stochastic dynamics whilst suggesting a
probabilistic interpretation of the reward design and policy convergence. In
this study, we propose a novel Adaptive Wasserstein Variational Optimization
(AWaVO) to tackle these challenges in sequential decision-making. Our approach
utilizes formal methods to provide interpretations of reward design,
transparency of training convergence, and probabilistic interpretation of
sequential decisions. To demonstrate practicality, we show convergent training
with guaranteed global convergence rates not only in simulation but also in
real robot tasks, and empirically verify a reasonable tradeoff between high
performance and conservative interpretability.Comment: 24 pages, 8 figures, containing Appendi
A Survey on Transformers in Reinforcement Learning
Transformer has been considered the dominating neural architecture in NLP and
CV, mostly under supervised settings. Recently, a similar surge of using
Transformers has appeared in the domain of reinforcement learning (RL), but it
is faced with unique design choices and challenges brought by the nature of RL.
However, the evolution of Transformers in RL has not yet been well unraveled.
In this paper, we seek to systematically review motivations and progress on
using Transformers in RL, provide a taxonomy on existing works, discuss each
sub-field, and summarize future prospects
5G RAN/MEC Slicing and Admission Control using Deep Reinforcement Learning
The 5G RAN functions can be virtualized and distributed across the radio unit (RU), distributed unit (DU), and centralized unit (CU) to facilitate flexible resource management. Complemented by multi-access edge computing (MEC), these components create network slices tailored for applications with diverse quality of service (QoS) requirements. However, as the requests for various slices arrive dynamically over time and the network resources are limited, it is non-trivial for an infrastructure provider (InP) to optimize its long-term revenue from real-time admission and embedding of slice requests. Prior works have leveraged Deep Reinforcement Learning (DRL) to address this problem, however, these solutions either do not scale to realistic topologies, require re-training of the DRL agents when facing topology changes, or do not consider the slice admission and embedding problems jointly. In this thesis, we use multi-agent DRL and Graph Attention Networks (GATs) to address these limitations. Specifically, we propose novel topology-independent admission and slicing agents that are scalable and generalizable to large and different metropolitan networks. Results show that the proposed approach converges faster and achieves up to 35.2% and 20% gain in revenue compared to heuristics and other DRL-based approaches, respectively. Additionally, we demonstrate that our approach is generalizable to scenarios and substrate networks previously unseen during training, as it maintains superior performance without re-training or re-tuning. Finally, we extract the attention maps of the GAT, and analyze them to detect potential bottlenecks and efficiently improve network performance and InP revenue through eliminating them
Augmented Behavioral Annotation Tools, with Application to Multimodal Datasets and Models: A Systematic Review
Annotation tools are an essential component in the creation of datasets for machine learning purposes. Annotation tools have evolved greatly since the turn of the century, and now commonly include collaborative features to divide labor efficiently, as well as automation employed to amplify human efforts. Recent developments in machine learning models, such as Transformers, allow for training upon very large and sophisticated multimodal datasets and enable generalization across domains of knowledge. These models also herald an increasing emphasis on prompt engineering to provide qualitative fine-tuning upon the model itself, adding a novel emerging layer of direct machine learning annotation. These capabilities enable machine intelligence to recognize, predict, and emulate human behavior with much greater accuracy and nuance, a noted shortfall of which have contributed to algorithmic injustice in previous techniques. However, the scale and complexity of training data required for multimodal models presents engineering challenges. Best practices for conducting annotation for large multimodal models in the most safe and ethical, yet efficient, manner have not been established. This paper presents a systematic literature review of crowd and machine learning augmented behavioral annotation methods to distill practices that may have value in multimodal implementations, cross-correlated across disciplines. Research questions were defined to provide an overview of the evolution of augmented behavioral annotation tools in the past, in relation to the present state of the art. (Contains five figures and four tables)
- …