Search CORE

1,446 research outputs found

Single- and multiobjective reinforcement learning in dynamic adversarial games

Author: Kurniawan Budi
Publication venue: 'Federation University Australia'
Publication date: 01/01/2022
Field of study

This thesis uses reinforcement learning (RL) to address dynamic adversarial games in the context of air combat manoeuvring simulation. A sequential decision problem commonly encountered in the field of operations research, air combat manoeuvring simulation conventionally relied on agent programming methods that required significant domain knowledge to be manually encoded into the simulation environment. These methods are appropriate for determining the effectiveness of existing tactics in different simulated scenarios. However, in order to maximise the advantages provided by new technologies (such as autonomous aircraft), new tactics will need to be discovered. A proven technique for solving sequential decision problems, RL has the potential to discover these new tactics. This thesis explores four RL approaches—tabular, deep, discrete-to-deep and multiobjective— as mechanisms for discovering new behaviours in simulations of air combat manoeuvring. Itimplements and tests several methods for each approach and compares those methods in terms of the learning time, baseline and comparative performances, and implementation complexity. In addition to evaluating the utility of existing approaches to the specific task of air combat manoeuvring, this thesis proposes and investigates two novel methods, discrete-to-deep supervised policy learning (D2D-SPL) and discrete-to-deep supervised Q-value learning (D2D-SQL), which can be applied more generally. D2D-SPL and D2D-SQL offer the generalisability of deep RL at a cost closer to the tabular approach.Doctor of Philosoph

Federation ResearchOnline

Memory, Space, and Planning: Multiscale Predictive Representations

Author: Momennejad Ida
Publication venue
Publication date: 19/02/2024
Field of study

Memory is inherently entangled with prediction and planning. Flexible behavior in biological and artificial agents depends on the interplay of learning from the past and predicting the future in ever-changing environments. This chapter reviews computational, behavioral, and neural evidence suggesting these processes rely on learning the relational structure of experiences, known as cognitive maps, and draws two key takeaways. First, that these memory structures are organized as multiscale, compact predictive representations in hippocampal and prefrontal cortex, or PFC, hierarchies. Second, we argue that such predictive memory structures are crucial to the complementary functions of the hippocampus and PFC, both for enabling a recall of detailed and coherent past episodes as well as generalizing experiences at varying scales for efficient prediction and planning. These insights advance our understanding of memory and planning mechanisms in the brain and hold significant implications for advancing artificial intelligence systems.Comment: To be published as a chapter in an edited volume by Oxford University Press (Editors: Sara Aronowitz and Lynn Nadel

arXiv.org e-Print Archive

Intentions and Creative Insights: a Reinforcement Learning Study of Creative Exploration in Problem-Solving

Author: Colin Thomas R.
Publication venue: 'University of Plymouth'
Publication date: 01/01/2020
Field of study

Insight is perhaps the cognitive phenomenon most closely associated with creativity. People engaged in problem-solving sometimes experience a sudden transformation: they see the problem in a radically different manner, and simultaneously feel with great certainty that they have found the right solution. The change of problem representation is called "restructuring", and the affective changes associated with sudden progress are called the "Aha!" experience. Together, restructuring and the "Aha!" experience characterize insight. Reinforcement Learning is both a theory of biological learning and a subfield of machine learning. In its psychological and neuroscientific guise, it is used to model habit formation, and, increasingly, executive function. In its artificial intelligence guise, it is currently the favored paradigm for modeling agents interacting with an environment. Reinforcement learning, I argue, can serve as a model of insight: its foundation in learning coincides with the role of experience in insight problem-solving; its use of an explicit "value" provides the basis for the "Aha!" experience; and finally, in a hierarchical form, it can achieve a sudden change of representation resembling restructuring. An experiment helps confirm some parallels between reinforcement learning and insight. It shows how transfer from prior tasks results in considerably accelerated learning, and how the value function increase resembles the sense of progress corresponding to the "Aha!"-moment. However, a model of insight on the basis of hierarchical reinforcement learning did not display the expected "insightful" behavior. A second model of insight is presented, in which temporal abstraction is based on self-prediction: by predicting its own future decisions, an agent adjusts its course of action on the basis of unexpected events. This kind of temporal abstraction, I argue, corresponds to what we call "intentions", and offers a promising model for biological insight. It explains the "Aha!" experience as resulting from a temporal difference error, whereas restructuring results from an adjustment of the agent's internal state on the basis of either new information or a stochastic interpretation of stimuli. The model is called the actor-critic-intention (ACI) architecture. Finally, the relationship between intentions, insight, and creativity is extensively discussed in light of these models: other works in the philosophical and scientific literature are related to, and sometimes illuminated by the ACI architecture

Ghent University Academic Bibliography

Plymouth Electronic Archive and Research Library

Nonlinear Analysis Methods in Physiological Signals

Author: Maria do Rosário Campos Ribeiro
Publication venue
Publication date: 04/07/2023
Field of study

Repositório Aberto da Universidade do Porto

INVESTIGATING COLLABORATIVE EXPLAINABLE AI (CXAI)/SOCIAL FORUM AS AN EXPLAINABLE AI (XAI) METHOD IN AUTONOMOUS DRIVING (AD)

Author: Mamun Tauseef Ibne
Publication venue: Digital Commons @ Michigan Tech
Publication date: 01/01/2023
Field of study

Explainable AI (XAI) systems primarily focus on algorithms, integrating additional information into AI decisions and classifications to enhance user or developer comprehension of the system\u27s behavior. These systems often incorporate untested concepts of explainability, lacking grounding in the cognitive and educational psychology literature (S. T. Mueller et al., 2021). Consequently, their effectiveness may be limited, as they may address problems that real users don\u27t encounter or provide information that users do not seek. In contrast, an alternative approach called Collaborative XAI (CXAI), as proposed by S. Mueller et al (2021), emphasizes generating explanations without relying solely on algorithms. CXAI centers on enabling users to ask questions and share explanations based on their knowledge and experience to facilitate others\u27 understanding of AI systems. Mamun, Hoffman, et al. (2021) developed a CXAI system akin to a Social Question and Answer (SQA) platform (S. Oh, 2018a), adapting it for AI system explanations. The system successfully passed evaluation based on XAI metrics Hoffman, Mueller, et al. (2018), as implemented in a master’s thesis by Mamun (2021), which validated its effectiveness in a basic image classification domain and explored the types of explanations it generated. This Ph.D. dissertation builds upon this prior work, aiming to apply it in a novel context: users and potential users of self-driving semi-autonomous vehicles. This approach seeks to unravel communication patterns within a social QA platform (S. Oh, 2018a), the types of questions it can assist with, and the benefits it might offer users of widely adopted AI systems. Initially, the feasibility of using existing social QA platforms as explanatory tools for an existing AI system was investigated. The study found that users on these platforms collaboratively assist one another in problem-solving, with many resolutions being reached (Linja et al., 2022). An intriguing discovery was that anger directed at the AI system drove increased engagement on the platform. The subsequent phase leverages observations from social QA platforms in the autonomous driving (AD) sector to gain insights into an AI system within a vehicle. The dissertation includes two simulation studies employing these observations as training materials. The studies explore users\u27 Level 3 Situational Awareness (Endsley, 1995) when the autonomous vehicle exhibits abnormal behavior. These investigate detection rates and users\u27 comprehension of abnormal driving situations. Additionally, these studies measure the perception of personalization within the context of the training process (Zhang & Curley, 2018), cognitive workload (Hart & Staveland, 1988), trust, and reliance (Körber, 2018) concerning the training process. The findings from these studies are mixed, showing higher detection rates of abnormal driving with training but diminished trust and reliance. The final study engages current Tesla FSD users in semi-structured interviews (Crandall et al., 2006) to explore their use of social QA platforms, their knowledge sources during the training phase, and their search for answers to abnormal driving scenarios. The results reveal extensive collaboration through social forums and group discussions, shedding light on differences in trust and reliance within this domain

Michigan Technological University

Imaging fast neural activity in the brain during epilepsy with electrical impedance tomography

Author: Hannan Sana
Publication venue: UCL (University College London)
Publication date: 28/07/2019
Field of study

Electrical impedance tomography (EIT) is a medical imaging technique which reconstructs images of the internal conductivity of an object using boundary measurements obtained by applying current through pairs of non-penetrating surface electrodes. EIT is able to image impedance changes which arise during neural activity at a high spatiotemporal resolution through the rat cerebral cortex and therefore represents a novel method for understanding neuronal network dynamics in epilepsy. Additionally, it holds therapeutic potential for improving the presurgical localisation of epileptogenic foci in individuals with drug-resistant epilepsy. This thesis was aimed at developing EIT for imaging epileptiform activity in vivo and assessing its potential for clinical use. Chapter 1 is a review of existing functional neuroimaging modalities, the principles of EIT and previous studies that have used EIT for imaging epileptic events. In Chapter 2, the safety of continuous current application to the rat cortical surface at 10-100 μA and 1725 Hz, parameters that are representative of fast neural EIT protocols, was verified by histological evaluation. Chapter 3 details the development of two acute rat models of focal epilepsy, the cortical and hippocampal epileptic afterdischarges models, for assessing the feasibility of imaging epileptiform activity with fast neural EIT using epicortical electrode arrays. In Chapter 4, EIT was used to image the propagation of ictal spike-and-wave activity through the cerebral cortex at a resolution of 2 ms and ≤300 µm. In order to enable imaging of epileptiform discharges in deeper subcortical structures, the optimal carrier frequency for current application was determined in Chapter 5. Results demonstrated that the maximal signal-to-noise ratio of fast neural impedance changes during ictal discharges is obtained at 1355 Hz. Finally, in Chapter 6, epileptiform activity in the hippocampus was imaged, with a localisation accuracy of ≤400 µm, using epicortical impedance measurements obtained at this optimised carrier frequency

UCL Discovery

Recommended from our members

Neural Circuits Underlying Learning and Consolidation

Author: Lindsey John William
Publication venue
Publication date: 01/01/2024
Field of study

In this work, we develop models of neural circuits and plasticity rules that underlie different forms of learning and memory, with a focus on learning processes that involve multiple brain regions. We begin by surveying the literature on synaptic plasticity rules and implementations of learning algorithms in the brain. Each subsequent chapter presents a model of how a specific aspect of learning is implemented biologically, based on experimental evidence and normative considerations. We first focus on the neural basis of reinforcement learning in the basal ganglia. We show that in order to enable effective learning when control of behavior is distributed across multiple regions (``off-policy reinforcement learning''), classic models of dopamine activity must be adapted to include an additional action-sensitive component. We also show that the known plasticity rules of direct and indirect-pathway striatal projection neurons are inconsistent with existing models of striatal codes for action. We propose and find experimental support for a new model of striatal activity driven by efferent input. This model is functionally compatible with striatal plasticity rules and enables simultaneous multiplexing of action-selection and learning signals, a necessary ingredient for off-policy reinforcement learning. We next use an off-policy reinforcement learning model to explain a new experimental finding about the conditions under which learned motor skills are consolidated to be driven by the dorsolateral striatum in rats. We then shift our focus to consider consolidation more broadly, proposing a general model of the advantages of systems in which memories and learned behaviors are consolidated from short-term to long-term learning pathways. In particular, our model proposes that such architectures enable selective filtering of the set of experiences used for learning, which can be essential in noisy environments with many extraneous stimuli. In the appendices, we explore other factors relevant to learning algorithms, including the interaction between multiple sensory modalities, and the problem of credit assignment in multi-layer neural networks. In summary, this work presents a varied set of models of different forms of learning in the brain, emphasizing the cooperative role of plasticity rules and multi-regional circuit architecture in producing functionally useful synaptic weight updates

Columbia University Academic Commons