7,849 research outputs found
Machine learning in solar physics
The application of machine learning in solar physics has the potential to
greatly enhance our understanding of the complex processes that take place in
the atmosphere of the Sun. By using techniques such as deep learning, we are
now in the position to analyze large amounts of data from solar observations
and identify patterns and trends that may not have been apparent using
traditional methods. This can help us improve our understanding of explosive
events like solar flares, which can have a strong effect on the Earth
environment. Predicting hazardous events on Earth becomes crucial for our
technological society. Machine learning can also improve our understanding of
the inner workings of the sun itself by allowing us to go deeper into the data
and to propose more complex models to explain them. Additionally, the use of
machine learning can help to automate the analysis of solar data, reducing the
need for manual labor and increasing the efficiency of research in this field.Comment: 100 pages, 13 figures, 286 references, accepted for publication as a
Living Review in Solar Physics (LRSP
An investigation of entorhinal spatial representations in self-localisation behaviours
Spatial-modulated cells of the medial entorhinal cortex (MEC) and neighbouring cortices are thought to provide the neural substrate for self-localisation behaviours. These cells include grid cells of the MEC which are thought to compute path integration operations to update self-location estimates. In order to read this grid code, downstream cells are thought to reconstruct a positional estimate as a simple rate-coded representation of space.
Here, I show the coding scheme of grid cell and putative readout cells recorded from mice performing a virtual reality (VR) linear location task which engaged mice in both beaconing and path integration behaviours. I found grid cells can encode two unique coding schemes on the linear track, namely a position code which reflects periodic grid fields anchored to salient features of the track and a distance code which reflects periodic grid fields without this anchoring. Grid cells were found to switch between these coding schemes within sessions. When grid cells were encoding position, mice performed better at trials that required path integration but not on trials that required beaconing. This result provides the first mechanistic evidence linking grid cell activity to path integration-dependent behaviour.
Putative readout cells were found in the form of ramp cells which fire proportionally as a function of location in defined regions of the linear track. This ramping activity was found to be primarily explained by track position rather than other kinematic variables like speed and acceleration. These representations were found to be maintained across both trial types and outcomes indicating they likely result from recall of the track structure.
Together, these results support the functional importance of grid and ramp cells for self-localisation behaviours. Future investigations will look into the coherence between these two neural populations, which may together form a complete neural system for coding and decoding self-location in the brain
Modular lifelong machine learning
Deep learning has drastically improved the state-of-the-art in many important fields, including computer vision and natural language processing (LeCun et al., 2015). However, it is expensive to train a deep neural network on a machine learning problem. The overall training cost further increases when one wants to solve additional problems. Lifelong machine learning (LML) develops algorithms that aim to efficiently learn to solve a sequence of problems, which become available one at a time. New problems are solved with less resources by transferring previously learned knowledge. At the same time, an LML algorithm needs to retain good performance on all encountered problems, thus avoiding catastrophic forgetting. Current approaches do not possess all the desired properties of an LML algorithm. First, they primarily focus on preventing catastrophic forgetting (Diaz-Rodriguez et al., 2018; Delange et al., 2021). As a result, they neglect some knowledge transfer properties. Furthermore, they assume that all problems in a sequence share the same input space. Finally, scaling these methods to a large sequence of problems remains a challenge.
Modular approaches to deep learning decompose a deep neural network into sub-networks, referred to as modules. Each module can then be trained to perform an atomic transformation, specialised in processing a distinct subset of inputs. This modular approach to storing knowledge makes it easy to only reuse the subset of modules which are useful for the task at hand.
This thesis introduces a line of research which demonstrates the merits of a modular approach to lifelong machine learning, and its ability to address the aforementioned shortcomings of other methods. Compared to previous work, we show that a modular approach can be used to achieve more LML properties than previously demonstrated. Furthermore, we develop tools which allow modular LML algorithms to scale in order to retain said properties on longer sequences of problems.
First, we introduce HOUDINI, a neurosymbolic framework for modular LML. HOUDINI represents modular deep neural networks as functional programs and accumulates a library of pre-trained modules over a sequence of problems. Given a new problem, we use program synthesis to select a suitable neural architecture, as well as a high-performing combination of pre-trained and new modules. We show that our approach has most of the properties desired from an LML algorithm. Notably, it can perform forward transfer, avoid negative transfer and prevent catastrophic forgetting, even across problems with disparate input domains and problems which require different neural architectures.
Second, we produce a modular LML algorithm which retains the properties of HOUDINI but can also scale to longer sequences of problems. To this end, we fix the choice of a neural architecture and introduce a probabilistic search framework, PICLE, for searching through different module combinations. To apply PICLE, we introduce two probabilistic models over neural modules which allows us to efficiently identify promising module combinations.
Third, we phrase the search over module combinations in modular LML as black-box optimisation, which allows one to make use of methods from the setting of hyperparameter optimisation (HPO). We then develop a new HPO method which marries a multi-fidelity approach with model-based optimisation. We demonstrate that this leads to improvement in anytime performance in the HPO setting and discuss how this can in turn be used to augment modular LML methods.
Overall, this thesis identifies a number of important LML properties, which have not all been attained in past methods, and presents an LML algorithm which can achieve all of them, apart from backward transfer
Protecting the Future: Neonatal Seizure Detection with Spatial-Temporal Modeling
A timely detection of seizures for newborn infants with electroencephalogram
(EEG) has been a common yet life-saving practice in the Neonatal Intensive Care
Unit (NICU). However, it requires great human efforts for real-time monitoring,
which calls for automated solutions to neonatal seizure detection. Moreover,
the current automated methods focusing on adult epilepsy monitoring often fail
due to (i) dynamic seizure onset location in human brains; (ii) different
montages on neonates and (iii) huge distribution shift among different
subjects. In this paper, we propose a deep learning framework, namely STATENet,
to address the exclusive challenges with exquisite designs at the temporal,
spatial and model levels. The experiments over the real-world large-scale
neonatal EEG dataset illustrate that our framework achieves significantly
better seizure detection performance.Comment: Accepted in IEEE International Conference on Systems, Man, and
Cybernetics (SMC) 202
Improving diagnostic procedures for epilepsy through automated recording and analysis of patients’ history
Transient loss of consciousness (TLOC) is a time-limited state of profound cognitive impairment characterised by amnesia, abnormal motor control, loss of responsiveness, a short duration and complete recovery. Most instances of TLOC are caused by one of three health conditions: epilepsy, functional (dissociative) seizures (FDS), or syncope. There is often a delay before the correct diagnosis is made and 10-20% of individuals initially receive an incorrect diagnosis. Clinical decision tools based on the endorsement of TLOC symptom lists have been limited to distinguishing between two causes of TLOC. The Initial Paroxysmal Event Profile (iPEP) has shown promise but was demonstrated to have greater accuracy in distinguishing between syncope and epilepsy or FDS than between epilepsy and FDS. The objective of this thesis was to investigate whether interactional, linguistic, and communicative differences in how people with epilepsy and people with FDS describe their experiences of TLOC can improve the predictive performance of the iPEP. An online web application was designed that collected information about TLOC symptoms and medical history from patients and witnesses using a binary questionnaire and verbal interaction with a virtual agent. We explored potential methods of automatically detecting these communicative differences, whether the differences were present during an interaction with a VA, to what extent these automatically detectable communicative differences improve the performance of the iPEP, and the acceptability of the application from the perspective of patients and witnesses. The two feature sets that were applied to previous doctor-patient interactions, features designed to measure formulation effort or detect semantic differences between the two groups, were able to predict the diagnosis with an accuracy of 71% and 81%, respectively. Individuals with epilepsy or FDS provided descriptions of TLOC to the VA that were qualitatively like those observed in previous research. Both feature sets were effective predictors of the diagnosis when applied to the web application recordings (85.7% and 85.7%). Overall, the accuracy of machine learning models trained for the threeway classification between epilepsy, FDS, and syncope using the iPEP responses from patients that were collected through the web application was worse than the performance observed in previous research (65.8% vs 78.3%), but the performance was increased by the inclusion of features extracted from the spoken descriptions on TLOC (85.5%). Finally, most participants who provided feedback reported that the online application was acceptable. These findings suggest that it is feasible to differentiate between people with epilepsy and people with FDS using an automated analysis of spoken seizure descriptions. Furthermore, incorporating these features into a clinical decision tool for TLOC can improve the predictive performance by improving the differential diagnosis between these two health conditions. Future research should use the feedback to improve the design of the application and increase perceived acceptability of the approach
Beam scanning by liquid-crystal biasing in a modified SIW structure
A fixed-frequency beam-scanning 1D antenna based on Liquid Crystals (LCs) is designed for application in 2D scanning with lateral alignment. The 2D array environment imposes full decoupling of adjacent 1D antennas, which often conflicts with the LC requirement of DC biasing: the proposed design accommodates both. The LC medium is placed inside a Substrate Integrated Waveguide (SIW) modified to work as a Groove Gap Waveguide, with radiating slots etched on the upper broad wall, that radiates as a Leaky-Wave Antenna (LWA). This allows effective application of the DC bias voltage needed for tuning the LCs. At the same time, the RF field remains laterally confined, enabling the possibility to lay several antennas in parallel and achieve 2D beam scanning. The design is validated by simulation employing the actual properties of a commercial LC medium
Evaluation of different segmentation-based approaches for skin disorders from dermoscopic images
Treballs Finals de Grau d'Enginyeria Biomèdica. Facultat de Medicina i Ciències de la Salut. Universitat de Barcelona. Curs: 2022-2023. Tutor/Director: Sala Llonch, Roser, Mata Miquel, Christian, Munuera, JosepSkin disorders are the most common type of cancer in the world and the incident has been lately increasing over the past decades. Even with the most complex and advanced technologies, current image acquisition systems do not permit a reliable identification of the skin lesion by visual examination due to the challenging structure of the malignancy. This promotes the need for the implementation of automatic skin lesion segmentation methods in order to assist in physicians’ diagnostic when determining the lesion's region and to serve as a preliminary step for the classification of the skin lesion. Accurate and precise segmentation is crucial for a rigorous screening and monitoring of the disease's progression.
For the purpose of the commented concern, the present project aims to accomplish a state-of-the-art review about the most predominant conventional segmentation models for skin lesion segmentation, alongside with a market analysis examination. With the rise of automatic segmentation tools, a wide number of algorithms are currently being used, but many are the drawbacks when employing them for dermatological disorders due to the high-level presence of artefacts in the image acquired.
In light of the above, three segmentation techniques have been selected for the completion of the work: level set method, an algorithm combining GrabCut and k-means methods and an intensity automatic algorithm developed by Hospital Sant Joan de Déu de Barcelona research group. In addition, a validation of their performance is conducted for a further implementation of them in clinical training. The proposals, together with the got outcomes, have been accomplished by means of a publicly available skin lesion image database
Reinforcement learning in large state action spaces
Reinforcement learning (RL) is a promising framework for training intelligent agents which learn to optimize long term utility by directly interacting with the environment. Creating RL methods which scale to large state-action spaces is a critical problem towards ensuring real world deployment of RL systems. However, several challenges limit the applicability of RL to large scale settings. These include difficulties with exploration, low sample efficiency, computational intractability, task constraints like decentralization and lack of guarantees about important properties like performance, generalization and robustness in potentially unseen scenarios.
This thesis is motivated towards bridging the aforementioned gap. We propose several principled algorithms and frameworks for studying and addressing the above challenges RL. The proposed methods cover a wide range of RL settings (single and multi-agent systems (MAS) with all the variations in the latter, prediction and control, model-based and model-free methods, value-based and policy-based methods). In this work we propose the first results on several different problems: e.g. tensorization of the Bellman equation which allows exponential sample efficiency gains (Chapter 4), provable suboptimality arising from structural constraints in MAS(Chapter 3), combinatorial generalization results in cooperative MAS(Chapter 5), generalization results on observation shifts(Chapter 7), learning deterministic policies in a probabilistic RL framework(Chapter 6). Our algorithms exhibit provably enhanced performance and sample efficiency along with better scalability. Additionally, we also shed light on generalization aspects of the agents under different frameworks. These properties have been been driven by the use of several advanced tools (e.g. statistical machine learning, state abstraction, variational inference, tensor theory).
In summary, the contributions in this thesis significantly advance progress towards making RL agents ready for large scale, real world applications
Swarm Reinforcement Learning For Adaptive Mesh Refinement
The Finite Element Method, an important technique in engineering, is aided by
Adaptive Mesh Refinement (AMR), which dynamically refines mesh regions to allow
for a favorable trade-off between computational speed and simulation accuracy.
Classical methods for AMR depend on task-specific heuristics or expensive error
estimators, hindering their use for complex simulations. Recent learned AMR
methods tackle these problems, but so far scale only to simple toy examples. We
formulate AMR as a novel Adaptive Swarm Markov Decision Process in which a mesh
is modeled as a system of simple collaborating agents that may split into
multiple new agents. This framework allows for a spatial reward formulation
that simplifies the credit assignment problem, which we combine with Message
Passing Networks to propagate information between neighboring mesh elements. We
experimentally validate the effectiveness of our approach, Adaptive Swarm Mesh
Refinement (ASMR), showing that it learns reliable, scalable, and efficient
refinement strategies on a set of challenging problems. Our approach
significantly speeds up computation, achieving up to 30-fold improvement
compared to uniform refinements in complex simulations. Additionally, we
outperform learned baselines and achieve a refinement quality that is on par
with a traditional error-based AMR strategy without expensive oracle
information about the error signal.Comment: Version 1 of this paper is a preliminary workshop version that was
accepted as a workshop paper in the ICLR 2023 Workshop on Physics for Machine
Learnin
- …