281 research outputs found
Discovering Causal Relations and Equations from Data
Physics is a field of science that has traditionally used the scientific
method to answer questions about why natural phenomena occur and to make
testable models that explain the phenomena. Discovering equations, laws and
principles that are invariant, robust and causal explanations of the world has
been fundamental in physical sciences throughout the centuries. Discoveries
emerge from observing the world and, when possible, performing interventional
studies in the system under study. With the advent of big data and the use of
data-driven methods, causal and equation discovery fields have grown and made
progress in computer science, physics, statistics, philosophy, and many applied
fields. All these domains are intertwined and can be used to discover causal
relations, physical laws, and equations from observational data. This paper
reviews the concepts, methods, and relevant works on causal and equation
discovery in the broad field of Physics and outlines the most important
challenges and promising future lines of research. We also provide a taxonomy
for observational causal and equation discovery, point out connections, and
showcase a complete set of case studies in Earth and climate sciences, fluid
dynamics and mechanics, and the neurosciences. This review demonstrates that
discovering fundamental laws and causal relations by observing natural
phenomena is being revolutionised with the efficient exploitation of
observational data, modern machine learning algorithms and the interaction with
domain knowledge. Exciting times are ahead with many challenges and
opportunities to improve our understanding of complex systems.Comment: 137 page
Simulation Intelligence: Towards a New Generation of Scientific Methods
The original "Seven Motifs" set forth a roadmap of essential methods for the
field of scientific computing, where a motif is an algorithmic method that
captures a pattern of computation and data movement. We present the "Nine
Motifs of Simulation Intelligence", a roadmap for the development and
integration of the essential algorithms necessary for a merger of scientific
computing, scientific simulation, and artificial intelligence. We call this
merger simulation intelligence (SI), for short. We argue the motifs of
simulation intelligence are interconnected and interdependent, much like the
components within the layers of an operating system. Using this metaphor, we
explore the nature of each layer of the simulation intelligence operating
system stack (SI-stack) and the motifs therein: (1) Multi-physics and
multi-scale modeling; (2) Surrogate modeling and emulation; (3)
Simulation-based inference; (4) Causal modeling and inference; (5) Agent-based
modeling; (6) Probabilistic programming; (7) Differentiable programming; (8)
Open-ended optimization; (9) Machine programming. We believe coordinated
efforts between motifs offers immense opportunity to accelerate scientific
discovery, from solving inverse problems in synthetic biology and climate
science, to directing nuclear energy experiments and predicting emergent
behavior in socioeconomic settings. We elaborate on each layer of the SI-stack,
detailing the state-of-art methods, presenting examples to highlight challenges
and opportunities, and advocating for specific ways to advance the motifs and
the synergies from their combinations. Advancing and integrating these
technologies can enable a robust and efficient hypothesis-simulation-analysis
type of scientific method, which we introduce with several use-cases for
human-machine teaming and automated science
Some extensions to reliability modeling and optimization of networked systems
Ph.DDOCTOR OF PHILOSOPH
Discovering causal relations and equations from data
Physics is a field of science that has traditionally used the scientific method to answer questions about why natural phenomena occur and to make testable models that explain the phenomena. Discovering equations, laws, and principles that are invariant, robust, and causal has been fundamental in physical sciences throughout the centuries. Discoveries emerge from observing the world and, when possible, performing interventions on the system under study. With the advent of big data and data-driven methods, the fields of causal and equation discovery have developed and accelerated progress in computer science, physics, statistics, philosophy, and many applied fields. This paper reviews the concepts, methods, and relevant works on causal and equation discovery in the broad field of physics and outlines the most important challenges and promising future lines of research. We also provide a taxonomy for data-driven causal and equation discovery, point out connections, and showcase comprehensive case studies in Earth and climate sciences, fluid dynamics and mechanics, and the neurosciences. This review demonstrates that discovering fundamental laws and causal relations by observing natural phenomena is revolutionised with the efficient exploitation of observational data and simulations, modern machine learning algorithms and the combination with domain knowledge. Exciting times are ahead with many challenges and opportunities to improve our understanding of complex systems
A Learning Health System for Radiation Oncology
The proposed research aims to address the challenges faced by clinical data science researchers in radiation oncology accessing, integrating, and analyzing heterogeneous data from various sources. The research presents a scalable intelligent infrastructure, called the Health Information Gateway and Exchange (HINGE), which captures and structures data from multiple sources into a knowledge base with semantically interlinked entities. This infrastructure enables researchers to mine novel associations and gather relevant knowledge for personalized clinical outcomes.
The dissertation discusses the design framework and implementation of HINGE, which abstracts structured data from treatment planning systems, treatment management systems, and electronic health records. It utilizes disease-specific smart templates for capturing clinical information in a discrete manner. HINGE performs data extraction, aggregation, and quality and outcome assessment functions automatically, connecting seamlessly with local IT/medical infrastructure.
Furthermore, the research presents a knowledge graph-based approach to map radiotherapy data to an ontology-based data repository using FAIR (Findable, Accessible, Interoperable, Reusable) concepts. This approach ensures that the data is easily discoverable and accessible for clinical decision support systems. The dissertation explores the ETL (Extract, Transform, Load) process, data model frameworks, ontologies, and provides a real-world clinical use case for this data mapping.
To improve the efficiency of retrieving information from large clinical datasets, a search engine based on ontology-based keyword searching and synonym-based term matching tool was developed. The hierarchical nature of ontologies is leveraged to retrieve patient records based on parent and children classes. Additionally, patient similarity analysis is conducted using vector embedding models (Word2Vec, Doc2Vec, GloVe, and FastText) to identify similar patients based on text corpus creation methods. Results from the analysis using these models are presented.
The implementation of a learning health system for predicting radiation pneumonitis following stereotactic body radiotherapy is also discussed. 3D convolutional neural networks (CNNs) are utilized with radiographic and dosimetric datasets to predict the likelihood of radiation pneumonitis. DenseNet-121 and ResNet-50 models are employed for this study, along with integrated gradient techniques to identify salient regions within the input 3D image dataset. The predictive performance of the 3D CNN models is evaluated based on clinical outcomes.
Overall, the proposed Learning Health System provides a comprehensive solution for capturing, integrating, and analyzing heterogeneous data in a knowledge base. It offers researchers the ability to extract valuable insights and associations from diverse sources, ultimately leading to improved clinical outcomes. This work can serve as a model for implementing LHS in other medical specialties, advancing personalized and data-driven medicine
Conventional and Neural Architectures for Biometric Presentation Attack Detection
Facial biometrics, which enable an efficient and reliable method of person recognition, have been growing continuously as an active sub-area of computer vision. Automatic face recognition offers a natural and non-intrusive method for recognising users from their facial characteristics. However, facial recognition systems are vulnerable to presentation attacks (or spoofing attacks) when an attacker attempts to hide their true identity and masquerades as a valid user by misleading the biometric system. Thus, Facial Presentation Attack Detection (Facial PAD) (or facial antispoofing) techniques that aim to protect face recognition systems from such attacks, have been attracting more research attention in recent years. Various systems and algorithms have been proposed and evaluated. This thesis explores and compares some novel directions for detecting facial presentation attacks, including traditional features as well as approaches based on deep learning. In particular, different features encapsulating temporal information are developed and explored for describing the dynamic characteristics in presentation attacks. Hand-crafted features, deep neural architectures and their possible extensions are explored for their application in PAD. The proposed novel traditional features address the problem of modelling distinct representations of presentation attacks in the temporal domain and consider two possible branches: behaviour-level and texture-level temporal information. The behaviour-level feature is developed from a symbolic system that was widely used in psychological studies and automated emotion analysis. Other proposed traditional features aim to capture the distinct differences in image quality, shadings and skin reflections by using dynamic texture descriptors. This thesis then explores deep learning approaches using different pre-trained neural architectures with the aim of improving detection performance. In doing so, this thesis also explores visualisations of the internal representation of the networks to inform the further development of such approaches for improving performance and suggest possible new directions for future research. These directions include interpretable capability of deep learning approaches for PAD and a fully automatic system design capability in which the network architecture and parameters are determined by the available data. The interpretable capability can produce justifications for PAD decisions through both natural language and saliency map formats. Such systems can lead to further performance improvement through the use of an attention sub-network by learning from the justifications. Designing optimum deep neural architectures for PAD is still a complex problem that requires substantial effort from human experts. For this reason, the necessity of producing a system that can automatically design the neural architecture for a particular task is clear. A gradient-based neural architecture search algorithm is explored and extended through the development of different optimisation functions for designing the neural architectures for PAD automatically. These possible extensions of the deep learning approaches for PAD were evaluated using challenging benchmark datasets and the potential of the proposed approaches were demonstrated by comparing with the state-of-the-art techniques and published results. The proposed methods were evaluated and analysed using publicly available datasets. Results from the experiments demonstrate the usefulness of temporal information and the potential benefits of applying deep learning techniques for presentation attack detection. In particular, the use of explanations for improving usability and performance of deep learning PAD techniques and automatic techniques for the design of PAD neural architectures show considerable promise for future development
- …