281 research outputs found

    Discovering Causal Relations and Equations from Data

    Full text link
    Physics is a field of science that has traditionally used the scientific method to answer questions about why natural phenomena occur and to make testable models that explain the phenomena. Discovering equations, laws and principles that are invariant, robust and causal explanations of the world has been fundamental in physical sciences throughout the centuries. Discoveries emerge from observing the world and, when possible, performing interventional studies in the system under study. With the advent of big data and the use of data-driven methods, causal and equation discovery fields have grown and made progress in computer science, physics, statistics, philosophy, and many applied fields. All these domains are intertwined and can be used to discover causal relations, physical laws, and equations from observational data. This paper reviews the concepts, methods, and relevant works on causal and equation discovery in the broad field of Physics and outlines the most important challenges and promising future lines of research. We also provide a taxonomy for observational causal and equation discovery, point out connections, and showcase a complete set of case studies in Earth and climate sciences, fluid dynamics and mechanics, and the neurosciences. This review demonstrates that discovering fundamental laws and causal relations by observing natural phenomena is being revolutionised with the efficient exploitation of observational data, modern machine learning algorithms and the interaction with domain knowledge. Exciting times are ahead with many challenges and opportunities to improve our understanding of complex systems.Comment: 137 page

    Simulation Intelligence: Towards a New Generation of Scientific Methods

    Full text link
    The original "Seven Motifs" set forth a roadmap of essential methods for the field of scientific computing, where a motif is an algorithmic method that captures a pattern of computation and data movement. We present the "Nine Motifs of Simulation Intelligence", a roadmap for the development and integration of the essential algorithms necessary for a merger of scientific computing, scientific simulation, and artificial intelligence. We call this merger simulation intelligence (SI), for short. We argue the motifs of simulation intelligence are interconnected and interdependent, much like the components within the layers of an operating system. Using this metaphor, we explore the nature of each layer of the simulation intelligence operating system stack (SI-stack) and the motifs therein: (1) Multi-physics and multi-scale modeling; (2) Surrogate modeling and emulation; (3) Simulation-based inference; (4) Causal modeling and inference; (5) Agent-based modeling; (6) Probabilistic programming; (7) Differentiable programming; (8) Open-ended optimization; (9) Machine programming. We believe coordinated efforts between motifs offers immense opportunity to accelerate scientific discovery, from solving inverse problems in synthetic biology and climate science, to directing nuclear energy experiments and predicting emergent behavior in socioeconomic settings. We elaborate on each layer of the SI-stack, detailing the state-of-art methods, presenting examples to highlight challenges and opportunities, and advocating for specific ways to advance the motifs and the synergies from their combinations. Advancing and integrating these technologies can enable a robust and efficient hypothesis-simulation-analysis type of scientific method, which we introduce with several use-cases for human-machine teaming and automated science

    Some extensions to reliability modeling and optimization of networked systems

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Discovering causal relations and equations from data

    Get PDF
    Physics is a field of science that has traditionally used the scientific method to answer questions about why natural phenomena occur and to make testable models that explain the phenomena. Discovering equations, laws, and principles that are invariant, robust, and causal has been fundamental in physical sciences throughout the centuries. Discoveries emerge from observing the world and, when possible, performing interventions on the system under study. With the advent of big data and data-driven methods, the fields of causal and equation discovery have developed and accelerated progress in computer science, physics, statistics, philosophy, and many applied fields. This paper reviews the concepts, methods, and relevant works on causal and equation discovery in the broad field of physics and outlines the most important challenges and promising future lines of research. We also provide a taxonomy for data-driven causal and equation discovery, point out connections, and showcase comprehensive case studies in Earth and climate sciences, fluid dynamics and mechanics, and the neurosciences. This review demonstrates that discovering fundamental laws and causal relations by observing natural phenomena is revolutionised with the efficient exploitation of observational data and simulations, modern machine learning algorithms and the combination with domain knowledge. Exciting times are ahead with many challenges and opportunities to improve our understanding of complex systems

    A Learning Health System for Radiation Oncology

    Get PDF
    The proposed research aims to address the challenges faced by clinical data science researchers in radiation oncology accessing, integrating, and analyzing heterogeneous data from various sources. The research presents a scalable intelligent infrastructure, called the Health Information Gateway and Exchange (HINGE), which captures and structures data from multiple sources into a knowledge base with semantically interlinked entities. This infrastructure enables researchers to mine novel associations and gather relevant knowledge for personalized clinical outcomes. The dissertation discusses the design framework and implementation of HINGE, which abstracts structured data from treatment planning systems, treatment management systems, and electronic health records. It utilizes disease-specific smart templates for capturing clinical information in a discrete manner. HINGE performs data extraction, aggregation, and quality and outcome assessment functions automatically, connecting seamlessly with local IT/medical infrastructure. Furthermore, the research presents a knowledge graph-based approach to map radiotherapy data to an ontology-based data repository using FAIR (Findable, Accessible, Interoperable, Reusable) concepts. This approach ensures that the data is easily discoverable and accessible for clinical decision support systems. The dissertation explores the ETL (Extract, Transform, Load) process, data model frameworks, ontologies, and provides a real-world clinical use case for this data mapping. To improve the efficiency of retrieving information from large clinical datasets, a search engine based on ontology-based keyword searching and synonym-based term matching tool was developed. The hierarchical nature of ontologies is leveraged to retrieve patient records based on parent and children classes. Additionally, patient similarity analysis is conducted using vector embedding models (Word2Vec, Doc2Vec, GloVe, and FastText) to identify similar patients based on text corpus creation methods. Results from the analysis using these models are presented. The implementation of a learning health system for predicting radiation pneumonitis following stereotactic body radiotherapy is also discussed. 3D convolutional neural networks (CNNs) are utilized with radiographic and dosimetric datasets to predict the likelihood of radiation pneumonitis. DenseNet-121 and ResNet-50 models are employed for this study, along with integrated gradient techniques to identify salient regions within the input 3D image dataset. The predictive performance of the 3D CNN models is evaluated based on clinical outcomes. Overall, the proposed Learning Health System provides a comprehensive solution for capturing, integrating, and analyzing heterogeneous data in a knowledge base. It offers researchers the ability to extract valuable insights and associations from diverse sources, ultimately leading to improved clinical outcomes. This work can serve as a model for implementing LHS in other medical specialties, advancing personalized and data-driven medicine

    Connected Attribute Filtering Based on Contour Smoothness

    Get PDF

    Conventional and Neural Architectures for Biometric Presentation Attack Detection

    Get PDF
    Facial biometrics, which enable an efficient and reliable method of person recognition, have been growing continuously as an active sub-area of computer vision. Automatic face recognition offers a natural and non-intrusive method for recognising users from their facial characteristics. However, facial recognition systems are vulnerable to presentation attacks (or spoofing attacks) when an attacker attempts to hide their true identity and masquerades as a valid user by misleading the biometric system. Thus, Facial Presentation Attack Detection (Facial PAD) (or facial antispoofing) techniques that aim to protect face recognition systems from such attacks, have been attracting more research attention in recent years. Various systems and algorithms have been proposed and evaluated. This thesis explores and compares some novel directions for detecting facial presentation attacks, including traditional features as well as approaches based on deep learning. In particular, different features encapsulating temporal information are developed and explored for describing the dynamic characteristics in presentation attacks. Hand-crafted features, deep neural architectures and their possible extensions are explored for their application in PAD. The proposed novel traditional features address the problem of modelling distinct representations of presentation attacks in the temporal domain and consider two possible branches: behaviour-level and texture-level temporal information. The behaviour-level feature is developed from a symbolic system that was widely used in psychological studies and automated emotion analysis. Other proposed traditional features aim to capture the distinct differences in image quality, shadings and skin reflections by using dynamic texture descriptors. This thesis then explores deep learning approaches using different pre-trained neural architectures with the aim of improving detection performance. In doing so, this thesis also explores visualisations of the internal representation of the networks to inform the further development of such approaches for improving performance and suggest possible new directions for future research. These directions include interpretable capability of deep learning approaches for PAD and a fully automatic system design capability in which the network architecture and parameters are determined by the available data. The interpretable capability can produce justifications for PAD decisions through both natural language and saliency map formats. Such systems can lead to further performance improvement through the use of an attention sub-network by learning from the justifications. Designing optimum deep neural architectures for PAD is still a complex problem that requires substantial effort from human experts. For this reason, the necessity of producing a system that can automatically design the neural architecture for a particular task is clear. A gradient-based neural architecture search algorithm is explored and extended through the development of different optimisation functions for designing the neural architectures for PAD automatically. These possible extensions of the deep learning approaches for PAD were evaluated using challenging benchmark datasets and the potential of the proposed approaches were demonstrated by comparing with the state-of-the-art techniques and published results. The proposed methods were evaluated and analysed using publicly available datasets. Results from the experiments demonstrate the usefulness of temporal information and the potential benefits of applying deep learning techniques for presentation attack detection. In particular, the use of explanations for improving usability and performance of deep learning PAD techniques and automatic techniques for the design of PAD neural architectures show considerable promise for future development
    corecore