51 research outputs found
Recommended from our members
Data harmonisation for information fusion in digital healthcare: A state-of-the-art systematic review, meta-analysis and future research directions.
Removing the bias and variance of multicentre data has always been a challenge in large scale digital healthcare studies, which requires the ability to integrate clinical features extracted from data acquired by different scanners and protocols to improve stability and robustness. Previous studies have described various computational approaches to fuse single modality multicentre datasets. However, these surveys rarely focused on evaluation metrics and lacked a checklist for computational data harmonisation studies. In this systematic review, we summarise the computational data harmonisation approaches for multi-modality data in the digital healthcare field, including harmonisation strategies and evaluation metrics based on different theories. In addition, a comprehensive checklist that summarises common practices for data harmonisation studies is proposed to guide researchers to report their research findings more effectively. Last but not least, flowcharts presenting possible ways for methodology and metric selection are proposed and the limitations of different methods have been surveyed for future research
Data harmonisation for information fusion in digital healthcare: A state-of-the-art systematic review, meta-analysis and future research directions
Removing the bias and variance of multicentre data has always been a challenge in large scale digital healthcare studies, which requires the ability to integrate clinical features extracted from data acquired by different scanners and protocols to improve stability and robustness. Previous studies have described various computational approaches to fuse single modality multicentre datasets. However, these surveys rarely focused on evaluation metrics and lacked a checklist for computational data harmonisation studies. In this systematic review, we summarise the computational data harmonisation approaches for multi-modality data in the digital healthcare field, including harmonisation strategies and evaluation metrics based on different theories. In addition, a comprehensive checklist that summarises common practices for data harmonisation studies is proposed to guide researchers to report their research findings more effectively. Last but not least, flowcharts presenting possible ways for methodology and metric selection are proposed and the limitations of different methods have been surveyed for future research
Recommended from our members
Advances in Latent Variable and Causal Models
This thesis considers three different areas of machine learning concerned with the modelling of data, extending theoretical understanding in each of them. First, the estimation of f- divergences is considered in a setting that is naturally satisfied in the context of autoencoders. By exploiting structural assumptions on the distributions of concern, the proposed estimator is shown to exhibit fast rates of concentration and bias-decay. In contrast, in much of the existing f-divergence estimation literature, fast rates are only obtainable under strong conditions that are difficult to verify in practice. Next, novel identifiability results are presented for nonlinear Independent Component Analysis (ICA) in a multi-view setting, extending the scarce literature of known identifiability results for nonlinear ICA. A result of particular note is that if one noiseless view of the sources is supplemented by a second view that is appropriately corrupted by source-level noise, the sources can be fully reconstructed from the observations up to tolerable ambiguities. This setting is applicable to areas such as neuroimaging, where multiple data modalities may be available. Finally, a framework is introduced to evaluate when two causal models are consistent with one another, meaning that a correspondence can be established between them such that reasoning about the effects of interventions in both models agree. This can be used to understand when two models of the same system at different levels of detail are consistent, and has application to the problem of causal variable definition. This work has broad implications to the causal modelling process in general, as there is often a mismatch between the level at which measurements are made and the level at which the underlying ‘true’ causal structure exists, yet causal inference algorithms generally seek to discover causal structure at the level of measurements
Recommended from our members
Large-scale and Deep Spatiotemporal Point-Process Models
Many accurate spatiotemporal data sets have recently become available for research. Real-world applications create strong demands for a better multivariate point-process modeling. In this thesis, we develop new multivariate models with generalization ability and scalability. The first two chapters provide a research background, real-world problems and a mathematical introduction to point-process models. In chapter 3, we develop a nonparametric method for multivariate spatiotemporal Hawkes processes with applications on network reconstruction. In contrast to prior work, which has often focused on exclusively temporal information, our approach uses spatiotemporal information and does not assume a specific parametric form. Our results demonstrate that, in comparison to using only temporal data, our approach yields improved network reconstruction, providing a basis for meaningful subsequent analysis---such as examinations of community structure and motifs---of the reconstructed networks. In chapter 4, we present a fast and accurate estimation method for multivariate Hawkes processes. Our method, with guaranteed consistency, combines two estimation approaches. Extensive numerical experiments, with synthetic data and real-world social network data, show that our method improves the accuracy, scalability and computational efficiency of prevailing estimation approaches. Moreover, it greatly boosts the performance of Hawkes process-based models on social network reconstruction and helps to understand the spatiotemporal triggering dynamics over social media.In chapter 5, we focus on multivariate spatial point processes, which can describe heterotopic data over space. However, highly multivariate intensities are computationally challenging due to the curse of dimensionality. To bridge this gap, we introduce a declustering-based hidden-variable model that leads to an efficient inference via a variational autoencoder (VAE). We also prove that this model is a generalization of the VAE-based model for collaborative filtering. This leads to an interesting application of spatial point-process models to recommender systems. Experimental results show the method's utility on both synthetic data and real-world data. Finally, in chapter 6, we show how multivariate point processes can be applied to opioid overdose events and real-time prediction of the hourly crime rate. In chapter 7, we discuss future directions and conclude the thesis
Uncertainty Quantification in Machine Learning for Engineering Design and Health Prognostics: A Tutorial
On top of machine learning models, uncertainty quantification (UQ) functions
as an essential layer of safety assurance that could lead to more principled
decision making by enabling sound risk assessment and management. The safety
and reliability improvement of ML models empowered by UQ has the potential to
significantly facilitate the broad adoption of ML solutions in high-stakes
decision settings, such as healthcare, manufacturing, and aviation, to name a
few. In this tutorial, we aim to provide a holistic lens on emerging UQ methods
for ML models with a particular focus on neural networks and the applications
of these UQ methods in tackling engineering design as well as prognostics and
health management problems. Toward this goal, we start with a comprehensive
classification of uncertainty types, sources, and causes pertaining to UQ of ML
models. Next, we provide a tutorial-style description of several
state-of-the-art UQ methods: Gaussian process regression, Bayesian neural
network, neural network ensemble, and deterministic UQ methods focusing on
spectral-normalized neural Gaussian process. Established upon the mathematical
formulations, we subsequently examine the soundness of these UQ methods
quantitatively and qualitatively (by a toy regression example) to examine their
strengths and shortcomings from different dimensions. Then, we review
quantitative metrics commonly used to assess the quality of predictive
uncertainty in classification and regression problems. Afterward, we discuss
the increasingly important role of UQ of ML models in solving challenging
problems in engineering design and health prognostics. Two case studies with
source codes available on GitHub are used to demonstrate these UQ methods and
compare their performance in the life prediction of lithium-ion batteries at
the early stage and the remaining useful life prediction of turbofan engines
- …