4,126 research outputs found
Machine Learning and Integrative Analysis of Biomedical Big Data.
Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues
A Bayesian Consistent Dual Ensemble Kalman Filter for State-Parameter Estimation in Subsurface Hydrology
Ensemble Kalman filtering (EnKF) is an efficient approach to addressing
uncertainties in subsurface groundwater models. The EnKF sequentially
integrates field data into simulation models to obtain a better
characterization of the model's state and parameters. These are generally
estimated following joint and dual filtering strategies, in which, at each
assimilation cycle, a forecast step by the model is followed by an update step
with incoming observations. The Joint-EnKF directly updates the augmented
state-parameter vector while the Dual-EnKF employs two separate filters, first
estimating the parameters and then estimating the state based on the updated
parameters. In this paper, we reverse the order of the forecast-update steps
following the one-step-ahead (OSA) smoothing formulation of the Bayesian
filtering problem, based on which we propose a new dual EnKF scheme, the
Dual-EnKF. Compared to the Dual-EnKF, this introduces a new update
step to the state in a fully consistent Bayesian framework, which is shown to
enhance the performance of the dual filtering approach without any significant
increase in the computational cost. Numerical experiments are conducted with a
two-dimensional synthetic groundwater aquifer model to assess the performance
and robustness of the proposed Dual-EnKF, and to evaluate its
results against those of the Joint- and Dual-EnKFs. The proposed scheme is able
to successfully recover both the hydraulic head and the aquifer conductivity,
further providing reliable estimates of their uncertainties. Compared with the
standard Joint- and Dual-EnKFs, the proposed scheme is found more robust to
different assimilation settings, such as the spatial and temporal distribution
of the observations, and the level of noise in the data. Based on our
experimental setups, it yields up to 25% more accurate state and parameters
estimates
Interpretable Subgroup Discovery in Treatment Effect Estimation with Application to Opioid Prescribing Guidelines
The dearth of prescribing guidelines for physicians is one key driver of the
current opioid epidemic in the United States. In this work, we analyze medical
and pharmaceutical claims data to draw insights on characteristics of patients
who are more prone to adverse outcomes after an initial synthetic opioid
prescription. Toward this end, we propose a generative model that allows
discovery from observational data of subgroups that demonstrate an enhanced or
diminished causal effect due to treatment. Our approach models these
sub-populations as a mixture distribution, using sparsity to enhance
interpretability, while jointly learning nonlinear predictors of the potential
outcomes to better adjust for confounding. The approach leads to
human-interpretable insights on discovered subgroups, improving the practical
utility for decision suppor
- …