3,148 research outputs found
Machine Learning and Integrative Analysis of Biomedical Big Data.
Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues
Unsupervised landmark analysis for jump detection in molecular dynamics simulations
Molecular dynamics is a versatile and powerful method to study diffusion in
solid-state ionic conductors, requiring minimal prior knowledge of equilibrium
or transition states of the system's free energy surface. However, the analysis
of trajectories for relevant but rare events, such as a jump of the diffusing
mobile ion, is still rather cumbersome, requiring prior knowledge of the
diffusive process in order to get meaningful results. In this work, we present
a novel approach to detect the relevant events in a diffusive system without
assuming prior information regarding the underlying process. We start from a
projection of the atomic coordinates into a landmark basis to identify the
dominant features in a mobile ion's environment. Subsequent clustering in
landmark space enables a discretization of any trajectory into a sequence of
distinct states. As a final step, the use of the smooth overlap of atomic
positions descriptor allows distinguishing between different environments in a
straightforward way. We apply this algorithm to ten Li-ionic systems and
conduct in-depth analyses of cubic LiLaZrO, tetragonal
LiGePS, and the -eucryptite LiAlSiO. We
compare our results to existing methods, underscoring strong points,
weaknesses, and insights into the diffusive behavior of the ionic conduction in
the materials investigated
- …