7,956 research outputs found
Machine Learning and Integrative Analysis of Biomedical Big Data.
Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues
Recommended from our members
Statistical Workflow for Feature Selection in Human Metabolomics Data.
High-throughput metabolomics investigations, when conducted in large human cohorts, represent a potentially powerful tool for elucidating the biochemical diversity underlying human health and disease. Large-scale metabolomics data sources, generated using either targeted or nontargeted platforms, are becoming more common. Appropriate statistical analysis of these complex high-dimensional data will be critical for extracting meaningful results from such large-scale human metabolomics studies. Therefore, we consider the statistical analytical approaches that have been employed in prior human metabolomics studies. Based on the lessons learned and collective experience to date in the field, we offer a step-by-step framework for pursuing statistical analyses of cohort-based human metabolomics data, with a focus on feature selection. We discuss the range of options and approaches that may be employed at each stage of data management, analysis, and interpretation and offer guidance on the analytical decisions that need to be considered over the course of implementing a data analysis workflow. Certain pervasive analytical challenges facing the field warrant ongoing focused research. Addressing these challenges, particularly those related to analyzing human metabolomics data, will allow for more standardization of as well as advances in how research in the field is practiced. In turn, such major analytical advances will lead to substantial improvements in the overall contributions of human metabolomics investigations
Applying Deep Machine Learning for psycho-demographic profiling of Internet users using O.C.E.A.N. model of personality
In the modern era, each Internet user leaves enormous amounts of auxiliary
digital residuals (footprints) by using a variety of on-line services. All this
data is already collected and stored for many years. In recent works, it was
demonstrated that it's possible to apply simple machine learning methods to
analyze collected digital footprints and to create psycho-demographic profiles
of individuals. However, while these works clearly demonstrated the
applicability of machine learning methods for such an analysis, created simple
prediction models still lacks accuracy necessary to be successfully applied for
practical needs. We have assumed that using advanced deep machine learning
methods may considerably increase the accuracy of predictions. We started with
simple machine learning methods to estimate basic prediction performance and
moved further by applying advanced methods based on shallow and deep neural
networks. Then we compared prediction power of studied models and made
conclusions about its performance. Finally, we made hypotheses how prediction
accuracy can be further improved. As result of this work, we provide full
source code used in the experiments for all interested researchers and
practitioners in corresponding GitHub repository. We believe that applying deep
machine learning for psycho-demographic profiling may have an enormous impact
on the society (for good or worse) and provides means for Artificial
Intelligence (AI) systems to better understand humans by creating their
psychological profiles. Thus AI agents may achieve the human-like ability to
participate in conversation (communication) flow by anticipating human
opponents' reactions, expectations, and behavior
- …