9,331 research outputs found
Machine Learning and Integrative Analysis of Biomedical Big Data.
Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues
A mathematical theory of semantic development in deep neural networks
An extensive body of empirical research has revealed remarkable regularities
in the acquisition, organization, deployment, and neural representation of
human semantic knowledge, thereby raising a fundamental conceptual question:
what are the theoretical principles governing the ability of neural networks to
acquire, organize, and deploy abstract knowledge by integrating across many
individual experiences? We address this question by mathematically analyzing
the nonlinear dynamics of learning in deep linear networks. We find exact
solutions to this learning dynamics that yield a conceptual explanation for the
prevalence of many disparate phenomena in semantic cognition, including the
hierarchical differentiation of concepts through rapid developmental
transitions, the ubiquity of semantic illusions between such transitions, the
emergence of item typicality and category coherence as factors controlling the
speed of semantic processing, changing patterns of inductive projection over
development, and the conservation of semantic similarity in neural
representations across species. Thus, surprisingly, our simple neural model
qualitatively recapitulates many diverse regularities underlying semantic
development, while providing analytic insight into how the statistical
structure of an environment can interact with nonlinear deep learning dynamics
to give rise to these regularities
Information visualization for DNA microarray data analysis: A critical review
Graphical representation may provide effective means of making sense of the complexity and sheer volume of data produced by DNA microarray experiments that monitor the expression patterns of thousands of genes simultaneously. The ability to use ldquoabstractrdquo graphical representation to draw attention to areas of interest, and more in-depth visualizations to answer focused questions, would enable biologists to move from a large amount of data to particular records they are interested in, and therefore, gain deeper insights in understanding the microarray experiment results. This paper starts by providing some background knowledge of microarray experiments, and then, explains how graphical representation can be applied in general to this problem domain, followed by exploring the role of visualization in gene expression data analysis. Having set the problem scene, the paper then examines various multivariate data visualization techniques that have been applied to microarray data analysis. These techniques are critically reviewed so that the strengths and weaknesses of each technique can be tabulated. Finally, several key problem areas as well as possible solutions to them are discussed as being a source for future work
- …