36,010 research outputs found
Multi-resolution Tensor Learning for Large-Scale Spatial Data
High-dimensional tensor models are notoriously computationally expensive to
train. We present a meta-learning algorithm, MMT, that can significantly speed
up the process for spatial tensor models. MMT leverages the property that
spatial data can be viewed at multiple resolutions, which are related by
coarsening and finegraining from one resolution to another. Using this
property, MMT learns a tensor model by starting from a coarse resolution and
iteratively increasing the model complexity. In order to not "over-train" on
coarse resolution models, we investigate an information-theoretic fine-graining
criterion to decide when to transition into higher-resolution models. We
provide both theoretical and empirical evidence for the advantages of this
approach. When applied to two real-world large-scale spatial datasets for
basketball player and animal behavior modeling, our approach demonstrate 3 key
benefits: 1) it efficiently captures higher-order interactions (i.e., tensor
latent factors), 2) it is orders of magnitude faster than fixed resolution
learning and scales to very fine-grained spatial resolutions, and 3) it
reliably yields accurate and interpretable models
Machine Learning and Integrative Analysis of Biomedical Big Data.
Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues
- …