14,320 research outputs found
Machine Learning and Integrative Analysis of Biomedical Big Data.
Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues
Deep Extreme Multi-label Learning
Extreme multi-label learning (XML) or classification has been a practical and
important problem since the boom of big data. The main challenge lies in the
exponential label space which involves possible label sets especially
when the label dimension is huge, e.g., in millions for Wikipedia labels.
This paper is motivated to better explore the label space by originally
establishing an explicit label graph. In the meanwhile, deep learning has been
widely studied and used in various classification problems including
multi-label classification, however it has not been properly introduced to XML,
where the label space can be as large as in millions. In this paper, we propose
a practical deep embedding method for extreme multi-label classification, which
harvests the ideas of non-linear embedding and graph priors-based label space
modeling simultaneously. Extensive experiments on public datasets for XML show
that our method performs competitive against state-of-the-art result
Challenges in Representation Learning: A report on three machine learning contests
The ICML 2013 Workshop on Challenges in Representation Learning focused on
three challenges: the black box learning challenge, the facial expression
recognition challenge, and the multimodal learning challenge. We describe the
datasets created for these challenges and summarize the results of the
competitions. We provide suggestions for organizers of future challenges and
some comments on what kind of knowledge can be gained from machine learning
competitions.Comment: 8 pages, 2 figure
- …