Search CORE

14,320 research outputs found

Machine Learning and Integrative Analysis of Biomedical Big Data.

Author: Choi Howard
Chung Neo Christopher
Mirza Bilal
Ping Peipei
Wang Jie
Wang Wei
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

Multidisciplinary Digital Publishing Institute

Ezid

Directory of Open Access Journals

eScholarship - University of California

Deep Extreme Multi-label Learning

Author: Wang Xiangfeng
Yan Junchi
Zha Hongyuan
Zhang Wenjie
Publication venue
Publication date: 08/06/2018
Field of study

Extreme multi-label learning (XML) or classification has been a practical and important problem since the boom of big data. The main challenge lies in the exponential label space which involves

2^L

possible label sets especially when the label dimension

L

is huge, e.g., in millions for Wikipedia labels. This paper is motivated to better explore the label space by originally establishing an explicit label graph. In the meanwhile, deep learning has been widely studied and used in various classification problems including multi-label classification, however it has not been properly introduced to XML, where the label space can be as large as in millions. In this paper, we propose a practical deep embedding method for extreme multi-label classification, which harvests the ideas of non-linear embedding and graph priors-based label space modeling simultaneously. Extensive experiments on public datasets for XML show that our method performs competitive against state-of-the-art result

arXiv.org e-Print Archive

Crossref

Challenges in Representation Learning: A report on three machine learning contests

Author: Athanasakis Dimitris
Bengio Yoshua
Bergstra James
Carrier Pierre Luc
Chuang Zhang
Courville Aaron
Cukierski Will
Erhan Dumitru
Feng Fangxiang
Goodfellow Ian J.
Grozea Cristian
Hamner Ben
Ionescu Radu
Lee Dong-Hyun
Li Ruifan
Milakov Maxim
Mirza Mehdi
Park John
Popescu Marius
Ramaiah Chetan
Romaszko Lukasz
Shawe-Taylor John
Tang Yichuan
Thaler David
Wang Xiaojie
Xie Jingjing
Xu Bing
Zhou Yingbo
Publication venue
Publication date: 01/01/2013
Field of study

The ICML 2013 Workshop on Challenges in Representation Learning focused on three challenges: the black box learning challenge, the facial expression recognition challenge, and the multimodal learning challenge. We describe the datasets created for these challenges and summarize the results of the competitions. We provide suggestions for organizers of future challenges and some comments on what kind of knowledge can be gained from machine learning competitions.Comment: 8 pages, 2 figure

arXiv.org e-Print Archive

Crossref

Fraunhofer-ePrints