Search CORE

7,156 research outputs found

Machine Learning and Integrative Analysis of Biomedical Big Data.

Author: Choi Howard
Chung Neo Christopher
Mirza Bilal
Ping Peipei
Wang Jie
Wang Wei
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

Multidisciplinary Digital Publishing Institute

Ezid

Directory of Open Access Journals

eScholarship - University of California

Recommended from our members

DNA methylation-based classification of central nervous system tumours.

Author: Aronica Eleonora
Becker Albert
Benner Axel
Beschorner Rudi
Bewerunge-Hudler Melanie
Bjerkvig Rolf
Braczynski Anne K
Brehmer Stefanie
Brück Wolfgang
Calaminus Gabriele
Capper David
Chavez Lukas
Coras Roland
Cryan Jane
Deckert Martina
Dohmen Hildegard
Driever Pablo Hernáiz
Engel Nils W
Farrell Michael
Fischer Roger
Fleischhack Gudrun
Frank Stephan
Frühwald Michael C
Garvalov Boyan K
Geisenberger Christoph
Giangaspero Felice
Gnekow Astrid
Gottardo Nicholas G
Haberler Christine
Hans Volkmar
Hansford Jordan R
Harter Patrick N
Hench Jürgen
Heppner Frank
Hewer Ekkehard
Hofer Silvia
Hovestadt Volker
Huang Kristin
Hänggi Daniel
Hölsken Annett
Jones Chris
Jones David TW
Jouvet Anne
Kannan Kasthuri
Keohane Catherine
Ketter Ralf
Khatib Ziad
Koch Arend
Koelsche Christian
Kohlhof Patricia
Kramm Christof M
Kratz Annekathrin
Kristensen Bjarne W
Kulozik Andreas
Lechner Matt
Lindenberg Kerstin
Lohmann Dietmar
Lopes Beatriz
Mawrin Christian
Milde Till
Monoranu Camelia-Maria
Mueller Wolf
Mühleisen Helmut
Müller Hermann L
Olar Adriana
Pages Melanie
Pajtler Kristian W
Perry Arie
Plate Karl H
Pohl Ute
Preusser Matthias
Prinz Marco
Reuss David E
Rodriguez Fausto J
Rozsnoki Stephanie
Rushing Elisabeth
Rutkowski Stefan
Sahm Felix
Scheurlen Wolfram
Schick Matthias
Schittenhelm Jens
Schrimpf Daniel
Schweizer Leonille
Seiz-Rosenhagen Marcel
Selt Florian
Serrano Jonathan
Sill Martin
Staszewski Ori
Stichel Damian
Sturm Dominik
Temming Petra
Tippelt Stephan
Tsirigos Aristotelis
Varlet Pascale
von Hoff Katja
Wani Khalida
Wefers Annika K
Witt Hendrik
Witt Olaf
Zapatka Marc
Publication venue: eScholarship, University of California
Publication date: 01/03/2018
Field of study

Accurate pathological diagnosis is crucial for optimal management of patients with cancer. For the approximately 100 known tumour types of the central nervous system, standardization of the diagnostic process has been shown to be particularly challenging-with substantial inter-observer variability in the histopathological diagnosis of many tumour types. Here we present a comprehensive approach for the DNA methylation-based classification of central nervous system tumours across all entities and age groups, and demonstrate its application in a routine diagnostic setting. We show that the availability of this method may have a substantial impact on diagnostic precision compared to standard methods, resulting in a change of diagnosis in up to 12% of prospective cases. For broader accessibility, we have designed a free online classifier tool, the use of which does not require any additional onsite data processing. Our results provide a blueprint for the generation of machine-learning-based tumour classifiers across other cancer entities, with the potential to fundamentally transform tumour pathology

eScholarship - University of California

Pan-cancer classifications of tumor histological images using deep learning

Author: Caruana Dennis
Chuang Jeffrey H.
Farahmand Saman
Foroughi pour Ali
Namburi Sandeep
Noorbakhsh Javad
Rimm David
Soltanieh-Ha Mohammad
Zarringhalam Kourosh
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 04/03/2020
Field of study

Histopathological images are essential for the diagnosis of cancer type and selection of optimal treatment. However, the current clinical process of manual inspection of images is time consuming and prone to intra- and inter-observer variability. Here we show that key aspects of cancer image analysis can be performed by deep convolutional neural networks (CNNs) across a wide spectrum of cancer types. In particular, we implement CNN architectures based on Google Inception v3 transfer learning to analyze 27815 H&E slides from 23 cohorts in The Cancer Genome Atlas in studies of tumor/normal status, cancer subtype, and mutation status. For 19 solid cancer types we are able to classify tumor/normal status of whole slide images with extremely high AUCs (0.995±0.008). We are also able to classify cancer subtypes within 10 tissue types with AUC values well above random expectations (micro-average 0.87±0.1). We then perform a cross-classification analysis of tumor/normal status across tumor types. We find that classifiers trained on one type are often effective in distinguishing tumor from normal in other cancer types, with the relationships among classifiers matching known cancer tissue relationships. For the more challenging problem of mutational status, we are able to classify TP53 mutations in three cancer types with AUCs from 0.65-0.80 using a fully-trained CNN, and with similar cross-classification accuracy across tissues. These studies demonstrate the power of CNNs for not only classifying histopathological images in diverse cancer types, but also for revealing shared biology between tumors. We have made software available at: https://github.com/javadnoorb/HistCNNFirst author draf

Boston University Institutional Repository (OpenBU)

Class imbalance impact on the prediction of complications during home hospitalization: a comparative study.

Author: Calvo González Mireia
Cano Isaac
Henández Carmen
Jané Campos Raimon
Miralles Felip
Ribas Vicent
Roca Josep
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

© 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting /republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other worksHome hospitalization (HH) is presented as a healthcare alternative capable of providing high standards of care when patients no longer need hospital facilities. Although HH seems to lower healthcare costs by shortening hospital stays and improving patient's quality of life, the lack of continuous observation at home may lead to complications in some patients. Since blood tests have been proven to provide relevant prognosis information in many diseases, this paper analyzes the impact of different sampling methods on the prediction of HH outcomes. After a first exploratory analysis, some variables extracted from routine blood tests performed at the moment of HH admission, such as hemoglobin, lymphocytes or creatinine, were found to unmask statistically significant differences between patients undergoing successful and unsucessful HH stays. Then, predictive models were built with these data, in order to identify unsuccessful cases eventually needing hospital facilities. However, since these hospital admissions during HH programs are rare, their identification through conventional machine-learning approaches is challenging. Thus, several sampling strategies designed to face class imbalance were herein overviewed and compared. Among the analyzed approaches, over-sampling strategies, such as ROSE (Random Over-Sampling Examples) and conventional random over-sampling, showed the best performances. Nevertheless, further improvements should be proposed in the future so as to better identify those patients not benefiting from HHPeer ReviewedPostprint (author's final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC

G2C: A Generator-to-Classifier Framework Integrating Multi-Stained Visual Cues for Pathological Glomerulus Classification

Author: Liu Zhihong
Sun Guangyu
Wu Bingzhe
Xie Lingxi
Zeng Caihong
Zhang Xiaolu
Zhao Shiwan
Publication venue
Publication date: 07/03/2019
Field of study

Pathological glomerulus classification plays a key role in the diagnosis of nephropathy. As the difference between different subcategories is subtle, doctors often refer to slides from different staining methods to make decisions. However, creating correspondence across various stains is labor-intensive, bringing major difficulties in collecting data and training a vision-based algorithm to assist nephropathy diagnosis. This paper provides an alternative solution for integrating multi-stained visual cues for glomerulus classification. Our approach, named generator-to-classifier (G2C), is a two-stage framework. Given an input image from a specified stain, several generators are first applied to estimate its appearances in other staining methods, and a classifier follows to combine visual cues from different stains for prediction (whether it is pathological, or which type of pathology it has). We optimize these two stages in a joint manner. To provide a reasonable initialization, we pre-train the generators in an unlabeled reference set under an unpaired image-to-image translation task, and then fine-tune them together with the classifier. We conduct experiments on a glomerulus type classification dataset collected by ourselves (there are no publicly available datasets for this purpose). Although joint optimization slightly harms the authenticity of the generated patches, it boosts classification performance, suggesting more effective visual cues are extracted in an automatic way. We also transfer our model to a public dataset for breast cancer classification, and outperform the state-of-the-arts significantly.Comment: Accepted by AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications