Search CORE

4,874 research outputs found

Machine Learning and Integrative Analysis of Biomedical Big Data.

Author: Choi Howard
Chung Neo Christopher
Mirza Bilal
Ping Peipei
Wang Jie
Wang Wei
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

Directory of Open Access Journals

eScholarship - University of California

Multilevel Weighted Support Vector Machine for Classification on Healthcare Data with Missing Values

Author: Marko Nicholas
Razzaghi Talayeh
Roderick Oleg
Safro Ilya
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 07/04/2016
Field of study

This work is motivated by the needs of predictive analytics on healthcare data as represented by Electronic Medical Records. Such data is invariably problematic: noisy, with missing entries, with imbalance in classes of interests, leading to serious bias in predictive modeling. Since standard data mining methods often produce poor performance measures, we argue for development of specialized techniques of data-preprocessing and classification. In this paper, we propose a new method to simultaneously classify large datasets and reduce the effects of missing values. It is based on a multilevel framework of the cost-sensitive SVM and the expected maximization imputation method for missing values, which relies on iterated regression analyses. We compare classification results of multilevel SVM-based algorithms on public benchmark datasets with imbalanced classes and missing values as well as real data in health applications, and show that our multilevel SVM-based method produces fast, and more accurate and robust classification results.Comment: arXiv admin note: substantial text overlap with arXiv:1503.0625

arXiv.org e-Print Archive

Directory of Open Access Journals

FigShare

Spatial and temporal epidemiological analysis in the Big Data era

Author: Alvarado-Serrano
Anderson
Andrienko
Anon
Anon
Anon
Anon
Baker
Bell
Breiman
Brownstein
Brownstein
Brownstein
Brunker
Butler
Butler
Carneiro
Carrel
Carroll
Chan
Chew
Chunara
Clements
Collier
Collins
Correa
Costa
Cowen
de Glanville
Dhar
Dirk U. Pfeiffer
Dodge
Eastman
Elith
Elith
Eysenbach
Faghmous
Faria
Feizizadeh
Fernández
Firestone
Firestone
França
Freifeld
Gandomi
Gartner
Gibney
Giebultowicz
Gilbert
Ginsberg
Goodchild
Goodchild
Grein
Haklay
Hartley
Hay
Hay
Heipke
Heymann
Hirzel
Hirzel
Hongoh
Istepanian
Jankowski
Jones
Kambatla
Kamel Boulos
Kamel Boulos
Keller
Kim B. Stevens
Kuhn
Lawson
Lazer
Lee
Leetaru
Li
Liang
Ligmann-Zielinska
Malczewski
Malczewski
Martin
Mayer-Schönberger
Milinovich
Milinovich
Mortari
Mullins
Murray
Mykhalovskiy
Okabe
Oliver
Olsen
O’Driscoll
Peters
Pfeiffer
Pfeiffer
Pigliucci
Pigott
Porter
Prates
Pybus
Rutten
Sanchez-Matamoros
Sarojinie Fernando
Schadt
Scholkopf
Schutt
See
Signorini
Solanas
Sorensen
St Louis
Stevens
Stevens
Tatem
Tatem
Tolentino
Tran
van Zyl
van Zyl
Vatsavai
Wesolowski
Wesolowski
Wilson
Wilson
Wilson
Wing
Yemshanov
You
Zeldenrust
Ziegler
Publication venue: 'Elsevier BV'
Publication date: 01/11/2015
Field of study

DPVis: Visual Analytics with Hidden Markov Models for Disease Progression Pathways

Author: Anand Vibha
Frohnert Brigitte I
Ghosh Soumya
Kwon Bum Chul
Lundgren Markus
Ng Kenney
Severson Kristen A
Sun Zhaonan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/04/2020
Field of study

Clinical researchers use disease progression models to understand patient status and characterize progression patterns from longitudinal health records. One approach for disease progression modeling is to describe patient status using a small number of states that represent distinctive distributions over a set of observed measures. Hidden Markov models (HMMs) and its variants are a class of models that both discover these states and make inferences of health states for patients. Despite the advantages of using the algorithms for discovering interesting patterns, it still remains challenging for medical experts to interpret model outputs, understand complex modeling parameters, and clinically make sense of the patterns. To tackle these problems, we conducted a design study with clinical scientists, statisticians, and visualization experts, with the goal to investigate disease progression pathways of chronic diseases, namely type 1 diabetes (T1D), Huntington's disease, Parkinson's disease, and chronic obstructive pulmonary disease (COPD). As a result, we introduce DPVis which seamlessly integrates model parameters and outcomes of HMMs into interpretable and interactive visualizations. In this study, we demonstrate that DPVis is successful in evaluating disease progression models, visually summarizing disease states, interactively exploring disease progression patterns, and building, analyzing, and comparing clinically relevant patient subgroups.Comment: to appear at IEEE Transactions on Visualization and Computer Graphic

arXiv.org e-Print Archive

Lund University Publications