Search CORE

31,216 research outputs found

Machine Learning and Integrative Analysis of Biomedical Big Data.

Author: Choi Howard
Chung Neo Christopher
Mirza Bilal
Ping Peipei
Wang Jie
Wang Wei
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

Multidisciplinary Digital Publishing Institute

Ezid

Directory of Open Access Journals

eScholarship - University of California

Consensus clustering approach to group brain connectivity matrices

Author: Angelini Leonardo
Cortes Jesus M.
Marinazzo Daniele
Pellicoro Mario
Rasero Javier
Stramaglia Sebastiano
Publication venue
Publication date: 01/01/2017
Field of study

A novel approach rooted on the notion of consensus clustering, a strategy developed for community detection in complex networks, is proposed to cope with the heterogeneity that characterizes connectivity matrices in health and disease. The method can be summarized as follows: (i) define, for each node, a distance matrix for the set of subjects by comparing the connectivity pattern of that node in all pairs of subjects; (ii) cluster the distance matrix for each node; (iii) build the consensus network from the corresponding partitions; (iv) extract groups of subjects by finding the communities of the consensus network thus obtained. Differently from the previous implementations of consensus clustering, we thus propose to use the consensus strategy to combine the information arising from the connectivity patterns of each node. The proposed approach may be seen either as an exploratory technique or as an unsupervised pre-training step to help the subsequent construction of a supervised classifier. Applications on a toy model and two real data sets, show the effectiveness of the proposed methodology, which represents heterogeneity of a set of subjects in terms of a weighted network, the consensus matrix

arXiv.org e-Print Archive

Ghent University Academic Bibliography

Directory of Open Access Journals

Archivio istituzionale della ricerca - Università di Bari

Networks and the epidemiology of infectious disease

Author: Danon Leon
Ford Ashley P.
House Thomas A.
Jewell Chris P.
Keeling Matthew James
Roberts Gareth O.
Ross Joshua V.
Vernon Matthew C.
Publication venue: 'Hindawi Limited'
Publication date: 27/11/2010
Field of study

The science of networks has revolutionised research into the dynamics of interacting elements. It could be argued that epidemiology in particular has embraced the potential of network theory more than any other discipline. Here we review the growing body of research concerning the spread of infectious diseases on networks, focusing on the interplay between network theory and epidemiology. The review is split into four main sections, which examine: the types of network relevant to epidemiology; the multitude of ways these networks can be characterised; the statistical methods that can be applied to infer the epidemiological parameters on a realised network; and finally simulation and analytical methods to determine epidemic dynamics on a given network. Given the breadth of areas covered and the ever-expanding number of publications, a comprehensive review of all work is impossible. Instead, we provide a personalised overview into the areas of network epidemiology that have seen the greatest progress in recent years or have the greatest potential to provide novel insights. As such, considerable importance is placed on analytical approaches and statistical methods which are both rapidly expanding fields. Throughout this review we restrict our attention to epidemiological issues

arXiv.org e-Print Archive

CiteSeerX

Crossref

Adelaide Research & Scholarship

Directory of Open Access Journals

PubMed Central

Warwick Research Archives Portal Repository

The University of Manchester - Institutional Repository

Lancaster E-Prints

Explore Bristol Research

Estimating sample-specific regulatory networks

Author: Glass Kimberly
Kuijjer Marieke Lydia
Quackenbush John
Tung Matthew
Yuan GuoCheng
Publication venue
Publication date: 28/06/2018
Field of study

Biological systems are driven by intricate interactions among the complex array of molecules that comprise the cell. Many methods have been developed to reconstruct network models of those interactions. These methods often draw on large numbers of samples with measured gene expression profiles to infer connections between genes (or gene products). The result is an aggregate network model representing a single estimate for the likelihood of each interaction, or "edge," in the network. While informative, aggregate models fail to capture the heterogeneity that is represented in any population. Here we propose a method to reverse engineer sample-specific networks from aggregate network models. We demonstrate the accuracy and applicability of our approach in several data sets, including simulated data, microarray expression data from synchronized yeast cells, and RNA-seq data collected from human lymphoblastoid cell lines. We show that these sample-specific networks can be used to study changes in network topology across time and to characterize shifts in gene regulation that may not be apparent in expression data. We believe the ability to generate sample-specific networks will greatly facilitate the application of network methods to the increasingly large, complex, and heterogeneous multi-omic data sets that are currently being generated, and ultimately support the emerging field of precision network medicine

arXiv.org e-Print Archive

Directory of Open Access Journals

NORA - Norwegian Open Research Archives

A statistical network analysis of the HIV/AIDS epidemics in Cuba

Author: A Clauset
A Kleczkowski
B Hill
C Moore
E Volz
E Volz
F Ball
F Ball
F Barbour
F Rossi
H Arazoza De
I Herman
I Kiss
J Reichardt
J Wylie
JM Roberts Jr
L Decreusefond
M Blum
M Graham
M Molloy
M Newman
M Newman
M Newman
M Newman
M Newman
R Ahuja
RM May
S Clémençon
S Fortunato
S Resnick
T Britton
T Fruchterman
T House
Y-H Hsieh
Publication venue
Publication date: 22/05/2015
Field of study

The Cuban contact-tracing detection system set up in 1986 allowed the reconstruction and analysis of the sexual network underlying the epidemic (5,389 vertices and 4,073 edges, giant component of 2,386 nodes and 3,168 edges), shedding light onto the spread of HIV and the role of contact-tracing. Clustering based on modularity optimization provides a better visualization and understanding of the network, in combination with the study of covariates. The graph has a globally low but heterogeneous density, with clusters of high intraconnectivity but low interconnectivity. Though descriptive, our results pave the way for incorporating structure when studying stochastic SIR epidemics spreading on social networks

arXiv.org e-Print Archive

Crossref

HAL-Paris1

Partition Decoupling for Multi-gene Analysis of Gene Expression Profiling Data

Author: Braun Rosemary
Leibon Gregory
Pauls Scott
Rockmore Daniel
Publication venue
Publication date: 01/01/2011
Field of study

We present the extention and application of a new unsupervised statistical learning technique--the Partition Decoupling Method--to gene expression data. Because it has the ability to reveal non-linear and non-convex geometries present in the data, the PDM is an improvement over typical gene expression analysis algorithms, permitting a multi-gene analysis that can reveal phenotypic differences even when the individual genes do not exhibit differential expression. Here, we apply the PDM to publicly-available gene expression data sets, and demonstrate that we are able to identify cell types and treatments with higher accuracy than is obtained through other approaches. By applying it in a pathway-by-pathway fashion, we demonstrate how the PDM may be used to find sets of mechanistically-related genes that discriminate phenotypes.Comment: Revise

arXiv.org e-Print Archive

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Dartmouth Digital Commons (Dartmouth College)