Search CORE

Cold Spring Harbor Laboratory Institutional Repository

Cold Spring Harbor Laboratory Institutional Repository

Predicting age from the transcriptome of human dermal fibroblasts

Author: Fleischer J. G.
Hetzer M. W.
Huang L.
Ibarra A.
Navlakha S.
Schulte R.
Shokhirev M. N.
Tsai H. H.
Tyagi S.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/12/2018
Field of study

Biomarkers of aging can be used to assess the health of individuals and to study aging and age-related diseases. We generate a large dataset of genome-wide RNA-seq profiles of human dermal fibroblasts from 133 people aged 1 to 94 years old to test whether signatures of aging are encoded within the transcriptome. We develop an ensemble machine learning method that predicts age to a median error of 4 years, outperforming previous methods used to predict age. The ensemble was further validated by testing it on ten progeria patients, and our method is the only one that predicts accelerated aging in these patients

Algorithms in nature: the convergence of systems biology and computational thinking

Author: Afek Y
Beckers R
Bishop CM
Charalambous C
Coore D
Daliot A
Deneubourg JL
Ferber J
Forrest S
Goldberg DE
Hebb DO
Jhuang H
Kennedy J
Kephart JO
Luby M
Lynch NA
Păun G
Ren H
Rendel JM
Saket Navlakha
Serre T
Shapiro LG
Szeliski R
Vazirani V
Werner‐Allen G
Wokoma I
Zhu X
Ziv Bar‐Joseph
Publication venue: Nature Publishing Group
Publication date: 08/11/2011
Field of study

Biologists rely on computational methods to analyze and integrate large data sets, while several computational methods were inspired by the high-level design principles of biological systems. This Perspectives discusses the recent convergence of these two ways of thinking

Cold Spring Harbor Laboratory Institutional Repository

Southampton (e-Prints Soton)

DADA: Degree-Aware Algorithms for Network-Based Disease Gene Prioritization

Author: AM Edwards
AM Glazier
D Maglott
D Masotti
DS Goldberg
E Adie
E Nabieva
F Turner
G Bebek
Gurkan Bebek
H Tong
H Tong
HG Brunner
J Chen
K Lage
K Macropol
KI Goh
L Lovász
M Oti
M Oti
MA van Driel
Mehmet Koyutürk
MS Erten
O Vanunu
RA George
Rob M Ewing
S Aerts
S Brin
S Köhler
S Navlakha
Sinan Erten
T Ideker
VN Patel
WK Huh
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background High-throughput molecular interaction data have been used effectively to prioritize candidate genes that are linked to a disease, based on the observation that the products of genes associated with similar diseases are likely to interact with each other heavily in a network of protein-protein interactions (PPIs). An important challenge for these applications, however, is the incomplete and noisy nature of PPI data. Information flow based methods alleviate these problems to a certain extent, by considering indirect interactions and multiplicity of paths. Results We demonstrate that existing methods are likely to favor highly connected genes, making prioritization sensitive to the skewed degree distribution of PPI networks, as well as ascertainment bias in available interaction and disease association data. Motivated by this observation, we propose several statistical adjustment methods to account for the degree distribution of known disease and candidate genes, using a PPI network with associated confidence scores for interactions. We show that the proposed methods can detect loosely connected disease genes that are missed by existing approaches, however, this improvement might come at the price of more false negatives for highly connected genes. Consequently, we develop a suite called D<smcaps>A</smcaps>D<smcaps>A</smcaps>, which includes different uniform prioritization methods that effectively integrate existing approaches with the proposed statistical adjustment strategies. Comprehensive experimental results on the Online Mendelian Inheritance in Man (OMIM) database show that D<smcaps>A</smcaps>D<smcaps>A</smcaps> outperforms existing methods in prioritizing candidate disease genes. Conclusions These results demonstrate the importance of employing accurate statistical models and associated adjustment methods in network-based disease gene prioritization, as well as other network-based functional inference applications. D<smcaps>A</smcaps>D<smcaps>A</smcaps> is implemented in Matlab and is freely available at <url>http://compbio.case.edu/dada/</url>.</p

Springer - Publisher Connector

arXiv.org e-Print Archive

Concurrence of form and function in developing networks and its role in synaptic pruning

Author: A Fornito
A Holtmaat
A Holtmaat
A Litwin-Kumar
AB Bortz
AL Barabási
AY Klintsova
B Kolb
B Kolb
DH Geschwind
DS Modha
E Bullmore
E Frank
F Vazquez
F Zenke
G Chechik
G Faludi
G Liu
H Sayama
H Sompolinsky
J Berg
J Iglesias
J Ren
JJ Hopfield
JJ Torres
JM Cortés
KS Lee
LG Morelli
M Deger
M Roo De
ME Turanalp
MS Keshavan
MT Gastner
N Voges
O Williams
PR Huttenlocher
QK Telesford
R Albert
RL Gómez
S Amari
S Boccaletti
S Franciscis de
S Johnson
S Johnson
S Johnson
S Maslov
S Navlakha
S Song
T Akam
T Gross
VM Eguiluz
W Xiong
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

A fundamental question in neuroscience is how structure and function of neural systems are related. We study this interplay by combining a familiar auto-associative neural network with an evolving mechanism for the birth and death of synapses. A feedback loop then arises leading to two qualitatively different types of behaviour. In one, the network structure becomes heterogeneous and dissasortative, and the system displays good memory performance; furthermore, the structure is optimised for the particular memory patterns stored during the process. In the other, the structure remains homogeneous and incapable of pattern retrieval. These findings provide an inspiring picture of brain structure and dynamics that is compatible with experimental results on early brain development, and may help to explain synaptic pruning. Other evolving networks—such as those of protein interactions—might share the basic ingredients for this feedback loop and other questions, and indeed many of their structural features are as predicted by our model.We are grateful for financial support from the Spanish MINECO (project of Excellence: FIS2017-84256-P) and from “Obra Social La Caixa”

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

University of Birmingham Research Portal

Repositorio Institucional Universidad de Granada

ProphNet: A generic prioritization method through propagation of information

Author: A Hamosh
A Molven
A Naderi
A Rökman
AL Barabási
AL Gloyn
AP Babenko
Armando Blanco
B Raghavachari
C Fenoglio
C Van Duijn
Carlos Cano
D Zhou
E Jain
EA Adie
EA Adie
EC van Hove
G Chenevix-Trench
GS Wilkie
JBJ Kwok
KJ Gaulton
LE Wold
MA van Driel
N Aziz
N Rahman
O Vanunu
O Vanunu
P Vahteristo
PJ Westenend
RD Finn
S Aerts
S Köhler
S Navlakha
S Peri
SK Ng
T Buterin
T Hwang
T Walsh
V Martínez
Víctor Martínez
W Wang
W Zhang
X Wang
X Wu
Y Li
Y Moreau
Y Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

This article has been published as part of BMC Bioinformatics Volume 15 Supplement 1, 2014: Integrated Bio-Search: Selected Works from the 12th International Workshop on Network Tools and Applications in Biology (NETTAB 2012).[Background] Prioritization methods have become an useful tool for mining large amounts of data to suggest promising hypotheses in early research stages. Particularly, network-based prioritization tools use a network representation for the interactions between different biological entities to identify novel indirect relationships. However, current network-based prioritization tools are strongly tailored to specific domains of interest (e.g. gene-disease prioritization) and they do not allow to consider networks with more than two types of entities (e.g. genes and diseases). Therefore, the direct application of these methods to accomplish new prioritization tasks is limited.[Results] This work presents ProphNet, a generic network-based prioritization tool that allows to integrate an arbitrary number of interrelated biological entities to accomplish any prioritization task. We tested the performance of ProphNet in comparison with leading network-based prioritization methods, namely rcNet and DomainRBF, for gene-disease and domain-disease prioritization, respectively. The results obtained by ProphNet show a significant improvement in terms of sensitivity and specificity for both tasks. We also applied ProphNet to disease-gene prioritization on Alzheimer, Diabetes Mellitus Type 2 and Breast Cancer to validate the results and identify putative candidate genes involved in these diseases.[Conclusions] ProphNet works on top of any heterogeneous network by integrating information of different types of biological entities to rank entities of a specific type according to their degree of relationship with a query set of entities of another type. Our method works by propagating information across data networks and measuring the correlation between the propagated values for a query and a target sets of entities. ProphNet is available at: http://genome2.ugr.es/prophnet webcite. A Matlab implementation of the algorithm is also available at the website.This work was part of projects P08-TIC-4299 of J. A., Sevilla and TIN2009-13489 of DGICT, Madrid. It was also supported by Plan Propio de Investigación, University of Granada

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Springer - Publisher Connector

Repositorio Institucional Universidad de Granada

Clustering metagenomic sequences with interpolated Markov models

Author: A Brady
A Kislyuk
A McHardy
AC McHardy
AL Delcher
AL Delcher
B Rodriguez-Brito
CKK Chan
D Huson
D Rusch
D Wu
DA Benson
David R Kelley
EA Grice
EK Costello
G Celeux
G Dick
GW Tyson
H Teeling
J Bohlin
J Bohlin
J Bohlin
J Morgan
J Mrazek
J Qin
J Shi
J White
JA Eisen
JG Lawrence
K Chen
K Liolios
K Mavromatis
L Hubert
LB Koski
M Hamady
M Wu
MM Haque
N Diaz
P Smyth
P Tan
R Durbin
R Sandberg
S Chatterji
S Karlin
S Kosakovsky Pond
S Mann
S Navlakha
SF Altschul
SJ Lee
SL Salzberg
Steven L Salzberg
T Abe
T Abe
W Gerlach
YW Wu
Z Weinberg
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Sequencing of environmental DNA (often called metagenomics) has shown tremendous potential to uncover the vast number of unknown microbes that cannot be cultured and sequenced by traditional methods. Because the output from metagenomic sequencing is a large set of reads of unknown origin, clustering reads together that were sequenced from the same species is a crucial analysis step. Many effective approaches to this task rely on sequenced genomes in public databases, but these genomes are a highly biased sample that is not necessarily representative of environments interesting to many metagenomics projects. Results We present S<smcaps>CIMM</smcaps> (Sequence Clustering with Interpolated Markov Models), an unsupervised sequence clustering method. S<smcaps>CIMM</smcaps> achieves greater clustering accuracy than previous unsupervised approaches. We examine the limitations of unsupervised learning on complex datasets, and suggest a hybrid of S<smcaps>CIMM</smcaps> and supervised learning method Phymm called P<smcaps>HY</smcaps>S<smcaps>CIMM</smcaps> that performs better when evolutionarily close training genomes are available. Conclusions S<smcaps>CIMM</smcaps> and P<smcaps>HY</smcaps>S<smcaps>CIMM</smcaps> are highly accurate methods to cluster metagenomic sequences. S<smcaps>CIMM</smcaps> operates entirely unsupervised, making it ideal for environments containing mostly novel microbes. P<smcaps>HY</smcaps>S<smcaps>CIMM</smcaps> uses supervised learning to improve clustering in environments containing microbial strains from well-characterized genera. S<smcaps>CIMM</smcaps> and P<smcaps>HY</smcaps>S<smcaps>CIMM</smcaps> are available open source from <url>http://www.cbcb.umd.edu/software/scimm</url>.</p

Springer - Publisher Connector