Search CORE

Springer - Publisher Connector

Brunel University Research Archive

A dynamic network approach for the study of human phenotypes

Author: A Barrat
A Hinks
A Rzhetsky
AB Begovich
AJ Lusis
Albert-László Barabási
C Benesch
CA Hidalgo
CA Hidalgo
César A. Hidalgo
D Lauderdale
DS Lee
GS Cooper
H Goehler
I Feldman
J Hatten
J Kobberling
J Lim
J Zhang
JB Mitchell
JF Rual
K-I Goh
Lauren Ancel Meyers
LI Iezzoni
MA Pujana
MC Oldham
N Craddock
Nicholas A. Christakis
Nicholas Blumm
P McGuffin
P Wordsworth
S Schneeweiss
SE Ahnert
SE Calvano
SF Grant
U Stelzl
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/04/2009
Field of study

The use of networks to integrate different genetic, proteomic, and metabolic datasets has been proposed as a viable path toward elucidating the origins of specific diseases. Here we introduce a new phenotypic database summarizing correlations obtained from the disease history of more than 30 million patients in a Phenotypic Disease Network (PDN). We present evidence that the structure of the PDN is relevant to the understanding of illness progression by showing that (1) patients develop diseases close in the network to those they already have; (2) the progression of disease along the links of the network is different for patients of different genders and ethnicities; (3) patients diagnosed with diseases which are more highly connected in the PDN tend to die sooner than those affected by less connected diseases; and (4) diseases that tend to be preceded by others in the PDN tend to be more connected than diseases that precede other illnesses, and are associated with higher degrees of mortality. Our findings show that disease progression can be represented and studied using network methods, offering the potential to enhance our understanding of the origin and evolution of human diseases. The dataset introduced here, released concurrently with this publication, represents the largest relational phenotypic resource publicly available to the research community.Comment: 28 pages (double space), 6 figure

arXiv.org e-Print Archive

Public Library of Science (PLOS)

Harvard University - DASH

The Francis Crick Institute

Data incongruence and the problem of avian louse phylogeny

Recent studies based on different types of data (i.e. morphological and molecular) have supported conflicting phylogenies for the genera of avian feather lice (Ischnocera: Phthiraptera). We analyse new and published data from morphology and from mitochondrial (12S rRNA and COI) and nuclear (EF1-) genes to explore the sources of this incongruence and explain these conflicts. Character convergence, multiple substitutions at high divergences, and ancient radiation over a short period of time have contributed to the problem of resolving louse phylogeny with the data currently available. We show that apparent incongruence between the molecular datasets is largely attributable to rate variation and nonstationarity of base composition. In contrast, highly significant character incongruence leads to topological incongruence between the molecular and morphological data. We consider ways in which biases in the sequence data could be misleading, using several maximum likelihood models and LogDet corrections. The hierarchical structure of the data is explored using likelihood mapping and SplitsTree methods. Ultimately, we concede there is strong discordance between the molecular and morphological data and apply the conditional combination approach in this case. We conclude that higher level phylogenetic relationships within avian Ischnocera remain extremely problematic. However, consensus between datasets is beginning to converge on a stable phylogeny for avian lice, at and below the familial rank

Enlighten

Disambiguating Proteins, Genes, and RNA in Text: A Machine Learning Approach

Author: Duboue Pablo A.
Hatzivassiloglou Vasileios
Rzhetsky Andrey
Publication venue: Proceedings of the 9th International Conference on Intelligent Systems for Molecular Biology
Publication date: 01/01/2001
Field of study

We present an automated system for assigning protein, gene, or mRNA class labels to biological terms in free text. Three machine learning algorithms and several extended ways for defining contextual features for disambiguation are examined, and a fully unsupervised manner for obtaining training examples is proposed. We train and evaluate our system over a collection of 9 million words of molecular biology journal articles, obtaining accuracy rates up to 85%

Columbia University Academic Commons

Recommended from our members

Conflicting Biomedical Assumptions for Mathematical Modeling: The Case of Cancer Metastasis

Author: Divoli Anna
Evans James A.
Mendonça Eneida A.
Rzhetsky Andrey
Publication venue
Publication date: 21/12/2023
Field of study

Computational models in biomedicine rely on biological and clinical assumptions. The selection of these assumptions contributes substantially to modeling success or failure. Assumptions used by experts at the cutting edge of research, however, are rarely explicitly described in scientific publications. One can directly collect and assess some of these assumptions through interviews and surveys. Here we investigate diversity in expert views about a complex biological phenomenon, the process of cancer metastasis. We harvested individual viewpoints from 28 experts in clinical and molecular aspects of cancer metastasis and summarized them computationally. While experts predominantly agreed on the definition of individual steps involved in metastasis, no two expert scenarios for metastasis were identical. We computed the probability that any two experts would disagree on k or fewer metastatic stages and found that any two randomly selected experts are likely to disagree about several assumptions. Considering the probability that two or more of these experts review an article or a proposal about metastatic cascades, the probability that they will disagree with elements of a proposed model approaches 1. This diversity of conceptions has clear consequences for advance and deadlock in the field. We suggest that strong, incompatible views are common in biomedicine but largely invisible to biomedical experts themselves. We built a formal Markov model of metastasis to encapsulate expert convergence and divergence regarding the entire sequence of metastatic stages. This model revealed stages of greatest disagreement, including the points at which cancer enters and leaves the bloodstream. The model provides a formal probabilistic hypothesis against which researchers can evaluate data on the process of metastasis. This would enable subsequent improvement of the model through Bayesian probabilistic update. Practically, we propose that model assumptions and hunches be harvested systematically and made available for modelers and scientists.</p

Knowledge UChicago

Mapping gene associations in human mitochondria using clinical disease phenotypes

Author: A Hamosh
A Spinazzola
AJ Butte
Andrey Rzhetsky
BB Lowell
C Eng
C Perez-Cerda
Curt Scharfe
D Skladal
DC Wallace
E Singer
Edward A. Allen
FW Sabb
G Benard
G Giaever
GM Enns
Gregory M. Enns
Guan-Cheng Li
H Prokisch
H Prokisch
Henry Horng-Shing Lu
HG Brunner
I Feldman
IH Witten
J Lim
JB Potash
JD Storey
JO Sass
Jutta K. Neuenburg
K Lage
KI Goh
L Franke
LG Biesecker
LM Steinmetz
MA Holliday
MA van Driel
MG Rosca
N Darin
N Freimer
O Elpeleg
P Braun
PF Chinnery
PM Nadkarni
R Delsite
R Ensenauer
R Luft
Ronald W. Davis
S Anderson
S Calvo
S DiMauro
SE Calvano
T Hernandez-Boussard
TD Vo
Thomas Klopstock
Tina M. Cowan
TM Dawson
V Tiranti
VK Mootha
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2009
Field of study

Nuclear genes encode most mitochondrial proteins, and their mutations cause diverse and debilitating clinical disorders. To date, 1,200 of these mitochondrial genes have been recorded, while no standardized catalog exists of the associated clinical phenotypes. Such a catalog would be useful to develop methods to analyze human phenotypic data, to determine genotype-phenotype relations among many genes and diseases, and to support the clinical diagnosis of mitochondrial disorders. Here we establish a clinical phenotype catalog of 174 mitochondrial disease genes and study associations of diseases and genes. Phenotypic features such as clinical signs and symptoms were manually annotated from full-text medical articles and classified based on the hierarchical MeSH ontology. This classification of phenotypic features of each gene allowed for the comparison of diseases between different genes. In turn, we were then able to measure the phenotypic associations of disease genes for which we calculated a quantitative value that is based on their shared phenotypic features. The results showed that genes sharing more similar phenotypes have a stronger tendency for functional interactions, proving the usefulness of phenotype similarity values in disease gene network analysis. We then constructed a functional network of mitochondrial genes and discovered a higher connectivity for non-disease than for disease genes, and a tendency of disease genes to interact with each other. Utilizing these differences, we propose 168 candidate genes that resemble the characteristic interaction patterns of mitochondrial disease genes. Through their network associations, the candidates are further prioritized for the study of specific disorders such as optic neuropathies and Parkinson disease. Most mitochondrial disease phenotypes involve several clinical categories including neurologic, metabolic, and gastrointestinal disorders, which might indicate the effects of gene defects within the mitochondrial system. The accompanying knowledgebase (http://www.mitophenome.org/) supports the study of clinical diseases and associated genes

CiteSeerX

Open Access LMU ( Ludwig-Maximilians-Univ. München)

The Francis Crick Institute

Dissecting schizophrenia phenotypic variation:the contribution of genetic variation, environmental exposures, and gene–environment interactions

Author: Khan Atif
Kushner Steven A.
Rzhetsky Andrey
Zhang Hanxin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/05/2022
Field of study

Schizophrenia is among the leading causes of disability worldwide. Prior studies have conclusively demonstrated that the etiology of schizophrenia contains a strong genetic component. However, the understanding of environmental contributions and gene–environment interactions have remained less well understood. Here, we estimated the genetic and environmental contributions to schizophrenia risk using a unique combination of data sources and mathematical models. We used the administrative health records of 481,657 U.S. individuals organized into 128,989 families. In addition, we employed rich geographically specific measures of air, water, and land quality across the United States. Using models of progressively increasing complexity, we examined both linear and non-linear contributions of genetic variation and environmental exposures to schizophrenia risk. Our results demonstrate that heritability estimates differ significantly when gene–environment interactions are included in the models, dropping from 79% for the simplest model, to 46% in the best-fit model which included the full set of linear and non-linear parameters. Taken together, these findings suggest that environmental factors are an important source of explanatory variance underlying schizophrenia risk. Future studies are warranted to further explore linear and non-linear environmental contributions to schizophrenia risk and investigate the causality of these associations

EUR Research Repository

Public Library of Science (PLOS)

Divergence of Mammalian Higher Order Chromatin Structure Is Associated with Developmental Loci

Author: A Baker
A Kong
A Magklara
Andrey Rzhetsky
B Moindrot
CL Woodcock
Colin A. Semple
CP Cabrera
D Peric-Hupkes
D Schmidt
D Sproul
DM Church
DM Gilbert
DS Dimitrova
DW Huang
E de Wit
E Lieberman-Aiden
E Yaffe
E Yaffe
Emily V. Chambers
EV Linardopoulou
G Pegoraro
GK Smyth
I Hiratani
I Hiratani
I Qamar
J Cai
JGD Prendergast
JGD Prendergast
JR Dixon
K Schroder
KC Wang
L Guelen
L Parker-Katiraee
M Costantini
M Guttman
M Ku
M Lin
N Gilbert
NMV Gomes
P Tschopp
R Kalhor
R Lister
RT O'Keefe
S Xiao
T Hirayama
T Montavon
T Ryba
TS McClintock
VG Tusher
Wendy A. Bickmore
WJ Kent
Y Wang
Y Wang
YH Woo
Z Zhang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

Several recent studies have examined different aspects of mammalian higher order chromatin structure - replication timing, lamina association and Hi-C inter-locus interactions - and have suggested that most of these features of genome organisation are conserved over evolution. However, the extent of evolutionary divergence in higher order structure has not been rigorously measured across the mammalian genome, and until now little has been known about the characteristics of any divergent loci present. Here, we generate a dataset combining multiple measurements of chromatin structure and organisation over many embryonic cell types for both human and mouse that, for the first time, allows a comprehensive assessment of the extent of structural divergence between mammalian genomes. Comparison of orthologous regions confirms that all measurable facets of higher order structure are conserved between human and mouse, across the vast majority of the detectably orthologous genome. This broad similarity is observed in spite of many loci possessing cell type specific structures. However, we also identify hundreds of regions (from 100 Kb to 2.7 Mb in size) showing consistent evidence of divergence between these species, constituting at least 10% of the orthologous mammalian genome and encompassing many hundreds of human and mouse genes. These regions show unusual shifts in human GC content, are unevenly distributed across both genomes, and are enriched in human subtelomeric regions. Divergent regions are also relatively enriched for genes showing divergent expression patterns between human and mouse ES cells, implying these regions cause divergent regulation. Particular divergent loci are strikingly enriched in genes implicated in vertebrate development, suggesting important roles for structural divergence in the evolution of mammalian developmental programmes. These data suggest that, though relatively rare in the mammalian genome, divergence in higher order chromatin structure has played important roles during evolution

CiteSeerX