Search CORE

First anatomical network analysis of fore- and hindlimb musculoskeletal modularity in bonobos, common chimpanzees, and humans

Author: B Esteve-Altava
B Esteve-Altava
B Esteve-Altava
B Esteve-Altava
B Villmoare
C Rolian
C Rolian
C Rolian
C Rolian
D Rasskin-Gutman
EH Margulies
G Csardi
J Molnar
ME Newman
MW Marzke
MW Marzke
NM Young
NM Young
P Pons
R Diogo
R Diogo
R Diogo
R Diogo
RA Miller
S Shou
TF Hansen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 02/05/2018
Field of study

Studies of morphological integration and modularity, and of anatomical complexity in human evolution typically focus on skeletal tissues. Here we provide the first network analysis of the musculoskeletal anatomy of both the fore- and hindlimbs of the two species of chimpanzee and humans. Contra long-accepted ideas, network analysis reveals that the hindlimb displays a pattern opposite to that of the forelimb: Pan big toe is typically seen as more independently mobile, but humans are actually the ones that have a separate module exclusively related to its movements. Different fore- vs hindlimb patterns are also seen for anatomical network complexity (i.e., complexity in the arrangement of bones and muscles). For instance, the human hindlimb is as complex as that of chimpanzees but the human forelimb is less complex than in Pan. Importantly, in contrast to the analysis of morphological integration using morphometric approaches, network analyses do not support the prediction that forelimb and hindlimb are more dissimilar in species with functionally divergent limbs such as bipedal humans

arXiv.org e-Print Archive

Viral population estimation using pyrosequencing

Author: A Dempster
A Rambaut
AMN Tsibris
B Gaschen
Baback Gharizadeh
C Wang
Chunlin Wang
D O'Meara
DC Douek
E Domingo
E Halperin
EH Simpson
ES Lander
Glenn Tesler
GS Gottlieb
GW Tyson
H Fakhrai-Rad
I Malet
IM Rouzine
J Kececioglu
JE Hopcroft
JF Simons
K Chen
KJ Metzner
L Bacheler
L Doukhan
L Excoffier
Lior Pachter
LR Ford
M Breitbart
M Eigen
M Margulies
M Stephens
MA Nowak
MJ Gonzales
ML Collins
ML Sogin
Mostafa Ronaghi
MT Tammi
N Beerenwinkel
Nicholas Eriksson
Niko Beerenwinkel
P Jenkins
PA Pevzner
R Schmid
R Shankarappa
Robert W. Shafer
RP Dilworth
S Huse
S-Y Rhee
S-Y Rhee
Soo-Yon Rhee
VA Johnson
Yumi Mitsuya
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2008
Field of study

The diversity of virus populations within single infected hosts presents a major difficulty for the natural immune response as well as for vaccine design and antiviral drug therapy. Recently developed pyrophosphate based sequencing technologies (pyrosequencing) can be used for quantifying this diversity by ultra-deep sequencing of virus samples. We present computational methods for the analysis of such sequence data and apply these techniques to pyrosequencing data obtained from HIV populations within patients harboring drug resistant virus strains. Our main result is the estimation of the population structure of the sample from the pyrosequencing reads. This inference is based on a statistical approach to error correction, followed by a combinatorial algorithm for constructing a minimal set of haplotypes that explain the data. Using this set of explaining haplotypes, we apply a statistical model to infer the frequencies of the haplotypes in the population via an EM algorithm. We demonstrate that pyrosequencing reads allow for effective population reconstruction by extensive simulations and by comparison to 165 sequences obtained directly from clonal sequencing of four independent, diverse HIV populations. Thus, pyrosequencing can be used for cost-effective estimation of the structure of virus populations, promising new insights into viral evolutionary dynamics and disease control strategies.Comment: 23 pages, 13 figure

CiteSeerX

Repository for Publications and Research Data

Caltech Authors

Detection of lineage-specific evolutionary changes among primate species

Abstract Background Comparison of the human genome with other primates offers the opportunity to detect evolutionary events that created the diverse phenotypes among the primate species. Because the primate genomes are highly similar to one another, methods developed for analysis of more divergent species do not always detect signs of evolutionary selection. Results We have developed a new method, called DivE, specifically designed to find regions that have evolved either more or less rapidly than expected, for any clade within a set of very closely related species. Unlike some previous methods, DivE does not rely on rates of synonymous and nonsynonymous substitution, which enables it to detect evolutionary events in noncoding regions. We demonstrate using simulated data that DivE compares favorably to alternative methods, and we then apply DivE to the ENCODE regions in 14 primate species. We identify thousands of regions in these primates, ranging from 50 to >10000 bp in length, that appear to have experienced either constrained or accelerated rates of evolution. In particular, we detected 4942 regions that have potentially undergone positive selection in one or more primate species. Most of these regions occur outside of protein-coding genes, although we identified 20 proteins that have experienced positive selection. Conclusions DivE provides an easy-to-use method to predict both positive and negative selection in noncoding DNA, that is particularly well-suited to detecting lineage-specific selection in large genomes.</p

Digital Repository at the University of Maryland

Local conservation scores without a priori assumptions on neutral substitution rates

Author: A Diallo
A Siepel
A Siepel
A Siepel
A Wang
D Karolchik
DM McGaughey
E Check
E Dermitzakis
E Margulies
E Rivas
EH Margulies
G Bejerano
GM Cooper
GM Cooper
J Felsenstein
J Felsenstein
J Hagenauer
J Kim
Jakob C Mueller
Janis Dingel
Joachim Hagenauer
Jürgen Zech
M Blanchette
M Blanchette
M Kamal
M Pheasant
N Stojanovic
Niccolò Leonardi
PAP Moran
Pavol Hanus
R Nielsen
R Nielsen
RC Hardison
RM Phatarfod
S Asthana
S Whelan
The ENCODE Project Consortium
TM Cover
Z Yang
Z Yang
Z Yang
Z Yang
Z Yang
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Comparative genomics aims to detect signals of evolutionary conservation as an indicator of functional constraint. Surprisingly, results of the ENCODE project revealed that about half of the experimentally verified functional elements found in non-coding DNA were classified as unconstrained by computational predictions. Following this observation, it has been hypothesized that this may be partly explained by biased estimates on neutral evolutionary rates used by existing sequence conservation metrics. All methods we are aware of rely on a comparison with the neutral rate and conservation is estimated by measuring the deviation of a particular genomic region from this rate. Consequently, it is a reasonable assumption that inaccurate neutral rate estimates may lead to biased conservation and constraint estimates. Results We propose a conservation signal that is produced by local Maximum Likelihood estimation of evolutionary parameters using an optimized sliding window and present a Kullback-Leibler projection that allows multiple different estimated parameters to be transformed into a conservation measure. This conservation measure does not rely on assumptions about neutral evolutionary substitution rates and little a priori assumptions on the properties of the conserved regions are imposed. We show the accuracy of our approach (KuLCons) on synthetic data and compare it to the scores generated by state-of-the-art methods (phastCons, GERP, SCONE) in an ENCODE region. We find that KuLCons is most often in agreement with the conservation/constraint signatures detected by GERP and SCONE while qualitatively very different patterns from phastCons are observed. Opposed to standard methods KuLCons can be extended to more complex evolutionary models, e.g. taking insertion and deletion events into account and corresponding results show that scores obtained under this model can diverge significantly from scores using the simpler model. Conclusion Our results suggest that discriminating among the different degrees of conservation is possible without making assumptions about neutral rates. We find, however, that it cannot be expected to discover considerably different constraint regions than GERP and SCONE. Consequently, we conclude that the reported discrepancies between experimentally verified functional and computationally identified constraint elements are likely not to be explained by biased neutral rate estimates.</p

MPG.PuRe

Anatomical Network Comparison of Human Upper and Lower, Newborn and Adult, and Normal and Abnormal Limbs, with Notes on Development, Pathology and Limb Serial Homology vs. Homoplasy

Author: A Goswami
A Porto
A Porto
AM Leroi
B Esteve-Altava
B Esteve-Altava
B Esteve-Altava
B Hallgrimsson
B Hallgrimsson
B Hallgrimsson
B Villmoare
BA Barash
BA Villmoare
BA Villmoare
BL Shapiro
Borja Esteve-Altava
C Rolian
C Rolian
C Rolian
C Smith
CF Ross
Christopher Smith
CR Bardeen
D Rasskin-Gutman
D Rasskin-Gutman
DE Lieberman
DE Lieberman
Diego Rasskin-Gutman
DJ Watts
E Ravasz
EC Olson
EH Margulies
EN Lorenz
G Bello‐Hellegouarch
G Csardi
G Marroig
G Marroig
G Marroig
GB Müller
GP Wagner
GP Wagner
GP Wagner
J Hodin
J Zhu
JA Clack
JM Cheverud
Julia C. Boughner
Junming Yue
K Sears
KL Lewton
KS Crider
L Danon
LJ Daston
LR Monteiro
M Bastir
M Bastir
M Buchanan
M Fabrezi
MC Gondré-Lewis
MD Humphries
MI Coates
ML Moss
ML Zelditch
NH Shubin
NM Young
NM Young
P Alberch
P Mitteroecker
P Pons
PL Reno
R Cornette
R Diogo
R Diogo
R Diogo
R Diogo
R Diogo
R Diogo
R Diogo
R Diogo
R Diogo
RA Raff
RR Ackermann
RR Ackermann
Rui Diogo
S Raspopovic
SJ Pont
SN Reid
VL Roth
W Bateson
WR Atchley
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 09/10/2015
Field of study

How do the various anatomical parts (modules) of the animal body evolve into very different integrated forms (integration) yet still function properly without decreasing the individual's survival? This long-standing question remains unanswered for multiple reasons, including lack of consensus about conceptual definitions and approaches, as well as a reasonable bias toward the study of hard tissues over soft tissues. A major difficulty concerns the non-trivial technical hurdles of addressing this problem, specifically the lack of quantitative tools to quantify and compare variation across multiple disparate anatomical parts and tissue types. In this paper we apply for the first time a powerful new quantitative tool, Anatomical Network Analysis (AnNA), to examine and compare in detail the musculoskeletal modularity and integration of normal and abnormal human upper and lower limbs. In contrast to other morphological methods, the strength of AnNA is that it allows efficient and direct empirical comparisons among body parts with even vastly different architectures (e.g. upper and lower limbs) and diverse or complex tissue composition (e.g. bones, cartilages and muscles), by quantifying the spatial organization of these parts-their topological patterns relative to each other-using tools borrowed from network theory. Our results reveal similarities between the skeletal networks of the normal newborn/adult upper limb vs. lower limb, with exception to the shoulder vs. pelvis. However, when muscles are included, the overall musculoskeletal network organization of the upper limb is strikingly different from that of the lower limb, particularly that of the more proximal structures of each limb. Importantly, the obtained data provide further evidence to be added to the vast amount of paleontological, gross anatomical, developmental, molecular and embryological data recently obtained that contradicts the long-standing dogma that the upper and lower limbs are serial homologues. In addition, the AnNA of the limbs of a trisomy 18 human fetus strongly supports Pere Alberch's ill-named "logic of monsters" hypothesis, and contradicts the commonly accepted idea that birth defects often lead to lower integration (i.e. more parcellation) of anatomical structures

FigShare

Analysis of Transposon Interruptions Suggests Selection for L1 Elements on the X Chromosome

Author: A Pavlicek
AB Conley
AFA Smit
AO Urrutia
C Chureau
C Feschotte
C Simons
CB Lowe
CI Castillo-Davis
CJ Brown
CM Bergman
DA Petrov
DA Petrov
DE Symer
Dmitri A. Petrov
E Axelsson
E Birney
EA Montgomery
EH Margulies
EH Margulies
G Abrusan
G Bejerano
G Churakov
G Lev-Maor
György Abrusán
IK Jordan
J Giordano
J Hasler
J Jurka
J Jurka
J Jurka
JA Bailey
JC Chow
JFY Brookfield
JK Pace
JO Kriegs
Joti Giordano
K Han
K Ng
K Plath
L Carrel
L Carrel
L Duret
L Marino-Ramirez
LL Handley
LN van de Lagemaat
M Dewannieux
M Hackenberg
MF Lyon
MF Lyon
MT Ross
MT Webster
N Gal-Mark
N Gilbert
N Sela
P Medstrand
P Polak
Peter E. Warburton
PF Arndt
S Boissinot
SK Sen
TS Mikkelsen
Z Wang
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

It has been hypothesised that the massive accumulation of L1 transposable elements on the X chromosome is due to their function in X inactivation, and that the accumulation of Alu elements near genes is adaptive. We tested the possible selective advantage of these two transposable element (TE) families with a novel method, interruption analysis. In mammalian genomes, a large number of TEs interrupt other TEs due to the high overall abundance and age of repeats, and these interruptions can be used to test whether TEs are selectively neutral. Interruptions of TEs, which are beneficial for the host, are expected to be deleterious and underrepresented compared with neutral ones. We found that L1 elements in the regions of the X chromosome that contain the majority of the inactivated genes are significantly less frequently interrupted than on the autosomes, while L1s near genes that escape inactivation are interrupted with higher frequency, supporting the hypothesis that L1s on the X chromosome play a role in its inactivation. In addition, we show that TEs are less frequently interrupted in introns than in intergenic regions, probably due to selection against the expansion of introns, but the insertion pattern of Alus is comparable to other repeats

CiteSeerX

Multiple organism algorithm for finding ultraconserved elements

Author: A Sandelin
A Siepel
A Woolfe
AL Delcher
AL Delcher
B Ma
CF Cheung
D Gusfield
D Lawson
EA Glazov
EH Margulies
G Bejerano
Greg Madey
HW Mewes
JC Venter
JZ Ni
LD Stein
M Brudno
MI Abouelhoda
N Bray
Neil F Lobo
P Ferragina
RA Holt
S Kurtz
S Kurtz
S Schwartz
Scott Christley
SF Altschul
T Tran
TJP Hubbard
U Manber
WJ Kent
WJ Kent
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Ultraconserved elements are nucleotide or protein sequences with 100% identity (no mismatches, insertions, or deletions) in the same organism or between two or more organisms. Studies indicate that these conserved regions are associated with micro RNAs, mRNA processing, development and transcription regulation. The identification and characterization of these elements among genomes is necessary for the further understanding of their functionality. Results We describe an algorithm and provide freely available software which can find all of the ultraconserved sequences between genomes of multiple organisms. Our algorithm takes a combinatorial approach that finds all sequences without requiring the genomes to be aligned. The algorithm is significantly faster than BLAST and is designed to handle very large genomes efficiently. We ran our algorithm on several large comparative analyses to evaluate its effectiveness; one compared 17 vertebrate genomes where we find 123 ultraconserved elements longer than 40 bps shared by all of the organisms, and another compared the human body louse, <it>Pediculus humanus humanus</it>, against itself and select insects to find thousands of non-coding, potentially functional sequences. Conclusion Whole genome comparative analysis for multiple organisms is both feasible and desirable in our search for biological knowledge. We argue that bioinformatic programs should be forward thinking by assuming analysis on multiple (and possibly large) genomes in the design and implementation of algorithms. Our algorithm shows how a compromise design with a trade-off of disk space versus memory space allows for efficient computation while only requiring modest computer resources, and at the same time providing benefits not available with other software.</p

Systematic identification of conserved motif modules in the human genome

Author: A Subramanian
A Visel
AL Donner
B Ren
CE Lawrence
CS Shashikant
DC King
DJ Galas
DS Johnson
E Eden
E Wingender
EH Davidson
EH Margulies
G Grahne
G Robertson
GD Stormo
GG Loots
GG Prefontaine
Haiyan Hu
HJ Bussemaker
J Han
J Hu
JC Knight
JD Hughes
KH Lee
L Narlikar
Lin Hou
M Blanchette
M Blanchette
M Brudno
M Fried
M Gupta
MA Eid
Minghua Deng
MM Garner
Naifang Su
NB La Thangue
OV Kel-Margoulis
PR Stabach
Q Zhou
S Sinha
SA Sholl
TL Bailey
WW Wasserman
WW Wasserman
X Cai
X Li
X Zhang
Xiaohui Cai
Xiaoman Li
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background The identification of motif modules, groups of multiple motifs frequently occurring in DNA sequences, is one of the most important tasks necessary for annotating the human genome. Current approaches to identifying motif modules are often restricted to searches within promoter regions or rely on multiple genome alignments. However, the promoter regions only account for a limited number of locations where transcription factor binding sites can occur, and multiple genome alignments often cannot align binding sites with their true counterparts because of the short and degenerative nature of these transcription factor binding sites. Results To identify motif modules systematically, we developed a computational method for the entire non-coding regions around human genes that does not rely upon the use of multiple genome alignments. First, we selected orthologous DNA blocks approximately 1-kilobase in length based on discontiguous sequence similarity. Next, we scanned the conserved segments in these blocks using known motifs in the TRANSFAC database. Finally, a frequent pattern mining technique was applied to identify motif modules within these blocks. In total, with a false discovery rate cutoff of 0.05, we predicted 3,161,839 motif modules, 90.8% of which are supported by various forms of functional evidence. Compared with experimental data from 14 ChIP-seq experiments, on average, our methods predicted 69.6% of the ChIP-seq peaks with TFBSs of multiple TFs. Our findings also show that many motif modules have distance preference and order preference among the motifs, which further supports the functionality of these predictions. Conclusions Our work provides a large-scale prediction of motif modules in mammals, which will facilitate the understanding of gene regulation in a systematic way.</p

eScholarship - University of California

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

How accurately is ncRNA aligned within whole-genome multiple alignments?

Author: A Prakash
A Prakash
A Siepel
Adrienne X Wang
DA Pollard
DA Pollard
E Rivas
E Torarinsson
EH Margulies
G Bourque
J Pei
JD Thompson
JD Thompson
JD Thompson
L Wang
M Blanchette
M Brudno
M Cline
M Errami
Martin Tompa
MS Rosenberg
S Batzoglou
S Griffiths-Jones
S Griffiths-Jones
S Karlin
S Kumar
S Schwartz
S Washietl
SR Eddy
SR Eddy
T Lassmann
W Miller
Walter L Ruzzo
WJ Kent
WJ Kent
WJ Kent
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Multiple alignment of homologous DNA sequences is of great interest to biologists since it provides a window into evolutionary processes. At present, the accuracy of whole-genome multiple alignments, particularly in noncoding regions, has not been thoroughly evaluated. Results We evaluate the alignment accuracy of certain noncoding regions using noncoding RNA alignments from Rfam as a reference. We inspect the MULTIZ 17-vertebrate alignment from the UCSC Genome Browser for all the human sequences in the Rfam seed alignments. In particular, we find 638 instances of chimeric and partial alignments to human noncoding RNA elements, of which at least 225 can be improved by straightforward means. As a byproduct of our procedure, we predict many novel instances of known ncRNA families that are suggested by the alignment. Conclusion MULTIZ does a fairly accurate job of aligning these genomes in these difficult regions. However, our experiments indicate that better alignments exist in some regions.</p