Search CORE

53 research outputs found

Meta-Alignment with Crumble and Prune: Partitioning very large alignment problems for performance and parallelization

Author: A Siepel
A Siepel
AS Schwartz
B Paten
B Paten
B Rhead
Benedict Paten
C Lee
CN Dewey
David Haussler
DF Feng
G Myers
I Lumb
J Ma
JE Stajich
JS Pedersen
K Katoh
K Katoh
K Kryukov
K Liu
K Reinert
KM Roskin
Krishna M Roskin
M Blanchette
M Hasegawa
M Waterman
N Bray
P Di Tommaso
RC Edgar
RK Bradley
S Griffiths-Jones
S Schwartz
T Kim
U Tönges
W Gentzsch
WJ Kent
WJ Kent
Z Yang
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Continuing research into the global multiple sequence alignment problem has resulted in more sophisticated and principled alignment methods. Unfortunately these new algorithms often require large amounts of time and memory to run, making it nearly impossible to run these algorithms on large datasets. As a solution, we present two general methods, Crumble and Prune, for breaking a phylogenetic alignment problem into smaller, more tractable sub-problems. We call Crumble and Prune <it>meta-alignment </it>methods because they use existing alignment algorithms and can be used with many current alignment programs. Crumble breaks long alignment problems into shorter sub-problems. Prune divides the phylogenetic tree into a collection of smaller trees to reduce the number of sequences in each alignment problem. These methods are orthogonal: they can be applied together to provide better scaling in terms of sequence length and in sequence depth. Both methods partition the problem such that many of the sub-problems can be solved independently. The results are then combined to form a solution to the full alignment problem. Results Crumble and Prune each provide a significant performance improvement with little loss of accuracy. In some cases, a gain in accuracy was observed. Crumble and Prune were tested on real and simulated data. Furthermore, we have implemented a system called Job-tree that allows hierarchical sub-problems to be solved in parallel on a compute cluster, significantly shortening the run-time. Conclusions These methods enabled us to solve gigabase alignment problems. These methods could enable a new generation of biologically realistic alignment algorithms to be applied to real world, large scale alignment problems.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Batrachochytrium dendrobatidis Shows High Genetic Diversity and Ecological Niche Specificity among Haplotypes in the Maya Mountains of Belize

Author: CG Becker
CJ Briggs
D Rödder
DG Boyle
E Rodriguez
EA Morehouse
GB Pauly
J Voyles
JA Swets
Jason E. Stajich
JAT Morgan
John Pollinger
JP Gaertner
JP Gaertner
JR Mendelson
JR Rohr
JS Piotrowski
K Goka
K Kaiser
KA Murray
KR Lips
KR Lips
Kristine Kaiser
L Berger
LF Skerratt
MC Fisher
MC Fisher
R Puschendorf
RA Farrer
RG Pearson
RWR Retallick
S Federici
SGM Bridgewater
SJ Phillips
SR Ron
TL Cheng
TWJ Garner
V Kantzoura
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

The amphibian pathogen Batrachochytrium dendrobatidis (Bd) has been implicated in amphibian declines around the globe. Although it has been found in most countries in Central America, its presence has never been assessed in Belize. We set out to determine the range, prevalence, and diversity of Bd using quantitative PCR (qPCR) and sequencing of a portion of the 5.8 s and ITS1-2 regions. Swabs were collected from 524 amphibians of at least 26 species in the protected areas of the Maya Mountains of Belize. We sequenced a subset of 72 samples that had tested positive for Bd by qPCR at least once; 30 samples were verified as Bd. Eight unique Bd haplotypes were identified in the Maya Mountains, five of which were previously undescribed. We identified unique ecological niches for the two most broadly distributed haplotypes. Combined with data showing differing virulence shown in different strains in other studies, the 5.8 s - ITS1-2 region diversity found in this study suggests that there may be substantial differences among populations or haplotypes. Future work should focus on whether specific haplotypes for other genomic regions and possibly pathogenicity can be associated with haplotypes at this locus, as well as the integration of molecular tools with other ecological tools to elucidate the ecology and pathogenicity of Bd

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

FigShare

The Deadly Chytrid Fungus: A Story of an Emerging Pathogen

Author: C Weldon
CJ Briggs
DC Woodhams
DC Woodhams
DS Blehert
DW Warnock
EA Morehouse
EB Rosenblum
EB Rosenblum
EP Symonds
Erica Bree Rosenblum
EW Lamirande
Hiten D. Madhani
J Voyles
J Voyles
J Voyles
Jamie Voyles
Jason E. Stajich
JD Kirshtein
JE Longcore
JJL Rowley
JS Piotrowski
K Goka
KR Lips
KR Lips
L Berger
L Berger
L Schloegel
MC Fisher
MC Fisher
P Daszak
RN Harris
RWR Retallick
SF Walker
SN Stuart
Thomas J. Poorten
TY James
TY James
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

[Extract] Emerging infectious diseases present a great challenge for the health of both humans and wildlife. The increasing prevalence of drug-resistant fungal pathogens in humans [1] and recent outbreaks of novel fungal pathogens in wildlife populations [2] underscore the need to better understand the origins and mechanisms of fungal pathogenicity. One of the most dramatic examples of fungal impacts on vertebrate populations is the effect of the amphibian disease chytridiomycosis, caused by the chytrid fungus Batrachochytrium dendrobatidis (Bd).\ud Amphibians around the world are experiencing unprecedented population losses and local extinctions [3]. While there are multiple causes of amphibian declines, many catastrophic die-offs are attributed to Bd [4],[5]. The chytrid pathogen has been documented in hundreds of amphibian species, and reports of Bd's impact on additional species and in additional geographic regions are accumulating at an alarming rate (e.g., see http://www.spatialepidemiology.net/bd). Bd is a microbial, aquatic fungus with distinct life stages. The motile stage, called a zoospore, swims using a flagellum and initiates the colonization of frog skin. Within the host epidermal cells, a zoospore forms a spherical thallus, which matures and produces new zoospores by dividing asexually, renewing the cycle of infection when zoospores are released to the skin surface (Figure 1). Bd is considered an emerging pathogen, discovered and described only a decade ago [6],[7]. Despite intensive ecological study of Bd over the last decade, a number of unanswered questions remain. Here we summarize what has been recently learned about this lethal pathogen

ResearchOnline@JCU

Crossref

Directory of Open Access Journals

ResearchOnline at James Cook University

PubMed Central

eScholarship - University of California

A Comparison of Phylogenetic Network Methods Using Computer Simulation

Author: A Rzhetsky
A Shioura
AR Templeton
AR Templeton
AR Templeton
B Holland
B Rannala
BA Schaal
BME Moret
D Posada
D Posada
David Posada
DF Robinson
DH Huson
DH Huson
DL Swofford
DM Hillis
DM Hillis
FT Bakker
G Cardona
G Jin
HJ Bandelt
I Cassens
I Cassens
Jason E. Stajich
JS Song
KA Crandall
Keith A. Crandall
L Excoffier
LL Cavalli-Sforza
M Clement
M Forster
M Pagel
M Perez-Losada
MH Schierup
MK Kuhner
N Nguyen
N Saitou
RC Griffiths
RR Hudson
RR Hudson
S Schneider
S Wain-Hobson
Steven M. Woolley
TH Jukes
W-H Li
Z Yang
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

Background: We present a series of simulation studies that explore the relative performance of several phylogenetic network approaches (statistical parsimony, split decomposition, union of maximum parsimony trees, neighbor-net, simulated history recombination upper bound, median-joining, reduced median joining and minimum spanning network) compared to standard tree approaches, (neighbor-joining and maximum parsimony) in the presence and absence of recombination. Principal Findings: In the absence of recombination, all methods recovered the correct topology and branch lengths nearly all of the time when the substitution rate was low, except for minimum spanning networks, which did considerably worse. At a higher substitution rate, maximum parsimony and union of maximum parsimony trees were the most accurate. With recombination, the ability to infer the correct topology was halved for all methods and no method could accurately estimate branch lengths. Conclusions: Our results highlight the need for more accurate phylogenetic network methods and the importance of detecting and accounting for recombination in phylogenetic studies. Furthermore, we provide useful information for choosing a network algorithm and a framework in which to evaluate improvements to existing methods and nove

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Digital Commons@Becker

Characterization of killer immunoglobulin-like receptor genetics and comprehensive genotyping by pyrosequencing in rhesus macaques

Author: AA Bashirova
AD Colantonio
Anna J Moreland
Benjamin N Bimber
BN Bimber
C McErlean
C Rosner
C Vilches
C Witt
CM Gardiner
D Hansen
D O'Connor
D Sharma
David H O'Connor
G Alter
H Li
HG Shilling
J Loffredo
J Robinson
J Sambrook
JC Beck
JD Thompson
JE Stajich
JH Blokhuis
JH Blokhuis
JH Blokhuis
JS Miller
K Hershberger
K Hsu
Karl W Broman
KL Hershberger
L Moretta
LA Guethlein
Lisbeth A Guethlein
LL Lanier
M Carrington
M Martin
M Martin
M Wilson
ML Budde
P Parham
Peter Parham
PH Kruse
R Keith Reeves
R Paul Johnson
R Rajalingam
RE Bontrop
RL Grendell
RW Wiseman
S Khakoo
S Kim
S Kim
VR Bonagura
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Human killer immunoglobulin-like receptors (KIRs) play a critical role in governing the immune response to neoplastic and infectious disease. Rhesus macaques serve as important animal models for many human diseases in which KIRs are implicated; however, the study of KIR activity in this model is hindered by incomplete characterization of <it>KIR </it>genetics. Results Here we present a characterization of <it>KIR </it>genetics in rhesus macaques (<it>Macaca mulatta)</it>. We conducted a survey of <it>KIRs </it>in this species, identifying 47 novel full-length <it>KIR </it>sequences. Using this expanded sequence library to build upon previous work, we present evidence supporting the existence of 22 <it>Mamu-KIR </it>genes, providing a framework within which to describe macaque <it>KIRs</it>. We also developed a novel pyrosequencing-based technique for <it>KIR </it>genotyping. This method provides both comprehensive <it>KIR </it>genotype and frequency estimates of transcript level, with implications for the study of <it>KIRs </it>in all species. Conclusions The results of this study significantly improve our understanding of macaque <it>KIR </it>genetic organization and diversity, with implications for the study of many human diseases that use macaques as a model. The ability to obtain comprehensive KIR genotypes is of basic importance for the study of KIRs, and can easily be adapted to other species. Together these findings both advance the field of macaque KIRs and facilitate future research into the role of KIRs in human disease.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

A Detailed History of Intron-rich Eukaryotic Ancestors Inferred from a Global Survey of 100 Complete Genomes

Author: A Fedorov
A Fedorov
AB Rose
AG Simpson
BJ Blencowe
Chris P. Ponting
CP Robert
DC Jeffares
E Mossel
ET Wang
Eugene V. Koonin
EV Koonin
F Denoeud
F Lejeune
F Rodriguez-Trelles
G Ast
G Neu-Yilik
H Keren
H Le Hir
HD Nguyen
IB Rogozin
IB Rogozin
Igor B. Rogozin
J Felsenstein
J Muller
JE Nixon
JE Stajich
JS Farris
L Carmel
L Carmel
L Collins
LK Fritz-Laylin
M Csuros
M Csuros
M Csuros
M Irimia
M Irimia
M Lynch
M Lynch
Miklos Csuros
MP Hoeppner
PJ Keeling
PJ Keeling
R Nielsen
S Vanacova
SM Adl
SW Roy
SW Roy
SW Roy
SW Roy
T Mourier
W Li
WK Hastings
Z Yang
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Protein-coding genes in eukaryotes are interrupted by introns, but intron densities widely differ between eukaryotic lineages. Vertebrates, some invertebrates and green plants have intron-rich genes, with 6–7 introns per kilobase of coding sequence, whereas most of the other eukaryotes have intron-poor genes. We reconstructed the history of intron gain and loss using a probabilistic Markov model (Markov Chain Monte Carlo, MCMC) on 245 orthologous genes from 99 genomes representing the three of the five supergroups of eukaryotes for which multiple genome sequences are available. Intron-rich ancestors are confidently reconstructed for each major group, with 53 to 74% of the human intron density inferred with 95% confidence for the Last Eukaryotic Common Ancestor (LECA). The results of the MCMC reconstruction are compared with the reconstructions obtained using Maximum Likelihood (ML) and Dollo parsimony methods. An excellent agreement between the MCMC and ML inferences is demonstrated whereas Dollo parsimony introduces a noticeable bias in the estimations, typically yielding lower ancestral intron densities than MCMC and ML. Evolution of eukaryotic genes was dominated by intron loss, with substantial gain only at the bases of several major branches including plants and animals. The highest intron density, 120 to 130% of the human value, is inferred for the last common ancestor of animals. The reconstruction shows that the entire line of descent from LECA to mammals was intron-rich, a state conducive to the evolution of alternative splicing

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Positive Selection in East Asians for an EDAR Allele that Enhances NF-κB Activation

Author: A Fischer
A Franbourg
A Fujimoto
A Kumar
BF Voight
BF Voight
CS Carlson
David Hughes
DF Conrad
DJ Headon
Emilie Hardouin
HL Norton
HM Cann
I Thesleff
Irina Pugach
J Hey
Jarosław Bryk
Jason E. Stajich
JL Kelley
JM Akey
JP Pollinger
JS Friedlaender
JZ Li
K Tang
KR Thornton
L Frisse
LB Barreiro
M Przeworski
M Soejima
M Yan
MA Beaumont
Mark Stoneking
MR Waters
N Chassaing
N Izagirre
NA Rosenberg
NA Rosenberg
NA Rosenberg
NJR Fagundes
O Lao
P Koppinen
PC Sabeti
Rainer Strotmann
RC Edgar
RL Lamason
S Myles
S Myles
S Myles
S Wright
SA Tishkoff
SE Ptak
Sean Myles
SH Williamson
T Bersaglieri
VA Botchkarev
Y Shimomura
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

Genome-wide scans for positive selection in humans provide a promising approach to establish links between genetic variants and adaptive phenotypes. From this approach, lists of hundreds of candidate genomic regions for positive selection have been assembled. These candidate regions are expected to contain variants that contribute to adaptive phenotypes, but few of these regions have been associated with phenotypic effects. Here we present evidence that a derived nonsynonymous substitution (370A) in EDAR, a gene involved in ectodermal development, was driven to high frequency in East Asia by positive selection prior to 10,000 years ago. With an in vitro transfection assay, we demonstrate that 370A enhances NF-κB activity. Our results suggest that 370A is a positively selected functional genetic variant that underlies an adaptive human phenotype

Crossref

Directory of Open Access Journals

PubMed Central

Bournemouth University Research Online

MPG.PuRe

University of Huddersfield Repository

Explore Bristol Research

C-type lectin-like domains in Fugu rubripes

Author: A Amores
A Krogh
A Li
A McLysaght
A Nishiyama
A Sato
A Zimek
AC Mistry
AK Jones
AL Hughes
AN Zelensky
AP Spicer
AY Gracey
C Burge
CJ Bayne
DA Shagin
E Birney
E Staub
EM Schwarz
G Blanc
G Pluschke
GR Vasta
H Ohbayashi
H Oshiumi
H Sano
H Sekine
H Williams
H Zhang
I Letunic
IM Weiss
J Felsenstein
J Wittbrodt
JC Achenbach
JD Thompson
JE Stajich
JM Maglich
JS Conery
JS Taylor
JS Taylor
K Azumi
K Drickamer
K Drickamer
K Drickamer
K Drickamer
K Fujiki
K Khalturin
K Kobuke
K Mann
K Natarajan
K Ohtani
KH Wolfe
KH Wolfe
KV Ewart
L McGregor
L Vitved
M Clamp
M Lynch
M Nei
M Nei
M Remm
M Robinson-Rechavi
M Robinson-Rechavi
M Trexler
MC Loewen
MS Clark
MY Matsuo
N Matsumoto
NC Brissett
NF Ng
PE Ahlberg
PJ Neame
R Sandford
RB Dodd
RC Richards
RD Dowell
S Aparicio
S Geider
S Kijimoto-Ochiai
S Kumar
S Ohno
S Tasumi
S Wong
SA Linehan
SE Lewis
SR Eddy
SR Kim
Susumu Ohno
SV Nair
T Hubbard
T Muta
T Szyperski
W Gronwald
Wen-Hsiung Li
WH Li
WI Weis
WM Yokoyama
X Gu
XQ Yu
Y Jiang
Y Van de Peer
Y Van Kooyk
Z Ning
Z Yang
Publication venue: BioMed Central
Publication date: 01/01/2004
Field of study

BACKGROUND: Members of the C-type lectin domain (CTLD) superfamily are metazoan proteins functionally important in glycoprotein metabolism, mechanisms of multicellular integration and immunity. Three genome-level studies on human, C. elegans and D. melanogaster reported previously demonstrated almost complete divergence among invertebrate and mammalian families of CTLD-containing proteins (CTLDcps). RESULTS: We have performed an analysis of CTLD family composition in Fugu rubripes using the draft genome sequence. The results show that all but two groups of CTLDcps identified in mammals are also found in fish, and that most of the groups have the same members as in mammals. We failed to detect representatives for CTLD groups V (NK cell receptors) and VII (lithostathine), while the DC-SIGN subgroup of group II is overrepresented in Fugu. Several new CTLD-containing genes, highly conserved between Fugu and human, were discovered using the Fugu genome sequence as a reference, including a CSPG family member and an SCP-domain-containing soluble protein. A distinct group of soluble dual-CTLD proteins has been identified, which may be the first reported CTLDcp group shared by invertebrates and vertebrates. We show that CTLDcp-encoding genes are selectively duplicated in Fugu, in a manner that suggests an ancient large-scale duplication event. We have verified 32 gene structures and predicted 63 new ones, and make our annotations available through a distributed annotation system (DAS) server and their sequences as additional files with this paper. CONCLUSIONS: The vertebrate CTLDcp family was essentially formed early in vertebrate evolution and is completely different from the invertebrate families. Comparison of fish and mammalian genomes revealed three groups of CTLDcps and several new members of the known groups, which are highly conserved between fish and mammals, but were not identified in the study using only mammalian genomes. Despite limitations of the draft sequence, the Fugu rubripes genome is a powerful instrument for gene discovery and vertebrate evolutionary analysis. The composition of the CTLDcp superfamily in fish and mammals suggests that large-scale duplication events played an important role in the evolution of vertebrates

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

The Australian National University

Analysis of the Basidiomycete Coprinopsis cinerea Reveals Conservation of the Core Meiotic Expression Program over Half a Billion Years of Evolution

Author: A Koshiyama
A Moldon
A Soukas
AC Gathman
AI Saeed
AJ Enright
Allen C. Gathman
ALY Pang
AM Schurko
Andreas Rechtsteiner
AP Mitchell
Claire Burns
CM Lake
DA Hosack
DS Johnston
EE Gerecke
F Chalmel
F Chalmel
F Chen
F Cnudde
F Klein
G Thomas
G Valentine
G Wrobel
GF Richard
Harmit S. Malik
J Andrews
J Bahler
J Ernst
J Ma
J Mata
J Mata
JA Young
Jason D. Lieb
Jason E. Stajich
JE Blair
JE Stajich
JFX Diffley
JL Gerton
JS Morey
JW Taylor
K Iwabata
K Juneau
K Sakaguchi
KA Henderson
L Li
LB Li
LC Seitz
Lorna Casselton
M Ashburner
M Celerin
M Primig
MA Ramesh
ME Futschik
ME Zolan
Miriam E. Zolan
N Hunter
NB Raju
NY Stassen
Oleksandr P. Savytskyy
P Heywood
Patricia J. Pukkila
PMB Medina
Q Xia
R Padmore
RC Gentleman
RK Sherwood
RM Adkins
S Chu
S Keeney
S Namekawa
S van Dongen
SA Redhead
Sarah K. Wilke
SB Malik
Sean E. Hanlon
SF Altschul
SL Page
SN Acharya
ST Merino
SY Hwang
T Kanda
T Nara
T Yamaguchi
U Kues
U Schlecht
U Schlecht
V Reinke
V Reinke
VG Tusher
W Crismani
Walt W. Lilly
WR Pearson
WX Li
Y Watanabe
Z Bozdech
Z Wang
ZL He
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Coprinopsis cinerea (also known as Coprinus cinereus) is a multicellular basidiomycete mushroom particularly suited to the study of meiosis due to its synchronous meiotic development and prolonged prophase. We examined the 15-hour meiotic transcriptional program of C. cinerea, encompassing time points prior to haploid nuclear fusion though tetrad formation, using a 70-mer oligonucleotide microarray. As with other organisms, a large proportion (∼20%) of genes are differentially regulated during this developmental process, with successive waves of transcription apparent in nine transcriptional clusters, including one enriched for meiotic functions. C. cinerea and the fungi Saccharomyces cerevisiae and Schizosaccharomyces pombe diverged ∼500–900 million years ago, permitting a comparison of transcriptional programs across a broad evolutionary time scale. Previous studies of S. cerevisiae and S. pombe compared genes that were induced upon entry into meiosis; inclusion of C. cinerea data indicates that meiotic genes are more conserved in their patterns of induction across species than genes not known to be meiotic. In addition, we found that meiotic genes are significantly more conserved in their transcript profiles than genes not known to be meiotic, which indicates a remarkable conservation of the meiotic process across evolutionarily distant organisms. Overall, meiotic function genes are more conserved in both induction and transcript profile than genes not known to be meiotic. However, of 50 meiotic function genes that were co-induced in all three species, 41 transcript profiles were well-correlated in at least two of the three species, but only a single gene (rad50) exhibited coordinated induction and well-correlated transcript profiles in all three species, indicating that co-induction does not necessarily predict correlated expression or vice versa. Differences may reflect differences in meiotic mechanisms or new roles for paralogs. Similarities in induction, transcript profiles, or both, should contribute to gene discovery for orthologs without currently characterized meiotic roles

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Carolina Digital Repository

Whole Genome Resequencing Reveals Natural Target Site Preferences of Transposable Elements in Drosophila melanogaster

Transposable elements are mobile DNA sequences that integrate into host genomes using diverse mechanisms with varying degrees of target site specificity. While the target site preferences of some engineered transposable elements are well studied, the natural target preferences of most transposable elements are poorly characterized. Using population genomic resequencing data from 166 strains of Drosophila melanogaster, we identified over 8,000 new insertion sites not present in the reference genome sequence that we used to decode the natural target preferences of 22 families of transposable element in this species. We found that terminal inverted repeat transposon and long terminal repeat retrotransposon families present clade-specific target site duplications and target site sequence motifs. Additionally, we found that the sequence motifs at transposable element target sites are always palindromes that extend beyond the target site duplication. Our results demonstrate the utility of population genomics data for high-throughput inference of transposable element targeting preferences in the wild and establish general rules for terminal inverted repeat transposon and long terminal repeat retrotransposon target site selection in eukaryotic genomes

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

The University of Manchester - Institutional Repository

FigShare