Search CORE

28 research outputs found

SeqAn An efficient, generic C++ library for sequence analysis

Author: A Darling
A Fabri
A Halpern
Andreas Döring
C Notredame
D Butt
D Vandevoorde
David Weese
DS Hirschberg
EW Myers
EW Myers
G Myers
G Navarro
J Dutheil
J Kececioglu
J Stajich
JC Venter
K Czarnecki
K Mehlhorn
Knut Reinert
M Abouelhoda
M Abouelhoda
M Brudno
M Höhl
M Li
M Pocock
M Wilson
MH Austern
MH Overmars
MI Abouelhoda
N Saitou
O Gotoh
P Bieganski
P Weiner
R Giegerich
RJ Mural
S Burkhardt
S Burkhardt
S Kurtz
SB Needleman
SF Altschul
TH Cormen
Tobias Rausch
U Manber
W Vahrson
WR Pitt
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background The use of novel algorithmic techniques is pivotal to many important problems in life science. For example the sequencing of the human genome <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> would not have been possible without advanced assembly algorithms. However, owing to the high speed of technological progress and the urgent need for bioinformatics tools, there is a widening gap between state-of-the-art algorithmic techniques and the actual algorithmic components of tools that are in widespread use. Results To remedy this trend we propose the use of SeqAn, a library of efficient data types and algorithms for sequence analysis in computational biology. SeqAn comprises implementations of existing, practical state-of-the-art algorithmic components to provide a sound basis for algorithm testing and development. In this paper we describe the design and content of SeqAn and demonstrate its use by giving two examples. In the first example we show an application of SeqAn as an experimental platform by comparing different exact string matching algorithms. The second example is a simple version of the well-known MUMmer tool rewritten in SeqAn. Results indicate that our implementation is very efficient and versatile to use. Conclusion We anticipate that SeqAn greatly simplifies the rapid development of new bioinformatics tools by providing a collection of readily usable, well-designed algorithmic components which are fundamental for the field of sequence analysis. This leverages not only the implementation of new algorithms, but also enables a sound analysis and comparison of existing algorithms.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

Repository: Freie Universität Berlin (FU), Math Department (fu_mi_publications)

PubMed Central

Systematic identification of conserved motif modules in the human genome

Author: A Subramanian
A Visel
AL Donner
B Ren
CE Lawrence
CS Shashikant
DC King
DJ Galas
DS Johnson
E Eden
E Wingender
EH Davidson
EH Margulies
G Grahne
G Robertson
GD Stormo
GG Loots
GG Prefontaine
Haiyan Hu
HJ Bussemaker
J Han
J Hu
JC Knight
JD Hughes
KH Lee
L Narlikar
Lin Hou
M Blanchette
M Blanchette
M Brudno
M Fried
M Gupta
MA Eid
Minghua Deng
MM Garner
Naifang Su
NB La Thangue
OV Kel-Margoulis
PR Stabach
Q Zhou
S Sinha
SA Sholl
TL Bailey
WW Wasserman
WW Wasserman
X Cai
X Li
X Zhang
Xiaohui Cai
Xiaoman Li
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background The identification of motif modules, groups of multiple motifs frequently occurring in DNA sequences, is one of the most important tasks necessary for annotating the human genome. Current approaches to identifying motif modules are often restricted to searches within promoter regions or rely on multiple genome alignments. However, the promoter regions only account for a limited number of locations where transcription factor binding sites can occur, and multiple genome alignments often cannot align binding sites with their true counterparts because of the short and degenerative nature of these transcription factor binding sites. Results To identify motif modules systematically, we developed a computational method for the entire non-coding regions around human genes that does not rely upon the use of multiple genome alignments. First, we selected orthologous DNA blocks approximately 1-kilobase in length based on discontiguous sequence similarity. Next, we scanned the conserved segments in these blocks using known motifs in the TRANSFAC database. Finally, a frequent pattern mining technique was applied to identify motif modules within these blocks. In total, with a false discovery rate cutoff of 0.05, we predicted 3,161,839 motif modules, 90.8% of which are supported by various forms of functional evidence. Compared with experimental data from 14 ChIP-seq experiments, on average, our methods predicted 69.6% of the ChIP-seq peaks with TFBSs of multiple TFs. Our findings also show that many motif modules have distance preference and order preference among the motifs, which further supports the functionality of these predictions. Conclusions Our work provides a large-scale prediction of motif modules in mammals, which will facilitate the understanding of gene regulation in a systematic way.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Mouse Transgenesis Identifies Conserved Functional Enhancers and cis-Regulatory Motif in the Vertebrate LIM Homeobox Gene Lhx2 Locus

Author: A Kolterud
A Kolterud
A Nagy
A Sandelin
A Woolfe
AI Su
AJ Bendall
AL Marat
Alison P. Lee
AP Lee
Byrappa Venkatesh
CA Gruber
DE Rincon-Limas
Domingos Henrique
DS Wilson
E Levivier
E Wandzioch
ES Monuki
F Poulin
FB Rahmatpanah
FD Porter
FJ Diaz-Benjumea
G Pavesi
G Tornqvist
H Kikuta
H Rhee
HF Wang
HJ Mangalam
HK Wu
HK Wu
I Ovcharenko
J Asp
J Hirota
J Wu
JA van den Hurk
JM Rhee
JT Shin
KA Frazer
LA Pennacchio
M Brudno
M Joksimovic
M Sironi
ME Zuber
MJL de Hoon
MW Perry
N Frankel
N Heisterkamp
O Karlsson
PA Gray
PA Gray
PD Allaire
PD Allaire
PJ Bailey
Q Li
R Kothary
RC Milewski
S Bulchand
S Guazzi
S Nakai
S Yun
SE Lundgren
SJ Chou
SS Blair
Sydney Brenner
T Rauch
T Werner
VS Mangale
WF Odenwald
X He
Y Jeong
Y Omori
Y Xu
Y Zhao
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

The vertebrate Lhx2 is a member of the LIM homeobox family of transcription factors. It is essential for the normal development of the forebrain, eye, olfactory system and liver as well for the differentiation of lymphoid cells. However, despite the highly restricted spatio-temporal expression pattern of Lhx2, nothing is known about its transcriptional regulation. In mammals and chicken, Crb2, Dennd1a and Lhx2 constitute a conserved linkage block, while the intervening Dennd1a is lost in the fugu Lhx2 locus. To identify functional enhancers of Lhx2, we predicted conserved noncoding elements (CNEs) in the human, mouse and fugu Crb2-Lhx2 loci and assayed their function in transgenic mouse at E11.5. Four of the eight CNE constructs tested functioned as tissue-specific enhancers in specific regions of the central nervous system and the dorsal root ganglia (DRG), recapitulating partial and overlapping expression patterns of Lhx2 and Crb2 genes. There was considerable overlap in the expression domains of the CNEs, which suggests that the CNEs are either redundant enhancers or regulating different genes in the locus. Using a large set of CNEs (810 CNEs) associated with transcription factor-encoding genes that express predominantly in the central nervous system, we predicted four over-represented 8-mer motifs that are likely to be associated with expression in the central nervous system. Mutation of one of them in a CNE that drove reporter expression in the neural tube and DRG abolished expression in both domains indicating that this motif is essential for expression in these domains. The failure of the four functional enhancers to recapitulate the complete expression pattern of Lhx2 at E11.5 indicates that there must be other Lhx2 enhancers that are either located outside the region investigated or divergent in mammals and fishes. Other approaches such as sequence comparison between multiple mammals are required to identify and characterize such enhancers

Crossref

Directory of Open Access Journals

PubMed Central

ScholarBank@NUS

Pregnane X Receptor and Yin Yang 1 Contribute to the Differential Tissue Expression and Induction of CYP3A5 and CYP3A4

The hepato-intestinal induction of the detoxifying enzymes CYP3A4 and CYP3A5 by the xenosensing pregnane X receptor (PXR) constitutes a key adaptive response to oral drugs and dietary xenobiotics. In contrast to CYP3A4, CYP3A5 is additionally expressed in several, mostly steroidogenic organs, which creates potential for induction-driven disturbances of the steroid homeostasis. Using cell lines and mice transgenic for a CYP3A5 promoter we demonstrate that the CYP3A5 expression in these organs is non-inducible and independent from PXR. Instead, it is enabled by the loss of a suppressing yin yang 1 (YY1)-binding site from the CYP3A5 promoter which occurred in haplorrhine primates. This YY1 site is conserved in CYP3A4, but its inhibitory effect can be offset by PXR acting on response elements such as XREM. Taken together, the loss of YY1 binding site from promoters of the CYP3A5 gene lineage during primate evolution may have enabled the utilization of CYP3A5 both in the adaptive hepato-intestinal response to xenobiotics and as a constitutively expressed gene in other organs. Our results thus constitute a first description of uncoupling induction from constitutive expression for a major detoxifying enzyme. They also suggest an explanation for the considerable tissue expression differences between CYP3A5 and CYP3A4

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

FigShare

Melanism in Peromyscus Is Caused by Independent Mutations in Agouti

Author: A Ludwig
BE Horner
BJ Norris
C Mayor
CA Cottle
CB Stewart
CC Steiner
CC Steiner
Christopher D. Wiley
CJ Bult
D Henrique
DC Bennett
DI Vage
DI Våge
DI Våge
DL Nagle
DL Stern
DS Lu
E Eizirik
E Theron
EA MacDougall-Shackleton
EJ Michaud
Evan P. Kingsley
G Wlasiuk
GS Barsh
H Klungland
H Vrieling
HA Orr
HB Cott
HE Hoekstra
HE Hoekstra
HE Hoekstra
HE Hoekstra
Hopi E. Hoekstra
JA Kerns
JB Gross
JBS Haldane
JMH Kijas
JR Kornegay
JS Mogil
Justin O. Borevitz
KM Hogan
L He
LK Phan
LM Turner
LS Robbins
M Brudno
Marie Manceau
MD Rausher
MD Shapiro
MD Shapiro
ME Protas
MEN Majerus
MK Ling
MM Dickie
MM Ollmann
MW Nachman
NI Mundy
NJ Nadeau
PC Baião
PJ Wittkopp
RH ffrench-Constant
RJ Miltenberger
RT Bronson
S Rieder
S Takeuchi
SB Carroll
SB Carroll
SE Millar
SI Candille
SJ Bultman
SW Baxter
T Hiragaki
T Kuramoto
TM Anderson
TM Gunn
V Hayssen
WE Howard
WH Osgood
WL Perry
YR Chen
Publication venue: Public Library of Science
Publication date: 01/07/2009
Field of study

Identifying the molecular basis of phenotypes that have evolved independently can provide insight into the ways genetic and developmental constraints influence the maintenance of phenotypic diversity. Melanic (darkly pigmented) phenotypes in mammals provide a potent system in which to study the genetic basis of naturally occurring mutant phenotypes because melanism occurs in many mammals, and the mammalian pigmentation pathway is well understood. Spontaneous alleles of a few key pigmentation loci are known to cause melanism in domestic or laboratory populations of mammals, but in natural populations, mutations at one gene, the melanocortin-1 receptor (Mc1r), have been implicated in the vast majority of cases, possibly due to its minimal pleiotropic effects. To investigate whether mutations in this or other genes cause melanism in the wild, we investigated the genetic basis of melanism in the rodent genus Peromyscus, in which melanic mice have been reported in several populations. We focused on two genes known to cause melanism in other taxa, Mc1r and its antagonist, the agouti signaling protein (Agouti). While variation in the Mc1r coding region does not correlate with melanism in any population, in a New Hampshire population, we find that a 125-kb deletion, which includes the upstream regulatory region and exons 1 and 2 of Agouti, results in a loss of Agouti expression and is perfectly associated with melanic color. In a second population from Alaska, we find that a premature stop codon in exon 3 of Agouti is associated with a similar melanic phenotype. These results show that melanism has evolved independently in these populations through mutations in the same gene, and suggest that melanism produced by mutations in genes other than Mc1r may be more common than previously thought

Public Library of Science (PLOS)

Crossref

Harvard University - DASH

Directory of Open Access Journals

PubMed Central

Assessing Computational Methods of Cis-Regulatory Module Prediction

Author: A Bruhat
A Siepel
A Sosinsky
A Visel
AB Rose
AG Clark
AL Halpern
AM Moses
B Prud'homme
B Shi
BK Peterson
BP Berman
BY Chan
Christina Leslie
CM Bergman
CM Bergman
D Kolbe
D Papatsenko
DA Kleinjan
DC King
DC King
DE Schones
DM Jeziorska
DS Johnson
E Birney
E Davidson
E Emberly
E Segal
E Wingender
G Bejerano
GM Euskirchen
H Wang
H Weintraub
JB Warner
Jing Su
JL Kabat
JR Stone
JS Jakobsen
KH Surinya
KJ Won
L Li
LP Lim
M Bieda
M Blanchette
M Brudno
M Hasegawa
MC Frith
MD Schroeder
MD Wilson
MS Halfon
MS Halfon
MZ Ludwig
N Bray
N Ghanem
N Gompel
N Pierstorff
ND Heintzman
ND Heintzman
O Hallikas
O Johansson
OV Kel-Margoulis
P Van Loo
PC FitzGerald
PJ Sabo
Q Zhou
Q Zhou
R Godbout
RP Zinzen
S Aerts
S Aerts
S Batzoglou
S Karlin
S MacArthur
S Richards
S Sinha
S Sinha
S Sinha
Sarah A. Teichmann
SC Parker
SE Celniker
T Sandmann
T Strachan
T Waleev
Thomas A. Down
TL Bailey
TM Williams
V Ferretti
V Gotea
W Krivan
WW Wasserman
X He
X He
XY Li
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Computational methods attempting to identify instances of cis-regulatory modules (CRMs) in the genome face a challenging problem of searching for potentially interacting transcription factor binding sites while knowledge of the specific interactions involved remains limited. Without a comprehensive comparison of their performance, the reliability and accuracy of these tools remains unclear. Faced with a large number of different tools that address this problem, we summarized and categorized them based on search strategy and input data requirements. Twelve representative methods were chosen and applied to predict CRMs from the Drosophila CRM database REDfly, and across the human ENCODE regions. Our results show that the optimal choice of method varies depending on species and composition of the sequences in question. When discriminating CRMs from non-coding regions, those methods considering evolutionary conservation have a stronger predictive power than methods designed to be run on a single genome. Different CRM representations and search strategies rely on different CRM properties, and different methods can complement one another. For example, some favour homotypical clusters of binding sites, while others perform best on short CRMs. Furthermore, most methods appear to be sensitive to the composition and structure of the genome to which they are applied. We analyze the principal features that distinguish the methods that performed well, identify weaknesses leading to poor performance, and provide a guide for users. We also propose key considerations for the development and evaluation of future CRM-prediction methods

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

Intronic Cis-Regulatory Modules Mediate Tissue-Specific and Microbial Control of angptl4/fiaf Transcription

Author: A Cazes
A Galaup
A Georgiadi
A Koster
A Stark
AC Meireles-Filho
AJ Belanger
Amelia L. Jazwa
AN Ng
Andrew S. McCallion
AR Folsom
B Kutlu
B Thisse
B Wang
C Grootaert
C Thisse
CA Semple
CC Martin
CE Ng
CH Chao
Chad M. Trent
CK Fleissner
D Padua
D Panne
DS Chekmenev
E Beuling
E Davidson
E Gasteiger
EC Lee
EE Hare
EJ Flynn 3rd
F Backhed
F Backhed
F Backhed
G Bikopoulos
GP Hayhurst
H Staiger
H Wang
HA Field
HJ Flint
I Letunic
IK Quigley
J Chu
J Felsenstein
J Qin
J Schug
J. Gray Camp
JC Bryne
JC Jonas
JC Yoon
JE Gunton
JF Rawls
JF Rawls
JF Rawls
JG Camp
JM Bates
John F. Rawls
K Hama
K Kaddatz
K Kawakami
K Milligan-Myhre
KA Frazer
KL Tang
KN Wallace
L Aronsson
L Palanker
L Zeng
LN Pham
LV Hooper
M Brudno
M Haeussler
M Heinaniemi
M Kanther
M Levine
M Pack
M Shapira
MD Abramoff
MH Yau
MJ Borok
MP Verzi
MR Dusing
P Navratilova
P Zhu
R Pillai
RB Sartor
RC Edgar
RD Finn
RE Ley
S Bertrand
S Curado
S Fisher
S Fisher
S Gupta
S Hedges
S Kersten
S Mandard
S Romeo
SI Han
SK Koliwad
SM Pascal
T Tsujimura
TL Bailey
TL Bailey
U Desai
V Matys
WW Wasserman
X Lei
YY Goh
Z Wang
ZD Peng
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

The intestinal microbiota enhances dietary energy harvest leading to increased fat storage in adipose tissues. This effect is caused in part by the microbial suppression of intestinal epithelial expression of a circulating inhibitor of lipoprotein lipase called Angiopoietin-like 4 (Angptl4/Fiaf). To define the cis-regulatory mechanisms underlying intestine-specific and microbial control of Angptl4 transcription, we utilized the zebrafish system in which host regulatory DNA can be rapidly analyzed in a live, transparent, and gnotobiotic vertebrate. We found that zebrafish angptl4 is transcribed in multiple tissues including the liver, pancreatic islet, and intestinal epithelium, which is similar to its mammalian homologs. Zebrafish angptl4 is also specifically suppressed in the intestinal epithelium upon colonization with a microbiota. In vivo transgenic reporter assays identified discrete tissue-specific regulatory modules within angptl4 intron 3 sufficient to drive expression in the liver, pancreatic islet β-cells, or intestinal enterocytes. Comparative sequence analyses and heterologous functional assays of angptl4 intron 3 sequences from 12 teleost fish species revealed differential evolution of the islet and intestinal regulatory modules. High-resolution functional mapping and site-directed mutagenesis defined the minimal set of regulatory sequences required for intestinal activity. Strikingly, the microbiota suppressed the transcriptional activity of the intestine-specific regulatory module similar to the endogenous angptl4 gene. These results suggest that the microbiota might regulate host intestinal Angptl4 protein expression and peripheral fat storage by suppressing the activity of an intestine-specific transcriptional enhancer. This study provides a useful paradigm for understanding how microbial signals interact with tissue-specific regulatory networks to control the activity and evolution of host gene transcription

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Carolina Digital Repository

A Wide Extent of Inter-Strain Diversity in Virulent and Vaccine Strains of Alphaherpesviruses

Author: A Bartha
A Cheung
A Dolan
A Dolan
A Jons
A Jons
A Sauerbrei
A Simmons
A Simon
A Viejo-Borbolla
A Walker
AA Fonseca Jr
AC Minson
AE Granstedt
AJ Bradley
AJ Davison
AJ Davison
AK Robbins
AL Amelio
AL Gielkens
AL Greninger
AL Schmidt
AM Arvin
AM Arvin
AM Arvin
AS Kaplan
B La Scola
B Langmead
B Lomniczi
B Lomniczi
B Roizman
BB Kaufer
BD Parker
BG Klupp
BG Klupp
BG Klupp
BS Mohl
C Cunningham
C Deback
C Vlcek
CB Hwang
CL Wu
CM Chau
D Chibo
D Curanovic
D Given
D Todd
DC Bloom
DJ Dargan
DJ McGeoch
DJ McGeoch
DJ McGeoch
DM Koelle
DM Koelle
DM Koelle
DR Bentley
DR Denver
DS Nikolic
E Buschiazzo
E Portales-Casamar
E Rekabdar
EA Petrovskis
EC Hahn
EE Brittle
ES Mocarski
EV Ball
F Grey
G Aston-Jones
G Badis
G Benson
G Hatfull
G Zhang
GA Peters
GA Smith
GF Richard
GR Bedadala
GW Luxton
H Kang
H Li
H Matsuura
HC Lee
HT Orr
I Gorzer
I Steiner
I Tempera
I Tempera
J Cheval
J Loh
J Nugent
JA Liljeqvist
JC Dohm
JF Kreuze
JH Weis
JI Lee
JJ Sasadeusz
JM DeMarchi
JM Dijkstra
Joel D. Baines
JR Brouwer
JT Robinson
JW Nicol
K Nakamura
K Okazaki
K Umene
K Wang
KA Frazer
KB Platt
Kevin J. Verstrepen
KJ Verstrepen
KT Jeang
L. W. Enquist
LA Pfister
Lance Parsons
LB Strick
LE Pomeranz
LS Christensen
LW Enquist
LW Enquist
M Al Rwahnih
M Backovic
M Brudno
M Chen
M Falkenberg
M Gao
M Legendre
M Legendre
M Lynch
M Mapelli
M Ramaswamy
Matthieu Legendre
MC Schatz
MD Vinces
ME Whealy
MG Lyman
MG Myers
MI Ekstrand
MI Thurston
MJ Wagner
ML Szpara
Moriah L. Szpara
MW Wathen
N Babic
N Renzette
NO Bianchi
O de Jesus
O Elemento
O Harismendy
P Medvedev
P Norberg
PA Bates
Q Chen
R Baer
R Gemayel
R Staden
R Staden
R Willemsen
RD Everett
RF Haff
RJ Watson
RJ Watson
RL Warren
RM Presti
RS Tirabassi
S Awasthi
S Awasthi
S Bekal
S Bottcher
S Bottcher
S Bottcher
S Krobitsch
S LaBoissiere
S Levy
S Taharaguchi
S Tyler
S Yoon
S. Rafi Shamim
SA Rezaee
SD Tyler
SF Altschul
SJ Spatz
SJ Spatz
SK Mittal
SL Oliver
SW Lee
T Ben-Porat
T Muller
TC Mettenleiter
TC Mettenleiter
TC Mettenleiter
TC Mettenleiter
TK Chowdary
TM White
W Fuchs
W Fuchs
W Stedman
WA Derbigny
WJ Muller
WO Ogle
Y Bao
Y Ushijima
Yolanda R. Tafuri
Publication venue: Public Library of Science
Publication date: 01/10/2011
Field of study

Alphaherpesviruses are widespread in the human population, and include herpes simplex virus 1 (HSV-1) and 2, and varicella zoster virus (VZV). These viral pathogens cause epithelial lesions, and then infect the nervous system to cause lifelong latency, reactivation, and spread. A related veterinary herpesvirus, pseudorabies (PRV), causes similar disease in livestock that result in significant economic losses. Vaccines developed for VZV and PRV serve as useful models for the development of an HSV-1 vaccine. We present full genome sequence comparisons of the PRV vaccine strain Bartha, and two virulent PRV isolates, Kaplan and Becker. These genome sequences were determined by high-throughput sequencing and assembly, and present new insights into the attenuation of a mammalian alphaherpesvirus vaccine strain. We find many previously unknown coding differences between PRV Bartha and the virulent strains, including changes to the fusion proteins gH and gB, and over forty other viral proteins. Inter-strain variation in PRV protein sequences is much closer to levels previously observed for HSV-1 than for the highly stable VZV proteome. Almost 20% of the PRV genome contains tandem short sequence repeats (SSRs), a class of nucleic acids motifs whose length-variation has been associated with changes in DNA binding site efficiency, transcriptional regulation, and protein interactions. We find SSRs throughout the herpesvirus family, and provide the first global characterization of SSRs in viruses, both within and between strains. We find SSR length variation between different isolates of PRV and HSV-1, which may provide a new mechanism for phenotypic variation between strains. Finally, we detected a small number of polymorphic bases within each plaque-purified PRV strain, and we characterize the effect of passage and plaque-purification on these polymorphisms. These data add to growing evidence that even plaque-purified stocks of stable DNA viruses exhibit limited sequence heterogeneity, which likely seeds future strain evolution

Lirias

Public Library of Science (PLOS)

Princeton University Open Access Repository

Crossref

Directory of Open Access Journals

PubMed Central