Search CORE

17 research outputs found

rMotifGen: random motif generator for DNA and protein sequences

Author: A Bairoch
A Bairoch
A Bairoch
A Rambaut
AF Neuwald
AV Favorov
C Timothy Hardin
CE Lawrence
CT Hardin
CT Workman
E Coward
E Eskin
E Wingender
EP Xing
Eric C Rouchka
G Pavesi
G Thijs
GZ Hertz
H Matsuda
HJ van
J Hu
J Liu
JD Hughes
L Stein
M Tompa
MC Frith
ML Engle
PA Pevzner
RM Schwartz
S Sinha
TL Bailey
W Ao
W Thompson
WN Grundy
X Liu
Y Ponty
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Detection of short, subtle conserved motif regions within a set of related DNA or amino acid sequences can lead to discoveries about important regulatory domains such as transcription factor and DNA binding sites as well as conserved protein domains. In order to help assess motif detection algorithms on motifs with varying properties and levels of conservation, we have developed a computational tool, rMotifGen, with the sole purpose of generating a number of random DNA or protein sequences containing short sequence motifs. Each motif consensus can be user-defined, randomly generated, or created from a position-specific scoring matrix (PSSM). Insertions and mutations within these motifs are created according to user-defined parameters and substitution matrices. The resulting sequences can be helpful in mutational simulations and in testing the limits of motif detection algorithms. Results Two implementations of rMotifGen have been created, one providing a graphical user interface (GUI) for random motif construction, and the other serving as a command line interface. The second implementation has the added advantages of platform independence and being able to be called in a batch mode. rMotifGen was used to construct sample sets of sequences containing DNA motifs and amino acid motifs that were then tested against the Gibbs sampler and MEME packages. Conclusion rMotifGen provides an efficient and convenient method for creating random DNA or amino acid sequences with a variable number of motifs, where the instance of each motif can be incorporated using a position-specific scoring matrix (PSSM) or by creating an instance mutated from its corresponding consensus using an evolutionary model based on substitution matrices. rMotifGen is freely available at: <url>http://bioinformatics.louisville.edu/brg/rMotifGen/</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Three allele combinations associated with Multiple Sclerosis

BACKGROUND: Multiple sclerosis (MS) is an immune-mediated disease of polygenic etiology. Dissection of its genetic background is a complex problem, because of the combinatorial possibilities of gene-gene interactions. As genotyping methods improve throughput, approaches that can explore multigene interactions appropriately should lead to improved understanding of MS. METHODS: 286 unrelated patients with definite MS and 362 unrelated healthy controls of Russian descent were genotyped at polymorphic loci (including SNPs, repeat polymorphisms, and an insertion/deletion) of the DRB1, TNF, LT, TGFβ1, CCR5 and CTLA4 genes and TNFa and TNFb microsatellites. Each allele carriership in patients and controls was compared by Fisher's exact test, and disease-associated combinations of alleles in the data set were sought using a Bayesian Markov chain Monte Carlo-based method recently developed by our group. RESULTS: We identified two previously unknown MS-associated tri-allelic combinations: -509TGFβ1*C, DRB1*18(3), CTLA4*G and -238TNF*B1,-308TNF*A2, CTLA4*G, which perfectly separate MS cases from controls, at least in the present sample. The previously described DRB1*15(2) allele, the microsatellite TNFa9 allele and the biallelic combination CCR5Δ32, DRB1*04 were also reidentified as MS-associated. CONCLUSION: These results represent an independent validation of MS association with DRB1*15(2) and TNFa9 in Russians and are the first to find the interplay of three loci in conferring susceptibility to MS. They demonstrate the efficacy of our approach for the identification of complex-disease-associated combinations of alleles

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Catalytic transformations of supercoiled DNA as studied by flow linear dichroism technique

Author: Andoh T
Balasta L
Berger JM
Boles TC
Cozzarelli N
Crick FH
Favorov PV
Fleury F
Frank-Kamenetskii MD
Fuller FB
Fuller FB
Gabibov AG
Gololobov GV
Hande KR
Iakubovskaia EA
Iakubovskaia EA
Iakubovskaia EA
Kjeldsen E
Lee CS
Lerman LS
Makarov VL
Morris SK
Norden B
Norden B
Pulleyblank DE
Roca AI
Roca J
Roca J
Sambrook J
Shuster AM
Swenberg CE
Upholt WB
Vologodskii AV
Vologodskii AV
Vologodskii AV
Xu R
Yoshida H
Publication venue: 'Wiley'
Publication date
Field of study

Crossref

ArcA and AppY Antagonize IscR Repression of Hydrogenase-1 Expression under Anaerobic Conditions, Revealing a Novel Mode of O 2

Author: Alvarez AF
Boyd D
Drapal N
Favorov AV
Fontecilla-Camps JC
Gaudu P
Giel JL
Hexter SV
Hidalgo E
Ku HH
Liu X
Lukey MJ
Mandin P
McGuire AM
Miller JH
Nesbit AD
Papenfort K
Pinske C
Richard DJ
Salmon KA
Schwartz CJ
Shepherd M
Yeo WS
Publication venue: 'American Society for Microbiology'
Publication date
Field of study

Crossref

Assessing computational tools for the discovery of transcription factor binding sites.

Author: Alexander V Favorov
Andrei A Mironov
AV Favorov
Bart De Moor
C Burge
Christopher Workman
Chun Ye
CT Workman
E Eskin
E Wingender
Eleazar Eskin
G Pavesi
G Thijs
George M Church
Gert Thijs
Giulio Pavesi
Graziano Pesole
GZ Hertz
J Moult
J van Helden
J van Helden
Jacques van Helden
JD Hughes
M Ashburner
M Burset
M Régnier
Martin C Frith
Martin Tompa
Mathias Vandenbogaert
MC Frith
MG Reese
Mireille Régnier
Nan Li
Nicolas Simonis
P Pevzner
S Sinha
S Sinha
S Sinha
Saurabh Sinha
Timothy L Bailey
TL Bailey
Vsevolod J Makeev
W Ao
W James Kent
William Stafford Noble
Yutao Fu
Zhiping Weng
Zhou Zhu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

The prediction of regulatory elements is a problem where computational methods offer great hope. Over the past few years, numerous tools have become available for this task. The purpose of the current assessment is twofold: to provide some guidance to users regarding the accuracy of currently available tools in various settings, and to provide a benchmark of data sets for assessing future tools.Journal ArticleResearch Support, N.I.H. ExtramuralResearch Support, Non-U.S. Gov'tResearch Support, U.S. Gov't, Non-P.H.S.Research Support, U.S. Gov't, P.H.S.info:eu-repo/semantics/publishe

Lirias

Crossref

AIR Universita degli studi di Milano

HAL AMU

Archivio istituzionale della ricerca - Università di Bari

DI-fusion

University of Queensland eSpace

A study on the application of topic models to motif finding algorithms

Author: AV Favorov
CT Workman
D Blei
DM Blei
DM Blei
E Eskin
E Wingender
G Pavesi
G Thijs
GZ Hertz
I Abnizova
J Aitchison
J Helden van
J Helden van
JD Hughes
JJ Shu
Josep Basha Gutierrez
K Hornik
Kenta Nakai
M Burset
M Mitchell
M Régnier
M Tompa
MC Frith
MK Das
PA Pevzner
S Sinha
TL Bailey
W Ao
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

PromoterSweep: a tool for identification of transcription factor binding sites

Author: A Sandelin
AG Pedersen
Agnes Hotz-Wagenblatt
AJ Vilella
AV Favorov
B Endre
B Lenhard
B Morgenstern
CD Schmid
CH Choi
Coral del Val
CT Workman
E Barta
Endre Barta
G Pavesi
G Robertson
GD Stormo
H Sun
Karl-Heinz Glatting
M Senger
M Tompa
Oliver Pelz
P Ernst
R Yamashita
S Sinha
S Sonnenburg
SF Altschul
SM Kielbasa
ST Smale
TL Bailey
VB Bajic
W Thompson
X Li
X Wang
X Xie
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

A developed system based on nature-inspired algorithms for DNA motif finding process

Author: A Ouaaraba
AMA Tawab
AV Favorov
C Adami
C Lei
D Karaboga
Das Mk and Dai HK
DK Agrafiotis
E Eskin
E Rashedi
Ebtehal S. Elewa
G Mauri
H Zheng
J Ding
J Hu
K Chandrasekaran
K Khan
Leonardo Mariño-Ramírez
M Brambilla
M Chawla
M Guerrero
M Regnier
Mai S. Mabrouk
Maoguo Gong
Mohamed B. Abdelhalim
N Li
NC Seeman
P Civicioglu
P D’haeseleer
R Bulatović
S Sinha
S Vijayvargiya
Seyedali Mirjalili
TL Bailey
V Bhargava
W Wei
XS Yang
Y Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Variants of the Coagulation and Inflammation Genes Are Replicably Associated with Myocardial Infarction and Epistatically Interact in Russians

Crossref

A fast weak motif-finding algorithm based on community detection in graphs

Author: A Tramonti
AA Sharov
ADS Cameron
AV Favorov
BK Cho
C Boucher
Caiyan Jia
CE Lawrence
CE Sammitt
CW Huang
D Nègre
E Eskin
G Pavesi
G Pavesi
G Pavesi
GE Crooks
GF Ames
GJ Li
GZ Hertz
H Salgado
HQ Sun
J Buhler
J Davila
J Hu
J Plumbridge
JD Hughes
Jian Yu
JL Lavrrar
JW Campbell
K Robison
L Elnitski
L Gang
M Girvan
M Ovelgonne
M Rosvall
M Tompa
Matthew B Carson
MF Sagot
MJ Newman
MK Das
ML Bulyk
O Danot
P Pevzner
PM McNicholas
PN Hengen
PP Kuksa
S Fortunato
S Georgiev
SA Gavigan
ST Jensen
T Schneiders
TL Bailey
UN Raghavan
VD Blondel
X Chen
Y Cui
Y Zhang
YL Chin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/07/2013
Field of study

BACKGROUND: Identification of transcription factor binding sites (also called ‘motif discovery’) in DNA sequences is a basic step in understanding genetic regulation. Although many successful programs have been developed, the problem is far from being solved on account of diversity in gene expression/regulation and the low specificity of binding sites. State-of-the-art algorithms have their own constraints (e.g., high time or space complexity for finding long motifs, low precision in identification of weak motifs, or the OOPS constraint: one occurrence of the motif instance per sequence) which limit their scope of application. RESULTS: In this paper, we present a novel and fast algorithm we call TFBSGroup. It is based on community detection from a graph and is used to discover long and weak (l,d) motifs under the ZOMOPS constraint (zero, one or multiple occurrence(s) of the motif instance(s) per sequence), where l is the length of a motif and d is the maximum number of mutations between a motif instance and the motif itself. Firstly, TFBSGroup transforms the (l, d) motif search in sequences to focus on the discovery of dense subgraphs within a graph. It identifies these subgraphs using a fast community detection method for obtaining coarse-grained candidate motifs. Next, it greedily refines these candidate motifs towards the true motif within their own communities. Empirical studies on synthetic (l, d) samples have shown that TFBSGroup is very efficient (e.g., it can find true (18, 6), (24, 8) motifs within 30 seconds). More importantly, the algorithm has succeeded in rapidly identifying motifs in a large data set of prokaryotic promoters generated from the Escherichia coli database RegulonDB. The algorithm has also accurately identified motifs in ChIP-seq data sets for 12 mouse transcription factors involved in ES cell pluripotency and self-renewal. CONCLUSIONS: Our novel heuristic algorithm, TFBSGroup, is able to quickly identify nearly exact matches for long and weak (l, d) motifs in DNA sequences under the ZOMOPS constraint. It is also capable of finding motifs in real applications. The source code for TFBSGroup can be obtained from http://bioinformatics.bioengr.uic.edu/TFBSGroup/

Crossref

PubMed Central

University of Illinois at Chicago: UIC INDIGO (INtellectual property in DIGital form available online in an Open environment)