Search CORE

2,132 research outputs found

More robust detection of motifs in coexpressed genes by using phylogenetic information

Author: De Keersmaecker Sigrid CJ
De Moor Bart
Fadda Abeer A
Marchal Kathleen
Monsieurs Pieter
Thijs Gert
Vanderleyden Jozef
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Several motif detection algorithms have been developed to discover overrepresented motifs in sets of coexpressed genes. However, in a noisy gene list, the number of genes containing the motif versus the number lacking the motif might not be sufficiently high to allow detection by classical motif detection tools. To still recover motifs which are not significantly enriched but still present, we developed a procedure in which we use phylogenetic footprinting to first delineate all potential motifs in each gene. Then we mutually compare all detected motifs and identify the ones that are shared by at least a few genes in the data set as potential candidates. RESULTS: We applied our methodology to a compiled test data set containing known regulatory motifs and to two biological data sets derived from genome wide expression studies. By executing four consecutive steps of 1) identifying conserved regions in orthologous intergenic regions, 2) aligning these conserved regions, 3) clustering the conserved regions containing similar regulatory regions followed by extraction of the regulatory motifs and 4) screening the input intergenic sequences with detected regulatory motif models, our methodology proves to be a powerful tool for detecting regulatory motifs when a low signal to noise ratio is present in the input data set. Comparing our results with two other motif detection algorithms points out the robustness of our algorithm. CONCLUSION: We developed an approach that can reliably identify multiple regulatory motifs lacking a high degree of overrepresentation in a set of coexpressed genes (motifs belonging to sparsely connected hubs in the regulatory network) by exploiting the advantages of using both coexpression and phylogenetic information

Springer - Publisher Connector

Directory of Open Access Journals

Ghent University Academic Bibliography

PubMed Central

A survey of DNA motif finding algorithms

Author: Dai Ho-Kwok
Das Modan K
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Background: Unraveling the mechanisms that regulate gene expression is a major challenge in biology. An important task in this challenge is to identify regulatory elements, especially the binding sites in deoxyribonucleic acid (DNA) for transcription factors. These binding sites are short DNA segments that are called motifs. Recent advances in genome sequence availability and in high-throughput gene expression analysis technologies have allowed for the development of computational methods for motif finding. As a result, a large number of motif finding algorithms have been implemented and applied to various motif models over the past decade. This survey reviews the latest developments in DNA motif finding algorithms.Results: Earlier algorithms use promoter sequences of coregulated genes from single genome and search for statistically overrepresented motifs. Recent algorithms are designed to use phylogenetic footprinting or orthologous sequences and also an integrated approach where promoter sequences of coregulated genes and phylogenetic footprinting are used. All the algorithms studied have been reported to correctly detect the motifs that have been previously detected by laboratory experimental approaches, and some algorithms were able to find novel motifs. However, most of these motif finding algorithms have been shown to work successfully in yeast and other lower organisms, but perform significantly worse in higher organisms.Conclusion: Despite considerable efforts to date, DNA motif finding remains a complex challenge for biologists and computer scientists. Researchers have taken many different approaches in developing motif discovery tools and the progress made in this area of research is very encouraging. Performance comparison of different motif finding tools and identification of the best tools have proven to be a difficult task because tools are designed based on algorithms and motif models that are diverse and complex and our incomplete understanding of the biology of regulatory mechanism does not always provide adequate evaluation of underlying algorithms over motif models.Peer reviewedComputer Scienc

Springer - Publisher Connector

PubMed Central

The University of Arizona

SHAREOK repository

Positional clustering improves computational binding site detection and identifies novel cis-regulatory sites in mammalian GABA(A) receptor subunit genes

Author: Aerts
Anand
Ballas
Ballas
Blackwood
Boris E. Shakhnovich
Bosman
Brooks-Kayal
Bussemaker
Charles DeLisi
Daniel S. Roberts
Dawson
Dolan
Friberg
Frith
Gray
Harbison
Iyer
Kaplan
Kerr
Kirkness
Kuo
Lawrence
Lee
Lewin
Li
Liu
Macisaac
MacIsaac
Madhani
Morozov
Niehrs
Pellegrini
Perier
Pietrokovski
Purves
Reddy
Roberts
Roberts
Roth
Saffer
Shelley J. Russek
Siegel
Steiger
Stormo
Stormo
Swendeman
Temple
Therrien
Thiagalingam
Thijs
Timothy E. Reddy
Tompa
Treiman
Wall
Wasserman
Winderickx
Wingender
Wu
Publication venue: Oxford University Press
Publication date: 03/01/2007
Field of study

Understanding transcription factor (TF) mediated control of gene expression remains a major challenge at the interface of computational and experimental biology. Computational techniques predicting TF-binding site specificity are frequently unreliable. On the other hand, comprehensive experimental validation is difficult and time consuming. We introduce a simple strategy that dramatically improves robustness and accuracy of computational binding site prediction. First, we evaluate the rate of recurrence of computational TFBS predictions by commonly used sampling procedures. We find that the vast majority of results are biologically meaningless. However clustering results based on nucleotide position improves predictive power. Additionally, we find that positional clustering increases robustness to long or imperfectly selected input sequences. Positional clustering can also be used as a mechanism to integrate results from multiple sampling approaches for improvements in accuracy over each one alone. Finally, we predict and validate regulatory sequences partially responsible for transcriptional control of the mammalian type A γ-aminobutyric acid receptor (GABA(A)R) subunit genes. Positional clustering is useful for improving computational binding site predictions, with potential application to improving our understanding of mammalian gene expression. In particular, predicted regulatory mechanisms in the mammalian GABA(A)R subunit gene family may open new avenues of research towards understanding this pharmacologically important neurotransmitter receptor system

Crossref

Boston University Institutional Repository (OpenBU)

PubMed Central

Genome Biol.

Author: Brazma A.
Coulson R.
Manke T.
Palin K.
Sand O.
Ukkonen E.
van Helden J.
Vingron M.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/01/2009
Field of study

With genome analysis expanding from the study of genes to the study of gene regulation, 'regulatory genomics' utilizes sequence information, evolution and functional genomics measurements to unravel how regulatory information is encoded in the genome

MPG.PuRe

Computational identification of transcriptional regulatory elements in DNA sequence

Author: GuhaThakurta Debraj
Publication venue: Oxford University Press
Publication date: 01/01/2006
Field of study

Identification and annotation of all the functional elements in the genome, including genes and the regulatory sequences, is a fundamental challenge in genomics and computational biology. Since regulatory elements are frequently short and variable, their identification and discovery using computational algorithms is difficult. However, significant advances have been made in the computational methods for modeling and detection of DNA regulatory elements. The availability of complete genome sequence from multiple organisms, as well as mRNA profiling and high-throughput experimental methods for mapping protein-binding sites in DNA, have contributed to the development of methods that utilize these auxiliary data to inform the detection of transcriptional regulatory elements. Progress is also being made in the identification of cis-regulatory modules and higher order structures of the regulatory sequences, which is essential to the understanding of transcription regulation in the metazoan genomes. This article reviews the computational approaches for modeling and identification of genomic regulatory elements, with an emphasis on the recent developments, and current challenges

CiteSeerX

Crossref

PubMed Central

SIGffRid: A tool to search for sigma factor binding sites in bacterial genomes using comparative approach and biologically driven statistics

Abstract Background Many programs have been developed to identify transcription factor binding sites. However, most of them are not able to infer two-word motifs with variable spacer lengths. This case is encountered for RNA polymerase Sigma (<it>σ</it>) Factor Binding Sites (SFBSs) usually composed of two boxes, called -35 and -10 in reference to the transcription initiation point. Our goal is to design an algorithm detecting SFBS by using combinational and statistical constraints deduced from biological observations. Results We describe a new approach to identify SFBSs by comparing two related bacterial genomes. The method, named SIGffRid (SIGma Factor binding sites Finder using R'MES to select Input Data), performs a simultaneous analysis of pairs of promoter regions of orthologous genes. SIGffRid uses a prior identification of over-represented patterns in whole genomes as selection criteria for potential -35 and -10 boxes. These patterns are then grouped using pairs of short seeds (of which one is possibly gapped), allowing a variable-length spacer between them. Next, the motifs are extended guided by statistical considerations, a feature that ensures a selection of motifs with statistically relevant properties. We applied our method to the pair of related bacterial genomes of <it>Streptomyces coelicolor </it>and <it>Streptomyces avermitilis</it>. Cross-check with the well-defined SFBSs of the SigR regulon in <it>S. coelicolor </it>is detailed, validating the algorithm. SFBSs for HrdB and BldN were also found; and the results suggested some new targets for these <it>σ </it>factors. In addition, consensus motifs for BldD and new SFBSs binding sites were defined, overlapping previously proposed consensuses. Relevant tests were carried out also on bacteria with moderate GC content (i.e. <it>Escherichia coli</it>/<it>Salmonella typhimurium </it>and <it>Bacillus subtilis</it>/<it>Bacillus licheniformis </it>pairs). Motifs of house-keeping <it>σ </it>factors were found as well as other SFBSs such as that of SigW in <it>Bacillus </it>strains. Conclusion We demonstrate that our approach combining statistical and biological criteria was successful to predict SFBSs. The method versatility autorizes the recognition of other kinds of two-box regulatory sites.</p

HAL - Lille 3

Crossref

Directory of Open Access Journals

INRIA a CCSD electronic archive server

PubMed Central

HAL Descartes

ProdInra

Evaluation of phylogenetic footprint discovery for predicting bacterial cis-regulatory elements and revealing their evolution

Author: Janky Rekin's
van Helden Jacques
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

The detection of conserved motifs in promoters of orthologous genes (phylogenetic footprints) has become a common strategy to predict cis-acting regulatory elements. Several software tools are routinely used to raise hypotheses about regulation. However, these tools are generally used as black boxes, with default parameters. A systematic evaluation of optimal parameters for a footprint discovery strategy can bring a sizeable improvement to the predictions.Journal ArticleResearch Support, Non-U.S. Gov'tSCOPUS: ar.jinfo:eu-repo/semantics/publishe

Lirias

Springer - Publisher Connector

HAL AMU

Directory of Open Access Journals

PubMed Central

DI-fusion