Article thumbnail
Location of Repository

A universally applicable method of operon map prediction on minimally annotated genomes using conserved genomic context

By Martin T. Edwards, Stuart C. G. Rison, Neil G. Stoker and Lorenz Wernisch


An important step in understanding the regulation of a prokaryotic genome is the generation of its transcription unit map. The current strongest operon predictor depends on the distributions of intergenic distances (IGD) separating adjacent genes within and between operons. Unfortunately, experimental data on these distance distributions are limited to Escherichia coli and Bacillus subtilis. We suggest a new graph algorithmic approach based on comparative genomics to identify clusters of conserved genes independent of IGD and conservation of gene order. As a consequence, distance distributions of operon pairs for any arbitrary prokaryotic genome can be inferred. For E.coli, the algorithm predicts 854 conserved adjacent pairs with a precision of 85%. The IGD distribution for these pairs is virtually identical to the E.coli operon pair distribution. Statistical analysis of the predicted pair IGD distribution allows estimation of a genome-specific operon IGD cut-off, obviating the requirement for a training set in IGD-based operon prediction. We apply the method to a representative set of eight genomes, and show that these genome-specific IGD distributions differ considerably from each other and from the distribution in E.coli

Topics: Article
Publisher: Oxford University Press
Year: 2005
DOI identifier: 10.1093/nar/gki634
OAI identifier:
Provided by: PubMed Central

Suggested articles


  1. (1997). A genomic perspective on protein families.
  2. (2002). A powerful nonhomology method for the prediction of operons in prokaryotes.
  3. (2000). A probabilistic learning approach to whole-genome operon prediction.
  4. (1985). Algorithmic Graph Theory.
  5. (2002). Antibacterial activities and characterization of novel inhibitors of LpxC.
  6. (2003). Biosynthesis of the 7-deazaguanosine hypermodified nucleosides of transfer RNA.
  7. (2002). CheW binding interactions with CheA and Tar. Importance for chemotaxis signaling in Escherichia coli.
  8. (2002). Connected gene neighborhoods in prokaryotic genomes.
  9. Dubuisson,J.,Vianney,A.andLazzaroni,J.(2002)Mutationalanalysisof the TolA C-terminal domain of Escherichia coli and genetic evidence for an interaction between TolA and TolB.
  10. (2001). Empirical Bayes analysis of a microarray experiment.
  11. (2001). Fastidian gum: the Xylella fastidiosa exopolysaccharide possibly involved in bacterial pathogenicity.
  12. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.
  13. Ge,Y.,Old,I.,Girons,I.andCharon,N.(1997)TheflgKmotilityoperonof Borrelia burgdorferi is initiated by a sigma 70-like promoter.
  14. (1996). Gene order is not conserved in bacterial evolution.
  15. (2001). Genome alignment,evolutionofprokaryotic genomeorganization,andprediction of gene function using genomic context.
  16. (2003). Genomic functional annotation using co-evolution profiles of gene clusters.
  17. (2001). Identification of phospholipids as new components that assist in the in vitro trimerization of a bacterial pore protein.
  18. (2003). Inference of protein function and protein linkages in Nucleic Acids Research,
  19. (2000). KEGG: kyoto encyclopedia of genes and genomes.
  20. Kloser,A.,Laird,M.,Deng,M.andMisra,R.(1998)ModulationsinlipidA and phospholipid biosynthesis pathways influence outer membrane protein assembly in Escherichia coli K-12.
  21. Lathe,W.C.,III,Snel,B.andBork,P.(2000)Genecontextconservationof a higher order than operons.
  22. (1989). Lon-dependent regulation of the DNA binding protein HU in Escherichia coli.
  23. (2000). Modelling and smoothing parameter estimation with multiple quadratic penalties.
  24. (2003). Molecular and functional analysis of the lepB gene, encoding a type I signal peptidase from Rickettsia rickettsii and Rickettsia typhi.
  25. (2000). Molecular Cell Biology. 4th edn W.H.Freeman and Company,
  26. (1987). Nitrogen regulation of transport operons—analysis of promoters argTr and dhuA.
  27. (2000). Nitrogen regulatory protein c-controlled genes of Escherichia coli: scavenging as a defense against nitrogen limitation.
  28. (2000). Operons in Escherichia coli: genomic analyses and predictions.
  29. (2003). Predicting bacterial transcription units using sequence and expression data.
  30. (2001). Prediction of operons in microbial genomes.
  31. (2002). Purine regulon of gamma-proteobacteria: a detailed description.
  32. (2002). The EcoCyc Database.
  33. (1999). The use of gene clusters to infer functional coupling.
  34. (2004). Transcriptional organization of the Clostridium acetobutylicum genome.
  35. (2003). Transcriptional organization of the Pseudomonas putida tol-oprL genes.
  36. (2004). Transcriptional regulation, operon organization and growth conditions in Escherichia coli k-12.
  37. (1997). Variation in HU composition during growth of Escherichia coli: the heterodimer is required for -long term survival.
  38. Zhang,Y.andCronan,J.(1998)Transcriptionalanalysisofessentialgenes of the Escherichia coli fatty acid biosynthesis gene cluster by functional replacement with the analogous Salmonella typhimurium gene cluster.

To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.