Skip to main content
Article thumbnail
Location of Repository

Allegro: Analyzing expression and sequence in concert to discover regulatory programs

By Yonit Halperin, Chaim Linhart, Igor Ulitsky and Ron Shamir


A major goal of system biology is the characterization of transcription factors and microRNAs (miRNAs) and the transcriptional programs they regulate. We present Allegro, a method for de-novo discovery of cis-regulatory transcriptional programs through joint analysis of genome-wide expression data and promoter or 3′ UTR sequences. The algorithm uses a novel log-likelihood-based, non-parametric model to describe the expression pattern shared by a group of co-regulated genes. We show that Allegro is more accurate and sensitive than existing techniques, and can simultaneously analyze multiple expression datasets with more than 100 conditions. We apply Allegro on datasets from several species and report on the transcriptional modules it uncovers. Our analysis reveals a novel motif over-represented in the promoters of genes highly expressed in murine oocytes, and several new motifs related to fly development. Finally, using stem-cell expression profiles, we identify three miRNA families with pivotal roles in human embryogenesis

Topics: Computational Biology
Publisher: Oxford University Press
OAI identifier:
Provided by: PubMed Central

Suggested articles


  1. (2004). A common set of gene regulatory networks links metabolism and growth inhibition.
  2. (2003). A comparison of normalization methods for high density oligonucleotide array data based on variance and bias.
  3. (2007). A functional study of miR-124 in the developing neural tube.
  4. (2004). A gene atlas of the mouse and human protein-encoding transcriptomes.
  5. (2007). A mammalian microRNA expression atlas based on small RNA library sequencing.
  6. (1992). A transcriptional hierarchy involved in mammalian cell-type specification.
  7. (2007). A universal framework for regulatory element discovery across all genomes and data types.
  8. (2004). A walk-through of the yeast mating pheromone response pathway.
  9. (2001). An algorithm for finding signals of unknown length in DNA sequences.
  10. (2005). Assessing computational tools for the discovery of transcription factor binding sites.
  11. (2000). CLICK: a clustering algorithm with applications to gene expression analysis.
  12. (2004). Cloning and expression of a novel CREB mRNA splice variant in human testis.
  13. (2004). Cluster analysis for gene expression data: a survey.
  14. (2005). Combining probability from independent tests: the weighted Z-method is superior to Fisher’s approach.
  15. (1998). Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization.
  16. (2008). Comprehensive microRNA profiling reveals a unique human embryonic stem cell signature dominated by a single seed sequence.
  17. (2002). Computational analysis of core promoters in the Drosophila genome.
  18. (2000). Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae.
  19. (2002). Deciphering gene expression regulatory networks.
  20. (2005). Deciphering transcriptional regulatory elements that encode specific cell cycle phasing by comparative genomics analysis.
  21. (2005). Discovering functional transcription-factor combinations in the human cell cycle.
  22. (2007). Discovering motifs in ranked lists of DNA sequences.
  23. (2007). Divergence of transcription factor binding sites across related yeast species.
  24. (2000). DNA binding sites: representation and discovery.
  25. (2004). E2Fs link the control of G1/S and G2/M transcription.
  26. (2002). Evidence for large domains of similarly expressed genes in the Drosophila genome.
  27. (2005). EXPANDER–an integrative program suite for microarray data analysis.
  28. (1996). Expression of B-Myb during mouse embryogenesis.
  29. (2000). Finding regulatory elements using joint likelihoods for sequence and expression profile data.
  30. (1994). Fitting a mixture model by expectation maximization to discover motifs in biopolymers.
  31. (2005). Fos and jun proteins are specifically expressed during differentiation of human keratinocytes.
  32. (2009). Gene expression during the life cycle of Drosophila melanogaster.
  33. (2003). Genome-wide discovery of transcriptional modules from DNA sequence and gene expression.
  34. (2003). Genome-wide in silico identification of transcriptional regulators controlling the cell cycle in human cells.
  35. (2000). Genomic expression programs in the response of yeast cells to environmental changes.
  36. (2004). Hog1 mediates cell-cycle arrest in G1 phase by the dual targeting of Sic1.
  37. (2002). Human regulatory factor X 4 (RFX4) is a testisspecific dimeric DNA-binding protein that cooperates with other human RFX members.
  38. (2002). Identification of genes periodically expressed in the human cell cycle and their expression in tumors.
  39. (2007). Identification of tightly regulated groups of genes during Drosophila melanogaster embryogenesis.
  40. (2005). Information-based clustering.
  41. (2006). Integrated biclustering of heterogeneous genome-wide datasets for the inference of global regulatory networks.
  42. (2003). Integrating regulatory motif discovery and genome-wide expression analysis.
  43. (1994). Murine A-myb: evidence for differential splicing and tissue-specific expression.
  44. (2002). Osmotic stress signaling and osmoadaptation in yeasts.
  45. (1998). Pheromone-dependent G1 cell cycle arrest requires Far1 phosphorylation, but may not involve inhibition of Cdc28-Cln2 kinase, in vivo.
  46. (2008). Proliferating cells express mRNAs with shortened 30 untranslated regions and fewer microRNA target sites.
  47. (2001). Regulatory element detection using correlation with expression.
  48. (2008). Regulatory networks define phenotypic classes of human stem cell lines.
  49. (2007). RNA sequence analysis defines Dicer’s role in mouse embryonic stem cells.
  50. (1965). Some methods for classification and analysis of multivariate observations.
  51. (2006). Specific microRNAs modulate embryonic stem cell-derived neurogenesis.
  52. (1999). Systematic determination of genetic network architecture.
  53. (2005). Systematic discovery of regulatory motifs in human promoters and 30 UTRs by comparison of several mammals.
  54. (1994). Targeted mutation of the CREB gene: compensation within the CREB/ATF family of transcription factors.
  55. (1996). Targeting of the CREB gene leads to up-regulation of a novel CREB mRNA isoform.
  56. (2004). Testis-specific transcriptional control.
  57. (2005). The E2F transcriptional network: old acquaintances with new faces.
  58. (2002). The expanding family of CREB/ CREM transcription factors that are involved with spermatogenesis.
  59. (1999). The myb gene family in cell growth, differentiation and apoptosis.
  60. (2006). The Oct4 and Nanog transcription network regulates pluripotency in mouse embryonic stem cells.
  61. (2005). The promoters of human cell cycle genes integrate signals from two tumor suppressive pathways during cellular transformation.
  62. (1996). The Saccharomyces cerevisiae zinc finger proteins Msn2p and Msn4p are required for transcriptional induction through the stress response element (STRE).
  63. (2006). The TAGt eam DNA motif controls the timing of Drosophila pre-blastoderm transcription.
  64. (2007). The TFIID subunit TAF4 regulates keratinocyte proliferation and has cell-autonomous and non-cell-autonomous tumour suppressor activity in mouse epidermis.
  65. (2007). Tissue-specific transcriptional regulation has diverged significantly between human and mouse.
  66. (2008). Transcription factor and microRNA motif discovery: The Amadeus platform and a compendium of metazoan target sets.
  67. (1998). Transcriptional control of muscle development by myocyte enhancer factor-2 (MEF2) proteins.
  68. (1996). TRANSFAC: a database on transcription factors and their DNA binding sites.
  69. (2004). Unique and redundant roles for HOG MAPK pathway components as revealed by wholegenome expression analysis.
  70. (2007). Unmasking activation of the zygotic genome using chromosomal deletions in the Drosophila embryo.
  71. (2008). Unraveling epigenetic regulation in embryonic stem cells.
  72. (2007). Wholegenome cartography of estrogen receptor alpha binding sites.
  73. (2002). Yeast go the whole HOG for the hyperosmotic response.

To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.