Search CORE

112 research outputs found

COUGER-co-factors associated with uniquely-bound genomic regions

Author: Gordân R.
Munteanu A.
Ohler U.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 26/05/2014
Field of study

Most transcription factors (TFs) belong to protein families that share a common DNA binding domain and have very similar DNA binding preferences. However, many paralogous TFs (i.e. members of the same TF family) perform different regulatory functions and interact with different genomic regions in the cell. A potential mechanism for achieving this differential in vivo specificity is through interactions with protein co-factors. Computational tools for studying the genomic binding profiles of paralogous TFs and identifying their putative co-factors are currently lacking. Here, we present an interactive web implementation of COUGER, a classification-based framework for identifying protein co-factors that might provide specificity to paralogous TFs. COUGER takes as input two sets of genomic regions bound by paralogous TFs, and it identifies a small set of putative co-factors that best distinguish the two sets of sequences. To achieve this task, COUGER uses a classification approach, with features that reflect the DNA-binding specificities of the putative co-factors. The identified co-factors are presented in a user-friendly output page, together with information that allows the user to understand and to explore the contributions of individual co-factor features. COUGER can be run as a stand-alone tool or through a web interface: http://couger.oit.duke.edu

PubMed Central

MDC Repository

Finding regulatory DNA motifs using alignment-free evolutionary conservation information

Author: Gordân Raluca
Hartemink Alexander J.
Narlikar Leelavati
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

As an increasing number of eukaryotic genomes are being sequenced, comparative studies aimed at detecting regulatory elements in intergenic sequences are becoming more prevalent. Most comparative methods for transcription factor (TF) binding site discovery make use of global or local alignments of orthologous regulatory regions to assess whether a particular DNA site is conserved across related organisms, and thus more likely to be functional. Since binding sites are usually short, sometimes degenerate, and often independent of orientation, alignment algorithms may not align them correctly. Here, we present a novel, alignment-free approach for using conservation information for TF binding site discovery. We relax the definition of conserved sites: we consider a DNA site within a regulatory region to be conserved in an orthologous sequence if it occurs anywhere in that sequence, irrespective of orientation. We use this definition to derive informative priors over DNA sequence positions, and incorporate these priors into a Gibbs sampling algorithm for motif discovery. Our approach is simple and fast. It requires neither sequence alignments nor the phylogenetic relationships between the orthologous sequences, yet it is more effective on real biological data than methods that do

CiteSeerX

PubMed Central

A Nucleosome-Guided Map of Transcription Factor Binding Sites in Yeast

Author: Alexander J Hartemink
Leelavati Narlikar
Raluca Gordân
Satoru Miyano
Publication venue: Public Library of Science
Publication date: 01/01/2007
Field of study

Finding functional DNA binding sites of transcription factors (TFs) throughout the genome is a crucial step in understanding transcriptional regulation. Unfortunately, these binding sites are typically short and degenerate, posing a significant statistical challenge: many more matches to known TF motifs occur in the genome than are actually functional. However, information about chromatin structure may help to identify the functional sites. In particular, it has been shown that active regulatory regions are usually depleted of nucleosomes, thereby enabling TFs to bind DNA in those regions. Here, we describe a novel motif discovery algorithm that employs an informative prior over DNA sequence positions based on a discriminative view of nucleosome occupancy. When a Gibbs sampling algorithm is applied to yeast sequence-sets identified by ChIP-chip, the correct motif is found in 52% more cases with our informative prior than with the commonly used uniform prior. This is the first demonstration that nucleosome occupancy information can be used to improve motif discovery. The improvement is dramatic, even though we are using only a statistical model to predict nucleosome occupancy; we expect our results to improve further as high-resolution genome-wide experimental nucleosome occupancy data becomes increasingly available

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Human-chimpanzee differences in a FZD8 enhancer alter cell-cycle dynamics in the developing neocortex.

Author: Bepler T
Boyd JL
Gordân R
Pilaz LJ
Rouanet JP
Silver DL
Skove SL
Wray GA
Publication venue
Publication date: 16/03/2015
Field of study

The human neocortex differs from that of other great apes in several notable regards, including altered cell cycle, prolonged corticogenesis, and increased size [1-5]. Although these evolutionary changes most likely contributed to the origin of distinctively human cognitive faculties, their genetic basis remains almost entirely unknown. Highly conserved non-coding regions showing rapid sequence changes along the human lineage are candidate loci for the development and evolution of uniquely human traits. Several studies have identified human-accelerated enhancers [6-14], but none have linked an expression difference to a specific organismal trait. Here we report the discovery of a human-accelerated regulatory enhancer (HARE5) of FZD8, a receptor of the Wnt pathway implicated in brain development and size [15, 16]. Using transgenic mice, we demonstrate dramatic differences in human and chimpanzee HARE5 activity, with human HARE5 driving early and robust expression at the onset of corticogenesis. Similar to HARE5 activity, FZD8 is expressed in neural progenitors of the developing neocortex [17-19]. Chromosome conformation capture assays reveal that HARE5 physically and specifically contacts the core Fzd8 promoter in the mouse embryonic neocortex. To assess the phenotypic consequences of HARE5 activity, we generated transgenic mice in which Fzd8 expression is under control of orthologous enhancers (Pt-HARE5::Fzd8 and Hs-HARE5::Fzd8). In comparison to Pt-HARE5::Fzd8, Hs-HARE5::Fzd8 mice showed marked acceleration of neural progenitor cell cycle and increased brain size. Changes in HARE5 function unique to humans thus alter the cell-cycle dynamics of a critical population of stem cells during corticogenesis and may underlie some distinctive anatomical features of the human brain

DukeSpace

GRISOTTO: A greedy approach to improve combinatorial algorithms for motif discovery with prior knowledge

Author: A Valouev
Alexandra M Carvalho
AM Carvalho
AP Fejes
Arlindo L Oliveira
C Deremble
C Lee
CT Harbison
D Ucar
E Segal
E Valen
F Daenen
G Paillard
G Paillard
G Pavesi
GC Yuan
I Lafontaine
I Lafontaine
I Lafontaine
IV Kulakovskiy
JV Ponomarenko
KD MacIsaac
L Marsan
L Narlikar
L Narlikar
M Hu
M Kellis
MF Sagot
N Pisanti
R Gordân
R Gordân
R Gordân
R Pudimat
R Siddharthan
RA O'Flanagan
RG Beiko
S Sinha
T Wang
TL Bailey
TL Bailey
V Matys
WW Wasserman
X Chen
Y Liu
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Position-specific priors (PSP) have been used with success to boost EM and Gibbs sampler-based motif discovery algorithms. PSP information has been computed from different sources, including orthologous conservation, DNA duplex stability, and nucleosome positioning. The use of prior information has not yet been used in the context of combinatorial algorithms. Moreover, priors have been used only independently, and the gain of combining priors from different sources has not yet been studied. Results We extend RISOTTO, a combinatorial algorithm for motif discovery, by post-processing its output with a greedy procedure that uses prior information. PSP's from different sources are combined into a scoring criterion that guides the greedy search procedure. The resulting method, called GRISOTTO, was evaluated over 156 yeast TF ChIP-chip sequence-sets commonly used to benchmark prior-based motif discovery algorithms. Results show that GRISOTTO is at least as accurate as other twelve state-of-the-art approaches for the same task, even without combining priors. Furthermore, by considering combined priors, GRISOTTO is considerably more accurate than the state-of-the-art approaches for the same task. We also show that PSP's improve GRISOTTO ability to retrieve motifs from mouse ChiP-seq data, indicating that the proposed algorithm can be applied to data from a different technology and for a higher eukaryote. Conclusions The conclusions of this work are twofold. First, post-processing the output of combinatorial algorithms by incorporating prior information leads to a very efficient and effective motif discovery method. Second, combining priors from different sources is even more beneficial than considering them separately.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Estimation of pairwise sequence similarity of mammalian enhancers with word neighbourhood counts

Author: Benson
Blaisdell
Blow
Burden
Carpenter
Dai
Doering
Forêt S.
Gardiner-Garden
Gordân R.
Goto
Hide
Jonathan Göke
Julia Lasserre
Kantorovitz
Kantorovitz
Kunarso
Lee
Lippert
Marcel H. Schulz
Martin Vingron
Needleman
Reinert
Robin
Small
Smith
Thomas-Chollier
van Helden
Vinga
Visel
Wilson
Wu
Zemojtel
Zinzen
Publication venue: Oxford University Press
Publication date: 01/01/2012
Field of study

Motivation: The identity of cells and tissues is to a large degree governed by transcriptional regulation. A major part is accomplished by the combinatorial binding of transcription factors at regulatory sequences, such as enhancers. Even though binding of transcription factors is sequence-specific, estimating the sequence similarity of two functionally similar enhancers is very difficult. However, a similarity measure for regulatory sequences is crucial to detect and understand functional similarities between two enhancers and will facilitate large-scale analyses like clustering, prediction and classification of genome-wide datasets

Recommended from our members

The vitamin D receptor gene as a determinant of survival in pancreatic cancer patients: Genomic analysis and experimental validation

Author: Bentrem David J.
Etheridge Amy S.
Glubb Dylan
Gordân Raluca
Innocenti Federico
Jiang Chen
Kindler Hedy L.
McLeod Howard
Mulkey Flora
Nakamura Yusuke
Neel Nicole
Niedzwiecki Donna
Owzar Kouros
Ratain Mark J.
Seiser Eric
Sibley Alexander B.
Talamonti Mark S.
Van Loon Katherine
Venook Alan P.
Yeh Jen Jen
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 21/06/2023
Field of study

Purpose: Advanced pancreatic cancer is a highly refractory disease almost always associated with survival of little more than a year. New interventions based on novel targets are needed. We aim to identify new genetic determinants of overall survival (OS) in patients after treatment with gemcitabine using genome-wide screens of germline DNA. We aim also to support these findings with in vitro functional analysis. Patients and methods: Genome-wide screens of germline DNA in two independent cohorts of pancreatic cancer patients (from the Cancer and Leukemia Group B (CALGB) 80303 and the Mayo Clinic) were used to select new genes associated with OS. The vitamin D receptor gene (VDR) was selected, and the interactions of genetic variation in VDR with circulating vitamin D levels and gemcitabine treatment were evaluated. Functional effects of common VDR variants were also evaluated in experimental assays in human cell lines. Results: The rs2853564 variant in VDR was associated with OS in patients from both the Mayo Clinic (HR 0.81, 95% CI 0.70–0.94, p = 0.0059) and CALGB 80303 (HR 0.74, 0.63–0.87, p = 0.0002). rs2853564 interacted with high pre-treatment levels of 25-hydroxyvitamin D (25(OH)D, a measure of endogenous vitamin D) (p = 0.0079 for interaction) and with gemcitabine treatment (p = 0.024 for interaction) to confer increased OS. rs2853564 increased transcriptional activity in luciferase assays and reduced the binding of the IRF4 transcription factor. Conclusion: Our findings propose VDR as a novel determinant of survival in advanced pancreatic cancer patients. Common functional variation in this gene might interact with endogenous vitamin D and gemcitabine treatment to determine improved patient survival. These results support evidence for a modulatory role of the vitamin D pathway for the survival of advanced pancreatic cancer patients.</p

Knowledge UChicago

Is Transcription Factor Binding Site Turnover a Sufficient Explanation for Cis-Regulatory Sequence Divergence?

Author: Borneman
Bradley
Brown
Chambers
Chan
Costas
Dermitzakis
Dermitzakis
Doniger
Doniger
Frazer
Gordân
Hancock
Harbison
Hertz
Hogues
Ihmels
Justin C. Fay
Kasowski
Kent
Lavoie
Li
Ludwig
MacIsaac
Margulies
Martchenko
Moses
Nishi
Odom
Otto
Perez
Perez
Piskur
Pollard
Prabhakar
Sandeep Venkataram
Schmidt
Siepel
Simpson
Tanay
Tatusov
Tautz
Tsong
Tsong
Tuch
Tuch
Wang
Ward
Weirauch
Woolfe
Zheng
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

The molecular evolution of cis-regulatory sequences is not well understood. Comparisons of closely related species show that cis-regulatory sequences contain a large number of sites constrained by purifying selection. In contrast, there are a number of examples from distantly related species where cis-regulatory sequences retain little to no sequence similarity but drive similar patterns of gene expression. Binding site turnover, whereby the gain of a redundant binding site enables loss of a previously functional site, is one model by which cis-regulatory sequences can diverge without a concurrent change in function. To determine whether cis-regulatory sequence divergence is consistent with binding site turnover, we examined binding site evolution within orthologous intergenic sequences from 14 yeast species defined by their syntenic relationships with adjacent coding sequences. Both local and global alignments show that nearly all distantly related orthologous cis-regulatory sequences have no significant level of sequence similarity but are enriched for experimentally identified binding sites. Yet, a significant proportion of experimentally identified binding sites that are conserved in closely related species are absent in distantly related species and so cannot be explained by binding site turnover. Depletion of binding sites depends on the transcription factor but is detectable for a quarter of all transcription factors examined. Our results imply that binding site turnover is not a sufficient explanation for cis-regulatory sequence evolution

Crossref

PubMed Central

Digital Commons@Becker

Efficient large-scale protein sequence comparison and gene matching to identify orthologs and co-orthologs

Author: Altschul
Altschul
Arun S. Konagurthu
Arunachalam
Bandyopadhyay
Bansal
Calabrese
Dehal
Dice
Edgar
Edgar
Flicek
Fukuhara
Geoffrey I. Webb
Gordân
Haas
Hachiya
James C. Whisstock
Jiangning Song
Jun
Khalid Mahmood
Koohy
Koonin
Kriventseva
Kuhn
Kärkkäinen
Li
Mahmood
Needleman
Papadimitriou
Pearson
Pruess
Remm
Sakarya
Sankoff
Santini
Sjolander
Smith
Smith
Sonnhammer
Sorensen
Swidan
Vandepoele
Vinga
Vingron
Widmann
Woolfe
Xu
Yu
Zhi
Publication venue: Oxford University Press
Publication date: 01/01/2012
Field of study

Broadly, computational approaches for ortholog assignment is a three steps process: (i) identify all putative homologs between the genomes, (ii) identify gene anchors and (iii) link anchors to identify best gene matches given their order and context. In this article, we engineer two methods to improve two important aspects of this pipeline [specifically steps (ii) and (iii)]. First, computing sequence similarity data [step (i)] is a computationally intensive task for large sequence sets, creating a bottleneck in the ortholog assignment pipeline. We have designed a fast and highly scalable sort-join method (afree) based on k-mer counts to rapidly compare all pairs of sequences in a large protein sequence set to identify putative homologs. Second, availability of complex genomes containing large gene families with prevalence of complex evolutionary events, such as duplications, has made the task of assigning orthologs and co-orthologs difficult. Here, we have developed an iterative graph matching strategy where at each iteration the best gene assignments are identified resulting in a set of orthologs and co-orthologs. We find that the afree algorithm is faster than existing methods and maintains high accuracy in identifying similar genes. The iterative graph matching strategy also showed high accuracy in identifying complex gene relationships. Standalone afree available from http://vbc.med.monash.edu.au/∼kmahmood/afree. EGM2, complete ortholog assignment pipeline (including afree and the iterative graph matching method) available from http://vbc.med.monash.edu.au/∼kmahmood/EGM2

Crossref

PubMed Central

Monash University Research Portal

University of Melbourne Institutional Repository

KIRMES: kernel-based identification of regulatory modules in euchromatic sequences

Author: Bailey
Ben-Hur
Boser
Busch
Frith
Giardine
Gordân
Gunnar Rätsch
Gupta
Harbison
Jan U. Lohmann
Joachims
Lawrence
Leibfried
Leslie
Leslie
Matys
Meinicke
Mikolajczyk
Müller
Noble
Nowak
Oliver Kohlbacher
Redman
Rätsch
Rätsch
Sandelin
Schneider
Schneider
Schölkopf
Schölkopf
Schölkopf
Sebastian J. Schultheiss
Segal
Sinha
Smith
Sonnenburg
Sonnenburg
Sonnenburg
Sonnenburg
Stormo
Swarbreck
Thijs
Wolfgang Busch
Yada
Zien
Publication venue: Oxford University Press
Publication date: 01/01/2009
Field of study

Motivation: Understanding transcriptional regulation is one of the main challenges in computational biology. An important problem is the identification of transcription factor (TF) binding sites in promoter regions of potential TF target genes. It is typically approached by position weight matrix-based motif identification algorithms using Gibbs sampling, or heuristics to extend seed oligos. Such algorithms succeed in identifying single, relatively well-conserved binding sites, but tend to fail when it comes to the identification of combinations of several degenerate binding sites, as those often found in cis-regulatory modules

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

MPG.PuRe