Search CORE

24 research outputs found

Conservation anchors in the vertebrate genome

Author: Aloni Ronny
Lancet Doron
Publication venue: BioMed Central
Publication date: 01/01/1973
Field of study

Genomic segments that do not code for proteins yet show high conservation among vertebrates have recently been identified by various computational methodologies. We refer to them as ANCORs (ancestral non-coding conserved regions). The frequency of individual ANCORs within the genome, along with their (correlated) inter-species identity scores, helps in assessing the probability that they function in transcription regulation or RNA coding

Crossref

PubMed Central

Longer First Introns Are a General Property of Eukaryotic Gene Structure

Author: A Rogers
A Sakurai
AB Rose
AB Rose
AB Rose
AE Vinogradov
Alan Christoffels
BY Chung
CB Russell
D Mascarenhas
D Swarbreck
DA Benson
DJ Gaffney
DL Halligan
E Gazave
EV Kriventseva
G Marais
H Akashi
H Nielsen
HK Stenoien
Ian Korf
J Majewski
J Spieth
JC Venter
JD Hawkins
JJ Jonsson
JS Jeon
JV Chamary
K Lin
Keith R. Bradnam
KR Kalari
L Collins
L Duret
L Morello
M Deutsch
M Stanke
MG Reese
MW Smith
PD Keightley
RD Palmiter
RJ Wilson
S Levy
SH Ho
SW Li
T Blumenthal
W Gilbert
X Hong
XY Ren
Y Chen
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

While many properties of eukaryotic gene structure are well characterized, differences in the form and function of introns that occur at different positions within a transcript are less well understood. In particular, the dynamics of intron length variation with respect to intron position has received relatively little attention. This study analyzes all available data on intron lengths in GenBank and finds a significant trend of increased length in first introns throughout a wide range of species. This trend was found to be even stronger when using high-confidence gene annotation data for three model organisms (Arabidopsis thaliana, Caenorhabditis elegans, and Drosophila melanogaster) which show that the first intron in the 5′ UTR is - on average - significantly longer than all downstream introns within a gene. A partial explanation for increased first intron length in A. thaliana is suggested by the increased frequency of certain motifs that are present in first introns. The phenomenon of longer first introns can potentially be used to improve gene prediction software and also to detect errors in existing gene annotations

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

The PAZAR database of gene regulatory information coupled to the ORCA toolkit for the study of regulatory sequences

Author: Aerts
Altschul
Anthony McCallum
Astanehe
Bailey
Barrett
Benson
Bray
Brudno
Corcoran
David Arenillas
Elodie Portales-Casamar
Fickett
Flicek
Ho Sui
Jonathan Lim
Karolchik
Lenhard
Levy
Magdalena I. Swanson
Montgomery
Needleman
Parkinson
Portales-Casamar
Sandelin
Sherry
Siepel
Stajich
Stefan Kirov
Steven Jiang
Tagle
Trinklein
Vlieghe
Wyeth W. Wasserman
Publication venue: Oxford University Press
Publication date
Field of study

The PAZAR database unites independently created and maintained data collections of transcription factor and regulatory sequence annotation. The flexible PAZAR schema permits the representation of diverse information derived from experiments ranging from biochemical protein–DNA binding to cellular reporter gene assays. Data collections can be made available to the public, or restricted to specific system users. The data ‘boutiques’ within the shopping-mall-inspired system facilitate the analysis of genomics data and the creation of predictive models of gene regulation. Since its initial release, PAZAR has grown in terms of data, features and through the addition of an associated package of software tools called the ORCA toolkit (ORCAtk). ORCAtk allows users to rapidly develop analyses based on the information stored in the PAZAR system. PAZAR is available at http://www.pazar.info. ORCAtk can be accessed through convenient buttons located in the PAZAR pages or via our website at http://www.cisreg.ca/ORCAtk

Crossref

PubMed Central

Linking disease-associated genes to regulatory networks via promoter organization

Author: de Angelis M. Hrabé
Döhr S.
Klingenhoff A.
Maier H.
Schneider R.
Werner T.
Publication venue: Oxford University Press
Publication date: 01/01/2005
Field of study

Pathway- or disease-associated genes may participate in more than one transcriptional co-regulation network. Such gene groups can be readily obtained by literature analysis or by high-throughput techniques such as microarrays or protein-interaction mapping. We developed a strategy that defines regulatory networks by in silico promoter analysis, finding potentially co-regulated subgroups without a priori knowledge. Pairs of transcription factor binding sites conserved in orthologous genes (vertically) as well as in promoter sequences of co-regulated genes (horizontally) were used as seeds for the development of promoter models representing potential co-regulation. This approach was applied to a Maturity Onset Diabetes of the Young (MODY)-associated gene list, which yielded two models connecting functionally interacting genes within MODY-related insulin/glucose signaling pathways. Additional genes functionally connected to our initial gene list were identified by database searches with these promoter models. Thus, data-driven in silico promoter analysis allowed integrating molecular mechanisms with biological functions of the cell

Exploration for Functional Nucleotide Sequence Candidates within Coding Regions of Mammalian Genes

Author: Benjamini
Bhalla
Blencowe
Delgado
Donehower
Drummond
Eyre-Walker
Hershberg
Ikemura
Ikemura
Ikemura
Kanaya
Kurland
Lareau
Levy
Licatalosi
Lin
Makalowski
N. Saitou
R. Suzuki
Reed
Sandelin
Schattner
Sharp
Sharp
Takahashi
Publication venue: Oxford University Press
Publication date
Field of study

The primary role of a protein coding gene is to encode amino acids. Therefore, synonymous sites of codons, which do not change the encoded amino acid, are regarded as evolving neutrally. However, if a certain region of a protein coding gene contains a functional nucleotide element (e.g. splicing signals), synonymous sites in the region may have selective pressure. The existence of such elements would be detected by searching regions of low nucleotide substitution. We explored invariant nucleotide sequences in 10 790 orthologous genes of six mammalian species (Homo sapiens, Macaca mulatta, Mus musculus, Rattus norvegicus, Bos taurus, and Canis familiaris), and extracted 4150 sequences whose conservation is significantly stronger than other regions of the gene and named them significantly conserved coding sequences (SCCSs). SCCSs are observed in 2273 genes. The genes are mainly involved with development, transcriptional regulation, and the neurons, and are expressed in the nervous system and the head and neck organs. No strong influence of conventional factors that affect synonymous substitution was observed in SCCSs. These results imply that SCCSs may have double function as nucleotide element and protein coding sequence and retained in the course of mammalian evolution

Crossref

PubMed Central

Position specific variation in the rate of evolution in transcription factor binding sites

Author: Chiang Derek Y
Eisen Michael B
Kellis Manolis
Lander Eric S
Moses Alan M
Publication venue: BioMed Central
Publication date: 01/01/2003
Field of study

BACKGROUND: The binding sites of sequence specific transcription factors are an important and relatively well-understood class of functional non-coding DNAs. Although a wide variety of experimental and computational methods have been developed to characterize transcription factor binding sites, they remain difficult to identify. Comparison of non-coding DNA from related species has shown considerable promise in identifying these functional non-coding sequences, even though relatively little is known about their evolution. RESULTS: Here we analyse the genome sequences of the budding yeasts Saccharomyces cerevisiae, S. bayanus, S. paradoxus and S. mikatae to study the evolution of transcription factor binding sites. As expected, we find that both experimentally characterized and computationally predicted binding sites evolve slower than surrounding sequence, consistent with the hypothesis that they are under purifying selection. We also observe position-specific variation in the rate of evolution within binding sites. We find that the position-specific rate of evolution is positively correlated with degeneracy among binding sites within S. cerevisiae. We test theoretical predictions for the rate of evolution at positions where the base frequencies deviate from background due to purifying selection and find reasonable agreement with the observed rates of evolution. Finally, we show how the evolutionary characteristics of real binding motifs can be used to distinguish them from artefacts of computational motif finding algorithms. CONCLUSION: As has been observed for protein sequences, the rate of evolution in transcription factor binding sites varies with position, suggesting that some regions are under stronger functional constraint than others. This variation likely reflects the varying importance of different positions in the formation of the protein-DNA complex. The characterization of the pattern of evolution in known binding sites will likely contribute to the effective use of comparative sequence data in the identification of transcription factor binding sites and is an important step toward understanding the evolution of functional non-coding DNA

DSpace@MIT

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

UNT Digital Library

In Vivo Validation of a Computationally Predicted Conserved Ath5 Target Gene Set

Author: Dorota Skowronska-Krawczyk
Ewan Birney
Filippo Del Bene
Herwig Baier
Jean-Marc Matter
Joachim Wittbrodt
Laurence Ettwiller
Stuart K Kim
Publication venue: Public Library of Science
Publication date: 01/01/2007
Field of study

So far, the computational identification of transcription factor binding sites is hampered by the complexity of vertebrate genomes. Here we present an in silico procedure to predict target sites of a transcription factor in complex genomes using its binding site. In a first step sequence, comparison of closely related genomes identifies the binding sites in conserved cis-regulatory regions (phylogenetic footprinting). Subsequently, more remote genomes are introduced into the comparison to identify highly conserved and therefore putatively functional binding sites (phylogenetic filtering). When applied to the binding site of atonal homolog 5 (Ath5 or ATOH7), this procedure efficiently filters evolutionarily conserved binding sites out of more than 300,000 instances in a vertebrate genome. We validate a selection of the linked target genes by showing coexpression with and transcriptional regulation by Ath5. Finally, chromatin immunoprecipitation demonstrates the occupancy of the target gene promoters by Ath5. Thus, our procedure, applied to whole genomes, is a fast and predictive tool to in silico filter the target genes of a given transcription factor with defined binding site

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

MPG.PuRe

Computational prediction of transcription-factor binding site locations

Author: Bulyk Martha L
Publication venue: BioMed Central
Publication date: 23/12/2003
Field of study

Identifying genomic locations of transcription-factor binding sites, particularly in higher eukaryotic genomes, has been an enormous challenge. Various experimental and computational approaches have been used to detect these sites; methods involving computational comparisons of related genomes have been particularly successful

CiteSeerX

Harvard University - DASH

PubMed Central

A Method for the Structure-Based, Genome-Wide Analysis of Bacterial Intergenic Sequences Identifies Shared Compositional and Functional Features

Author: Di Patti F.
Fani R.
Fondi M.
Lenzini L.
Livi R.
Mengoni A.
Publication venue: 'MDPI AG'
Publication date: 01/01/2019
Field of study

Florence Research