65,081 research outputs found
Computational genomics of regulatory elements and regulatory territories
Whole genome comparison of metazoan genomes reveals extremely high level of noncoding conservation over tens to hundreds of base pairs across distant species. These sequences are termed as conserved noncoding elements (CNEs). Arrays of conserved noncoding elements that span the loci of developmental regulatory genes and their span defines regulatory genomic blocks (GRBs). CNEs are currently known to be involved in transcriptional regulation and development as long-range enhancers. However, no molecular mechanism can yet explain their exceptional degree of conservation. As a first step towards the genome-wide study of these elements, I developed two R/Bioconductor packages CNEr and TFBSTools, to detect and analyse regulatory elements. Next, I designed a novel CNE detection pipeline for duplicated regions in the ameiotic Adineta vaga genome. Identification of CNEs in this genome suggests that the principal function of CNEs is regulation of developmental gene expression rather than copy number sensing. In addition, I performed a de novo genome annotation of European common carp Cyprinus carpio. This genome stands as an ideal candidate for comparative study of zebrafish genome. Its analysis revealed a wealth of previously undetected fish regulatory elements and their unexpectedly high level of conservation between the two genomes. Finally, I presented a computational method for the identification of GRB boundaries and prediction of the corresponding target genes under long-range regulation. The predicted target genes are implicated in developmental, transcriptional regulation and axon guidance. The disruption of regulation of these target genes is likely to cause complex diseases, including cancer. The GRB boundaries and predicted target genes are valuable resource for investigating developmental regulation and interpreting genome-wide association studies.Open Acces
Motifs and cis-regulatory modules mediating the expression of genes co-expressed in presynaptic neurons
An integrative strategy of comparative genomics, experimental and computational approaches reveals aspects of a regulatory network controlling neuronal-specific expression in presynaptic neurons
Computational identification of transcriptional regulatory elements in DNA sequence
Identification and annotation of all the functional elements in the genome, including genes and the regulatory sequences, is a fundamental challenge in genomics and computational biology. Since regulatory elements are frequently short and variable, their identification and discovery using computational algorithms is difficult. However, significant advances have been made in the computational methods for modeling and detection of DNA regulatory elements. The availability of complete genome sequence from multiple organisms, as well as mRNA profiling and high-throughput experimental methods for mapping protein-binding sites in DNA, have contributed to the development of methods that utilize these auxiliary data to inform the detection of transcriptional regulatory elements. Progress is also being made in the identification of cis-regulatory modules and higher order structures of the regulatory sequences, which is essential to the understanding of transcription regulation in the metazoan genomes. This article reviews the computational approaches for modeling and identification of genomic regulatory elements, with an emphasis on the recent developments, and current challenges
Inferring Regulatory Networks by Combining Perturbation Screens and Steady State Gene Expression Profiles
Reconstructing transcriptional regulatory networks is an important task in
functional genomics. Data obtained from experiments that perturb genes by
knockouts or RNA interference contain useful information for addressing this
reconstruction problem. However, such data can be limited in size and/or are
expensive to acquire. On the other hand, observational data of the organism in
steady state (e.g. wild-type) are more readily available, but their
informational content is inadequate for the task at hand. We develop a
computational approach to appropriately utilize both data sources for
estimating a regulatory network. The proposed approach is based on a three-step
algorithm to estimate the underlying directed but cyclic network, that uses as
input both perturbation screens and steady state gene expression data. In the
first step, the algorithm determines causal orderings of the genes that are
consistent with the perturbation data, by combining an exhaustive search method
with a fast heuristic that in turn couples a Monte Carlo technique with a fast
search algorithm. In the second step, for each obtained causal ordering, a
regulatory network is estimated using a penalized likelihood based method,
while in the third step a consensus network is constructed from the highest
scored ones. Extensive computational experiments show that the algorithm
performs well in reconstructing the underlying network and clearly outperforms
competing approaches that rely only on a single data source. Further, it is
established that the algorithm produces a consistent estimate of the regulatory
network.Comment: 24 pages, 4 figures, 6 table
Discovering structural cis-regulatory elements by modeling the behaviors of mRNAs
Gene expression is regulated at each step from chromatin remodeling through translation and degradation. Several known RNA-binding regulatory proteins interact with specific RNA secondary structures in addition to specific nucleotides. To provide a more comprehensive understanding of the regulation of gene expression, we developed an integrative computational approach that leverages functional genomics data and nucleotide sequences to discover RNA secondary structure-defined cis-regulatory elements (SCREs). We applied our structural cis-regulatory element detector (StructRED) to microarray and mRNA sequence data from Saccharomyces cerevisiae, Drosophila melanogaster, and Homo sapiens. We recovered the known specificities of Vts1p in yeast and Smaug in flies. In addition, we discovered six putative SCREs in flies and three in humans. We characterized the SCREs based on their condition-specific regulatory influences, the annotation of the transcripts that contain them, and their locations within transcripts. Overall, we show that modeling functional genomics data in terms of combined RNA structure and sequence motifs is an effective method for discovering the specificities and regulatory roles of RNA-binding proteins
Systematic genetic analysis of the MHC region reveals mechanistic underpinnings of HLA type associations with disease.
The MHC region is highly associated with autoimmune and infectious diseases. Here we conduct an in-depth interrogation of associations between genetic variation, gene expression and disease. We create a comprehensive map of regulatory variation in the MHC region using WGS from 419 individuals to call eight-digit HLA types and RNA-seq data from matched iPSCs. Building on this regulatory map, we explored GWAS signals for 4083 traits, detecting colocalization for 180 disease loci with eQTLs. We show that eQTL analyses taking HLA type haplotypes into account have substantially greater power compared with only using single variants. We examined the association between the 8.1 ancestral haplotype and delayed colonization in Cystic Fibrosis, postulating that downregulation of RNF5 expression is the likely causal mechanism. Our study provides insights into the genetic architecture of the MHC region and pinpoints disease associations that are due to differential expression of HLA genes and non-HLA genes
Understanding the Dynamics of Gene Regulatory Systems : Characterisation and Clinical Relevance of cis-Regulatory Polymorphisms
Peer reviewedPublisher PD
Identification of small RNAs abundant in Burkholderia cenocepacia biofilms reveal putative regulators with a potential role in carbon and iron metabolism
Small RNAs play a regulatory role in many central metabolic processes of bacteria, as well as in developmental processes such as biofilm formation. Small RNAs of Burkholderia cenocepacia, an opportunistic pathogenic beta-proteobacterium, are to date not well characterised. To address that, we performed genome-wide transcriptome structure analysis of biofilm grown B. cenocepacia J2315. 41 unannotated short transcripts were identified in intergenic regions of the B. cenocepacia genome. 15 of these short transcripts, highly abundant in biofilms, widely conserved in Burkholderia sp. and without known function, were selected for in-depth analysis. Expression profiling showed that most of these sRNAs are more abundant in biofilms than in planktonic cultures. Many are also highly abundant in cells grown in minimal media, suggesting they are involved in adaptation to nutrient limitation and growth arrest. Their computationally predicted targets include a high proportion of genes involved in carbon metabolism. Expression and target genes of one sRNA suggest a potential role in regulating iron homoeostasis. The strategy used for this study to detect sRNAs expressed in B. cenocepacia biofilms has successfully identified sRNAs with a regulatory function
- …