133,862 research outputs found

    Genome-Wide Identification of Bcl11b Gene Targets Reveals Role in Brain-Derived Neurotrophic Factor Signaling

    Get PDF
    B-cell leukemia/lymphoma 11B (Bcl11b) is a transcription factor showing predominant expression in the striatum. To date, there are no known gene targets of Bcl11b in the nervous system. Here, we define targets for Bcl11b in striatal cells by performing chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) in combination with genome-wide expression profiling. Transcriptome-wide analysis revealed that 694 genes were significantly altered in striatal cells over-expressing Bcl11b, including genes showing striatal-enriched expression similar to Bcl11b. ChIP-seq analysis demonstrated that Bcl11b bound a mixture of coding and non-coding sequences that were within 10 kb of the transcription start site of an annotated gene. Integrating all ChIP-seq hits with the microarray expression data, 248 direct targets of Bcl11b were identified. Functional analysis on the integrated gene target list identified several zinc-finger encoding genes as Bcl11b targets, and further revealed a significant association of Bcl11b to brain-derived neurotrophic factor/neurotrophin signaling. Analysis of ChIP-seq binding regions revealed significant consensus DNA binding motifs for Bcl11b. These data implicate Bcl11b as a novel regulator of the BDNF signaling pathway, which is disrupted in many neurological disorders. Specific targeting of the Bcl11b-DNA interaction could represent a novel therapeutic approach to lowering BDNF signaling specifically in striatal cells

    Identifying Consensus Disease Pathways in Parkinson's Disease Using an Integrative Systems Biology Approach

    Get PDF
    Parkinson's disease (PD) has had six genome-wide association studies (GWAS) conducted as well as several gene expression studies. However, only variants in MAPT and SNCA have been consistently replicated. To improve the utility of these approaches, we applied pathway analyses integrating both GWAS and gene expression. The top 5000 SNPs (p<0.01) from a joint analysis of three existing PD GWAS were identified and each assigned to a gene. For gene expression, rather than the traditional comparison of one anatomical region between sets of patients and controls, we identified differentially expressed genes between adjacent Braak regions in each individual and adjusted using average control expression profiles. Over-represented pathways were calculated using a hyper-geometric statistical comparison. An integrated, systems meta-analysis of the over-represented pathways combined the expression and GWAS results using a Fisher's combined probability test. Four of the top seven pathways from each approach were identical. The top three pathways in the meta-analysis, with their corrected p-values, were axonal guidance (p = 2.8E-07), focal adhesion (p = 7.7E-06) and calcium signaling (p = 2.9E-05). These results support that a systems biology (pathway) approach will provide additional insight into the genetic etiology of PD and that these pathways have both biological and statistical support to be important in PD

    Novel insight into the etiology of autism spectrum disorder gained by integrating expression data with genome-wide association statistics

    Get PDF
    Background A recent genome-wide association study (GWAS) of autism spectrum disorders (ASD) (Ncases=18,381, Ncontrols=27,969) has provided novel opportunities for investigating the aetiology of ASD. Here, we integrate the ASD GWAS summary statistics with summary-level gene expression data to infer differential gene expression in ASD, an approach called transcriptome-wide association study (TWAS). Methods Using FUSION software, ASD GWAS summary statistics were integrated with predictors of gene expression from 16 human datasets, including adult and fetal brain. A novel adaptation of established statistical methods was then used to test for enrichment within candidate pathways, specific tissues, and at different stages of brain development. The proportion of ASD heritability explained by predicted expression of genes in the TWAS was estimated using stratified linkage disequilibrium-score regression. Results This study identified 14 genes as significantly differentially expressed in ASD, 13 of which were outside of known genome-wide significant loci (±500kb). XRN2, a gene proximal to an ASD GWAS locus, was inferred to be significantly upregulated in ASD, providing insight into functional consequence of this associated locus. One novel transcriptome-wide significant association from this study is the downregulation of PDIA6, which showed minimal evidence of association in the GWAS, and in gene-based analysis using MAGMA. Predicted gene expression in this study accounted for 13.0% of the total ASD SNP-heritability. Conclusion This study has implicated several genes as significantly up-/down-regulated in ASD providing novel and useful information for subsequent functional studies. This study also explores the utility of TWAS-based enrichment analysis and compares TWAS results with a functionally agnostic approach

    Multi-Platform Whole-Genome Microarray Analyses Refine the Epigenetic Signature of Breast Cancer Metastasis with Gene Expression and Copy Number

    Get PDF
    BACKGROUND: We have previously identified genome-wide DNA methylation changes in a cell line model of breast cancer metastasis. These complex epigenetic changes that we observed, along with concurrent karyotype analyses, have led us to hypothesize that complex genomic alterations in cancer cells (deletions, translocations and ploidy) are superimposed over promoter-specific methylation events that are responsible for gene-specific expression changes observed in breast cancer metastasis. METHODOLOGY/PRINCIPAL FINDINGS: We undertook simultaneous high-resolution, whole-genome analyses of MDA-MB-468GFP and MDA-MB-468GFP-LN human breast cancer cell lines (an isogenic, paired lymphatic metastasis cell line model) using Affymetrix gene expression (U133), promoter (1.0R), and SNP/CNV (SNP 6.0) microarray platforms to correlate data from gene expression, epigenetic (DNA methylation), and combination copy number variant/single nucleotide polymorphism microarrays. Using Partek Software and Ingenuity Pathway Analysis we integrated datasets from these three platforms and detected multiple hypomethylation and hypermethylation events. Many of these epigenetic alterations correlated with gene expression changes. In addition, gene dosage events correlated with the karyotypic differences observed between the cell lines and were reflected in specific promoter methylation patterns. Gene subsets were identified that correlated hyper (and hypo) methylation with the loss (or gain) of gene expression and in parallel, with gene dosage losses and gains, respectively. Individual gene targets from these subsets were also validated for their methylation, expression and copy number status, and susceptible gene pathways were identified that may indicate how selective advantage drives the processes of tumourigenesis and metastasis. CONCLUSIONS/SIGNIFICANCE: Our approach allows more precisely profiling of functionally relevant epigenetic signatures that are associated with cancer progression and metastasis

    Quantitative and Automated High-throughput Genome-wide RNAi Screens in C. elegans.

    Get PDF
    Epub ahead of printInternational audienceRNA interference is a powerful method to understand gene function, especially when conducted at a whole-genome scale and in a quantitative context. In C. elegans, gene function can be knocked down simply and efficiently by feeding worms with bacteria expressing a dsRNA corresponding to a specific gene (1). While the creation of libraries of RNAi clones covering most of the C. elegans genome (2,3) opened the way for true functional genomic studies (see for example (4-7)), most established methods are laborious. Moy and colleagues have developed semi-automated protocols that facilitate genome-wide screens (8). The approach relies on microscopic imaging and image analysis. Here we describe an alternative protocol for a high-throughput genome-wide screen, based on robotic handling of bacterial RNAi clones, quantitative analysis using the COPAS Biosort (Union Biometrica (UBI)), and an integrated software: the MBioLIMS (Laboratory Information Management System from Modul-Bio) a technology that provides increased throughput for data management and sample tracking. The method allows screens to be conducted on solid medium plates. This is particularly important for some studies, such as those addressing host-pathogen interactions in C. elegans, since certain microbes do not efficiently infect worms in liquid culture. We show how the method can be used to quantify the importance of genes in anti-fungal innate immunity in C. elegans. In this case, the approach relies on the use of a transgenic strain carrying an epidermal infection-inducible fluorescent reporter gene, with GFP under the control of the promoter of the antimicrobial peptide gene nlp 29 and a red fluorescent reporter that is expressed constitutively in the epidermis. The latter provides an internal control for the functional integrity of the epidermis and nonspecific transgene silencing(9). When control worms are infected by the fungus they fluoresce green. Knocking down by RNAi a gene required for nlp 29 expression results in diminished fluorescence after infection. Currently, this protocol allows more than 3,000 RNAi clones to be tested and analyzed per week, opening the possibility of screening the entire genome in less than 2 months

    Genomic convergence and network analysis approach to identify candidate genes in Alzheimer's disease

    Get PDF
    BACKGROUND: Alzheimer’s disease (AD) is one of the leading genetically complex and heterogeneous disorder that is influenced by both genetic and environmental factors. The underlying risk factors remain largely unclear for this heterogeneous disorder. In recent years, high throughput methodologies, such as genome-wide linkage analysis (GWL), genome-wide association (GWA) studies, and genome-wide expression profiling (GWE), have led to the identification of several candidate genes associated with AD. However, due to lack of consistency within their findings, an integrative approach is warranted. Here, we have designed a rank based gene prioritization approach involving convergent analysis of multi-dimensional data and protein-protein interaction (PPI) network modelling. RESULTS: Our approach employs integration of three different AD datasets- GWL,GWA and GWE to identify overlapping candidate genes ranked using a novel cumulative rank score (S(R)) based method followed by prioritization using clusters derived from PPI network. S(R) for each gene is calculated by addition of rank assigned to individual gene based on either p value or score in three datasets. This analysis yielded 108 plausible AD genes. Network modelling by creating PPI using proteins encoded by these genes and their direct interactors resulted in a layered network of 640 proteins. Clustering of these proteins further helped us in identifying 6 significant clusters with 7 proteins (EGFR, ACTB, CDC2, IRAK1, APOE, ABCA1 and AMPH) forming the central hub nodes. Functional annotation of 108 genes revealed their role in several biological activities such as neurogenesis, regulation of MAP kinase activity, response to calcium ion, endocytosis paralleling the AD specific attributes. Finally, 3 potential biochemical biomarkers were found from the overlap of 108 AD proteins with proteins from CSF and plasma proteome. EGFR and ACTB were found to be the two most significant AD risk genes. CONCLUSIONS: With the assumption that common genetic signals obtained from different methodological platforms might serve as robust AD risk markers than candidates identified using single dimension approach, here we demonstrated an integrated genomic convergence approach for disease candidate gene prioritization from heterogeneous data sources linked to AD. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/1471-2164-15-199) contains supplementary material, which is available to authorized users

    Reconciling gene expression data with regulatory network models – a stimulon-based approach for integrated metabolic and regulatory modeling of Bacillus subtilis

    Get PDF
    The reconstruction of genome-scale metabolic models from genome annotations has become a routine practice in Systems Biology research. The potential of metabolic models for predictive biology is widely accepted by the scientific community, but these same models still lack the capability to account for the effect of gene regulation on metabolic activity. Our focus organism, Bacillus subtilis is most commonly found in soil, being subject to a wide variety of external environmental conditions. This reinforces the importance of the regulatory mechanisms that allow the bacteria to survive and adapt to such conditions. Existing integrated metabolic regulatory models are currently available for only a small number of well-known organisms (e.g E. coli and B. subtilis). The E. coli integrated model was proposed by Covert et al in 2004 and has slowly improved over the years. Goelzer et al. introduced the B. subtilis integrated model in 2008, covering only the central metabolic pathways. Different strategies were used in the two modeling efforts. The E. coli model is defined by a set of Boolean rules (turning genes ON and OFF) accounting mostly for transcription factors, gene interactions, involved metabolites, and some external conditions such as heat shock. The B. subtilis model introduces a set of more complex rules and also incorporates sigma factor activity into the modeling abstraction. Here we propose a genome-scale model for the regulatory network of B. subtilis, using a new stimulon-based approach. A stimulon is defined as the set of genes (that can be a part of the same operon(s) and regulon(s)) that respond in the same set of stimuli. The proposed stimulon-based approach allows for the inclusion of more types of regulation in the model. This methodology also abstracts away much of the complexity of regulatory mechanisms by directly connecting the activity of genes to the presence or absence of associated stimuli, a necessity in the many cases where details of regulatory mechanisms are poorly understood. Our model integrates regulatory network data from the Goelzer et al model, in addition to other available literature data. We then reconciled our model against a large set of high-quality gene expression data (tiled microarrays for 104 different conditions). The stimulons in our model were split or extended to improve consistency with our expression data, and the stimuli in our model were adjusted to improve consistency with the conditions of our expression experiments. The reconciliation with gene expression data revealed a significant number of exact or nearly exact matches between the manually curated regulons/stimulons and pure correlation-based regulons. Our reconciliation analysis of the 2011 SubtiWiki regulon release suggested many gene candidates for regulon extension that were subsequently included in the 2013 SubtiWiki update. Our enhanced model also includes an improved coverage of a wide range of different stress conditions. We then integrated our regulatory model with the latest metabolic reconstruction for B. subtilis, the iBsu1103V2 model (Tanaka et al. 2012). We applied this integrated metabolic regulatory model to the simulation of all growth phenotype data currently available for B. subtilis, demonstrating how the addition of regulatory constraints improved consistency of model predictions with experimentally observed phenotype data. This analysis of growth phenotype data unveiled phenotypes that could only be characterized with the addition of regulatory network constraints. All tools applied in the reconstruction, simulation, and curation of our new regulatory model are now publicly available as a part of the KBase framework. These tools permit the direct simulation of gene expression data using the regulon model alone, as well as the simulation of phenotypes and growth conditions using an integrated metabolic and regulatory model. We will highlight these new tools in the context of our reconstruction and analysis of the B. subtilis regulatory model

    Systematic identification of functional plant modules through the integration of complementary data sources

    Get PDF
    A major challenge is to unravel how genes interact and are regulated to exert specific biological functions. The integration of genome-wide functional genomics data, followed by the construction of gene networks, provides a powerful approach to identify functional gene modules. Large-scale expression data, functional gene annotations, experimental protein-protein interactions, and transcription factor-target interactions were integrated to delineate modules in Arabidopsis (Arabidopsis thaliana). The different experimental input data sets showed little overlap, demonstrating the advantage of combining multiple data types to study gene function and regulation. In the set of 1,563 modules covering 13,142 genes, most modules displayed strong coexpression, but functional and cis-regulatory coherence was less prevalent. Highly connected hub genes showed a significant enrichment toward embryo lethality and evidence for cross talk between different biological processes. Comparative analysis revealed that 58% of the modules showed conserved coexpression across multiple plants. Using module-based functional predictions, 5,562 genes were annotated, and an evaluation experiment disclosed that, based on 197 recently experimentally characterized genes, 38.1% of these functions could be inferred through the module context. Examples of confirmed genes of unknown function related to cell wall biogenesis, xylem and phloem pattern formation, cell cycle, hormone stimulus, and circadian rhythm highlight the potential to identify new gene functions. The module-based predictions offer new biological hypotheses for functionally unknown genes in Arabidopsis (1,701 genes) and six other plant species (43,621 genes). Furthermore, the inferred modules provide new insights into the conservation of coexpression and coregulation as well as a starting point for comparative functional annotation

    Automated data integration for developmental biological research

    Get PDF
    In an era exploding with genome-scale data, a major challenge for developmental biologists is how to extract significant clues from these publicly available data to benefit our studies of individual genes, and how to use them to improve our understanding of development at a systems level. Several studies have successfully demonstrated new approaches to classic developmental questions by computationally integrating various genome-wide data sets. Such computational approaches have shown great potential for facilitating research: instead of testing 20,000 genes, researchers might test 200 to the same effect. We discuss the nature and state of this art as it applies to developmental research
    • …
    corecore