120 research outputs found

    Discovery of widespread transcription initiation at microsatellites predictable by sequence-based deep neural network

    Get PDF
    Using the Cap Analysis of Gene Expression (CAGE) technology, the FANTOM5 consortium provided one of the most comprehensive maps of transcription start sites (TSSs) in several species. Strikingly, ~72% of them could not be assigned to a specific gene and initiate at unconventional regions, outside promoters or enhancers. Here, we probe these unassigned TSSs and show that, in all species studied, a significant fraction of CAGE peaks initiate at microsatellites, also called short tandem repeats (STRs). To confirm this transcription, we develop Cap Trap RNA-seq, a technology which combines cap trapping and long read MinION sequencing. We train sequence-based deep learning models able to predict CAGE signal at STRs with high accuracy. These models unveil the importance of STR surrounding sequences not only to distinguish STR classes, but also to predict the level of transcription initiation. Importantly, genetic variants linked to human diseases are preferentially found at STRs with high transcription initiation level, supporting the biological and clinical relevance of transcription initiation at STRs. Together, our results extend the repertoire of non-coding transcription associated with DNA tandem repeats and complexify STR polymorphism.Peer reviewe

    Pathway Analysis Approaches for Rare and Common Variants: Insights From Genetic Analysis Workshop 18

    Get PDF
    Pathway analysis, broadly defined as a group of methods incorporating a priori biological information from public databases, has emerged as a promising approach for analyzing high-dimensional genomic data. As part of Genetic Analysis Workshop 18, seven research groups applied pathway analysis techniques to whole-genome sequence data from the San Antonio Family Study. Overall, the groups found that the potential of pathway analysis to improve detection of causal variants by lowering the multiple-testing burden and incorporating biologic insight remains largely unrealized. Specifically, there is a lack of best practices at each stage of the pathway approach: annotation, analysis, interpretation, and follow-up. Annotation of genetic variants is inconsistent across databases, incomplete, and biased toward known genes. At the analysis stage insufficient statistical power remains a major challenge. Analyses combining rare and common variants may have an inflated type I error rate and may not improve detection of causal genes. Inclusion of known causal genes may not improve statistical power, although the fraction of explained phenotypic variance may be a more appropriate metric. Interpretation of findings is further complicated by evidence in support of interactions between pathways and by the lack of consensus on how to best incorporate functional information. Finally, all presented approaches warranted follow-up studies, both to reduce the likelihood of false-positive findings and to identify specific causal variants within a given pathway. Despite the initial promise of pathway analysis for modeling biological complexity of disease phenotypes, many methodological challenges currently remain to be addressed

    A collection of enhancer trap insertional mutants for functional genomics in tomato

    Full text link
    [EN] With the completion of genome sequencing projects, the next challenge is to close the gap between gene annotation and gene functional assignment. Genomic tools to identify gene functions are based on the analysis of phenotypic variations between a wild type and its mutant; hence, mutant collections are a valuable resource. In this sense, T-DNA collections allow for an easy and straightforward identification of the tagged gene, serving as the basis of both forward and reverse genetic strategies. This study reports on the phenotypic and molecular characterization of an enhancer trap T-DNA collection in tomato (Solanum lycopersicum L.), which has been produced by Agrobacterium-mediated transformation using a binary vector bearing a minimal promoter fused to the uidA reporter gene. Two genes have been isolated from different T-DNA mutants, one of these genes codes for a UTP-glucose-1-phosphate uridylyltransferase involved in programmed cell death and leaf development, which means a novel gene function reported in tomato. Together, our results support that enhancer trapping is a powerful tool to identify novel genes and regulatory elements in tomato and that this T-DNA mutant collection represents a highly valuable resource for functional analyses in this fleshy-fruited model species.This work was supported by research grants from the Spanish Ministerio de Economia y Competitividad (AGL2012-40150-C02-01, AGL2012-40150-C02-02, AGL2015-64991-C3-1-R and AGL2015-64991-C3-3-R), Junta de Andalucia (P10-AGR-6931) and UE-FEDER. B.P. received a JAE-Doc research contract from the CSIC (Spain). PhD fellowships were funded by the FPU (M.G-A. and R.F.) and the FPI (M.P.A., S.S.A. F-L., A.O-A and L.C.) Programmes of the Ministerio de Ciencia e Innovacion, the JAE predoc Programme of the Spanish CSIC (G.G.), the CONACYT and Universidad de Sinaloa of Mexico (J.S.) and the LASPAU (J.L.Q.). The authors thank research facilities provided by the Campus de Excelencia Internacional Agroalimentario (CeiA3)Pérez-Martín, F.; Yuste-Lisbona, FJ.; Pineda Chaza, BJ.; Angarita-Díaz, MP.; García Sogo, B.; Antón Martínez, MT.; Sanchez Martín-Sauceda, S.... (2017). A collection of enhancer trap insertional mutants for functional genomics in tomato. Plant Biotechnology Journal. 15(11):1439-1452. https://doi.org/10.1111/pbi.12728S14391452151

    Diagnostic and prognostic factors in patients with prostate cancer : a systematic review

    Get PDF
    Funding PIONEER is funded through the IMI2 Joint Undertaking and is listed under Grant Agreement No. 777492 and is part of the Big Data for Better Outcomes Programme (BD4BO). IMI2 receives support from the European Union’s Horizon 2020 research and innovation programme and the European Federation of Pharmaceutical Industries and Associations (EFPIA). The views communicated within are those of PIONEER. Neither the IMI nor the European Union, EFPIA, or any Associated Partners are responsible for any use that may be made of the information contained herein.Peer reviewedPublisher PD

    A clonal expression biomarker associates with lung cancer mortality

    Get PDF
    An aim of molecular biomarkers is to stratify patients with cancer into disease subtypes predictive of outcome, improving diagnostic precision beyond clinical descriptors such as tumor stage1. Transcriptomic intratumor heterogeneity (RNA-ITH) has been shown to confound existing expression-based biomarkers across multiple cancer types2,3,4,5,6. Here, we analyze multi-region whole-exome and RNA sequencing data for 156 tumor regions from 48 patients enrolled in the TRACERx study to explore and control for RNA-ITH in non-small cell lung cancer. We find that chromosomal instability is a major driver of RNA-ITH, and existing prognostic gene expression signatures are vulnerable to tumor sampling bias. To address this, we identify genes expressed homogeneously within individual tumors that encode expression modules of cancer cell proliferation and are often driven by DNA copy-number gains selected early in tumor evolution. Clonal transcriptomic biomarkers overcome tumor sampling bias, associate with survival independent of clinicopathological risk factors, and may provide a general strategy to refine biomarker design across cancer types

    Differential usage of transcriptional start sites and polyadenylation sites in FMR1 premutation alleles†

    Get PDF
    5′- and 3′-untranslated regions (UTRs) are important regulators of gene expression and play key roles in disease progression and susceptibility. The 5′-UTR of the fragile X mental retardation 1 (FMR1) gene contains a CGG repeat element that is expanded (>200 CGG repeats; full mutation) and methylated in fragile X syndrome (FXS), the most common form of inherited intellectual disability (ID) and known cause of autism. Significant phenotypic involvement has also emerged in some individuals with the premutation (55–200 CGG repeats), including fragile X-associated premature ovarian insufficiency (FXPOI) in females, and the neurodegenerative disorder, fragile X-associated tremor/ataxia syndrome (FXTAS), in older adult carriers. Here, we show that FMR1 mRNA in human and mouse brain is expressed as a combination of multiple isoforms that use alternative transcriptional start sites and different polyadenylation sites. Furthermore, we have identified a novel human transcription start site used in brain but not in lymphoblastoid cells, and have detected FMR1 isoforms generated through the use of both canonical and non-canonical polyadenylation signals. Importantly, in both human and mouse, a specific regulation of the UTRs is observed in brain of FMR1 premutation alleles, suggesting that the transcript variants may play a role in premutation-related pathologies

    Neural progenitor cells from an adult patient with fragile X syndrome

    Get PDF
    BACKGROUND: Currently, there is no adequate animal model to study the detailed molecular biochemistry of fragile X syndrome, the leading heritable form of mental impairment. In this study, we sought to establish the use of immature neural cells derived from adult tissues as a novel model of fragile X syndrome that could be used to more fully understand the pathology of this neurogenetic disease. METHODS: By modifying published methods for the harvest of neural progenitor cells from the post-mortem human brain, neural cells were successfully harvested and grown from post-mortem brain tissue of a 25-year-old adult male with fragile X syndrome, and from brain tissue of a patient with no neurological disease. RESULTS: The cultured fragile X cells displayed many of the characteristics of neural progenitor cells, including nestin and CD133 expression, as well as the biochemical hallmarks of fragile X syndrome, including CGG repeat expansion and a lack of FMRP expression. CONCLUSION: The successful production of neural cells from an individual with fragile X syndrome opens a new avenue for the scientific study of the molecular basis of this disorder, as well as an approach for studying the efficacy of new therapeutic agents

    Allostatic load and subsequent all-cause mortality: which biological markers drive the relationship? Findings from a UK birth cohort

    Get PDF
    The concept of allostatic load (AL) refers to the idea of a global physiological ‘wear and tear’ resulting from the adaptation to the environment through the stress response systems over the life span. The link between socioeconomic position (SEP) and mortality has now been established, and there is evidence that AL may capture the link between SEP and mortality. In order to quantitatively assess the role of AL on mortality, we use data from the 1958 British birth cohort including eleven year mortality in 8,113 adults. Specifically, we interrogate the hypothesis of a cumulative biological risk (allostatic load) reflecting 4 physiological systems potentially predicting future risk of death (N = 132). AL was defined using 14 biomarkers assayed in blood from a biosample collected at 44 years of age. Cox proportional hazard regression analysis revealed that higher allostatic load at 44 years old was a significant predictor of mortality 11 years later [HR = 3.56 (2.3 to 5.53)]. We found that this relationship was not solely related to early-life SEP, adverse childhood experiences and young adulthood health status, behaviours and SEP [HR = 2.57 (1.59 to 4.15)] . Regarding the ability of each physiological system and biomarkers to predict future death, our results suggest that the cumulative measure was advantageous compared to evaluating each physiological system sub-score and biomarker separately. Our findings add some evidence of a biological embodiment in response to stress which ultimately affects mortality

    Common polygenic variation in coeliac disease and confirmation of ZNF335 and NIFA as disease susceptibility loci

    Get PDF
    Coeliac disease (CD) is a chronic immune-mediated disease triggered by the ingestion of gluten. It has an estimated prevalence of approximately 1% in European populations. Specific HLA-DQA1 and HLA-DQB1 alleles are established coeliac susceptibility genes and are required for the presentation of gliadin to the immune system resulting in damage to the intestinal mucosa. In the largest association analysis of CD to date, 39 non-HLA risk loci were identified, 13 of which were new, in a sample of 12 014 individuals with CD and 12 228 controls using the Immunochip genotyping platform. Including the HLA, this brings the total number of known CD loci to 40. We have replicated this study in an independent Irish CD case–control population of 425 CD and 453 controls using the Immunochip platform. Using a binomial sign test, we show that the direction of the effects of previously described risk alleles were highly correlated with those reported in the Irish population, (P=2.2 × 10−16). Using the Polygene Risk Score (PRS) approach, we estimated that up to 35% of the genetic variance could be explained by loci present on the Immunochip (P=9 × 10−75). When this is limited to non-HLA loci, we explain a maximum of 4.5% of the genetic variance (P=3.6 × 10−18). Finally, we performed a meta-analysis of our data with the previous reports, identifying two further loci harbouring the ZNF335 and NIFA genes which now exceed genome-wide significance, taking the total number of CD susceptibility loci to 42

    HVint: a strategy for identifying novel protein-protein interactions in Herpes Simplex Virus Type 1

    Get PDF
    Human herpesviruses are widespread human pathogens with a remarkable impact on worldwide public health. Despite intense decades of research, the molecular details in many aspects of their function remain to be fully characterized. To unravel the details of how these viruses operate, a thorough understanding of the relationships between the involved components is key. Here, we present HVint, a novel protein-protein intra-viral interaction resource for herpes simplex virus type 1 (HSV-1) integrating data from five external sources. To assess each interaction, we used a scoring scheme that takes into consideration aspects such as the type of detection method and the number of lines of evidence. The coverage of the initial interactome was further increased using evolutionary information, by importing interactions reported for other human herpesviruses. These latter interactions constitute, therefore, computational predictions for potential novel interactions in HSV-1. An independent experimental analysis was performed to confirm a subset of our predicted interactions. This subset covers proteins that contribute to nuclear egress and primary envelopment events, including VP26, pUL31, pUL40 and the recently characterized pUL32 and pUL21. Our findings support a coordinated crosstalk between VP26 and proteins such as pUL31, pUS9 and the CSVC complex, contributing to the development of a model describing the nuclear egress and primary envelopment pathways of newly synthesized HSV-1 capsids. The results are also consistent with recent findings on the involvement of pUL32 in capsid maturation and early tegumentation events. Further, they open the door to new hypotheses on virus-specific regulators of pUS9-dependent transport. To make this repository of interactions readily accessible for the scientific community, we also developed a user-friendly and interactive web interface. Our approach demonstrates the power of computational predictions to assist in the design of targeted experiments for the discovery of novel protein-protein interactions
    corecore