15 research outputs found

    Discovery of widespread transcription initiation at microsatellites predictable by sequence-based deep neural network

    Get PDF
    Using the Cap Analysis of Gene Expression (CAGE) technology, the FANTOM5 consortium provided one of the most comprehensive maps of transcription start sites (TSSs) in several species. Strikingly, ~72% of them could not be assigned to a specific gene and initiate at unconventional regions, outside promoters or enhancers. Here, we probe these unassigned TSSs and show that, in all species studied, a significant fraction of CAGE peaks initiate at microsatellites, also called short tandem repeats (STRs). To confirm this transcription, we develop Cap Trap RNA-seq, a technology which combines cap trapping and long read MinION sequencing. We train sequence-based deep learning models able to predict CAGE signal at STRs with high accuracy. These models unveil the importance of STR surrounding sequences not only to distinguish STR classes, but also to predict the level of transcription initiation. Importantly, genetic variants linked to human diseases are preferentially found at STRs with high transcription initiation level, supporting the biological and clinical relevance of transcription initiation at STRs. Together, our results extend the repertoire of non-coding transcription associated with DNA tandem repeats and complexify STR polymorphism

    Expression of seven members of the gene family encoding secretory aspartyl proteinases in Candida albicans

    No full text
    The opportunistic fungal pathogen Candida albicans produces secretory aspartyl proteinases, which are believed to be virulence factors in infection. We have studied the in vitro expression of seven known members of the SAP gene family in a range of strains and serotypes by Northern analysis. SAP1 and SAP3 were regulated during phenotypic switching between the white and opaque forms of the organism. The SAP2 mRNA, which was the dominant transcript in the yeast form, was found to be autoinduced by peptide products of Sap2 activity and to be repressed by amino acids. The expression of the closely related SAP4-SAP6 genes was observed only at neutral pH during serum-induced yeast to hyphal transition. No SAP7 mRNA was detected under any of the conditions or in any of the strains tested. Our data suggest that the various members of the SAP gene family may have distinct roles in the colonization and invasion of the host

    References

    No full text

    A promoter-level mammalian expression atlas

    No full text
    Regulated transcription controls the diversity, developmental pathways and spatial organization of the hundreds of cell types that make up a mammal. Using single-molecule cDNA sequencing, we mapped transcription start sites (TSSs) and their usage in human and mouse primary cells, cell lines and tissues to produce a comprehensive overview of mammalian gene expression across the human body. We find that few genes are truly ‘housekeeping’, whereas many mammalian promoters are composite entities composed of several closely separated TSSs, with independent cell-type-specific expression profiles. TSSs specific to different cell types evolve at different rates, whereas promoters of broadly expressed genes are the most conserved. Promoter-based expression analysis reveals key transcription factors defining cell states and links them to binding-site motifs. The functions of identified novel transcripts can be predicted by coexpression and sample ontology enrichment analyses. The functional annotation of the mammalian genome 5 (FANTOM5) project provides comprehensive expression profiles and functional annotation of mammalian cell-type-specific transcriptomes with wide applications in biomedical research

    Discovery of widespread transcription initiation at microsatellites predictable by sequence-based deep neural network

    No full text
    10.1038/s41467-021-23143-7Nature Communications121329

    ALICE: Physics performance report, volume I

    No full text
    ALICE is a general-purpose heavy-ion experiment designed to study the physics of strongly interacting matter and the quark-gluon plasma in nucleus-nucleus collisions at the LHC. It currently includes more than 900 physicists and senior engineers, from both nuclear and high-energy physics, from about 80 institutions in 28 countries. The experimentwas approved in February 1997. The detailed design of the different detector systems has been laid down in a number of Technical Design Reports issued between mid-1998 and the end of 2001 and construction has started for most detectors. Since the last comprehensive information on detector and physics performance was published in the ALICE Technical Proposal in 1996, the detector as well as simulation, reconstruction and analysis software have undergone significant development. The Physics Performance Report (PPR) will give an updated and comprehensive summary of the current status and performance of the various ALICE subsystems, including updates to the Technical Design Reports, where appropriate, as well as a description of systems which have not been published in a Technical Design Report. The PPR will be published in two volumes. The currentVolume I contains: 1. a short theoretical overview and an extensive reference list concerning the physics topics of interest to ALICE, 2. relevant experimental conditions at the LHC, 3. a short summary and update of the subsystem designs, and 4. a description of the offline framework and Monte Carlo generators. Volume II, which will be published separately, will contain detailed simulations of combined detector performance, event reconstruction, and analysis of a representative sample of relevant physics observables from global event characteristics to hard processes. © 2004 IOP Publishing Ltd
    corecore