5 research outputs found

    GENCODE: reference annotation for the human and mouse genomes in 2023.

    Get PDF
    GENCODE produces high quality gene and transcript annotation for the human and mouse genomes. All GENCODE annotation is supported by experimental data and serves as a reference for genome biology and clinical genomics. The GENCODE consortium generates targeted experimental data, develops bioinformatic tools and carries out analyses that, along with externally produced data and methods, support the identification and annotation of transcript structures and the determination of their function. Here, we present an update on the annotation of human and mouse genes, including developments in the tools, data, analyses and major collaborations which underpin this progress. For example, we report the creation of a set of non-canonical ORFs identified in GENCODE transcripts, the LRGASP collaboration to assess the use of long transcriptomic data to build transcript models, the progress in collaborations with RefSeq and UniProt to increase convergence in the annotation of human and mouse protein-coding genes, the propagation of GENCODE across the human pan-genome and the development of new tools to support annotation of regulatory features by GENCODE. Our annotation is accessible via Ensembl, the UCSC Genome Browser and https://www.gencodegenes.org

    GENCODE reference annotation for the human and mouse genomes

    Get PDF
    The accurate identification and description of the genes in the human and mouse genomes is a fundamental requirement for high quality analysis of data informing both genome biology and clinical genomics. Over the last 15 years, the GENCODE consortium has been producing reference quality gene annotations to provide this foundational resource. The GENCODE consortium includes both experimental and computational biology groups who work together to improve and extend the GENCODE gene annotation. Specifically, we generate primary data, create bioinformatics tools and provide analysis to support the work of expert manual gene annotators and automated gene annotation pipelines. In addition, manual and computational annotation workflows use any and all publicly available data and analysis, along with the research literature to identify and characterise gene loci to the highest standard. GENCODE gene annotations are accessible via the Ensembl and UCSC Genome Browsers, the Ensembl FTP site, Ensembl Biomart, Ensembl Perl and REST APIs as well as https://www.gencodegenes.org.National Human Genome Research Institute of the National Institutes of Healt

    Amino Acid-Coupled Bromophenols and a Sulfated Dimethylsulfonium Lanosol from the Red Alga Vertebrata lanosa

    Full text link
    Vertebrata lanosa is a red alga that can commonly be found along the shores of Europe and North America. Its composition of bromophenols has been studied intensely. The aim of the current study was therefore to further investigate the phytochemistry of this alga, focusing more on the polar components. In total, 23 substances were isolated, including lanosol-4,7-disulfate (4) and the new compounds 3,5-dibromotyrosine (12), 3-bromo-5-sulfodihydroxyphenylalanine (13), 3-bromo-6-lanosyl dihydroxyphenylalanine (14), 3-(6′-lanosyl lanosyl) tyrosine (15) and 5-sulfovertebratol (16). In addition, 4-sulfo-7-dimethylsulfonium lanosol (7) was identified. While, in general, the dimethylsulfonium moiety is widespread in algae, its appearance in bromophenol is unique. Moreover, the major glycerogalactolipids, including the new ((5Z,8Z,11Z,14Z,17Z)-eicosapentaenoic acid 3′-[(6′’-O-α-galactopyranosyl-β-D-galactopyranosyl)]-1-glycerol ester (23), and mycosporine-like amino acids, porphyra-334 (17), aplysiapalythine A (18) and palythine (19), were identified
    corecore