125 research outputs found

    Using equivalence class counts for fast and accurate testing of differential transcript usage [version 2; peer review: 1 approved, 2 approved with reservations]

    Get PDF
    Background: RNA sequencing has enabled high-throughput and fine-grained quantitative analyses of the transcriptome. While differential gene expression is the most widely used application of this technology, RNA-seq data also has the resolution to infer differential transcript usage (DTU), which can elucidate the role of different transcript isoforms between experimental conditions, cell types or tissues. DTU has typically been inferred from exon-count data, which has issues with assigning reads unambiguously to counting bins, and requires alignment of reads to the genome. Recently, approaches have emerged that use transcript quantification estimates directly for DTU. Transcript counts can be inferred from 'pseudo' or lightweight aligners, which are significantly faster than traditional genome alignment. However, recent evaluations show lower sensitivity in DTU analysis compared to exon-level analysis. Transcript abundances are estimated from equivalence classes (ECs), which determine the transcripts that any given read is compatible with. Recent work has proposed performing a variety of RNA-seq analysis directly on equivalence class counts (ECCs). Methods: Here we demonstrate that ECCs can be used effectively with existing count-based methods for detecting DTU. We evaluate this approach on simulated human and drosophila data, as well as on a real dataset through subset testing. Results: We find that ECCs have similar sensitivity and false discovery rates as exon-level counts but can be generated in a fraction of the time through the use of pseudo-aligners. Conclusions: We posit that equivalence class read counts are a natural unit on which to perform differential transcript usage analysis

    The Fourteenth Data Release of the Sloan Digital Sky Survey: First Spectroscopic Data from the extended Baryon Oscillation Spectroscopic Survey and from the second phase of the Apache Point Observatory Galactic Evolution Experiment

    Get PDF
    The fourth generation of the Sloan Digital Sky Survey (SDSS-IV) has been in operation since July 2014. This paper describes the second data release from this phase, and the fourteenth from SDSS overall (making this, Data Release Fourteen or DR14). This release makes public data taken by SDSS-IV in its first two years of operation (July 2014-2016). Like all previous SDSS releases, DR14 is cumulative, including the most recent reductions and calibrations of all data taken by SDSS since the first phase began operations in 2000. New in DR14 is the first public release of data from the extended Baryon Oscillation Spectroscopic Survey (eBOSS); the first data from the second phase of the Apache Point Observatory (APO) Galactic Evolution Experiment (APOGEE-2), including stellar parameter estimates from an innovative data driven machine learning algorithm known as "The Cannon"; and almost twice as many data cubes from the Mapping Nearby Galaxies at APO (MaNGA) survey as were in the previous release (N = 2812 in total). This paper describes the location and format of the publicly available data from SDSS-IV surveys. We provide references to the important technical papers describing how these data have been taken (both targeting and observation details) and processed for scientific use. The SDSS website (www.sdss.org) has been updated for this release, and provides links to data downloads, as well as tutorials and examples of data use. SDSS-IV is planning to continue to collect astronomical data until 2020, and will be followed by SDSS-V.Comment: SDSS-IV collaboration alphabetical author data release paper. DR14 happened on 31st July 2017. 19 pages, 5 figures. Accepted by ApJS on 28th Nov 2017 (this is the "post-print" and "post-proofs" version; minor corrections only from v1, and most of errors found in proofs corrected

    Francisella tularensis novicida proteomic and transcriptomic data integration and annotation based on semantic web technologies

    Get PDF
    This paper summarises the lessons and experiences gained from a case study of the application of semantic web technologies to the integration of data from the bacterial species Francisella tularensis novicida (Fn). Fn data sources are disparate and heterogeneous, as multiple laboratories across the world, using multiple technologies, perform experiments to understand the mechanism of virulence. It is hard to integrate these data sources in a flexible manner that allows new experimental data to be added and compared when required

    Tailoring therapies—improving the management of early breast cancer: St Gallen International Expert Consensus on the Primary Therapy of Early Breast Cancer 2015

    Get PDF
    The 14th St Gallen International Breast Cancer Conference (2015) reviewed new evidence on locoregional and systemic therapies for early breast cancer. This manuscript presents news and progress since the 2013 meeting, provides expert opinion on almost 200 questions posed to Consensus Panel members, and summarizes treatment-oriented classification of subgroups and treatment recommendation

    The Fourteenth Data Release of the Sloan Digital Sky Survey: First Spectroscopic Data from the Extended Baryon Oscillation Spectroscopic Survey and from the Second Phase of the Apache Point Observatory Galactic Evolution Experiment

    Get PDF
    The fourth generation of the Sloan Digital Sky Survey (SDSS-IV) has been in operation since 2014 July. This paper describes the second data release from this phase, and the 14th from SDSS overall (making this Data Release Fourteen or DR14). This release makes the data taken by SDSS-IV in its first two years of operation (2014–2016 July) public. Like all previous SDSS releases, DR14 is cumulative, including the most recent reductions and calibrations of all data taken by SDSS since the first phase began operations in 2000. New in DR14 is the first public release of data from the extended Baryon Oscillation Spectroscopic Survey; the first data from the second phase of the Apache Point Observatory (APO) Galactic Evolution Experiment (APOGEE-2), including stellar parameter estimates from an innovative data-driven machine-learning algorithm known as "The Cannon"; and almost twice as many data cubes from the Mapping Nearby Galaxies at APO (MaNGA) survey as were in the previous release (N = 2812 in total). This paper describes the location and format of the publicly available data from the SDSS-IV surveys. We provide references to the important technical papers describing how these data have been taken (both targeting and observation details) and processed for scientific use. The SDSS web site (www.sdss.org) has been updated for this release and provides links to data downloads, as well as tutorials and examples of data use. SDSS-IV is planning to continue to collect astronomical data until 2020 and will be followed by SDSS-V
    corecore