177 research outputs found

    MIBEN: Robust Multiple Imputation with the Bayesian Elastic Net

    Get PDF
    Correctly specifying the imputation model when conducting multiple imputation remains one of the most significant challenges in missing data analysis. This dissertation introduces a robust multiple imputation technique, Multiple Imputation with the Bayesian Elastic Net (MIBEN), as a remedy for this difficulty. A Monte Carlo simulation study was conducted to assess the performance of the MIBEN technique and compare it to several state-of-the-art multiple imputation methods

    Streamlining Missing Data Analysis by Aggregating Multiple Imputations at the Data Level: A Monte Carlo Simulation to Assess the Tenability of the SuperMatrix Approach

    Get PDF
    A Monte Carlo Simulation Study was conducted to assess the tenability of a novel treatment of missing data. Through aggregating multiply-imputed data sets prior to model estimation, the proposed technique allows researchers to reap the benefits of a principled missing data tool (i.e., multiple imputation), while maintaining the simplicity of complete case analysis. In terms of the accuracy of model fit indices derived from confirmatory factor analyses, the proposed technique was found to perform universally better than a naive ad hoc technique consisting of averaging the multiple estimates of model fit derived from a traditionally conceived implementation of multiple imputation. However, the proposed technique performed considerably worse in this task than did full information maximum likelihood (FIML) estimation. Absolute fit indices and residual based fit indices derived from the proposed technique demonstrated an unacceptable degree of bias in assessing direct model fit, but incremental fit indices led to acceptable conclusions regarding model fit. Chi-squared difference values derived from the proposed technique were unbiased across all study conditions (except for those with very poor parameterizations) and were consistently more accurate than such values derived from the ad hoc comparison condition. It was also found that Chi-squared difference values derived from FIML-based models were negatively biased to an unacceptable degree in any conditions with greater than 10% missing. Implications, limitations and future directions of the current work are discussed

    US Cosmic Visions: New Ideas in Dark Matter 2017: Community Report

    Get PDF
    This white paper summarizes the workshop "U.S. Cosmic Visions: New Ideas in Dark Matter" held at University of Maryland on March 23-25, 2017.Comment: 102 pages + reference

    The Fourteenth Data Release of the Sloan Digital Sky Survey: First Spectroscopic Data from the extended Baryon Oscillation Spectroscopic Survey and from the second phase of the Apache Point Observatory Galactic Evolution Experiment

    Get PDF
    The fourth generation of the Sloan Digital Sky Survey (SDSS-IV) has been in operation since July 2014. This paper describes the second data release from this phase, and the fourteenth from SDSS overall (making this, Data Release Fourteen or DR14). This release makes public data taken by SDSS-IV in its first two years of operation (July 2014-2016). Like all previous SDSS releases, DR14 is cumulative, including the most recent reductions and calibrations of all data taken by SDSS since the first phase began operations in 2000. New in DR14 is the first public release of data from the extended Baryon Oscillation Spectroscopic Survey (eBOSS); the first data from the second phase of the Apache Point Observatory (APO) Galactic Evolution Experiment (APOGEE-2), including stellar parameter estimates from an innovative data driven machine learning algorithm known as "The Cannon"; and almost twice as many data cubes from the Mapping Nearby Galaxies at APO (MaNGA) survey as were in the previous release (N = 2812 in total). This paper describes the location and format of the publicly available data from SDSS-IV surveys. We provide references to the important technical papers describing how these data have been taken (both targeting and observation details) and processed for scientific use. The SDSS website (www.sdss.org) has been updated for this release, and provides links to data downloads, as well as tutorials and examples of data use. SDSS-IV is planning to continue to collect astronomical data until 2020, and will be followed by SDSS-V.Comment: SDSS-IV collaboration alphabetical author data release paper. DR14 happened on 31st July 2017. 19 pages, 5 figures. Accepted by ApJS on 28th Nov 2017 (this is the "post-print" and "post-proofs" version; minor corrections only from v1, and most of errors found in proofs corrected

    The Eighth Data Release of the Sloan Digital Sky Survey: First Data from SDSS-III

    Get PDF
    The Sloan Digital Sky Survey (SDSS) started a new phase in August 2008, with new instrumentation and new surveys focused on Galactic structure and chemical evolution, measurements of the baryon oscillation feature in the clustering of galaxies and the quasar Ly alpha forest, and a radial velocity search for planets around ~8000 stars. This paper describes the first data release of SDSS-III (and the eighth counting from the beginning of the SDSS). The release includes five-band imaging of roughly 5200 deg^2 in the Southern Galactic Cap, bringing the total footprint of the SDSS imaging to 14,555 deg^2, or over a third of the Celestial Sphere. All the imaging data have been reprocessed with an improved sky-subtraction algorithm and a final, self-consistent photometric recalibration and flat-field determination. This release also includes all data from the second phase of the Sloan Extension for Galactic Understanding and Evolution (SEGUE-2), consisting of spectroscopy of approximately 118,000 stars at both high and low Galactic latitudes. All the more than half a million stellar spectra obtained with the SDSS spectrograph have been reprocessed through an improved stellar parameters pipeline, which has better determination of metallicity for high metallicity stars.Comment: Astrophysical Journal Supplements, in press (minor updates from submitted version

    Pan-Cancer Analysis of lncRNA Regulation Supports Their Targeting of Cancer Genes in Each Tumor Context

    Get PDF
    Long noncoding RNAs (lncRNAs) are commonly dys-regulated in tumors, but only a handful are known toplay pathophysiological roles in cancer. We inferredlncRNAs that dysregulate cancer pathways, onco-genes, and tumor suppressors (cancer genes) bymodeling their effects on the activity of transcriptionfactors, RNA-binding proteins, and microRNAs in5,185 TCGA tumors and 1,019 ENCODE assays.Our predictions included hundreds of candidateonco- and tumor-suppressor lncRNAs (cancerlncRNAs) whose somatic alterations account for thedysregulation of dozens of cancer genes and path-ways in each of 14 tumor contexts. To demonstrateproof of concept, we showed that perturbations tar-geting OIP5-AS1 (an inferred tumor suppressor) andTUG1 and WT1-AS (inferred onco-lncRNAs) dysre-gulated cancer genes and altered proliferation ofbreast and gynecologic cancer cells. Our analysis in-dicates that, although most lncRNAs are dysregu-lated in a tumor-specific manner, some, includingOIP5-AS1, TUG1, NEAT1, MEG3, and TSIX, synergis-tically dysregulate cancer pathways in multiple tumorcontexts

    Pan-cancer Alterations of the MYC Oncogene and Its Proximal Network across the Cancer Genome Atlas

    Get PDF
    Although theMYConcogene has been implicated incancer, a systematic assessment of alterations ofMYC, related transcription factors, and co-regulatoryproteins, forming the proximal MYC network (PMN),across human cancers is lacking. Using computa-tional approaches, we define genomic and proteo-mic features associated with MYC and the PMNacross the 33 cancers of The Cancer Genome Atlas.Pan-cancer, 28% of all samples had at least one ofthe MYC paralogs amplified. In contrast, the MYCantagonists MGA and MNT were the most frequentlymutated or deleted members, proposing a roleas tumor suppressors.MYCalterations were mutu-ally exclusive withPIK3CA,PTEN,APC,orBRAFalterations, suggesting that MYC is a distinct onco-genic driver. Expression analysis revealed MYC-associated pathways in tumor subtypes, such asimmune response and growth factor signaling; chro-matin, translation, and DNA replication/repair wereconserved pan-cancer. This analysis reveals insightsinto MYC biology and is a reference for biomarkersand therapeutics for cancers with alterations ofMYC or the PMN

    Genomic, Pathway Network, and Immunologic Features Distinguishing Squamous Carcinomas

    Get PDF
    This integrated, multiplatform PanCancer Atlas study co-mapped and identified distinguishing molecular features of squamous cell carcinomas (SCCs) from five sites associated with smokin

    Spatial Organization and Molecular Correlation of Tumor-Infiltrating Lymphocytes Using Deep Learning on Pathology Images

    Get PDF
    Beyond sample curation and basic pathologic characterization, the digitized H&E-stained images of TCGA samples remain underutilized. To highlight this resource, we present mappings of tumorinfiltrating lymphocytes (TILs) based on H&E images from 13 TCGA tumor types. These TIL maps are derived through computational staining using a convolutional neural network trained to classify patches of images. Affinity propagation revealed local spatial structure in TIL patterns and correlation with overall survival. TIL map structural patterns were grouped using standard histopathological parameters. These patterns are enriched in particular T cell subpopulations derived from molecular measures. TIL densities and spatial structure were differentially enriched among tumor types, immune subtypes, and tumor molecular subtypes, implying that spatial infiltrate state could reflect particular tumor cell aberration states. Obtaining spatial lymphocytic patterns linked to the rich genomic characterization of TCGA samples demonstrates one use for the TCGA image archives with insights into the tumor-immune microenvironment
    corecore