22 research outputs found

    Retrospective evaluation of whole exome and genome mutation calls in 746 cancer samples

    No full text
    Funder: NCI U24CA211006Abstract: The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) curated consensus somatic mutation calls using whole exome sequencing (WES) and whole genome sequencing (WGS), respectively. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2,658 cancers across 38 tumour types, we compare WES and WGS side-by-side from 746 TCGA samples, finding that ~80% of mutations overlap in covered exonic regions. We estimate that low variant allele fraction (VAF < 15%) and clonal heterogeneity contribute up to 68% of private WGS mutations and 71% of private WES mutations. We observe that ~30% of private WGS mutations trace to mutations identified by a single variant caller in WES consensus efforts. WGS captures both ~50% more variation in exonic regions and un-observed mutations in loci with variable GC-content. Together, our analysis highlights technological divergences between two reproducible somatic variant detection efforts

    Optical calibration of the SuperCam instrument body unit spectrometers

    No full text
    International audienceThe SuperCam remote sensing instrument on NASA's Perseverance rover is capable of four spectroscopic techniques, remote micro-imaging, and audio recording. These analytical techniques provide details of the chemistry and mineralogy of the rocks and soils probed in the Jezero Crater on Mars. Here we present the methods used for optical calibration of the three spectrometers covering the 243-853 nm range used by three of the four spectroscopic techniques. We derive the instrument optical response, which characterizes the instrument sensitivity to incident radiation as a function of a wavelength. The instrument optical response function derived here is an essential step in the interpretation of the spectra returned by SuperCam as it converts the observed spectra, reported by the instrument as "digital counts" from an analog to digital converter, into physical values of spectral radiance

    MULTIVARIATE AND ENSEMBLE MANGANESE CALIBRATION MODELS FOR SUPERCAM

    No full text
    International audienceOperator (LASSO) multivariate techniques with blended submodels; similar to the calibration model used from ChemCam [6], and then compared this model to ensemble methods [5,7]. Blended submodels split the data into smaller portions, trains linear models on these portions, and then optimizes the blend ranges of the submodels to cover the full data range [8]. The process of creating optimized submodels is time consuming, and may not yield the best model possible. Ensemble methods are non-linear, and would negate the need to train and optimize submodels. The response of the instrument to atomic emission is likely non-linear, and thus ensemble methods are likely to have better success in calibration than our previous attempts using LASSO, PLS, etc. Ensemble methods tested include Gradient Boosting, Random Forests, and Extra Trees [9]. Methods: Data Collection and Pre-processing. A standard set, for which MnO content is known, consisting of 252 training and 70 test standards, was analyzed using the SuperCam flight model from 1.6 m distance (3 average spectra were collected on each standard consisting of 50 shots averaged in each point) under a Mars-like atmosphere [5]. The standard set covers a range of Mn compositions from 0.0009-76 wt% MnO and contains a variety of rock matrices (e.g., rock, mineral, Mn ores). No outliers were removed. We use the Python Hyperspectral Analysis Tool [10] and the associated graphical interface for point spectra analysis [10] to preprocess the data and evaluate multivariate regression models. Ensemble methods were trained using Python scikit-learn [7,9]. Each spectrum is normalized by the sum of the total emission for each detector [5]. A "peak area" (PA) preprocessing technique is used [6], where local minima and maxima of the average spectra of the dataset is determined. The process then bins the emission between each pair of minima and assigns the result to the wavelength of the corresponding maximum. We compared full spectra with peak area spectra for this work. Based on preliminary work, we masked wavelengths ≄750 nm, where there are no Mn emission lines, to remove lines from alkali, minor elements, and oxygen, all of which had some influence on the LASSO model

    MULTIVARIATE AND ENSEMBLE MANGANESE CALIBRATION MODELS FOR SUPERCAM

    No full text
    International audienceOperator (LASSO) multivariate techniques with blended submodels; similar to the calibration model used from ChemCam [6], and then compared this model to ensemble methods [5,7]. Blended submodels split the data into smaller portions, trains linear models on these portions, and then optimizes the blend ranges of the submodels to cover the full data range [8]. The process of creating optimized submodels is time consuming, and may not yield the best model possible. Ensemble methods are non-linear, and would negate the need to train and optimize submodels. The response of the instrument to atomic emission is likely non-linear, and thus ensemble methods are likely to have better success in calibration than our previous attempts using LASSO, PLS, etc. Ensemble methods tested include Gradient Boosting, Random Forests, and Extra Trees [9]. Methods: Data Collection and Pre-processing. A standard set, for which MnO content is known, consisting of 252 training and 70 test standards, was analyzed using the SuperCam flight model from 1.6 m distance (3 average spectra were collected on each standard consisting of 50 shots averaged in each point) under a Mars-like atmosphere [5]. The standard set covers a range of Mn compositions from 0.0009-76 wt% MnO and contains a variety of rock matrices (e.g., rock, mineral, Mn ores). No outliers were removed. We use the Python Hyperspectral Analysis Tool [10] and the associated graphical interface for point spectra analysis [10] to preprocess the data and evaluate multivariate regression models. Ensemble methods were trained using Python scikit-learn [7,9]. Each spectrum is normalized by the sum of the total emission for each detector [5]. A "peak area" (PA) preprocessing technique is used [6], where local minima and maxima of the average spectra of the dataset is determined. The process then bins the emission between each pair of minima and assigns the result to the wavelength of the corresponding maximum. We compared full spectra with peak area spectra for this work. Based on preliminary work, we masked wavelengths ≄750 nm, where there are no Mn emission lines, to remove lines from alkali, minor elements, and oxygen, all of which had some influence on the LASSO model

    Dark microbiome and extremely low organics in Atacama fossil delta unveil Mars life detection limits

    Get PDF
    International audienceIdentifying unequivocal signs of life on Mars is one of the most important objectives for sending missions to the red planet. Here we report Red Stone, a 163-100 My alluvial fan–fan delta that formed under arid conditions in the Atacama Desert, rich in hematite and mudstones containing clays such as vermiculite and smectites, and therefore geologically analogous to Mars. We show that Red Stone samples display an important number of microorganisms with an unusual high rate of phylogenetic indeterminacy, what we refer to as “dark microbiome”, and a mix of biosignatures from extant and ancient microorganisms that can be barely detected with state-of-the-art laboratory equipment. Our analyses by testbed instruments that are on or will be sent to Mars unveil that although the mineralogy of Red Stone matches that detected by ground-based instruments on the red planet, similarly low levels of organics will be hard, if not impossible to detect in Martian rocks depending on the instrument and technique used. Our results stress the importance in returning samples to Earth for conclusively addressing whether life ever existed on Mars

    Dark microbiome and extremely low organics in Atacama fossil delta unveil Mars life detection limits

    No full text
    International audienceIdentifying unequivocal signs of life on Mars is one of the most important objectives for sending missions to the red planet. Here we report Red Stone, a 163-100 My alluvial fan–fan delta that formed under arid conditions in the Atacama Desert, rich in hematite and mudstones containing clays such as vermiculite and smectites, and therefore geologically analogous to Mars. We show that Red Stone samples display an important number of microorganisms with an unusual high rate of phylogenetic indeterminacy, what we refer to as “dark microbiome”, and a mix of biosignatures from extant and ancient microorganisms that can be barely detected with state-of-the-art laboratory equipment. Our analyses by testbed instruments that are on or will be sent to Mars unveil that although the mineralogy of Red Stone matches that detected by ground-based instruments on the red planet, similarly low levels of organics will be hard, if not impossible to detect in Martian rocks depending on the instrument and technique used. Our results stress the importance in returning samples to Earth for conclusively addressing whether life ever existed on Mars

    Post-landing major element quantification using SuperCam laser induced breakdown spectroscopy

    No full text
    International audienceThe SuperCam instrument on the Perseverance Mars 2020 rover uses a pulsed 1064 nm laser to ablate targets at a distance and conduct laser induced breakdown spectroscopy (LIBS) by analyzing the light from the resulting plasma. SuperCam LIBS spectra are preprocessed to remove ambient light, noise, and the continuum signal present in LIBS observations. Prior to quantification, spectra are masked to remove noisier spectrometer regions and spectra are normalized to minimize signal fluctuations and effects of target distance. In some cases, the spectra are also standardized or binned prior to quantification. To determine quantitative elemental compositions of diverse geologic materials at Jezero crater, Mars, we use a suite of 1198 laboratory spectra of 334 well-characterized reference samples. The samples were selected to span a wide range of compositions and include typical silicate rocks, pure minerals (e.g., silicates, sulfates, carbonates, oxides), more unusual compositions (e.g., Mn ore and sodalite), and replicates of the sintered SuperCam calibration targets (SCCTs) onboard the rover. For each major element (SiO2, TiO2, Al2O3, FeOT, MgO, CaO, Na2O, K2O), the database was subdivided into five "folds" with similar distributions of the element of interest. One fold was held out as an independent test set, and the remaining four folds were used to optimize multivariate regression models relating the spectrum to the composition. We considered a variety of models, and selected several for further investigation for each element, based primarily on the root mean squared error of prediction (RMSEP) on the test set, when analyzed at 3 m. In cases with several models of comparable performance at 3 m, we incorporated the SCCT performance at different distances to choose the preferred model. Shortly after landing on Mars and collecting initial spectra of geologic targets, we selected one model per element. Subsequently, with additional data from geologic targets, some models were revised to ensure results that are more consistent with geochemical constraints. The calibration discussed here is a snapshot of an ongoing effort to deliver the most accurate chemical compositions with SuperCam LIBS
    corecore