75 research outputs found
Stabilizing Estimates of Shapley Values with Control Variates
Shapley values are among the most popular tools for explaining predictions of
blackbox machine learning models. However, their high computational cost
motivates the use of sampling approximations, inducing a considerable degree of
uncertainty. To stabilize these model explanations, we propose ControlSHAP, an
approach based on the Monte Carlo technique of control variates. Our
methodology is applicable to any machine learning model and requires virtually
no extra computation or modeling effort. On several high-dimensional datasets,
we find it can produce dramatic reductions in the Monte Carlo variability of
Shapley estimates
Longitudinal cohort study of the impact of specialist cancer services for teenagers and young adults on quality of life: outcomes from the BRIGHTLIGHT study.
OBJECTIVES: In England, healthcare policy advocates specialised age-appropriate services for teenagers and young adults (TYA), those aged 13 to 24 years at diagnosis. Specialist Principal Treatment Centres (PTC) provide enhanced TYA age-specific care, although many still receive care in adult or children's cancer services. We present the first prospective structured analysis of quality of life (QOL) associated with the amount of care received in a TYA-PTC DESIGN: Longitudinal cohort study. SETTING: Hospitals delivering inpatient cancer care in England. PARTICIPANTS: 1114 young people aged 13 to 24 years newly diagnosed with cancer. INTERVENTION: Exposure to the TYA-PTC defined as patients receiving NO-TYA-PTC care with those receiving ALL-TYA-PTC and SOME-TYA-PTC care. PRIMARY OUTCOME: Quality of life measured at five time points: 6, 12, 18, 24 and 36 months after diagnosis. RESULTS: Group mean total QOL improved over time for all patients, but for those receiving NO-TYA-PTC was an average of 5.63 points higher (95% CI 2.77 to 8.49) than in young people receiving SOME-TYA-PTC care, and 4·17 points higher (95% CI 1.07 to 7.28) compared with ALL-TYA-PTC care. Differences were greatest 6 months after diagnosis, reduced over time and did not meet the 8-point level that is proposed to be clinically significant. Young people receiving NO-TYA-PTC care were more likely to have been offered a choice of place of care, be older, from more deprived areas, in work and have less severe disease. However, analyses adjusting for confounding factors did not explain the differences between TYA groups. CONCLUSIONS: Receipt of some or all care in a TYA-PTC was associated with lower QOL shortly after cancer diagnosis. The NO-TYA-PTC group had higher QOL 3 years after diagnosis, however those receiving all or some care in a TYA-PTC experienced more rapid QOL improvements. Receipt of some care in a TYA-PTC requires further study.This paper presents independent research funded by the National Institute for Health Research (NIHR) under its Programme Grants for Applied Research Programme (Grant Reference Number RP-PG-1209-10013). The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health. The BRIGHTLIGHT Team acknowledges the support of the NIHR, through the Cancer Research Network. LAF and LH are funded by Teenage Cancer Trust, DPS holds research grant funding from Teenage Cancer Trust, and RR was (in part) supported by the National Institute for Health Research (NIHR) Collaboration for Leadership in Applied Health Research and Care (CLAHRC) North Thames at Bart’s Health NHS Trust. RMT is a National Institute for Health Research (NIHR) Senior Nurse Research Leader. The views expressed are those of the author(s) and not necessarily those of the NIHR or the Department of Health and Social Care. None of the funding bodies have been involved with study concept, design or decision to submit the manuscript. JA-G was subsidised by the Ramon & Cajal programme operated by the Ministry of Economy and Business (RYC-2016-19353), and the European Social Fund
Recommended from our members
The Paleocene/Eocene boundary Global Standard Stratotype-section and Point (GSSP): Criteria for Characterisation and Correlation
The choice of a Paleocene/Eocene (P/E) Global Standard Stratotype-section and Point (GSSP) is complicated by the fact that there exists confusion on the exact denotation of the Paleocene and Eocene Series and their constituent lower rank (stage) units. While we can now resolve this problem by recourse to rigorous historical analysis, actual placement of the GSSP is further exacerbated by an embarrassment of riches (in regards to 7 criteria suitable for characterising and correlating a PIE GSSP but which span a temporal interval of greater than 2 my).
Following the precept that the boundaries between higher level chronostratigraphic units are to be founded upon the boundaries of their lowest constituent stages in a nested hierarchy, we note that one of the criteria providing global correlation potential (a stable isotope excursion in marine and terrestrial stratigraphies) lies at a stratigraphic level more than !my older than the base of the stratotypic Ypresian Stage to which the base of the Eocene Series has been subordinated until now. Lowering a chronostratigraphic unit by this extent risks a significant modification to the original geohistorical denotation of the Ypresian Stage and the Eocene Series.
We discuss here four options that are open to Voting Members of the Paleogene Subcommission. One solution consists in adjusting slightly the base of the Ypresian Stage (and, thus, the Eocene Series) so as to be correlatable on the basis of the lowest occurrence/First Appearance Datum (LO/FAD) of the calcareous nannofossil species Tribrachiatus.digitalis. Another solution would be to decouple series and stages so that the Ypresian Stage remains essentially unaltered but the base of tbe Eocene is relocated so as to be correlated on the basis of the Carbon Isotope Excursion (CIE).
Two (compromise) solutions consist in erecting a new stage for the upper/terminal Paleocene (between the Thanetian [sensu Dollfus] and Ypresian Stages) characterised at its base by the global stable isotope excursion. The P/E GSSP may then be placed at the base of the stratotypic Ypresian Stage (thus preserving historical continuity and conceptual denotation and stability) or at the base of the newly erected stage (facilitating correlation of the base of the Eocene series, at least in principle). Both GSSPs should be placed in suitable marine stratigraphic sections yet to be determined but upon which there is considerable current investigative activity
Processes of care and survival associated with treatment in specialist teenage and young adult cancer centres: results from the BRIGHTLIGHT cohort study.
OBJECTIVE: Survival gains in teenagers and young adults (TYA) are reported to be lower than children and adults for some cancers. Place of care is implicated, influencing access to specialist TYA professionals and research.Consequently, age-appropriate specialist cancer care is advocated for TYA although systematic investigation of associated outcomes is lacking. In England, age-appropriate care is delivered through 13 Principal Treatment Centres (TYA-PTC). BRIGHTLIGHT is the national evaluation of TYA cancer services to examine outcomes associated with differing places and levels of care. We aimed to examine the association between exposure to TYA-PTC care, survival and documentation of clinical processes of care. DESIGN: Prospective cohort study. SETTING: 109 National Health Service (NHS) hospitals across England. PARTICIPANTS: 1114 TYA, aged 13-24, newly diagnosed with cancer between 2012 and 2014. INTERVENTION: Participants were assigned a TYA-PTC category dependent on the proportion of care delivered in a TYA-PTC in the first year after diagnosis: all care in a TYA-PTC (ALL-TYA-PTC, n=270), no care in a TYA-PTC (NO-TYA-PTC, n=359), and some care in a TYA-PTC with additional care in a children's/adult unit (SOME-TYA-PTC, n=419). PRIMARY OUTCOME: Data were collected on documented processes indicative of age-appropriate care using clinical report forms, and survival through linkage to NHS databases. RESULTS: TYA receiving NO-TYA-PTC care were less likely to have documentation of molecular diagnosis, be reviewed by a children's or TYA multidisciplinary team, be assessed by supportive care services or have a fertility discussion. There was no significant difference in survival according to category of care. There was weak evidence that the association between care category and survival differed by age (p=0.08) with higher HRs for those over 19 receiving ALL or SOME-TYA-PTC compared with NO-TYA-PTC. CONCLUSION: TYA-PTC care was associated with better documentation of clinical processes associated with age-appropriate care but not improved survival
Recommended from our members
An ocean-colour time series for use in climate studies: the experience of the ocean-colour climate change initiate (OC-CCI)
Ocean colour is recognised as an Essential Climate Variable (ECV) by the Global Climate Observing System (GCOS); and spectrally-resolved water-leaving radiances (or remote-sensing reflectances) in the visible domain, and chlorophyll-a concentration are identified as required ECV products. Time series of the products at the global scale and at high spatial resolution, derived from ocean-colour data, are key to studying the dynamics of phytoplankton at seasonal and inter-annual scales; their role in marine biogeochemistry; the global carbon cycle; the modulation of how phytoplankton distribute solar-induced heat in the upper layers of the ocean; and the response of the marine ecosystem to climate variability and change. However, generating a long time series of these products from ocean colour data is not a trivial task: algorithms that are best suited for climate studies have to be selected from a number that are available for atmospheric correction of the satellite signal and for retrieval of chlorophyll-a concentration; since satellites have a finite life span, data from multiple sensors have to be merged to create a single time series, and any uncorrected inter-sensor biases could introduce artefacts in the series, e.g., different sensors monitor radiances at different wavebands such that producing a consistent time series of reflectances is not straightforward. Another requirement is that the products have to be validated against in situ observations. Furthermore, the uncertainties in the products have to be quantified, ideally on a pixel-by-pixel basis, to facilitate applications and interpretations that are consistent with the quality of the data. This paper outlines an approach that was adopted for generating an ocean-colour time series for climate studies, using data from the MERIS (MEdium spectral Resolution Imaging Spectrometer) sensor of the European Space Agency; the SeaWiFS (Sea viewingWide-Field-of-view Sensor) and MODIS-Aqua (Moderate-resolution Imaging Spectroradiometer-Aqua) sensors from the National Aeronautics and Space Administration (USA); and VIIRS (Visible and Infrared Imaging Radiometer Suite) from the National Oceanic and Atmospheric Administration (USA). The time series now covers the period from late 1997 to end of 2018. To ensure that the products meet, as well as possible, the requirements of the user community, marine-ecosystem modellers, and remote-sensing scientists were consulted at the outset on their immediate and longer-term requirements as well as on their expectations of ocean-colour data for use in climate research. Taking the user requirements into account, a series of objective criteria were established, against which available algorithms for processing ocean-colour data were evaluated and ranked. The algorithms that performed best with respect to the climate user requirements were selected to process data from the satellite sensors. Remote-sensing reflectance data from MODIS-Aqua, MERIS, and VIIRS were band-shifted to match the wavebands of SeaWiFS. Overlapping data were used to correct for mean biases between sensors at every pixel. The remote-sensing reflectance data derived from the sensors were merged, and the selected in-water algorithm was applied to the merged data to generate maps of chlorophyll concentration, inherent optical properties at SeaWiFS wavelengths, and the diffuse attenuation
coefficient at 490 nm. The merged products were validated against in situ observations. The uncertainties established on the basis of comparisons with in situ data were combined with an optical classification of the remote-sensing reflectance data using a fuzzy-logic approach, and were used to generate uncertainties (root mean square difference and bias) for each product at each pixel
A Compilation of Global Bio-Optical In Situ Data for Ocean-Colour Satellite Applications
A compiled set of in situ data is important to evaluate the quality of ocean-colour satellite-data records. Here we describe the data compiled for the validation of the ocean-colour products from the ESA Ocean Colour Climate Change Initiative (OC-CCI). The data were acquired from several sources (MOBY, BOUSSOLE, AERONET-OC, SeaBASS, NOMAD, MERMAID, AMT, ICES, HOT, GeP&CO), span between 1997 and 2012, and have a global distribution. Observations of the following variables were compiled: spectral remote-sensing reflectances, concentrations of chlorophyll a, spectral inherent optical properties and spectral diffuse attenuation coefficients. The data were from multi-project archives acquired via the open internet services or from individual projects, acquired directly from data providers. Methodologies were implemented for homogenisation, quality control and merging of all data. No changes were made to the original data, other than averaging of observations that were close in time and space, elimination of some points after quality control and conversion to a standard format. The final result is a merged table designed for validation of satellite-derived ocean-colour products and available in text format. Metadata of each in situ measurement (original source, cruise or experiment, principal investigator) were preserved throughout the work and made available in the final table. Using all the data in a validation exercise increases the number of matchups and enhances the representativeness of different marine regimes. By making available the metadata, it is also possible to analyse each set of data separately. The compiled data are available at doi:10.1594/PANGAEA.854832 (Valente et al., 2015)
A compilation of global bio-optical in situ data for ocean-colour satellite applications - version three
A global in situ data set for validation of ocean colour products from the ESA Ocean Colour Climate Change Initiative (OC-CCI) is presented. This version of the compilation, starting in 1997, now extends to 2021, which is important for the validation of the most recent satellite optical sensors such as Sentinel 3B OLCI and NOAA-20 VIIRS. The data set comprises in situ observations of the following variables: spectral remote-sensing reflectance, concentration of chlorophyll-a, spectral inherent optical properties, spectral diffuse attenuation coefficient, and total suspended matter. Data were obtained from multi-project archives acquired via open internet services or from individual projects acquired directly from data providers. Methodologies were implemented for homogenization, quality control, and merging of all data. Minimal changes were made on the original data, other than conversion to a standard format, elimination of some points, after quality control and averaging of observations that were close in time and space. The result is a merged table available in text format. Overall, the size of the data set grew with 148 432 rows, with each row representing a unique station in space and time (cf. 136 250 rows in previous version; Valente et al., 2019). Observations of remote-sensing reflectance increased to 68 641 (cf. 59 781 in previous version; Valente et al., 2019). There was also a near tenfold increase in chlorophyll data since 2016. Metadata of each in situ measurement (original source, cruise or experiment, principal investigator) are included in the final table. By making the metadata available, provenance is better documented and it is also possible to analyse each set of data separately. The compiled data are available at https://doi.org/10.1594/PANGAEA.941318 (Valente et al., 2022)
The Inflation Expectations of Firms: What Do They Look Like, are They Accurate, and Do They Matter?
A compilation of global bio-optical in situ data for ocean colour satellite applications – version three
A global in situ data set for validation of ocean colour products from the ESA Ocean Colour Climate Change Initiative (OC-CCI) is presented. This version of the compilation, starting in 1997, now extends to 2021, which is important for the validation of the most recent satellite optical sensors such as Sentinel 3B OLCI and NOAA-20 VIIRS. The data set comprises in situ observations of the following variables: spectral remote-sensing reflectance, concentration of chlorophyll-a, spectral inherent optical properties, spectral diffuse attenuation coefficient, and total suspended matter. Data were obtained from multi-project archives acquired via open internet services or from individual projects acquired directly from data providers. Methodologies were implemented for homogenization, quality control, and merging of all data. Minimal changes were made on the original data, other than conversion to a standard format, elimination of some points, after quality control and averaging of observations that were close in time and space. The result is a merged table available in text format. Overall, the size of the data set grew with 148 432 rows, with each row representing a unique station in space and time (cf. 136 250 rows in previous version; Valente et al., 2019). Observations of remote-sensing reflectance increased to 68 641 (cf. 59 781 in previous version; Valente et al., 2019). There was also a near tenfold increase in chlorophyll data since 2016. Metadata of each in situ measurement (original source, cruise or experiment, principal investigator) are included in the final table. By making the metadata available, provenance is better documented and it is also possible to analyse each set of data separately. The compiled data are available at https://doi.org/10.1594/PANGAEA.941318 (Valente et al., 2022)
- …