54 research outputs found

    Stratified Learning: a general-purpose statistical method for improved learning under Covariate Shift

    Full text link
    Covariate shift arises when the labelled training (source) data is not representative of the unlabelled (target) data due to systematic differences in the covariate distributions. A supervised model trained on the source data subject to covariate shift may suffer from poor generalization on the target data. We propose a novel, statistically principled and theoretically justified method to improve learning under covariate shift conditions, based on propensity score stratification, a well-established methodology in causal inference. We show that the effects of covariate shift can be reduced or altogether eliminated by conditioning on propensity scores. In practice, this is achieved by fitting learners on subgroups ("strata") constructed by partitioning the data based on the estimated propensity scores, leading to balanced covariates and much-improved target prediction. We demonstrate the effectiveness of our general-purpose method on contemporary research questions in observational cosmology, and on additional benchmark examples, matching or outperforming state-of-the-art importance weighting methods, widely studied in the covariate shift literature. We obtain the best reported AUC (0.958) on the updated "Supernovae photometric classification challenge" and improve upon existing conditional density estimation of galaxy redshift from Sloan Data Sky Survey (SDSS) data

    Incorporating Uncertainties in Atomic Data Into the Analysis of Solar and Stellar Observations: A Case Study in Fe XIII

    Full text link
    Information about the physical properties of astrophysical objects cannot be measured directly but is inferred by interpreting spectroscopic observations in the context of atomic physics calculations. Ratios of emission lines, for example, can be used to infer the electron density of the emitting plasma. Similarly, the relative intensities of emission lines formed over a wide range of temperatures yield information on the temperature structure. A critical component of this analysis is understanding how uncertainties in the underlying atomic physics propagates to the uncertainties in the inferred plasma parameters. At present, however, atomic physics databases do not include uncertainties on the atomic parameters and there is no established methodology for using them even if they did. In this paper we develop simple models for the uncertainties in the collision strengths and decay rates for Fe XIII and apply them to the interpretation of density sensitive lines observed with the EUV Imagining spectrometer (EIS) on Hinode. We incorporate these uncertainties in a Bayesian framework. We consider both a pragmatic Bayesian method where the atomic physics information is unaffected by the observed data, and a fully Bayesian method where the data can be used to probe the physics. The former generally increases the uncertainty in the inferred density by about a factor of 5 compared with models that incorporate only statistical uncertainties. The latter reduces the uncertainties on the inferred densities, but identifies areas of possible systematic problems with either the atomic physics or the observed intensities.Comment: in press at Ap

    Bayesian Hierarchical Modelling of Initial-Final Mass Relations Across Star Clusters

    Get PDF
    The initial-final mass relation (IFMR) of white dwarfs (WDs) plays an important role in stellar evolution. To derive precise estimates of IFMRs and explore how they may vary among star clusters, we propose a Bayesian hierarchical model that pools photometric data from multiple star clusters. After performing a simulation study to show the benefits of the Bayesian hierarchical model, we apply this model to five star clusters: the Hyades, M67, NGC 188, NGC 2168, and NGC 2477, leading to reasonable and consistent estimates of IFMRs for these clusters. We illustrate how a cluster-specific analysis of NGC 188 using its own photometric data can produce an unreasonable IFMR since its WDs have a narrow range of zero-age main sequence (ZAMS) masses. However, the Bayesian hierarchical model corrects the cluster-specific analysis by borrowing strength from other clusters, thus generating more reliable estimates of IFMR parameters. The data analysis presents the benefits of Bayesian hierarchical modelling over conventional cluster-specific methods, which motivates us to elaborate the powerful statistical techniques in this article

    Bayesian Hierarchical Modelling of Initial-Final Mass Relations Across Star Clusters

    Get PDF
    The initial-final mass relation (IFMR) of white dwarfs (WDs) plays an important role in stellar evolution. To derive precise estimates of IFMRs and explore how they may vary among star clusters, we propose a Bayesian hierarchical model that pools photo- metric data from multiple star clusters. After performing a simulation study to show the benefits of the Bayesian hierarchical model, we apply this model to five star clus- ters: the Hyades, M67, NGC 188, NGC 2168, and NGC 2477, leading to reasonable and consistent estimates of IFMRs for these clusters. We illustrate how a cluster-specific analysis of NGC 188 using its own photometric data can produce an unreasonable IFMR since its WDs have a narrow range of zero-age main sequence (ZAMS) masses. However, the Bayesian hierarchical model corrects the cluster-specific analysis by bor- rowing strength from other clusters, thus generating more reliable estimates of IFMR parameters. The data analysis presents the benefits of Bayesian hierarchical modelling over conventional cluster-specific methods, which motivates us to elaborate the pow- erful statistical techniques in this article.Comment: 29 pages, 12 figure

    A Bayesian Analysis of the Ages of Four Open Clusters

    Get PDF
    In this paper we apply a Bayesian technique to determine the best fit of stellar evolution models to find the main sequence turn off age and other cluster parameters of four intermediate-age open clusters: NGC 2360, NGC 2477, NGC 2660, and NGC 3960. Our algorithm utilizes a Markov chain Monte Carlo technique to fit these various parameters, objectively finding the best-fit isochrone for each cluster. The result is a high-precision isochrone fit. We compare these results with the those of traditional “by-eye” isochrone fitting methods. By applying this Bayesian technique to NGC 2360, NGC 2477, NGC 2660, and NGC 3960, we determine the ages of these clusters to be 1.35 ± 0.05, 1.02 ± 0.02, 1.64 ± 0.04, and 0.860 ± 0.04 Gyr, respectively. The results of this paper continue our effort to determine cluster ages to higher precision than that offered by these traditional methods of isochrone fitting

    Improving White Dwarfs as Chronometers with Gaia Parallaxes and Spectroscopic Metallicities

    Get PDF
    White dwarfs (WDs) offer unrealized potential in solving two problems in astrophysics: stellar age accuracy and precision. WD cooling ages can be inferred from surface temperatures and radii, which can be constrained with precision by high-quality photometry and parallaxes. Accurate and precise Gaia parallaxes along with photometric surveys provide information to derive cooling and total ages for vast numbers of WDs. Here we analyze 1372 WDs found in wide binaries with main-sequence (MS) companions and report on the cooling and total age precision attainable in these WD+MS systems. The total age of a WD can be further constrained if its original metallicity is known because the MS lifetime depends on metallicity at fixed mass, yet metallicity is unavailable via spectroscopy of the WD. We show that incorporating spectroscopic metallicity constraints from 38 wide binary MS companions substantially decreases internal uncertainties in WD total ages compared to a uniform constraint. Averaged over the 38 stars in our sample, the total (internal) age uncertainty improves from 21.04% to 16.77% when incorporating the spectroscopic constraint. Higher mass WDs yield better total age precision; for eight WDs with zero-age MS masses \u3e= 2.0 M, the mean uncertainty in total ages improves from 8.61% to 4.54% when incorporating spectroscopic metallicities. We find that it is often possible to achieve 5% total age precision for WDs with progenitor masses above 2.0 M if parallaxes wit

    Mismatch Repair Deficiency, Microsatellite Instability, and Survival: An Exploratory Analysis of the Medical Research Council Adjuvant Gastric Infusional Chemotherapy (MAGIC) Trial

    Get PDF
    Importance: Mismatch repair (MMR) deficiency (MMRD) and microsatellite instability (MSI) are prognostic for survival in many cancers and for resistance to fluoropyrimidines in early colon cancer. However, the effect of MMRD and MSI in curatively resected gastric cancer treated with perioperative chemotherapy is unknown. Objective: To examine the association among MMRD, MSI, and survival in patients with resectable gastroesophageal cancer randomized to surgery alone or perioperative epirubicin, cisplatin, and fluorouracil chemotherapy in the Medical Research Council Adjuvant Gastric Infusional Chemotherapy (MAGIC) trial. Design, Setting, and Participants: This secondary post hoc analysis of the MAGIC trial included participants who were treated with surgery alone or perioperative chemotherapy plus surgery for operable gastroesophageal cancer from July 1, 1994, through April 30, 2002. Tumor sections were assessed for expression of the MMR proteins mutL homologue 1, mutS homologue 2, mutS homologue 6, and PMS1 homologue 2. The association among MSI, MMRD, and survival was assessed. Main Outcomes and Measures: Interaction between MMRD and MSI status and overall survival (OS). Results: Of the 503 study participants, MSI results were available for 303 patients (283 with microsatellite stability or low MSI [median age, 62 years; 219 males (77.4%)] and 20 with high MSI [median age, 66 years; 14 males (70.0%)]). A total of 254 patients had MSI and MMR results available. Patients treated with surgery alone who had high MSI or MMRD had a median OS that was not reached (95% CI, 11.5 months to not reached) compared with a median OS among those who had neither high MSI nor MMRD of 20.5 months (95% CI, 16.7-27.8 months; hazard ratio, 0.42; 95% CI, 0.15-1.15; P\u2009=\u2009.09). In contrast, patients treated with chemotherapy plus surgery who had either high MSI or MMRD had a median OS of 9.6 months (95% CI, 0.1-22.5 months) compared with a median OS among those who were neither high MSI nor MMRD of 19.5 months (95% CI, 15.4-35.2 months; hazard ratio, 2.18; 95% CI, 1.08-4.42; P\u2009=\u2009.03). Conclusions and Relevance: In the MAGIC trial, MMRD and high MSI were associated with a positive prognostic effect in patients treated with surgery alone and a differentially negative prognostic effect in patients treated with chemotherapy. If independently validated, MSI or MMRD determined by preoperative biopsies could be used to select patients for perioperative chemotherapy
    • …
    corecore