54 research outputs found
Stratified Learning: a general-purpose statistical method for improved learning under Covariate Shift
Covariate shift arises when the labelled training (source) data is not
representative of the unlabelled (target) data due to systematic differences in
the covariate distributions. A supervised model trained on the source data
subject to covariate shift may suffer from poor generalization on the target
data. We propose a novel, statistically principled and theoretically justified
method to improve learning under covariate shift conditions, based on
propensity score stratification, a well-established methodology in causal
inference. We show that the effects of covariate shift can be reduced or
altogether eliminated by conditioning on propensity scores. In practice, this
is achieved by fitting learners on subgroups ("strata") constructed by
partitioning the data based on the estimated propensity scores, leading to
balanced covariates and much-improved target prediction. We demonstrate the
effectiveness of our general-purpose method on contemporary research questions
in observational cosmology, and on additional benchmark examples, matching or
outperforming state-of-the-art importance weighting methods, widely studied in
the covariate shift literature. We obtain the best reported AUC (0.958) on the
updated "Supernovae photometric classification challenge" and improve upon
existing conditional density estimation of galaxy redshift from Sloan Data Sky
Survey (SDSS) data
Incorporating Uncertainties in Atomic Data Into the Analysis of Solar and Stellar Observations: A Case Study in Fe XIII
Information about the physical properties of astrophysical objects cannot be
measured directly but is inferred by interpreting spectroscopic observations in
the context of atomic physics calculations. Ratios of emission lines, for
example, can be used to infer the electron density of the emitting plasma.
Similarly, the relative intensities of emission lines formed over a wide range
of temperatures yield information on the temperature structure. A critical
component of this analysis is understanding how uncertainties in the underlying
atomic physics propagates to the uncertainties in the inferred plasma
parameters. At present, however, atomic physics databases do not include
uncertainties on the atomic parameters and there is no established methodology
for using them even if they did. In this paper we develop simple models for the
uncertainties in the collision strengths and decay rates for Fe XIII and apply
them to the interpretation of density sensitive lines observed with the EUV
Imagining spectrometer (EIS) on Hinode. We incorporate these uncertainties in a
Bayesian framework. We consider both a pragmatic Bayesian method where the
atomic physics information is unaffected by the observed data, and a fully
Bayesian method where the data can be used to probe the physics. The former
generally increases the uncertainty in the inferred density by about a factor
of 5 compared with models that incorporate only statistical uncertainties. The
latter reduces the uncertainties on the inferred densities, but identifies
areas of possible systematic problems with either the atomic physics or the
observed intensities.Comment: in press at Ap
Bayesian Hierarchical Modelling of Initial-Final Mass Relations Across Star Clusters
The initial-final mass relation (IFMR) of white dwarfs (WDs) plays an important role in stellar evolution. To derive precise estimates of IFMRs and explore how they may vary among star clusters, we propose a Bayesian hierarchical model that pools photometric data from multiple star clusters. After performing a simulation study to show the benefits of the Bayesian hierarchical model, we apply this model to five star clusters: the Hyades, M67, NGC 188, NGC 2168, and NGC 2477, leading to reasonable and consistent estimates of IFMRs for these clusters. We illustrate how a cluster-specific analysis of NGC 188 using its own photometric data can produce an unreasonable IFMR since its WDs have a narrow range of zero-age main sequence (ZAMS) masses. However, the Bayesian hierarchical model corrects the cluster-specific analysis by borrowing strength from other clusters, thus generating more reliable estimates of IFMR parameters. The data analysis presents the benefits of Bayesian hierarchical modelling over conventional cluster-specific methods, which motivates us to elaborate the powerful statistical techniques in this article
Bayesian Hierarchical Modelling of Initial-Final Mass Relations Across Star Clusters
The initial-final mass relation (IFMR) of white dwarfs (WDs) plays an
important role in stellar evolution. To derive precise estimates of IFMRs and
explore how they may vary among star clusters, we propose a Bayesian
hierarchical model that pools photo- metric data from multiple star clusters.
After performing a simulation study to show the benefits of the Bayesian
hierarchical model, we apply this model to five star clus- ters: the Hyades,
M67, NGC 188, NGC 2168, and NGC 2477, leading to reasonable and consistent
estimates of IFMRs for these clusters. We illustrate how a cluster-specific
analysis of NGC 188 using its own photometric data can produce an unreasonable
IFMR since its WDs have a narrow range of zero-age main sequence (ZAMS) masses.
However, the Bayesian hierarchical model corrects the cluster-specific analysis
by bor- rowing strength from other clusters, thus generating more reliable
estimates of IFMR parameters. The data analysis presents the benefits of
Bayesian hierarchical modelling over conventional cluster-specific methods,
which motivates us to elaborate the pow- erful statistical techniques in this
article.Comment: 29 pages, 12 figure
A Bayesian Analysis of the Ages of Four Open Clusters
In this paper we apply a Bayesian technique to determine the best fit of stellar evolution models to find the main sequence turn off age and other cluster parameters of four intermediate-age open clusters: NGC 2360, NGC 2477, NGC 2660, and NGC 3960. Our algorithm utilizes a Markov chain Monte Carlo technique to fit these various parameters, objectively finding the best-fit isochrone for each cluster. The result is a high-precision isochrone fit. We compare these results with the those of traditional “by-eye” isochrone fitting methods. By applying this Bayesian technique to NGC 2360, NGC 2477, NGC 2660, and NGC 3960, we determine the ages of these clusters to be 1.35 ± 0.05, 1.02 ± 0.02, 1.64 ± 0.04, and 0.860 ± 0.04 Gyr, respectively. The results of this paper continue our effort to determine cluster ages to higher precision than that offered by these traditional methods of isochrone fitting
Improving White Dwarfs as Chronometers with Gaia Parallaxes and Spectroscopic Metallicities
White dwarfs (WDs) offer unrealized potential in solving two problems in astrophysics: stellar age accuracy and precision. WD cooling ages can be inferred from surface temperatures and radii, which can be constrained with precision by high-quality photometry and parallaxes. Accurate and precise Gaia parallaxes along with photometric surveys provide information to derive cooling and total ages for vast numbers of WDs. Here we analyze 1372 WDs found in wide binaries with main-sequence (MS) companions and report on the cooling and total age precision attainable in these WD+MS systems. The total age of a WD can be further constrained if its original metallicity is known because the MS lifetime depends on metallicity at fixed mass, yet metallicity is unavailable via spectroscopy of the WD. We show that incorporating spectroscopic metallicity constraints from 38 wide binary MS companions substantially decreases internal uncertainties in WD total ages compared to a uniform constraint. Averaged over the 38 stars in our sample, the total (internal) age uncertainty improves from 21.04% to 16.77% when incorporating the spectroscopic constraint. Higher mass WDs yield better total age precision; for eight WDs with zero-age MS masses \u3e= 2.0 M, the mean uncertainty in total ages improves from 8.61% to 4.54% when incorporating spectroscopic metallicities. We find that it is often possible to achieve 5% total age precision for WDs with progenitor masses above 2.0 M if parallaxes wit
Mismatch Repair Deficiency, Microsatellite Instability, and Survival: An Exploratory Analysis of the Medical Research Council Adjuvant Gastric Infusional Chemotherapy (MAGIC) Trial
Importance:
Mismatch repair (MMR) deficiency (MMRD) and microsatellite instability (MSI) are prognostic for survival in many cancers and for resistance to fluoropyrimidines in early colon cancer. However, the effect of MMRD and MSI in curatively resected gastric cancer treated with perioperative chemotherapy is unknown.
Objective:
To examine the association among MMRD, MSI, and survival in patients with resectable gastroesophageal cancer randomized to surgery alone or perioperative epirubicin, cisplatin, and fluorouracil chemotherapy in the Medical Research Council Adjuvant Gastric Infusional Chemotherapy (MAGIC) trial.
Design, Setting, and Participants:
This secondary post hoc analysis of the MAGIC trial included participants who were treated with surgery alone or perioperative chemotherapy plus surgery for operable gastroesophageal cancer from July 1, 1994, through April 30, 2002. Tumor sections were assessed for expression of the MMR proteins mutL homologue 1, mutS homologue 2, mutS homologue 6, and PMS1 homologue 2. The association among MSI, MMRD, and survival was assessed.
Main Outcomes and Measures:
Interaction between MMRD and MSI status and overall survival (OS).
Results:
Of the 503 study participants, MSI results were available for 303 patients (283 with microsatellite stability or low MSI [median age, 62 years; 219 males (77.4%)] and 20 with high MSI [median age, 66 years; 14 males (70.0%)]). A total of 254 patients had MSI and MMR results available. Patients treated with surgery alone who had high MSI or MMRD had a median OS that was not reached (95% CI, 11.5 months to not reached) compared with a median OS among those who had neither high MSI nor MMRD of 20.5 months (95% CI, 16.7-27.8 months; hazard ratio, 0.42; 95% CI, 0.15-1.15; P\u2009=\u2009.09). In contrast, patients treated with chemotherapy plus surgery who had either high MSI or MMRD had a median OS of 9.6 months (95% CI, 0.1-22.5 months) compared with a median OS among those who were neither high MSI nor MMRD of 19.5 months (95% CI, 15.4-35.2 months; hazard ratio, 2.18; 95% CI, 1.08-4.42; P\u2009=\u2009.03).
Conclusions and Relevance:
In the MAGIC trial, MMRD and high MSI were associated with a positive prognostic effect in patients treated with surgery alone and a differentially negative prognostic effect in patients treated with chemotherapy. If independently validated, MSI or MMRD determined by preoperative biopsies could be used to select patients for perioperative chemotherapy
- …