412,670 research outputs found

    Blending of Cepheids in M33

    Get PDF
    A precise and accurate determination of the Hubble constant based on Cepheid variables requires proper characterization of many sources of systematic error. One of these is stellar blending, which biases the measured fluxes of Cepheids and the resulting distance estimates. We study the blending of 149 Cepheid variables in M33 by matching archival Hubble Space Telescope data with images obtained at the WIYN 3.5-m telescope, which differ by a factor of 10 in angular resolution. We find that 55+-4% of the Cepheids have no detectable nearby companions that could bias the WIYN V-band photometry, while the fraction of Cepheids affected below the 10% level is 73+-4%. The corresponding values for the I band are 60+-4% and 72+-4%, respectively. We find no statistically significant difference in blending statistics as a function of period or surface brightness. Additionally, we report all the detected companions within 2 arcseconds of the Cepheids (equivalent to 9 pc at the distance of M33) which may be used to derive empirical blending corrections for Cepheids at larger distances.Comment: v2: Fixed incorrect description of Figure 2 in text. Accepted for publication in AJ. Full data tables can be found in ASCII format as part of the source distribution. A version of the paper with higher-resolution figures can be found at http://faculty.physics.tamu.edu/lmacri/papers/chavez12.pd

    Predicting the Clustering of X-Ray Selected Galaxy Clusters in Flux-Limited Surveys

    Get PDF
    (abridged) We present a model to predict the clustering properties of X-ray clusters in flux-limited surveys. Our technique correctly accounts for past light-cone effects on the observed clustering and follows the non-linear evolution in redshift of the underlying DM correlation function and cluster bias factor. The conversion of the limiting flux of a survey into the corresponding minimum mass of the hosting DM haloes is obtained by using theoretical and empirical relations between mass, temperature and X-ray luminosity of clusters. Finally, our model is calibrated to reproduce the observed cluster counts adopting a temperature-luminosity relation moderately evolving with redshift. We apply our technique to three existing catalogues: BCS, XBACs and REFLEX samples. Moreover, we consider an example of possible future space missions with fainter limiting flux. In general, we find that the amplitude of the spatial correlation function is a decreasing function of the limiting flux and that the EdS models always give smaller correlation amplitudes than open or flat models with low matter density parameter. In the case of XBACs, the comparison with previous estimates of the observational spatial correlation shows that only the predictions of models with Omega_0m=0.3 are in good agreement with the data, while the EdS models have too low a correlation strength. Finally, we use our technique to discuss the best strategy for future surveys. Our results show that the choice of a wide area catalogue, even with a brighter limiting flux, is preferable to a deeper, but with smaller area, survey.Comment: 20 pages, Latex using MN style, 11 figures enclosed. Version accepted for publication in MNRA

    The First Comparison Between Swarm-C Accelerometer-Derived Thermospheric Densities and Physical and Empirical Model Estimates

    Get PDF
    The first systematic comparison between Swarm-C accelerometer-derived thermospheric density and both empirical and physics-based model results using multiple model performance metrics is presented. This comparison is performed at the satellite's high temporal 10-s resolution, which provides a meaningful evaluation of the models' fidelity for orbit prediction and other space weather forecasting applications. The comparison against the physical model is influenced by the specification of the lower atmospheric forcing, the high-latitude ionospheric plasma convection, and solar activity. Some insights into the model response to thermosphere-driving mechanisms are obtained through a machine learning exercise. The results of this analysis show that the short-timescale variations observed by Swarm-C during periods of high solar and geomagnetic activity were better captured by the physics-based model than the empirical models. It is concluded that Swarm-C data agree well with the climatologies inherent within the models and are, therefore, a useful data set for further model validation and scientific research.Comment: https://goo.gl/n4QvU

    Reducing the number of inputs in nonlocal games

    Get PDF
    In this work we show how a vector-valued version of Schechtman's empirical method can be used to reduce the number of inputs in a nonlocal game GG while preserving the quotient β(G)/β(G)\beta^*(G)/\beta(G) of the quantum over the classical bias. We apply our method to the Khot-Vishnoi game, with exponentially many questions per player, to produce another game with polynomially many (Nn8N\approx n^8) questions so that the quantum over the classical bias is Ω(n/log2n)\Omega (n/\log^2 n)

    A Taxonomy of Big Data for Optimal Predictive Machine Learning and Data Mining

    Full text link
    Big data comes in various ways, types, shapes, forms and sizes. Indeed, almost all areas of science, technology, medicine, public health, economics, business, linguistics and social science are bombarded by ever increasing flows of data begging to analyzed efficiently and effectively. In this paper, we propose a rough idea of a possible taxonomy of big data, along with some of the most commonly used tools for handling each particular category of bigness. The dimensionality p of the input space and the sample size n are usually the main ingredients in the characterization of data bigness. The specific statistical machine learning technique used to handle a particular big data set will depend on which category it falls in within the bigness taxonomy. Large p small n data sets for instance require a different set of tools from the large n small p variety. Among other tools, we discuss Preprocessing, Standardization, Imputation, Projection, Regularization, Penalization, Compression, Reduction, Selection, Kernelization, Hybridization, Parallelization, Aggregation, Randomization, Replication, Sequentialization. Indeed, it is important to emphasize right away that the so-called no free lunch theorem applies here, in the sense that there is no universally superior method that outperforms all other methods on all categories of bigness. It is also important to stress the fact that simplicity in the sense of Ockham's razor non plurality principle of parsimony tends to reign supreme when it comes to massive data. We conclude with a comparison of the predictive performance of some of the most commonly used methods on a few data sets.Comment: 18 pages, 2 figures 3 table

    Experimental Comparison of Empirical Material Decomposition Methods for Spectral CT

    Get PDF
    Material composition can be estimated from spectral information acquired using photon counting x-ray detectors with pulse height analysis. Non-ideal effects in photon counting x-ray detectors such as charge-sharing, k-escape, and pulse-pileup distort the detected spectrum, which can cause material decomposition errors. This work compared the performance of two empirical decomposition methods: a neural network estimator and a linearized maximum likelihood estimator with correction (A-table method). The two investigated methods differ in how they model the nonlinear relationship between the spectral measurements and material decomposition estimates. The bias and standard deviation of material decomposition estimates were compared for the two methods, using both simulations and experiments with a photon-counting x-ray detector. Both the neural network and A-table methods demonstrated a similar performance for the simulated data. The neural network had lower standard deviation for nearly all thicknesses of the test materials in the collimated (low scatter) and uncollimated (higher scatter) experimental data. In the experimental study of Teflon thicknesses, non-ideal detector effects demonstrated a potential bias of 11–28%, which was reduced to 0.1–11% using the proposed empirical methods. Overall, the results demonstrated preliminary experimental feasibility of empirical material decomposition for spectral CT using photon-counting detectors

    Statistical unfolding of elementary particle spectra: Empirical Bayes estimation and bias-corrected uncertainty quantification

    Full text link
    We consider the high energy physics unfolding problem where the goal is to estimate the spectrum of elementary particles given observations distorted by the limited resolution of a particle detector. This important statistical inverse problem arising in data analysis at the Large Hadron Collider at CERN consists in estimating the intensity function of an indirectly observed Poisson point process. Unfolding typically proceeds in two steps: one first produces a regularized point estimate of the unknown intensity and then uses the variability of this estimator to form frequentist confidence intervals that quantify the uncertainty of the solution. In this paper, we propose forming the point estimate using empirical Bayes estimation which enables a data-driven choice of the regularization strength through marginal maximum likelihood estimation. Observing that neither Bayesian credible intervals nor standard bootstrap confidence intervals succeed in achieving good frequentist coverage in this problem due to the inherent bias of the regularized point estimate, we introduce an iteratively bias-corrected bootstrap technique for constructing improved confidence intervals. We show using simulations that this enables us to achieve nearly nominal frequentist coverage with only a modest increase in interval length. The proposed methodology is applied to unfolding the ZZ boson invariant mass spectrum as measured in the CMS experiment at the Large Hadron Collider.Comment: Published at http://dx.doi.org/10.1214/15-AOAS857 in the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org). arXiv admin note: substantial text overlap with arXiv:1401.827

    On statistical approaches to generate Level 3 products from satellite remote sensing retrievals

    Get PDF
    Satellite remote sensing of trace gases such as carbon dioxide (CO2_2) has increased our ability to observe and understand Earth's climate. However, these remote sensing data, specifically~Level 2 retrievals, tend to be irregular in space and time, and hence, spatio-temporal prediction is required to infer values at any location and time point. Such inferences are not only required to answer important questions about our climate, but they are also needed for validating the satellite instrument, since Level 2 retrievals are generally not co-located with ground-based remote sensing instruments. Here, we discuss statistical approaches to construct Level 3 products from Level 2 retrievals, placing particular emphasis on the strengths and potential pitfalls when using statistical prediction in this context. Following this discussion, we use a spatio-temporal statistical modelling framework known as fixed rank kriging (FRK) to obtain global predictions and prediction standard errors of column-averaged carbon dioxide based on Version 7r and Version 8r retrievals from the Orbiting Carbon Observatory-2 (OCO-2) satellite. The FRK predictions allow us to validate statistically the Level 2 retrievals globally even though the data are at locations and at time points that do not coincide with validation data. Importantly, the validation takes into account the prediction uncertainty, which is dependent both on the temporally-varying density of observations around the ground-based measurement sites and on the spatio-temporal high-frequency components of the trace gas field that are not explicitly modelled. Here, for validation of remotely-sensed CO2_2 data, we use observations from the Total Carbon Column Observing Network. We demonstrate that the resulting FRK product based on Version 8r compares better with TCCON data than that based on Version 7r.Comment: 28 pages, 10 figures, 4 table
    corecore