40 research outputs found

    Sequential Quantiles via Hermite Series Density Estimation

    Full text link
    Sequential quantile estimation refers to incorporating observations into quantile estimates in an incremental fashion thus furnishing an online estimate of one or more quantiles at any given point in time. Sequential quantile estimation is also known as online quantile estimation. This area is relevant to the analysis of data streams and to the one-pass analysis of massive data sets. Applications include network traffic and latency analysis, real time fraud detection and high frequency trading. We introduce new techniques for online quantile estimation based on Hermite series estimators in the settings of static quantile estimation and dynamic quantile estimation. In the static quantile estimation setting we apply the existing Gauss-Hermite expansion in a novel manner. In particular, we exploit the fact that Gauss-Hermite coefficients can be updated in a sequential manner. To treat dynamic quantile estimation we introduce a novel expansion with an exponentially weighted estimator for the Gauss-Hermite coefficients which we term the Exponentially Weighted Gauss-Hermite (EWGH) expansion. These algorithms go beyond existing sequential quantile estimation algorithms in that they allow arbitrary quantiles (as opposed to pre-specified quantiles) to be estimated at any point in time. In doing so we provide a solution to online distribution function and online quantile function estimation on data streams. In particular we derive an analytical expression for the CDF and prove consistency results for the CDF under certain conditions. In addition we analyse the associated quantile estimator. Simulation studies and tests on real data reveal the Gauss-Hermite based algorithms to be competitive with a leading existing algorithm.Comment: 43 pages, 9 figures. Improved version incorporating referee comments, as appears in Electronic Journal of Statistic

    A statistical investigation into the properties and dynamics of biological populations experiencing environmental variability

    Get PDF
    Student Number : 9908888R - MSc research report - School of Statistics and Actuarial Science - Faculty of ScienceMuch research has been devoted towards the understanding of population behaviour. Such understanding has often been furthered through the development of theoretical population models. This research report explores a variety of population models and their implications. The implications of the various models are explored using both analytical results and simulations. Specific aspects of population behaviour studied include gross fluctuation characteristics and extinction probabilities for a population. This research report starts with an overview of Deterministic Models. This is followed by a study of Birth and Death Processes, Branching Processes and Models that incorporate environmental variability. Finally, we study the maximum likelihood approach to population parameter estimation. The more notable theoretical results derived include: the development of models that incorporate the population’s history; models that incorporate discontinuous environmental changes and the development of a means of parameter estimation for a Stochastic Differential Equation

    Nonparametric Transient Classification using Adaptive Wavelets

    Full text link
    Classifying transients based on multi band light curves is a challenging but crucial problem in the era of GAIA and LSST since the sheer volume of transients will make spectroscopic classification unfeasible. Here we present a nonparametric classifier that uses the transient's light curve measurements to predict its class given training data. It implements two novel components: the first is the use of the BAGIDIS wavelet methodology - a characterization of functional data using hierarchical wavelet coefficients. The second novelty is the introduction of a ranked probability classifier on the wavelet coefficients that handles both the heteroscedasticity of the data in addition to the potential non-representativity of the training set. The ranked classifier is simple and quick to implement while a major advantage of the BAGIDIS wavelets is that they are translation invariant, hence they do not need the light curves to be aligned to extract features. Further, BAGIDIS is nonparametric so it can be used for blind searches for new objects. We demonstrate the effectiveness of our ranked wavelet classifier against the well-tested Supernova Photometric Classification Challenge dataset in which the challenge is to correctly classify light curves as Type Ia or non-Ia supernovae. We train our ranked probability classifier on the spectroscopically-confirmed subsample (which is not representative) and show that it gives good results for all supernova with observed light curve timespans greater than 100 days (roughly 55% of the dataset). For such data, we obtain a Ia efficiency of 80.5% and a purity of 82.4% yielding a highly competitive score of 0.49 whilst implementing a truly "model-blind" approach to supernova classification. Consequently this approach may be particularly suitable for the classification of astronomical transients in the era of large synoptic sky surveys.Comment: 14 pages, 8 figures. Published in MNRA

    Towards the Future of Supernova Cosmology

    Full text link
    For future surveys, spectroscopic follow-up for all supernovae will be extremely difficult. However, one can use light curve fitters, to obtain the probability that an object is a Type Ia. One may consider applying a probability cut to the data, but we show that the resulting non-Ia contamination can lead to biases in the estimation of cosmological parameters. A different method, which allows the use of the full dataset and results in unbiased cosmological parameter estimation, is Bayesian Estimation Applied to Multiple Species (BEAMS). BEAMS is a Bayesian approach to the problem which includes the uncertainty in the types in the evaluation of the posterior. Here we outline the theory of BEAMS and demonstrate its effectiveness using both simulated datasets and SDSS-II data. We also show that it is possible to use BEAMS if the data are correlated, by introducing a numerical marginalisation over the types of the objects. This is largely a pedagogical introduction to BEAMS with references to the main BEAMS papers.Comment: Replaced under married name Lochner (formally Knights). 3 pages, 2 figures. To appear in the Proceedings of 13th Marcel Grossmann Meeting (MG13), Stockholm, Sweden, 1-7 July 201

    Extending BEAMS to incorporate correlated systematic uncertainties

    Get PDF
    New supernova surveys such as the Dark Energy Survey, Pan-STARRS and the LSST will produce an unprecedented number of photometric supernova candidates, most with no spectroscopic data. Avoiding biases in cosmological parameters due to the resulting inevitable contamination from non-Ia supernovae can be achieved with the BEAMS formalism, allowing for fully photometric supernova cosmology studies. Here we extend BEAMS to deal with the case in which the supernovae are correlated by systematic uncertainties. The analytical form of the full BEAMS posterior requires evaluating 2^N terms, where N is the number of supernova candidates. This `exponential catastrophe' is computationally unfeasible even for N of order 100. We circumvent the exponential catastrophe by marginalising numerically instead of analytically over the possible supernova types: we augment the cosmological parameters with nuisance parameters describing the covariance matrix and the types of all the supernovae, \tau_i, that we include in our MCMC analysis. We show that this method deals well even with large, unknown systematic uncertainties without a major increase in computational time, whereas ignoring the correlations can lead to significant biases and incorrect credible contours. We then compare the numerical marginalisation technique with a perturbative expansion of the posterior based on the insight that future surveys will have exquisite light curves and hence the probability that a given candidate is a Type Ia will be close to unity or zero, for most objects. Although this perturbative approach changes computation of the posterior from a 2^N problem into an N^2 or N^3 one, we show that it leads to biases in general through a small number of misclassifications, implying that numerical marginalisation is superior.Comment: Resubmitted under married name Lochner (formally Knights). Version 3: major changes, including a large scale analysis with thousands of MCMC chains. Matches version published in JCAP. 23 pages, 8 figure

    Photometric Supernova Cosmology with BEAMS and SDSS-II

    Full text link
    Supernova cosmology without spectroscopic confirmation is an exciting new frontier which we address here with the Bayesian Estimation Applied to Multiple Species (BEAMS) algorithm and the full three years of data from the Sloan Digital Sky Survey II Supernova Survey (SDSS-II SN). BEAMS is a Bayesian framework for using data from multiple species in statistical inference when one has the probability that each data point belongs to a given species, corresponding in this context to different types of supernovae with their probabilities derived from their multi-band lightcurves. We run the BEAMS algorithm on both Gaussian and more realistic SNANA simulations with of order 10^4 supernovae, testing the algorithm against various pitfalls one might expect in the new and somewhat uncharted territory of photometric supernova cosmology. We compare the performance of BEAMS to that of both mock spectroscopic surveys and photometric samples which have been cut using typical selection criteria. The latter typically are either biased due to contamination or have significantly larger contours in the cosmological parameters due to small data-sets. We then apply BEAMS to the 792 SDSS-II photometric supernovae with host spectroscopic redshifts. In this case, BEAMS reduces the area of the (\Omega_m,\Omega_\Lambda) contours by a factor of three relative to the case where only spectroscopically confirmed data are used (297 supernovae). In the case of flatness, the constraints obtained on the matter density applying BEAMS to the photometric SDSS-II data are \Omega_m(BEAMS)=0.194\pm0.07. This illustrates the potential power of BEAMS for future large photometric supernova surveys such as LSST.Comment: 25 pages, 15 figures, submitted to Ap

    What does it mean to be affiliated with care?: Delphi consensus on the definition of unaffiliation and specialist in sickle cell disease

    Get PDF
    Accruing evidence reveals best practices for how to help individuals living with Sickle Cell Disease (SCD); yet, the implementation of these evidence-based practices in healthcare settings is lacking. The Sickle Cell Disease Implementation Consortium (SCDIC) is a national consortium that uses implementation science to identify and address barriers to care in SCD. The SCDIC seeks to understand how and why patients become unaffiliated from care and determine strategies to identify and connect patients to care. A challenge, however, is the lack of agreed-upon definition for what it means to be unaffiliated and what it means to be a SCD expert provider . In this study, we conducted a Delphi process to obtain expert consensus on what it means to be an unaffiliated patient with SCD and to define an SCD specialist, as no standard definition is available. Twenty-eight SCD experts participated in three rounds of questions. Consensus was defined as 80% or more of respondents agreeing. Experts reached consensus that an individual with SCD who is unaffiliated from care is someone who has not been seen by a sickle cell specialist in at least a year. A sickle cell specialist was defined as someone with knowledge and experience in SCD. Having knowledge means: being knowledgeable of the 2014 NIH Guidelines, Evidence-Based Management of SCD , trained in hydroxyurea management and transfusions, trained on screening for organ damage in SCD, trained in pain management and on SCD emergencies, and is aware of psychosocial and cognitive issues in SCD. Experiences that are expected of a SCD specialist include experience working with SCD patients, mentored by a SCD specialist, regular attendance at SCD conferences, and obtains continuing medical education on SCD every 2 years. The results have strong implications for future research, practice, and policy related to SCD by helping to lay a foundation for an new area of research (e.g., to identify subpopulations of unaffiliation and targeted interventions) and policies that support reaffiliation and increase accessibility to quality care

    Results from the Supernova Photometric Classification Challenge

    Get PDF
    We report results from the Supernova Photometric Classification Challenge (SNPCC), a publicly released mix of simulated supernovae (SNe), with types (Ia, Ibc, and II) selected in proportion to their expected rate. The simulation was realized in the griz filters of the Dark Energy Survey (DES) with realistic observing conditions (sky noise, point-spread function and atmospheric transparency) based on years of recorded conditions at the DES site. Simulations of non-Ia type SNe are based on spectroscopically confirmed light curves that include unpublished non-Ia samples donated from the Carnegie Supernova Project (CSP), the Supernova Legacy Survey (SNLS), and the Sloan Digital Sky Survey-II (SDSS-II). A spectroscopically confirmed subset was provided for training. We challenged scientists to run their classification algorithms and report a type and photo-z for each SN. Participants from 10 groups contributed 13 entries for the sample that included a host-galaxy photo-z for each SN, and 9 entries for the sample that had no redshift information. Several different classification strategies resulted in similar performance, and for all entries the performance was significantly better for the training subset than for the unconfirmed sample. For the spectroscopically unconfirmed subset, the entry with the highest average figure of merit for classifying SNe~Ia has an efficiency of 0.96 and an SN~Ia purity of 0.79. As a public resource for the future development of photometric SN classification and photo-z estimators, we have released updated simulations with improvements based on our experience from the SNPCC, added samples corresponding to the Large Synoptic Survey Telescope (LSST) and the SDSS, and provided the answer keys so that developers can evaluate their own analysis.Comment: accepted by PAS
    corecore