684 research outputs found

    Statistical Methods for Time-Conditional Survival Probability and Equally Spaced Count Data

    Get PDF
    This dissertation develops statistical methods for time-conditional survival probability and for equally spaced count data. Time-conditional survival probabilities are an alternative measure of future survival by accounting for time elapsed from diagnosis and are estimated as a ratio of survival probabilities. In Chapter 2, we derive the asymptotic distribution of a vector of nonparametric estimators and use weighted least squares methodology for the analysis of time-conditional survival probabilities. We show that the proposed test statistics for evaluating the relationship between time-conditional survival probabilities and additional time survived have central Chi-Square distributions under the null hypotheses. Further, we conducted simulation studies to assess the empirical probability of making a type I error for one of the hypotheses tests developed and to assess the power of the various models and statistics proposed. Additionally, we used weighted least squares techniques to fit regression models for the log time-conditional survival probabilities as a function of time survived after diagnosis to address clinically relevant questions. In Chapter 3, we derive the asymptotic distribution of time-conditional survival probability estimators from a Weibull parametric regression model and from a Logistic-Weibull cure model, adjusting for continuous covariates. We implement the weighted least squares methodology to assess relevant hypotheses. We create a statistical framework for investigating time-conditional survival probability by developing additional methodological approaches to address the relationship between estimated time-conditional survival probabilities, time survived, and patient prognostic factors. Over-dispersed count data are often encountered in longitudinal studies. In Chapter 4, we implement a maximum-likelihood based method for the analysis of equally spaced longitudinal count data with over-dispersion. The key features of this approach are first-order antedependence and linearity of the conditional expectations. We also assume a Markovian model of first order, implying that the value of an outcome on a subject at a specific measurement occasion only depends on the value at the previous measurement occasion. Our maximum likelihood approach using the Poisson model for count data benefits from a simple interpretation of regression parameters, like that in GEE analysis of count data

    Semiparametric bivariate modelling with flexible extremal dependence

    Get PDF
    Inference over multivariate tails often requires a number of assumptions which may affect the assessment of the extreme dependence structure. Models are usually constructed in such a way that extreme components can either be asymptotically dependent or be independent of each other. Recently, there has been an increasing interest on modelling multivariate extremes more flexibly, by allowing models to bridge both asymptotic dependence regimes. Here we propose a novel semiparametric approach which allows for a variety of dependence patterns, be them extremal or not, by using in a model-based fashion the full dataset. We build on previous work for inference on marginal exceedances over a high, unknown threshold, by combining it with flexible, semiparametric copula specifications to investigate extreme dependence, thus separately modelling marginals and dependence structure. Because of the generality of our approach, bivariate problems are investigated here due to computational challenges, but multivariate extensions are readily available. Empirical results suggest that our approach can provide sound uncertainty statements about the possibility of asymptotic independence, and we propose a criterion to quantify the presence of either extreme regime which performs well in our applications when compared to others available. Estimation of functions of interest for extremes is performed via MCMC algorithms. Attention is also devoted to the prediction of new extreme observations. Our approach is evaluated through simulations, applied to real data and assessed against competing approaches. Evidence demonstrates that the bulk of the data do not bias and improve the inferential process for extremal dependence in our applications

    Maximum Likelihood Based Analysis of Equally Spaced Longitudinal Count Data with Specified Marginal Means, First-order Antedependence, and Linear Conditional Expectations

    Get PDF
    This manuscript implements a maximum likelihood based approach that is appropriate for equally spaced longitudinal count data with over-dispersion, so that the variance of the outcome variable is larger than expected for the assumed Poisson distribution. We implement the proposed method in the analysis of two data sets and make comparisons with the semi-parametric generalized estimating equations (GEE) approach that incorrectly ignores the over-dispersion. The simulations demonstrate that the proposed method has better small sample efficiency than GEE. We also provide code in R that can be used to recreate the analysis results that we provide in this manuscript

    A Solution to the Galactic Foreground Problem for LISA

    Full text link
    Low frequency gravitational wave detectors, such as the Laser Interferometer Space Antenna (LISA), will have to contend with large foregrounds produced by millions of compact galactic binaries in our galaxy. While these galactic signals are interesting in their own right, the unresolved component can obscure other sources. The science yield for the LISA mission can be improved if the brighter and more isolated foreground sources can be identified and regressed from the data. Since the signals overlap with one another we are faced with a ``cocktail party'' problem of picking out individual conversations in a crowded room. Here we present and implement an end-to-end solution to the galactic foreground problem that is able to resolve tens of thousands of sources from across the LISA band. Our algorithm employs a variant of the Markov Chain Monte Carlo (MCMC) method, which we call the Blocked Annealed Metropolis-Hastings (BAM) algorithm. Following a description of the algorithm and its implementation, we give several examples ranging from searches for a single source to searches for hundreds of overlapping sources. Our examples include data sets from the first round of Mock LISA Data Challenges.Comment: 19 pages, 27 figure

    Influence on disease spread dynamics of herd characteristics in a structured livestock industry

    Get PDF
    Studies of between-herd contacts may provide important insight to disease transmission dynamics. By comparing the result from models with different levels of detail in the description of animal movement, we studied how factors influence the final epidemic size as well as the dynamic behaviour of an outbreak. We investigated the effect of contact heterogeneity of pig herds in Sweden due to herd size, between-herd distance and production type. Our comparative study suggests that the production-type structure is the most influential factor. Hence, our results imply that production type is the most important factor to obtain valid data for and include when modelling and analysing this system. The study also revealed that all included factors reduce the final epidemic size and also have yet more diverse effects on initial rate of disease spread. This implies that a large set of factors ought to be included to assess relevant predictions when modelling disease spread between herds. Furthermore, our results show that a more detailed model changes predictions regarding the variability in the outbreak dynamics and conclude that this is an important factor to consider in risk assessment

    Tests of Bayesian Model Selection Techniques for Gravitational Wave Astronomy

    Full text link
    The analysis of gravitational wave data involves many model selection problems. The most important example is the detection problem of selecting between the data being consistent with instrument noise alone, or instrument noise and a gravitational wave signal. The analysis of data from ground based gravitational wave detectors is mostly conducted using classical statistics, and methods such as the Neyman-Pearson criteria are used for model selection. Future space based detectors, such as the \emph{Laser Interferometer Space Antenna} (LISA), are expected to produced rich data streams containing the signals from many millions of sources. Determining the number of sources that are resolvable, and the most appropriate description of each source poses a challenging model selection problem that may best be addressed in a Bayesian framework. An important class of LISA sources are the millions of low-mass binary systems within our own galaxy, tens of thousands of which will be detectable. Not only are the number of sources unknown, but so are the number of parameters required to model the waveforms. For example, a significant subset of the resolvable galactic binaries will exhibit orbital frequency evolution, while a smaller number will have measurable eccentricity. In the Bayesian approach to model selection one needs to compute the Bayes factor between competing models. Here we explore various methods for computing Bayes factors in the context of determining which galactic binaries have measurable frequency evolution. The methods explored include a Reverse Jump Markov Chain Monte Carlo (RJMCMC) algorithm, Savage-Dickie density ratios, the Schwarz-Bayes Information Criterion (BIC), and the Laplace approximation to the model evidence. We find good agreement between all of the approaches.Comment: 11 pages, 6 figure

    Dynamic analysis of survival models and related processes.

    Get PDF
    This thesis presents new methods of analysis of survival data based on a Dynamic Bayesian approach. The models allow the parameters to change with time. The analysis is tractable and emphasises predictive aspects of the models. The survival problems covered include linear and non-linear regression, analysis of random samples, time-dependent covariates, life tables and competing risks. The analysis is also extended to a number of point processes. Numerical applications are provided and the microcomputer software to perform them is described

    Nonparametric Reconstruction of the Dark Energy Equation of State from Diverse Data Sets

    Full text link
    The cause of the accelerated expansion of the Universe poses one of the most fundamental questions in physics today. In the absence of a compelling theory to explain the observations, a first task is to develop a robust phenomenology. If the acceleration is driven by some form of dark energy, then, the phenomenology is determined by the dark energy equation of state w. A major aim of ongoing and upcoming cosmological surveys is to measure w and its time dependence at high accuracy. Since w(z) is not directly accessible to measurement, powerful reconstruction methods are needed to extract it reliably from observations. We have recently introduced a new reconstruction method for w(z) based on Gaussian process modeling. This method can capture nontrivial time-dependences in w(z) and, most importantly, it yields controlled and unbaised error estimates. In this paper we extend the method to include a diverse set of measurements: baryon acoustic oscillations, cosmic microwave background measurements, and supernova data. We analyze currently available data sets and present the resulting constraints on w(z), finding that current observations are in very good agreement with a cosmological constant. In addition we explore how well our method captures nontrivial behavior of w(z) by analyzing simulated data assuming high-quality observations from future surveys. We find that the baryon acoustic oscillation measurements by themselves already lead to remarkably good reconstruction results and that the combination of different high-quality probes allows us to reconstruct w(z) very reliably with small error bounds.Comment: 14 pages, 9 figures, 3 table
    corecore