684 research outputs found
Statistical Methods for Time-Conditional Survival Probability and Equally Spaced Count Data
This dissertation develops statistical methods for time-conditional survival probability and for equally spaced count data. Time-conditional survival probabilities are an alternative measure of future survival by accounting for time elapsed from diagnosis and are estimated as a ratio of survival probabilities. In Chapter 2, we derive the asymptotic distribution of a vector of nonparametric estimators and use weighted least squares methodology for the analysis of time-conditional survival probabilities. We show that the proposed test statistics for evaluating the relationship between time-conditional survival probabilities and additional time survived have central Chi-Square distributions under the null hypotheses. Further, we conducted simulation studies to assess the empirical probability of making a type I error for one of the hypotheses tests developed and to assess the power of the various models and statistics proposed. Additionally, we used weighted least squares techniques to fit regression models for the log time-conditional survival probabilities as a function of time survived after diagnosis to address clinically relevant questions. In Chapter 3, we derive the asymptotic distribution of time-conditional survival probability estimators from a Weibull parametric regression model and from a Logistic-Weibull cure model, adjusting for continuous covariates. We implement the weighted least squares methodology to assess relevant hypotheses. We create a statistical framework for investigating time-conditional survival probability by developing additional methodological approaches to address the relationship between estimated time-conditional survival probabilities, time survived, and patient prognostic factors. Over-dispersed count data are often encountered in longitudinal studies. In Chapter 4, we implement a maximum-likelihood based method for the analysis of equally spaced longitudinal count data with over-dispersion. The key features of this approach are first-order antedependence and linearity of the conditional expectations. We also assume a Markovian model of first order, implying that the value of an outcome on a subject at a specific measurement occasion only depends on the value at the previous measurement occasion. Our maximum likelihood approach using the Poisson model for count data benefits from a simple interpretation of regression parameters, like that in GEE analysis of count data
Semiparametric bivariate modelling with flexible extremal dependence
Inference over multivariate tails often requires a number of assumptions which may affect the assessment of the extreme dependence structure. Models are usually constructed in such a way that extreme components can either be asymptotically dependent or be independent of each other. Recently, there has been an increasing interest on modelling multivariate extremes more flexibly, by allowing models to bridge both asymptotic dependence regimes. Here we propose a novel semiparametric approach which allows for a variety of dependence patterns, be them extremal or not, by using in a model-based fashion the full dataset. We build on previous work for inference on marginal exceedances over a high, unknown threshold, by combining it with flexible, semiparametric copula specifications to investigate extreme dependence, thus separately modelling marginals and dependence structure. Because of the generality of our approach, bivariate problems are investigated here due to computational challenges, but multivariate extensions are readily available. Empirical results suggest that our approach can provide sound uncertainty statements about the possibility of asymptotic independence, and we propose a criterion to quantify the presence of either extreme regime which performs well in our applications when compared to others available. Estimation of functions of interest for extremes is performed via MCMC algorithms. Attention is also devoted to the prediction of new extreme observations. Our approach is evaluated through simulations, applied to real data and assessed against competing approaches. Evidence demonstrates that the bulk of the data do not bias and improve the inferential process for extremal dependence in our applications
Maximum Likelihood Based Analysis of Equally Spaced Longitudinal Count Data with Specified Marginal Means, First-order Antedependence, and Linear Conditional Expectations
This manuscript implements a maximum likelihood based approach that is appropriate for equally spaced longitudinal count data with over-dispersion, so that the variance of the outcome variable is larger than expected for the assumed Poisson distribution. We implement the proposed method in the analysis of two data sets and make comparisons with the semi-parametric generalized estimating equations (GEE) approach that incorrectly ignores the over-dispersion. The simulations demonstrate that the proposed method has better small sample efficiency than GEE. We also provide code in R that can be used to recreate the analysis results that we provide in this manuscript
A Solution to the Galactic Foreground Problem for LISA
Low frequency gravitational wave detectors, such as the Laser Interferometer
Space Antenna (LISA), will have to contend with large foregrounds produced by
millions of compact galactic binaries in our galaxy. While these galactic
signals are interesting in their own right, the unresolved component can
obscure other sources. The science yield for the LISA mission can be improved
if the brighter and more isolated foreground sources can be identified and
regressed from the data. Since the signals overlap with one another we are
faced with a ``cocktail party'' problem of picking out individual conversations
in a crowded room. Here we present and implement an end-to-end solution to the
galactic foreground problem that is able to resolve tens of thousands of
sources from across the LISA band. Our algorithm employs a variant of the
Markov Chain Monte Carlo (MCMC) method, which we call the Blocked Annealed
Metropolis-Hastings (BAM) algorithm. Following a description of the algorithm
and its implementation, we give several examples ranging from searches for a
single source to searches for hundreds of overlapping sources. Our examples
include data sets from the first round of Mock LISA Data Challenges.Comment: 19 pages, 27 figure
Influence on disease spread dynamics of herd characteristics in a structured livestock industry
Studies of between-herd contacts may provide important insight to disease transmission dynamics. By comparing the result from models with different levels of detail in the description of animal movement, we studied how factors influence the final epidemic size as well as the dynamic behaviour of an outbreak. We investigated the effect of contact heterogeneity of pig herds in Sweden due to herd size, between-herd distance and production type. Our comparative study suggests that the production-type structure is the most influential factor. Hence, our results imply that production type is the most important factor to obtain valid data for and include when modelling and analysing this system. The study also revealed that all included factors reduce the final epidemic size and also have yet more diverse effects on initial rate of disease spread. This implies that a large set of factors ought to be included to assess relevant predictions when modelling disease spread between herds. Furthermore, our results show that a more detailed model changes predictions regarding the variability in the outbreak dynamics and conclude that this is an important factor to consider in risk assessment
Tests of Bayesian Model Selection Techniques for Gravitational Wave Astronomy
The analysis of gravitational wave data involves many model selection
problems. The most important example is the detection problem of selecting
between the data being consistent with instrument noise alone, or instrument
noise and a gravitational wave signal. The analysis of data from ground based
gravitational wave detectors is mostly conducted using classical statistics,
and methods such as the Neyman-Pearson criteria are used for model selection.
Future space based detectors, such as the \emph{Laser Interferometer Space
Antenna} (LISA), are expected to produced rich data streams containing the
signals from many millions of sources. Determining the number of sources that
are resolvable, and the most appropriate description of each source poses a
challenging model selection problem that may best be addressed in a Bayesian
framework. An important class of LISA sources are the millions of low-mass
binary systems within our own galaxy, tens of thousands of which will be
detectable. Not only are the number of sources unknown, but so are the number
of parameters required to model the waveforms. For example, a significant
subset of the resolvable galactic binaries will exhibit orbital frequency
evolution, while a smaller number will have measurable eccentricity. In the
Bayesian approach to model selection one needs to compute the Bayes factor
between competing models. Here we explore various methods for computing Bayes
factors in the context of determining which galactic binaries have measurable
frequency evolution. The methods explored include a Reverse Jump Markov Chain
Monte Carlo (RJMCMC) algorithm, Savage-Dickie density ratios, the Schwarz-Bayes
Information Criterion (BIC), and the Laplace approximation to the model
evidence. We find good agreement between all of the approaches.Comment: 11 pages, 6 figure
Dynamic analysis of survival models and related processes.
This thesis presents new methods of analysis of survival data based on a Dynamic Bayesian approach. The models allow the parameters to change with time. The analysis is tractable and emphasises predictive aspects of the models. The survival problems covered include linear and non-linear regression, analysis of random samples, time-dependent covariates, life tables and competing risks. The analysis is also extended to a number of point processes. Numerical applications are provided and the microcomputer software to perform them is described
Nonparametric Reconstruction of the Dark Energy Equation of State from Diverse Data Sets
The cause of the accelerated expansion of the Universe poses one of the most
fundamental questions in physics today. In the absence of a compelling theory
to explain the observations, a first task is to develop a robust phenomenology.
If the acceleration is driven by some form of dark energy, then, the
phenomenology is determined by the dark energy equation of state w. A major aim
of ongoing and upcoming cosmological surveys is to measure w and its time
dependence at high accuracy. Since w(z) is not directly accessible to
measurement, powerful reconstruction methods are needed to extract it reliably
from observations. We have recently introduced a new reconstruction method for
w(z) based on Gaussian process modeling. This method can capture nontrivial
time-dependences in w(z) and, most importantly, it yields controlled and
unbaised error estimates. In this paper we extend the method to include a
diverse set of measurements: baryon acoustic oscillations, cosmic microwave
background measurements, and supernova data. We analyze currently available
data sets and present the resulting constraints on w(z), finding that current
observations are in very good agreement with a cosmological constant. In
addition we explore how well our method captures nontrivial behavior of w(z) by
analyzing simulated data assuming high-quality observations from future
surveys. We find that the baryon acoustic oscillation measurements by
themselves already lead to remarkably good reconstruction results and that the
combination of different high-quality probes allows us to reconstruct w(z) very
reliably with small error bounds.Comment: 14 pages, 9 figures, 3 table
- …