Search CORE

87,191 research outputs found

Transformations in regression, estimation, testing and modelling

Author: Parker Imelda
Publication venue: The University of St Andrews
Publication date: 06/06/2018
Field of study

Transformation is a powerful tool for model building. In regression the response variable is transformed in order to achieve the usual assumptions of normality, constant variance and additivity of effects. Here the normality assumption is replaced by the Laplace distributional assumption, appropriate when more large errors occur than would be expected if the errors were normally distributed. The parametric model is enlarged to include a transformation parameter and a likelihood procedure is adopted for estimating this parameter simultaneously with other parameters of interest. Diagnostic methods are described for assessing the influence of individual observations on the choice of transformation. Examples are presented. In distribution methodology the independent responses are transformed in order that a distributional assumption is satisfied for the transformed data. Here the interest is in the family of distributions which are not dependent on an unknown shape parameter. The gamma distribution (known order), with special case the exponential distribution, is a member of this family. An information number approach is proposed for transforming a known distribution to the gamma distribution (known order). The approach provides an insight into the large-sample behaviour of the likelihood procedure considered by Draper and Guttman (1968) for investigating transformations of data which allow the transformed observations to follow a gamma distribution. The information number approach is illustrated for three examples end the improvement towards the gamma distribution introduced by transformation is measured numerically and graphically. A graphical procedure is proposed for the general case of investigating transformations of data which allow the transformed observations to follow a distribution dependent on unknown threshold and scale parameters. The procedure is extended to include model testing and estimation for any distribution which with the aid of a power transformation can be put in the simple form of a distribution that is not dependent on an unknown shape parameter. The procedure is based on a ratio, R(y), which is constructed from the power transformation. Also described is a ratio-based technique for estimating the threshold parameter in important parametric models, including the three-parameter Weibull and lognormal distributions. Ratio estimation for the weibull distribution is assessed and compared with the modified maximum likelihood estimation of Cohen and Whitten (1982) in terms of bias and root mean squared error, by means of a simulation study. The methods are illustrated with several examples and extend naturally to singly Type 1 and Type 2 censored data

St Andrews Research Repository

Nonanticipating estimation applied to sequential analysis and changepoint detection

Author: Lorden Gary
Pollak Moshe
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2005
Field of study

Suppose a process yields independent observations whose distributions belong to a family parameterized by \theta\in\Theta. When the process is in control, the observations are i.i.d. with a known parameter value \theta_0. When the process is out of control, the parameter changes. We apply an idea of Robbins and Siegmund [Proc. Sixth Berkeley Symp. Math. Statist. Probab. 4 (1972) 37-41] to construct a class of sequential tests and detection schemes whereby the unknown post-change parameters are estimated. This approach is especially useful in situations where the parametric space is intricate and mixture-type rules are operationally or conceptually difficult to formulate. We exemplify our approach by applying it to the problem of detecting a change in the shape parameter of a Gamma distribution, in both a univariate and a multivariate setting.Comment: Published at http://dx.doi.org/10.1214/009053605000000183 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

Caltech Authors

A non-Gaussian continuous state space model for asset degradation

Author: Ma Lin
Mathew Joseph
Zhou Yifan
Publication venue: Springer-Verlag London Ltd.
Publication date: 01/01/2008
Field of study

The degradation model plays an essential role in asset life prediction and condition based maintenance. Various degradation models have been proposed. Within these models, the state space model has the ability to combine degradation data and failure event data. The state space model is also an effective approach to deal with the multiple observations and missing data issues. Using the state space degradation model, the deterioration process of assets is presented by a system state process which can be revealed by a sequence of observations. Current research largely assumes that the underlying system development process is discrete in time or states. Although some models have been developed to consider continuous time and space, these state space models are based on the Wiener process with the Gaussian assumption. This paper proposes a Gamma-based state space degradation model in order to remove the Gaussian assumption. Both condition monitoring observations and failure events are considered in the model so as to improve the accuracy of asset life prediction. A simulation study is carried out to illustrate the application procedure of the proposed model

Queensland University of Technology ePrints Archive

Flexible Tweedie regression models for continuous data

Author: Bonat Wagner H.
Kokonendji Célestin C.
Publication venue: 'Informa UK Limited'
Publication date: 12/09/2016
Field of study

Tweedie regression models provide a flexible family of distributions to deal with non-negative highly right-skewed data as well as symmetric and heavy tailed data and can handle continuous data with probability mass at zero. The estimation and inference of Tweedie regression models based on the maximum likelihood method are challenged by the presence of an infinity sum in the probability function and non-trivial restrictions on the power parameter space. In this paper, we propose two approaches for fitting Tweedie regression models, namely, quasi- and pseudo-likelihood. We discuss the asymptotic properties of the two approaches and perform simulation studies to compare our methods with the maximum likelihood method. In particular, we show that the quasi-likelihood method provides asymptotically efficient estimation for regression parameters. The computational implementation of the alternative methods is faster and easier than the orthodox maximum likelihood, relying on a simple Newton scoring algorithm. Simulation studies showed that the quasi- and pseudo-likelihood approaches present estimates, standard errors and coverage rates similar to the maximum likelihood method. Furthermore, the second-moment assumptions required by the quasi- and pseudo-likelihood methods enables us to extend the Tweedie regression models to the class of quasi-Tweedie regression models in the Wedderburn's style. Moreover, it allows to eliminate the non-trivial restriction on the power parameter space, and thus provides a flexible regression model to deal with continuous data. We provide \texttt{R} implementation and illustrate the application of Tweedie regression models using three data sets.Comment: 34 pages, 8 figure

arXiv.org e-Print Archive

FigShare

Fast and scalable non-parametric Bayesian inference for Poisson point processes

Author: Gugushvili Shota
Schauer Moritz
Spreij Peter
van der Meulen Frank
Publication venue
Publication date: 01/01/2020
Field of study

We study the problem of non-parametric Bayesian estimation of the intensity function of a Poisson point process. The observations are

n

independent realisations of a Poisson point process on the interval

[0,T]

. We propose two related approaches. In both approaches we model the intensity function as piecewise constant on

N

bins forming a partition of the interval

[0,T]

. In the first approach the coefficients of the intensity function are assigned independent gamma priors, leading to a closed form posterior distribution. On the theoretical side, we prove that as

n\rightarrow\infty,

the posterior asymptotically concentrates around the "true", data-generating intensity function at an optimal rate for

h

-H\"older regular intensity functions (

0 < h\leq 1

). In the second approach we employ a gamma Markov chain prior on the coefficients of the intensity function. The posterior distribution is no longer available in closed form, but inference can be performed using a straightforward version of the Gibbs sampler. Both approaches scale well with sample size, but the second is much less sensitive to the choice of

N

. Practical performance of our methods is first demonstrated via synthetic data examples. We compare our second method with other existing approaches on the UK coal mining disasters data. Furthermore, we apply it to the US mass shootings data and Donald Trump's Twitter data.Comment: 45 pages, 22 figure

arXiv.org e-Print Archive

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Bayesian spectral modeling for multiple time series

Author: Cadonna Annalisa
Kottas Athanasios
Prado Raquel
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2019
Field of study

We develop a novel Bayesian modeling approach to spectral density estimation for multiple time series. The log-periodogram distribution for each series is modeled as a mixture of Gaussian distributions with frequency-dependent weights and mean functions. The implied model for the log-spectral density is a mixture of linear mean functions with frequency-dependent weights. The mixture weights are built through successive differences of a logit-normal distribution function with frequency-dependent parameters. Building from the construction for a single spectral density, we develop a hierarchical extension for multiple time series. Specifically, we set the mean functions to be common to all spectral densities and make the weights specific to the time series through the parameters of the logit-normal distribution. In addition to accommodating flexible spectral density shapes, a practically important feature of the proposed formulation is that it allows for ready posterior simulation through a Gibbs sampler with closed form full conditional distributions for all model parameters. The modeling approach is illustrated with simulated datasets, and used for spectral analysis of multichannel electroencephalographic recordings (EEGs), which provides a key motivating application for the proposed methodology

Elektronische Publikationen der Wirtschaftsuniversität Wien

General Semiparametric Shared Frailty Model Estimation and Simulation with frailtySurv

Author: Gorfine Malka
Hsu Li
Monaco John V.
Publication venue: 'Foundation for Open Access Statistic'
Publication date: 01/08/2018
Field of study

The R package frailtySurv for simulating and fitting semi-parametric shared frailty models is introduced. Package frailtySurv implements semi-parametric consistent estimators for a variety of frailty distributions, including gamma, log-normal, inverse Gaussian and power variance function, and provides consistent estimators of the standard errors of the parameters' estimators. The parameters' estimators are asymptotically normally distributed, and therefore statistical inference based on the results of this package, such as hypothesis testing and confidence intervals, can be performed using the normal distribution. Extensive simulations demonstrate the flexibility and correct implementation of the estimator. Two case studies performed with publicly available datasets demonstrate applicability of the package. In the Diabetic Retinopathy Study, the onset of blindness is clustered by patient, and in a large hard drive failure dataset, failure times are thought to be clustered by the hard drive manufacturer and model

arXiv.org e-Print Archive

Directory of Open Access Journals

Journal of Statistical Software

Calhoun, Institutional Archive of the Naval Postgraduate School