83,848 research outputs found

    Saddlepoint approximation for moment generating functions of truncated random variables

    Full text link
    We consider the problem of approximating the moment generating function (MGF) of a truncated random variable in terms of the MGF of the underlying (i.e., untruncated) random variable. The purpose of approximating the MGF is to enable the application of saddlepoint approximations to certain distributions determined by truncated random variables. Two important statistical applications are the following: the approximation of certain multivariate cumulative distribution functions; and the approximation of passage time distributions in ion channel models which incorporate time interval omission. We derive two types of representation for the MGF of a truncated random variable. One of these representations is obtained by exponential tilting. The second type of representation, which has two versions, is referred to as an exponential convolution representation. Each representation motivates a different approximation. It turns out that each of the three approximations is extremely accurate in those cases ``to which it is suited.'' Moreover, there is a simple rule of thumb for deciding which approximation to use in a given case, and if this rule is followed, then our numerical and theoretical results indicate that the resulting approximation will be extremely accurate.Comment: Published at http://dx.doi.org/10.1214/009053604000000689 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    EMMIXcskew: an R Package for the Fitting of a Mixture of Canonical Fundamental Skew t-Distributions

    Get PDF
    This paper presents an R package EMMIXcskew for the fitting of the canonical fundamental skew t-distribution (CFUST) and finite mixtures of this distribution (FM-CFUST) via maximum likelihood (ML). The CFUST distribution provides a flexible family of models to handle non-normal data, with parameters for capturing skewness and heavy-tails in the data. It formally encompasses the normal, t, and skew-normal distributions as special and/or limiting cases. A few other versions of the skew t-distributions are also nested within the CFUST distribution. In this paper, an Expectation-Maximization (EM) algorithm is described for computing the ML estimates of the parameters of the FM-CFUST model, and different strategies for initializing the algorithm are discussed and illustrated. The methodology is implemented in the EMMIXcskew package, and examples are presented using two real datasets. The EMMIXcskew package contains functions to fit the FM-CFUST model, including procedures for generating different initial values. Additional features include random sample generation and contour visualization in 2D and 3D

    Multivariate Bernoulli distribution

    Full text link
    In this paper, we consider the multivariate Bernoulli distribution as a model to estimate the structure of graphs with binary nodes. This distribution is discussed in the framework of the exponential family, and its statistical properties regarding independence of the nodes are demonstrated. Importantly the model can estimate not only the main effects and pairwise interactions among the nodes but also is capable of modeling higher order interactions, allowing for the existence of complex clique effects. We compare the multivariate Bernoulli model with existing graphical inference models - the Ising model and the multivariate Gaussian model, where only the pairwise interactions are considered. On the other hand, the multivariate Bernoulli distribution has an interesting property in that independence and uncorrelatedness of the component random variables are equivalent. Both the marginal and conditional distributions of a subset of variables in the multivariate Bernoulli distribution still follow the multivariate Bernoulli distribution. Furthermore, the multivariate Bernoulli logistic model is developed under generalized linear model theory by utilizing the canonical link function in order to include covariate information on the nodes, edges and cliques. We also consider variable selection techniques such as LASSO in the logistic model to impose sparsity structure on the graph. Finally, we discuss extending the smoothing spline ANOVA approach to the multivariate Bernoulli logistic model to enable estimation of non-linear effects of the predictor variables.Comment: Published in at http://dx.doi.org/10.3150/12-BEJSP10 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

    Closed-Form Bayesian Inferences for the Logit Model via Polynomial Expansions

    Full text link
    Articles in Marketing and choice literatures have demonstrated the need for incorporating person-level heterogeneity into behavioral models (e.g., logit models for multiple binary outcomes as studied here). However, the logit likelihood extended with a population distribution of heterogeneity doesn't yield closed-form inferences, and therefore numerical integration techniques are relied upon (e.g., MCMC methods). We present here an alternative, closed-form Bayesian inferences for the logit model, which we obtain by approximating the logit likelihood via a polynomial expansion, and then positing a distribution of heterogeneity from a flexible family that is now conjugate and integrable. For problems where the response coefficients are independent, choosing the Gamma distribution leads to rapidly convergent closed-form expansions; if there are correlations among the coefficients one can still obtain rapidly convergent closed-form expansions by positing a distribution of heterogeneity from a Multivariate Gamma distribution. The solution then comes from the moment generating function of the Multivariate Gamma distribution or in general from the multivariate heterogeneity distribution assumed. Closed-form Bayesian inferences, derivatives (useful for elasticity calculations), population distribution parameter estimates (useful for summarization) and starting values (useful for complicated algorithms) are hence directly available. Two simulation studies demonstrate the efficacy of our approach.Comment: 30 pages, 2 figures, corrected some typos. Appears in Quantitative Marketing and Economics vol 4 (2006), no. 2, 173--20
    • …
    corecore