1,159 research outputs found
State space mixed models for longitudinal obsservations with binary and binomial responses
We propose a new class of state space models for longitudinal discrete response data where the observation equation is specified in an additive form involving both deterministic and random linear predictors. These models allow us to explicitly address the effects of trend, seaonal or other time-varying covariates while preserving the power of state space models in modeling serial dependence in the data. We develop a Markov Chain Monte Carlo algorithm to carry out statistical inferene for models with binary and binomial responses, in which we invoke de Jong and Shephard's (1995) simulaton smoother to establish an efficent sampling procedure for the state variables. To quantify and control the sensitivity of posteriors on the priors of variance parameters, we add a signal-to-noise ratio type parmeter in the specification of these priors. Finally, we ilustrate the applicability of the proposed state space mixed models for longitudinal binomial response data in both simulation studies and data examples
Composite likelihood Bayesian information criteria for model selection in high dimensional data
For high-dimensional data set with complicated dependency structures, the full likelihood approach often renders to intractable computational complexity. This imposes di±culty on model selection as most of the traditionally used information criteria require the evaluation of the full likelihood. We propose a composite likelihood version of the Bayesian information criterion (BIC) and establish its consistency property for the selection of the true underlying model. Under some mild regularity conditions, the proposed BIC is shown to be selection consistent, where the number of potential model parameters is allowed to increase to in¯nity at a certain rate of the sample size. Simulation studies demonstrate the empirical performance of this new BIC criterion, especially for the scenario that the number of parameters increases with the sample size
Time-Deformation Modeling Of Stock Returns Directed By Duration Processes
This paper presents a new class of time-deformation (or stochastic volatility) models for stock returns sampled in transaction time and directed by a generalized duration process. Stochastic volatility in this model is driven by an observed duration process and a latent autoregressive process. Parameter estimation in the model is carried out by using the method of simulated moments (MSM) due to its analytical feasibility and numerical stability for the proposed model. Simulations are conducted to validate the choices of the moments used in the formulation of the MSM. Both the simulation and empirical results obtained in this paper indicate that this approach works well for the proposed model. The main empirical findings for the IBM transaction return data can be summarized as follows: (i) the return distribution conditional on the duration process is not Gaussian, even though the duration process itself can marginally function as a directing process; (ii) the return process is highly leveraged; (iii) a longer trade duration tends to be associated with a higher return volatility; and (iv) the proposed model is capable of reproducing return whose marginal density function is close to that of the empirical return.Duration process; Ergodicity; Method of simulated moments; Return process; Stationarity.
Efficient Estimation of the Partly Linear Additive Hazards Model with Current Status Data
This paper focuses on efficient estimation, optimal rates of convergence and effective algorithms in the partly linear additive hazards regression model with current status data. We use polynomial splines to estimate both cumulative baseline hazard function with monotonicity constraint and nonparametric regression functions with no such constraint. We propose a simultaneous sieve maximum likelihood estimation for regression parameters and nuisance parameters and show that the resultant estimator of regression parameter vector is asymptotically normal and achieves the semiparametric information bound. In addition, we show that rates of convergence for the estimators of nonparametric functions are optimal. We implement the proposed estimation through a backfitting algorithm on generalized linear models. We conduct simulation studies to examine the finite‐sample performance of the proposed estimation method and present an analysis of renal function recovery data for illustration.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/110752/1/sjos12108.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/110752/2/sjos12108-sup-0001-supinfo.pd
fastMI: a fast and consistent copula-based estimator of mutual information
As a fundamental concept in information theory, mutual information (MI) has
been commonly applied to quantify association between random variables. Most
existing estimators of MI have unstable statistical performance since they
involve parameter tuning. We develop a consistent and powerful estimator,
called fastMI, that does not incur any parameter tuning. Based on a copula
formulation, fastMI estimates MI by leveraging Fast Fourier transform-based
estimation of the underlying density. Extensive simulation studies reveal that
fastMI outperforms state-of-the-art estimators with improved estimation
accuracy and reduced run time for large data sets. fastMI provides a powerful
test for independence that exhibits satisfactory type I error control.
Anticipating that it will be a powerful tool in estimating mutual information
in a broad range of data, we develop an R package fastMI for broader
dissemination
A Class of Directed Acyclic Graphs with Mixed Data Types in Mediation Analysis
We propose a unified class of generalized structural equation models (GSEMs)
with data of mixed types in mediation analysis, including continuous,
categorical, and count variables. Such models extend substantially the
classical linear structural equation model to accommodate many data types
arising from the application of mediation analysis. Invoking the hierarchical
modeling approach, we specify GSEMs by a copula joint distribution of outcome
variable, mediator and exposure variable, in which marginal distributions are
built upon generalized linear models (GLMs) with confounding factors. We
discuss the identifiability conditions for the causal mediation effects in the
counterfactual paradigm as well as the issue of mediation leakage, and develop
an asymptotically efficient profile maximum likelihood estimation and inference
for two key mediation estimands, natural direct effect and natural indirect
effect, in different scenarios of mixed data types. The proposed new
methodology is illustrated by a motivating epidemiological study that aims to
investigate whether the tempo of reaching infancy BMI peak (delay or on time),
an important early life growth milestone, may mediate the association between
prenatal exposure to phthalates and pubertal health outcomes.Comment: 33 pages, 3 figures, 3 table
- …