29,418 research outputs found

    Estimation of Dynamic Latent Variable Models Using Simulated Nonparametric Moments

    Get PDF
    Abstract. Given a model that can be simulated, conditional moments at a trial parameter value can be calculated with high accuracy by applying kernel smoothing methods to a long simulation. With such conditional moments in hand, standard method of moments techniques can be used to estimate the parameter. Because conditional moments are calculated using kernel smoothing rather than simple averaging, it is not necessary that the model be simulable subject to the conditioning information that is used to define the moment conditions. For this reason, the proposed estimator is applicable to general dynamic latent variable models. It is shown that as the number of simulations diverges, the estimator is consistent and a higher-order expansion reveals the stochastic difference between the infeasible GMM estimator based on the same moment conditions and the simulated version. In particular, we show how to adjust standard errors to account for the simulations. Monte Carlo results show how the estimator may be applied to a range of dynamic latent variable (DLV) models, and that it performs well in comparison to several other estimators that have been proposed for DLV models.dynamic latent variable models; simulation-based estimation; simulated moments; kernel regression; nonparametric estimation

    Estimation Algorithm for Mixture of Experts Recurrent Event Model

    Get PDF
    This paper proposes a mixture of experts recurrent events model. This general model accommodates an unobservable frailty variable, intervention effect, influence of accumulating event occurrences, and covariate effects. A latent class variable is utilized to deal with a heterogeneous population and associated covariates. A homogeneous nonparametric baseline hazard and heterogeneous parametric covariate effects are assumed. Maximum likelihood principle is employed to obtain parameter estimates. Since the frailty variable and latent classes are unobserved, an estimation procedure is derived through the EM algorithm. A simulated data set is generated to illustrate the data structure of recurrent events for a heterogeneous population

    Nonparametric Censored and Truncated Regression

    Get PDF
    The nonparametric censored regression model, with a fixed, known censoring point (normalized to zero), is y = max[0,m(x) + e], where both the regression function m(x) and the distribution of the error e are unknown. This paper provides estimators of m(x) and its derivatives. The convergence rate is the same as for an uncensored nonparametric regression and its derivatives. We also provide root n estimates of weighted average derivatives of m(x), which equal the coefficients in linear or partly linearr specifications for m(x). An extension permits estimation in the presence of a general form of heteroscedasticity. We also extend the estimator to the nonparametric truncated regression model, in which only uncensored data points are observed. The estimators are based on the relationship ?E(yk\x)/?m(x) = kE[yk-1/(y > 0)x ], which we show holds for positive integers k.Semiparametric, nonparametric, censored regression, truncated regression, Tobit, latent variable

    Identification and Inference of Nonlinear Models Using Two Samples with Arbitrary Measurement Errors

    Get PDF
    This paper considers identification and inference of a general latent nonlinear model using two samples, where a covariate contains arbitrary measurement errors in both samples, and neither sample contains an accurate measurement of the corresponding true variable. The primary sample consists of some dependent variables, some error-free covariates and an error-ridden covariate, where the measurement error has unknown distribution and could be arbitrarily correlated with the latent true values. The auxiliary sample consists of another noisy measurement of the mismeasured covariate and some error-free covariates. We first show that a general latent nonlinear model is nonparametrically identified using the two samples when both could have nonclassical errors, with no requirement of instrumental variables nor independence between the two samples. When the two samples are independent and the latent nonlinear model is parameterized, we propose sieve quasi maximum likelihood estimation (MLE) for the parameter of interest, and establish its root-n consistency and asymptotic normality under possible misspecification, and its semiparametric efficiency under correct specification. We also provide a sieve likelihood ratio model selection test to compare two possibly misspecified parametric latent models. A small Monte Carlo simulation and an empirical example are presented.Data combination, Nonlinear errors-in-variables model, Nonclassical measurement error, Nonparametric identification, Misspecified parametric latent model, Sieve likelihood estimation and inference

    Multi-view Learning as a Nonparametric Nonlinear Inter-Battery Factor Analysis

    Get PDF
    Factor analysis aims to determine latent factors, or traits, which summarize a given data set. Inter-battery factor analysis extends this notion to multiple views of the data. In this paper we show how a nonlinear, nonparametric version of these models can be recovered through the Gaussian process latent variable model. This gives us a flexible formalism for multi-view learning where the latent variables can be used both for exploratory purposes and for learning representations that enable efficient inference for ambiguous estimation tasks. Learning is performed in a Bayesian manner through the formulation of a variational compression scheme which gives a rigorous lower bound on the log likelihood. Our Bayesian framework provides strong regularization during training, allowing the structure of the latent space to be determined efficiently and automatically. We demonstrate this by producing the first (to our knowledge) published results of learning from dozens of views, even when data is scarce. We further show experimental results on several different types of multi-view data sets and for different kinds of tasks, including exploratory data analysis, generation, ambiguity modelling through latent priors and classification.Comment: 49 pages including appendi

    Essays in the econometrics of dynamic duration models with application to tick by tick financial data.

    Get PDF
    This thesis organizes three contributions on the econometrics of duration in the context of high frequency financial data. We provide existence conditions and analytical expressions of the moments of Log-ACD models. We focus on the dispersion index and the autocorrelation function and compare them with those of ACD and SCD models We apply the effcient importance sampling (EIS) method for computing the high-dimensional integral required to evaluate the likelihood function of the stochastic conditional duration (SCD) model. We compare EIS-based ML estimation with QML estimation based on the Kalman filter. We find that EIS- ML estimation is more precise statistically, at a cost of an acceptable loss of quickness of computations. We illustrate this with simulated and real data. We show also that the EIS-ML method is easy to apply to extensions of the SCD model. We carry out a nonparametric analysis of financial durations. We make use of an existing algorithm to describe nonparametrically the dynamics of the process in terms of its lagged realizations and of a latent variable, its conditional mean. The devices needed to effectively apply the algorithm to our dataset are presented. We show that: on simulated data, the nonparametric procedure yields better estimates than the ones delivered by an incorrectly specified parametric method, while on a real dataset, the nonparametric analysis can convey information on the nature of the data generating process that may not be captured by the parametric specification.

    Leveraging the Exact Likelihood of Deep Latent Variable Models

    Get PDF
    Deep latent variable models (DLVMs) combine the approximation abilities of deep neural networks and the statistical foundations of generative models. Variational methods are commonly used for inference; however, the exact likelihood of these models has been largely overlooked. The purpose of this work is to study the general properties of this quantity and to show how they can be leveraged in practice. We focus on important inferential problems that rely on the likelihood: estimation and missing data imputation. First, we investigate maximum likelihood estimation for DLVMs: in particular, we show that most unconstrained models used for continuous data have an unbounded likelihood function. This problematic behaviour is demonstrated to be a source of mode collapse. We also show how to ensure the existence of maximum likelihood estimates, and draw useful connections with nonparametric mixture models. Finally, we describe an algorithm for missing data imputation using the exact conditional likelihood of a deep latent variable model. On several data sets, our algorithm consistently and significantly outperforms the usual imputation scheme used for DLVMs

    Nonparametric Identification of Multivariate Mixtures

    Get PDF
    This article analyzes the identifiability of k-variate, M-component finite mixture models in which each component distribution has independent marginals, including models in latent class analysis. Without making parametric assumptions on the component distributions, we investigate how one can identify the number of components and the component distributions from the distribution function of the observed data. We reveal an important link between the number of variables (k), the number of values each variable can take, and the number of identifiable components. A lower bound on the number of components (M) is nonparametrically identifiable if k >= 2, and the maximum identifiable number of components is determined by the number of different values each variable takes. When M is known, the mixing proportions and the component distributions are nonparametrically identified from matrices constructed from the distribution function of the data if (i) k >= 3, (ii) two of k variables take at least M different values, and (iii) these matrices satisfy some rank and eigenvalue conditions. For the unknown M case, we propose an algorithm that possibly identifies M and the component distributions from data. We discuss a condition for nonparametric identi fication and its observable implications. In case M cannot be identified, we use our identification condition to develop a procedure that consistently estimates a lower bound on the number of components by estimating the rank of a matrix constructed from the distribution function of observed variables.finite mixture, latent class analysis, latent class model, model selection, number of components, rank estimation
    corecore