79 research outputs found
On Efficient Range-Summability of IID Random Variables in Two or Higher Dimensions
d-dimensional (for d > 1) efficient range-summability (dD-ERS) of random variables (RVs) is a fundamental algorithmic problem that has applications to two important families of database problems, namely, fast approximate wavelet tracking (FAWT) on data streams and approximately answering range-sum queries over a data cube. Whether there are efficient solutions to the dD-ERS problem, or to the latter database problem, have been two long-standing open problems. Both are solved in this work. Specifically, we propose a novel solution framework to dD-ERS on RVs that have Gaussian or Poisson distribution. Our dD-ERS solutions are the first ones that have polylogarithmic time complexities. Furthermore, we develop a novel k-wise independence theory that allows our dD-ERS solutions to have both high computational efficiencies and strong provable independence guarantees. Finally, we show that under a sufficient and likely necessary condition, certain existing solutions for 1D-ERS can be generalized to higher dimensions
Asymptotic inference for monstationary fractionally integrated processes
This paper studies the asymptotic of nonstationary fractionally integrated (NFI) multivariate processes with memory parameter d> 112. We provide conditions to establish a functional central limit theorem and weak convergence of stochastic integrals for NFI processes under the assumptions of these results are given. More specifically, we obtain the rates of convergence and limiting distributions of the OLS estimators of cointegrating vectors in triangular representations. Further, we extend Sims, Stock and Watson's (1990) analysis on estimation and hypothesis testing in vector autoregressions with integrated processes and deterministic components to the more general fractional framework. We show how their main conclusions remain valid when dealing with NFI processes. That is, whenever a block of coefficients can be written as coefficients on zero mean 1(0) regressors in a model that includes a constant term, they will have a joint asymptotic normal distribution, so that the corresponding restrictions can be tested using standard asymptotic chi-square distribution theory. Otherwise, in general, the associated statistics will have nonstandard limiting distributions
A Sieve-SMM Estimator for Dynamic Models
This paper proposes a Sieve Simulated Method of Moments (Sieve-SMM) estimator
for the parameters and the distribution of the shocks in nonlinear dynamic
models where the likelihood and the moments are not tractable. An important
concern with SMM, which matches sample with simulated moments, is that a
parametric distribution is required. However, economic quantities that depend
on this distribution, such as welfare and asset-prices, can be sensitive to
misspecification. The Sieve-SMM estimator addresses this issue by flexibly
approximating the distribution of the shocks with a Gaussian and tails mixture
sieve. The asymptotic framework provides consistency, rate of convergence and
asymptotic normality results, extending existing results to a new framework
with more general dynamics and latent variables. An application to asset
pricing in a production economy shows a large decline in the estimates of
relative risk-aversion, highlighting the empirical relevance of
misspecification bias
Sparsely Observed Functional Time Series: Estimation and Prediction
Functional time series analysis, whether based on time of frequency domain
methodology, has traditionally been carried out under the assumption of
complete observation of the constituent series of curves, assumed stationary.
Nevertheless, as is often the case with independent functional data, it may
well happen that the data available to the analyst are not the actual sequence
of curves, but relatively few and noisy measurements per curve, potentially at
different locations in each curve's domain. Under this sparse sampling regime,
neither the established estimators of the time series' dynamics, nor their
corresponding theoretical analysis will apply. The subject of this paper is to
tackle the problem of estimating the dynamics and of recovering the latent
process of smooth curves in the sparse regime. Assuming smoothness of the
latent curves, we construct a consistent nonparametric estimator of the series'
spectral density operator and use it develop a frequency-domain recovery
approach, that predicts the latent curve at a given time by borrowing strength
from the (estimated) dynamic correlations in the series across time. Further to
predicting the latent curves from their noisy point samples, the method fills
in gaps in the sequence (curves nowhere sampled), denoises the data, and serves
as a basis for forecasting. Means of providing corresponding confidence bands
are also investigated. A simulation study interestingly suggests that sparse
observation for a longer time period, may be provide better performance than
dense observation for a shorter period, in the presence of smoothness. The
methodology is further illustrated by application to an environmental data set
on fair-weather atmospheric electricity, which naturally leads to a sparse
functional time-series
Recommended from our members
Essays on Simulation-Based Estimation
Complex nonlinear dynamic models with an intractable likelihood or moments are increasingly common in economics. A popular approach to estimating these models is to match informative sample moments with simulated moments from a fully parameterized model using SMM or Indirect Inference. This dissertation consists of three chapters exploring different aspects of such simulation-based estimation methods. The following chapters are presented in the order in which they were written during my thesis.
Chapter 1, written with Serena Ng, provides an overview of existing frequentist and Bayesian simulation-based estimators. These estimators are seemingly computationally similar in the sense that they all make use of simulations from the model in order to do the estimation. To better understand the relationship between these estimators, this chapters introduces a Reverse Sampler which expresses the Bayesian posterior moments as a weighted average of frequentist estimates. As such, it highlights a deeper connection between the two class of estimators beyond the simulation aspect. This Reverse Sampler also allows us to compare the higher-order bias properties of these estimators. We find that while all estimators have an automatic bias correction property (Gourieroux et al., 1993) the Bayesian estimator introduces two additional biases. The first is due to computing a posterior mean rather than the mode. The second is due to the prior, which penalizes the estimates in a particular direction.
Chapter 2, also written with Serena Ng, proves that the Reverse Sampler described above targets the desired Approximate Bayesian Computation (ABC) posterior distribution. The idea relies on a change of variable argument: the frequentist optimization step implies a non-linear transformation. As a result, the unweighted draws follow a distribution that depends on the likelihood that comes from the simulations, and a Jacobian term that arises from the non-linear transformation. Hence, solving the frequentist estimation problem multiple times, with different numerical seeds, leads to an optimization-based importance sampler where the weights depend on the prior and the volume of the Jacobian of the non-linear transformation. In models where optimization is relatively fast, this Reverse Sampler is shown to compare favourably to existing ABC-MCMC or ABC-SMC sampling methods.
Chapter 3, relaxes the parametric assumptions on the distribution of the shocks in simulation-based estimation. It extends the existing SMM literature, where even though the choice of moments is flexible and potentially nonparametric, the model itself is assumed to be fully parametric. The large sample theory in this chapter allows for both time-series and short-panels which are the two most common data types found in empirical applications. Using a flexible sieve density reduces the sensitivity of estimates and counterfactuals to an ad hoc choice of distribution such as the Gaussian density. Compared to existing work on sieve estimation, the Sieve-SMM estimator involves dynamically generated data which implies non-standard bias and dependence properties. First, the dynamics imply an accumulation of the bias resulting in a larger nonparametric approximation error than in static models. To ensure that it does not accumulate too much, a set decay conditions on the data generating process are given and the resulting bias is derived. Second, by construction, the dependence properties of the simulated data vary with the parameter values so that standard empirical process results, which rely on a coupling argument, do not apply in this setting. This non-standard dependent empirical process is handled through an inequality built by adapting results from the existing literature. The results hold for bounded empirical processes under a geometric ergodicity condition. This is illustrated in the paper with Monte-Carlo simulations and two empirical applications
Essays on Heterogeneity and Non-Linearity in Panel Data and Time Series Models
In recent years advances in data collection and storage allow us to observe and analyze many financial, economic or environmental processes with higher precision. This in turn reveals new features of the underlying processes and creates a demand for the development of new econometric techniques. The aim of this thesis is to tackle some of these challenges in the filed of panel data and time series models. In particular, the first and the last chapters contribute to the issue of testing and estimating heterogeneous panel models with random coefficients. The second chapter discusses a generalization of the classical linear time series models to asymmetric ones and presents a test statistic to help empirical researchers to choose the appropriate modeling framework in this context. Finally, the objective of the third chapter is to extend the available (nonlinear) time series techniques on big data sets or functional data. In more detail, Chapter1, which is joint work with Joerg Breitung and Christoph Roling, employs the Lagrange Multiplier (LM) principle to test parameter homogeneity across cross-section units in panel data models. The test can be seen as a generalization of the Breusch-Pagan test against random individual effects to all regression coefficients. While the original test procedure assumes a likelihood framework under normality, several useful variants of the LM test are presented to allow for non-normality, heteroskedasticity and serially correlated errors. Moreover, the tests can be conveniently computed via simple artificial regressions. We derive the limiting distribution of the LM test and show that if the errors are not normally distributed, the original LM test is asymptotically valid if the number of time periods tends to infinity. A simple modification of the score statistic yields an LM test that is robust to non-normality if the number of time periods is fixed. Further adjustments provide versions of the LM test that are robust to heteroskedasticity and serial correlation. We compare the local power of our tests and the statistic proposed by Pesaran and Yamagata. The results of the Monte Carlo experiments suggest that the LM-type test can be substantially more powerful, in particular, when the number of time periods is small. Chapter 2, which is joint work with Thomas Nebeling, develops a Lagrange multiplier test statistic and its variants to test for the null hypothesis of no asymmetric effects of shocks on time series. In asymmetric time series models that allow for different responses to positive and negative past shocks the likelihood functions are, in general, non-differentiable. By making use of the theory of generalized functions Lagrange multiplier type tests and the resulting asymptotics are derived. The test statistics possess standard asymptotic limiting behavior under the null hypothesis. Monte Carlo experiments illustrate the accuracy of the asymptotic approximation and show that conventional model selection criteria can be used to estimate the required lag length. We provide an empirical application to the U.S. unemployment rate. In Chapter 3, written in collaborative work with Alexander Gleim, statistical tools for forecasting functional times series are developed, which for example can be used to analyze big data sets. To tackle the issue of time dependence we introduce the notion of functional dependence through scores of the spectral representation. We investigate the impact of time dependence thus quantified on the estimation of functional principal components. The rate of mean squared convergence of the estimator of the covariance operator is derived under long range dependence of the functional time series. After that, we suggest two forecasting techniques for functional time series satisfying our measure of time dependence and derive the asymptotic properties of their predictors. The first is the functional autoregressive model which is commonly used to describe linear processes. As our notion of functional dependence covers a broader class of processes we also study the functional additive autoregressive model and construct its forecasts by using the k-nearest neighbors approach. The accuracy of the proposed tools is verified through Monte Carlo simulations. Empirical relevance of the theory is illustrated through an application to electricity consumption in the Nordic countries. In Chapter 4, which was jointly done with Joerg Breitung, three main estimation procedures for the panel data models with heterogeneous slopes are discussed: pooling, generalized LS and mean-group estimator. In our analysis we take an explicit account of the statistical dependence that may exists between regressors and the heterogeneous effects of the slopes. It is shown that under systematic slope variations: (i) pooling gives inconsistent and highly misleading estimates, and (ii) generalized LS in general is not consistent even in settings when and are large, (iii) while mean-group estimator always provide consistent result at a price of higher variance. We contribute to the literature by suggesting a simple robustified version of the pooled based on Mundlak type corrections. This estimator provides consistent results and is asymptotically equivalent to the mean-group estimator for large N and T. Monte Carlo experiments confirm our theoretical findings and show that for large N and fixed T new estimator can be an attractive option when compare to the competitors
Functional Principal Component Analysis of Cointegrated Functional Time Series
Functional principal component analysis (FPCA) has played an important role
in the development of functional time series (FTS) analysis. This paper
investigates how FPCA can be used to analyze cointegrated functional time
series and propose a modification of FPCA as a novel statistical tool. Our
modified FPCA not only provides an asymptotically more efficient estimator of
the cointegrating vectors, but also leads to novel KPSS-type tests for
examining some essential properties of cointegrated time series. As an
empirical illustration, our methodology is applied to the time series of
log-earning densities
- …