    Nonparametric inference in hidden Markov models using P-splines

    Hidden Markov models (HMMs) are flexible time series models in which the distributions of the observations depend on unobserved serially correlated states. The state-dependent distributions in HMMs are usually taken from some class of parametrically specified distributions. The choice of this class can be difficult, and an unfortunate choice can have serious consequences for example on state estimates, on forecasts and generally on the resulting model complexity and interpretation, in particular with respect to the number of states. We develop a novel approach for estimating the state-dependent distributions of an HMM in a nonparametric way, which is based on the idea of representing the corresponding densities as linear combinations of a large number of standardized B-spline basis functions, imposing a penalty term on non-smoothness in order to maintain a good balance between goodness-of-fit and smoothness. We illustrate the nonparametric modeling approach in a real data application concerned with vertical speeds of a diving beaked whale, demonstrating that compared to parametric counterparts it can lead to models that are more parsimonious in terms of the number of states yet fit the data equally well

    A comparison of alternative approaches to sup-norm goodness of fit tests with estimated parameters

    Goodness of fit tests based on sup-norm statistics of empirical processes have nonstandard limiting distributions when the null hypothesis is composite-that is, when parameters of the null model are estimated. Several solutions to this problem have been suggested, including the calculation of adjusted critical values for these nonstandard distributions and the transformation of the empirical process such that statistics based on the transformed process are asymptotically distribution-free. The approximation methods proposed by Durbin (1985) can be applied to compute appropriate critical values for tests based on sup-norm statistics. The resulting tests have quite accurate size, a fact which has gone unrecognized in the econometrics literature. Some justification for this accuracy lies in the similar features that Durbin's approximation methods share with the theory of extrema for Gaussian random fields and for Gauss-Markov processes. These adjustment techniques are also related to the transformation methodology proposed by Khmaladze (1981) through the score function of the parametric model. Monte Carlo experiments suggest that these two testing strategies are roughly comparable to one another and more powerful than a simple bootstrap procedure.

    Confidence Corridors for Multivariate Generalized Quantile Regression

    We focus on the construction of confidence corridors for multivariate nonparametric generalized quantile regression functions. This construction is based on asymptotic results for the maximal deviation between a suitable nonparametric estimator and the true function of interest which follow after a series of approximation steps including a Bahadur representation, a new strong approximation theorem and exponential tail inequalities for Gaussian random fields. As a byproduct we also obtain confidence corridors for the regression function in the classical mean regression. In order to deal with the problem of slowly decreasing error in coverage probability of the asymptotic confidence corridors, which results in meager coverage for small sample sizes, a simple bootstrap procedure is designed based on the leading term of the Bahadur representation. The finite sample properties of both procedures are investigated by means of a simulation study and it is demonstrated that the bootstrap procedure considerably outperforms the asymptotic bands in terms of coverage accuracy. Finally, the bootstrap confidence corridors are used to study the efficacy of the National Supported Work Demonstration, which is a randomized employment enhancement program launched in the 1970s. This article has supplementary materials

    New L2-type exponentiality tests

    We introduce new consistent and scale-free goodness-of-fit tests for the exponential distribution based on the Puri-Rubin characterization. For the construction of test statistics we employ weighted L2 distance between V-empirical Laplace transforms of random variables that appear in the characterization. We derive the asymptotic behaviour under the null hypothesis as well as under fixed alternatives. We compare our tests, in terms of the Bahadur efficiency, to the likelihood ratio test, as well as some recent characterization based goodness-of-fit tests for the exponential distribution. We also compare the power of our tests to the power of some recent and classical exponentiality tests. According to both criteria, our tests are shown to be strong and outperform most of their competitors.Peer Reviewe

    Optimal Calibration for Multiple Testing against Local Inhomogeneity in Higher Dimension

    Based on two independent samples X_1,...,X_m and X_{m+1},...,X_n drawn from multivariate distributions with unknown Lebesgue densities p and q respectively, we propose an exact multiple test in order to identify simultaneously regions of significant deviations between p and q. The construction is built from randomized nearest-neighbor statistics. It does not require any preliminary information about the multivariate densities such as compact support, strict positivity or smoothness and shape properties. The properly adjusted multiple testing procedure is shown to be sharp-optimal for typical arrangements of the observation values which appear with probability close to one. The proof relies on a new coupling Bernstein type exponential inequality, reflecting the non-subgaussian tail behavior of a combinatorial process. For power investigation of the proposed method a reparametrized minimax set-up is introduced, reducing the composite hypothesis "p=q" to a simple one with the multivariate mixed density (m/n)p+(1-m/n)q as infinite dimensional nuisance parameter. Within this framework, the test is shown to be spatially and sharply asymptotically adaptive with respect to uniform loss on isotropic H\"older classes. The exact minimax risk asymptotics are obtained in terms of solutions of the optimal recovery

    Inference on distribution functions under measurement error

    This paper is concerned with inference on the cumulative distribution function (cdf) FXāˆ— in the classical measurement error model X = Xāˆ— + Īµ. We consider the case where the density of the measurement error Īµ is unknown and estimated by repeated measurements, and show validity of a bootstrap approximation for the distribution of the deviation in the sup-norm between the deconvolution cdf estimator and FXāˆ—. We allow the density of Īµ to be ordinary or super smooth. We also provide several theoretical results on the bootstrap and asymptotic Gumbel approximations of the sup-norm deviation for the case where the density of Īµ is known. Our approximation results are applicable to various contexts, such as confidence bands for FXāˆ— and its quantiles, and for performing various cdf-based tests such as goodness-of-fit tests for parametric models of Xāˆ—, two sample homogeneity tests, and tests for stochastic dominance. Simulation and real data examples illustrate satisfactory performance of the proposed methods
