518 research outputs found

    Infinitesimally Robust Estimation in General Smoothly Parametrized Models

    Full text link
    We describe the shrinking neighborhood approach of Robust Statistics, which applies to general smoothly parametrized models, especially, exponential families. Equal generality is achieved by object oriented implementation of the optimally robust estimators. We evaluate the estimates on real datasets from literature by means of our R packages ROptEst and RobLox

    Moment-based parameter estimation in binomial random intersection graph models

    Full text link
    Binomial random intersection graphs can be used as parsimonious statistical models of large and sparse networks, with one parameter for the average degree and another for transitivity, the tendency of neighbours of a node to be connected. This paper discusses the estimation of these parameters from a single observed instance of the graph, using moment estimators based on observed degrees and frequencies of 2-stars and triangles. The observed data set is assumed to be a subgraph induced by a set of n0n_0 nodes sampled from the full set of nn nodes. We prove the consistency of the proposed estimators by showing that the relative estimation error is small with high probability for n0n2/31n_0 \gg n^{2/3} \gg 1. As a byproduct, our analysis confirms that the empirical transitivity coefficient of the graph is with high probability close to the theoretical clustering coefficient of the model.Comment: 15 pages, 6 figure

    Maximum Likelihood Estimator for Hidden Markov Models in continuous time

    Full text link
    The paper studies large sample asymptotic properties of the Maximum Likelihood Estimator (MLE) for the parameter of a continuous time Markov chain, observed in white noise. Using the method of weak convergence of likelihoods due to I.Ibragimov and R.Khasminskii, consistency, asymptotic normality and convergence of moments are established for MLE under certain strong ergodicity conditions of the chain.Comment: Warning: due to a flaw in the publishing process, some of the references in the published version of the article are confuse

    Quantifying Robotic Swarm Coverage

    Full text link
    In the field of swarm robotics, the design and implementation of spatial density control laws has received much attention, with less emphasis being placed on performance evaluation. This work fills that gap by introducing an error metric that provides a quantitative measure of coverage for use with any control scheme. The proposed error metric is continuously sensitive to changes in the swarm distribution, unlike commonly used discretization methods. We analyze the theoretical and computational properties of the error metric and propose two benchmarks to which error metric values can be compared. The first uses the realizable extrema of the error metric to compute the relative error of an observed swarm distribution. We also show that the error metric extrema can be used to help choose the swarm size and effective radius of each robot required to achieve a desired level of coverage. The second benchmark compares the observed distribution of error metric values to the probability density function of the error metric when robot positions are randomly sampled from the target distribution. We demonstrate the utility of this benchmark in assessing the performance of stochastic control algorithms. We prove that the error metric obeys a central limit theorem, develop a streamlined method for performing computations, and place the standard statistical tests used here on a firm theoretical footing. We provide rigorous theoretical development, computational methodologies, numerical examples, and MATLAB code for both benchmarks.Comment: To appear in Springer series Lecture Notes in Electrical Engineering (LNEE). This book contribution is an extension of our ICINCO 2018 conference paper arXiv:1806.02488. 27 pages, 8 figures, 2 table

    Characterization of stochastic orders by L-functionals

    Get PDF
    Random variables may be compared with respect to their location by comparing certain functionals ad hoc, such as the mean or median, or by means of stochastic ordering based directly on the properties of the corresponding distribution functions. These alternative approaches are brought together in this paper. We focus on the class of L-functionals discussed by Bickel and Lehmann (1975) and characterize the comparison of random variables in terms of these measures by means of several stochastic orders based on iterated integrals, including the increasing convex orde

    Gene-based multiple trait analysis for exome sequencing data

    Get PDF
    The common genetic variants identified through genome-wide association studies explain only a small proportion of the genetic risk for complex diseases. The advancement of next-generation sequencing technologies has enabled the detection of rare variants that are expected to contribute significantly to the missing heritability. Some genetic association studies provide multiple correlated traits for analysis. Multiple trait analysis has the potential to improve the power to detect pleiotropic genetic variants that influence multiple traits. We propose a gene-level association test for multiple traits that accounts for correlation among the traits. Gene- or region-level testing for association involves both common and rare variants. Statistical tests for common variants may have limited power for individual rare variants because of their low frequency and multiple testing issues. To address these concerns, we use the weighted-sum pooling method to test the joint association of multiple rare and common variants within a gene. The proposed method is applied to the Genetic Association Workshop 17 (GAW17) simulated mini-exome data to analyze multiple traits. Because of the nature of the GAW17 simulation model, increased power was not observed for multiple-trait analysis compared to single-trait analysis. However, multiple-trait analysis did not result in a substantial loss of power because of the testing of multiple traits. We conclude that this method would be useful for identifying pleiotropic genes

    Detecting the direction of a signal on high-dimensional spheres: Non-null and Le Cam optimality results

    Full text link
    We consider one of the most important problems in directional statistics, namely the problem of testing the null hypothesis that the spike direction θ\theta of a Fisher-von Mises-Langevin distribution on the pp-dimensional unit hypersphere is equal to a given direction θ0\theta_0. After a reduction through invariance arguments, we derive local asymptotic normality (LAN) results in a general high-dimensional framework where the dimension pnp_n goes to infinity at an arbitrary rate with the sample size nn, and where the concentration κn\kappa_n behaves in a completely free way with nn, which offers a spectrum of problems ranging from arbitrarily easy to arbitrarily challenging ones. We identify various asymptotic regimes, depending on the convergence/divergence properties of (κn)(\kappa_n), that yield different contiguity rates and different limiting experiments. In each regime, we derive Le Cam optimal tests under specified κn\kappa_n and we compute, from the Le Cam third lemma, asymptotic powers of the classical Watson test under contiguous alternatives. We further establish LAN results with respect to both spike direction and concentration, which allows us to discuss optimality also under unspecified κn\kappa_n. To investigate the non-null behavior of the Watson test outside the parametric framework above, we derive its local asymptotic powers through martingale CLTs in the broader, semiparametric, model of rotationally symmetric distributions. A Monte Carlo study shows that the finite-sample behaviors of the various tests remarkably agree with our asymptotic results.Comment: 47 pages, 4 figure

    The interplay of microscopic and mesoscopic structure in complex networks

    Get PDF
    Not all nodes in a network are created equal. Differences and similarities exist at both individual node and group levels. Disentangling single node from group properties is crucial for network modeling and structural inference. Based on unbiased generative probabilistic exponential random graph models and employing distributive message passing techniques, we present an efficient algorithm that allows one to separate the contributions of individual nodes and groups of nodes to the network structure. This leads to improved detection accuracy of latent class structure in real world data sets compared to models that focus on group structure alone. Furthermore, the inclusion of hitherto neglected group specific effects in models used to assess the statistical significance of small subgraph (motif) distributions in networks may be sufficient to explain most of the observed statistics. We show the predictive power of such generative models in forecasting putative gene-disease associations in the Online Mendelian Inheritance in Man (OMIM) database. The approach is suitable for both directed and undirected uni-partite as well as for bipartite networks

    High Dimensional Sparse Econometric Models: An Introduction

    Get PDF
    In this chapter we discuss conceptually high dimensional sparse econometric models as well as estimation of these models using L1-penalization and post-L1-penalization methods. Focusing on linear and nonparametric regression frameworks, we discuss various econometric examples, present basic theoretical results, and illustrate the concepts and methods with Monte Carlo simulations and an empirical application. In the application, we examine and confirm the empirical validity of the Solow-Swan model for international economic growth
    corecore