518 research outputs found
Infinitesimally Robust Estimation in General Smoothly Parametrized Models
We describe the shrinking neighborhood approach of Robust Statistics, which
applies to general smoothly parametrized models, especially, exponential
families. Equal generality is achieved by object oriented implementation of the
optimally robust estimators. We evaluate the estimates on real datasets from
literature by means of our R packages ROptEst and RobLox
Moment-based parameter estimation in binomial random intersection graph models
Binomial random intersection graphs can be used as parsimonious statistical
models of large and sparse networks, with one parameter for the average degree
and another for transitivity, the tendency of neighbours of a node to be
connected. This paper discusses the estimation of these parameters from a
single observed instance of the graph, using moment estimators based on
observed degrees and frequencies of 2-stars and triangles. The observed data
set is assumed to be a subgraph induced by a set of nodes sampled from
the full set of nodes. We prove the consistency of the proposed estimators
by showing that the relative estimation error is small with high probability
for . As a byproduct, our analysis confirms that the
empirical transitivity coefficient of the graph is with high probability close
to the theoretical clustering coefficient of the model.Comment: 15 pages, 6 figure
Maximum Likelihood Estimator for Hidden Markov Models in continuous time
The paper studies large sample asymptotic properties of the Maximum
Likelihood Estimator (MLE) for the parameter of a continuous time Markov chain,
observed in white noise. Using the method of weak convergence of likelihoods
due to I.Ibragimov and R.Khasminskii, consistency, asymptotic normality and
convergence of moments are established for MLE under certain strong ergodicity
conditions of the chain.Comment: Warning: due to a flaw in the publishing process, some of the
references in the published version of the article are confuse
Quantifying Robotic Swarm Coverage
In the field of swarm robotics, the design and implementation of spatial
density control laws has received much attention, with less emphasis being
placed on performance evaluation. This work fills that gap by introducing an
error metric that provides a quantitative measure of coverage for use with any
control scheme. The proposed error metric is continuously sensitive to changes
in the swarm distribution, unlike commonly used discretization methods. We
analyze the theoretical and computational properties of the error metric and
propose two benchmarks to which error metric values can be compared. The first
uses the realizable extrema of the error metric to compute the relative error
of an observed swarm distribution. We also show that the error metric extrema
can be used to help choose the swarm size and effective radius of each robot
required to achieve a desired level of coverage. The second benchmark compares
the observed distribution of error metric values to the probability density
function of the error metric when robot positions are randomly sampled from the
target distribution. We demonstrate the utility of this benchmark in assessing
the performance of stochastic control algorithms. We prove that the error
metric obeys a central limit theorem, develop a streamlined method for
performing computations, and place the standard statistical tests used here on
a firm theoretical footing. We provide rigorous theoretical development,
computational methodologies, numerical examples, and MATLAB code for both
benchmarks.Comment: To appear in Springer series Lecture Notes in Electrical Engineering
(LNEE). This book contribution is an extension of our ICINCO 2018 conference
paper arXiv:1806.02488. 27 pages, 8 figures, 2 table
Characterization of stochastic orders by L-functionals
Random variables may be compared with respect to their location by comparing certain functionals ad hoc, such as the mean or median, or by means of stochastic ordering based directly on the properties of the corresponding distribution functions. These alternative approaches are brought together in this paper. We focus on the class of L-functionals discussed by Bickel and Lehmann (1975) and characterize the comparison of random variables in terms of these measures by means of several stochastic orders based on iterated integrals, including the increasing convex orde
Gene-based multiple trait analysis for exome sequencing data
The common genetic variants identified through genome-wide association studies explain only a small proportion of the genetic risk for complex diseases. The advancement of next-generation sequencing technologies has enabled the detection of rare variants that are expected to contribute significantly to the missing heritability. Some genetic association studies provide multiple correlated traits for analysis. Multiple trait analysis has the potential to improve the power to detect pleiotropic genetic variants that influence multiple traits. We propose a gene-level association test for multiple traits that accounts for correlation among the traits. Gene- or region-level testing for association involves both common and rare variants. Statistical tests for common variants may have limited power for individual rare variants because of their low frequency and multiple testing issues. To address these concerns, we use the weighted-sum pooling method to test the joint association of multiple rare and common variants within a gene. The proposed method is applied to the Genetic Association Workshop 17 (GAW17) simulated mini-exome data to analyze multiple traits. Because of the nature of the GAW17 simulation model, increased power was not observed for multiple-trait analysis compared to single-trait analysis. However, multiple-trait analysis did not result in a substantial loss of power because of the testing of multiple traits. We conclude that this method would be useful for identifying pleiotropic genes
Detecting the direction of a signal on high-dimensional spheres: Non-null and Le Cam optimality results
We consider one of the most important problems in directional statistics,
namely the problem of testing the null hypothesis that the spike direction
of a Fisher-von Mises-Langevin distribution on the -dimensional
unit hypersphere is equal to a given direction . After a reduction
through invariance arguments, we derive local asymptotic normality (LAN)
results in a general high-dimensional framework where the dimension goes
to infinity at an arbitrary rate with the sample size , and where the
concentration behaves in a completely free way with , which
offers a spectrum of problems ranging from arbitrarily easy to arbitrarily
challenging ones. We identify various asymptotic regimes, depending on the
convergence/divergence properties of , that yield different
contiguity rates and different limiting experiments. In each regime, we derive
Le Cam optimal tests under specified and we compute, from the Le Cam
third lemma, asymptotic powers of the classical Watson test under contiguous
alternatives. We further establish LAN results with respect to both spike
direction and concentration, which allows us to discuss optimality also under
unspecified . To investigate the non-null behavior of the Watson test
outside the parametric framework above, we derive its local asymptotic powers
through martingale CLTs in the broader, semiparametric, model of rotationally
symmetric distributions. A Monte Carlo study shows that the finite-sample
behaviors of the various tests remarkably agree with our asymptotic results.Comment: 47 pages, 4 figure
The interplay of microscopic and mesoscopic structure in complex networks
Not all nodes in a network are created equal. Differences and similarities
exist at both individual node and group levels. Disentangling single node from
group properties is crucial for network modeling and structural inference.
Based on unbiased generative probabilistic exponential random graph models and
employing distributive message passing techniques, we present an efficient
algorithm that allows one to separate the contributions of individual nodes and
groups of nodes to the network structure. This leads to improved detection
accuracy of latent class structure in real world data sets compared to models
that focus on group structure alone. Furthermore, the inclusion of hitherto
neglected group specific effects in models used to assess the statistical
significance of small subgraph (motif) distributions in networks may be
sufficient to explain most of the observed statistics. We show the predictive
power of such generative models in forecasting putative gene-disease
associations in the Online Mendelian Inheritance in Man (OMIM) database. The
approach is suitable for both directed and undirected uni-partite as well as
for bipartite networks
High Dimensional Sparse Econometric Models: An Introduction
In this chapter we discuss conceptually high dimensional sparse econometric
models as well as estimation of these models using L1-penalization and
post-L1-penalization methods. Focusing on linear and nonparametric regression
frameworks, we discuss various econometric examples, present basic theoretical
results, and illustrate the concepts and methods with Monte Carlo simulations
and an empirical application. In the application, we examine and confirm the
empirical validity of the Solow-Swan model for international economic growth
- …