18 research outputs found
A Topologically Valid Definition of Depth for Functional Data
The main focus of this work is on providing a formal definition of statistical depth for functional data on the basis of six properties, recognising topological features such as continuity, smoothness and contiguity. Amongst our depth defining properties is one that addresses the delicate challenge of inherent partial observability of functional data, with fulfillment giving rise to a minimal guarantee on the performance of the empirical depth beyond the idealised and practically infeasible case of full observability. As an incidental product, functional depths satisfying our definition achieve a robustness that is commonly ascribed to depth, despite the absence of a formal guarantee in the multivariate definition of depth. We demonstrate the fulfillment or otherwise of our properties for six widely used functional depth proposals, thereby providing a systematic basis for selection of a depth function
Distributed Estimation and Inference with Statistical Guarantees
This paper studies hypothesis testing and parameter estimation in the context
of the divide and conquer algorithm. In a unified likelihood based framework,
we propose new test statistics and point estimators obtained by aggregating
various statistics from subsamples of size , where is the sample
size. In both low dimensional and high dimensional settings, we address the
important question of how to choose as grows large, providing a
theoretical upper bound on such that the information loss due to the divide
and conquer algorithm is negligible. In other words, the resulting estimators
have the same inferential efficiencies and estimation rates as a practically
infeasible oracle with access to the full sample. Thorough numerical results
are provided to back up the theory
Recommended from our members
Improving confidence set estimation when parameters are weakly identified
We consider inference in weakly identified moment condition models when additional partially identifying moment inequality constraints are available. We detail the limiting distribution of the estimation criterion function and consequently propose a confidence set estimator for the true parameter.National Science Foundation of China (Grant ID: 71301026)This is the author accepted manuscript. The final version is available from Elsevier via http://dx.doi.org/10.1016/j.spl.2016.06.01
Nonparametric estimation of multivariate elliptic densities via finite mixture sieves
This paper considers the class of p-dimensional elliptic distributions (p = 1) satisfying the consistency property (Kano, 1994) and within this general framework presents a two-stage semiparametric estimator for the Lebesgue density based on Gaussian mixture sieves. Under the online Exponentiated Gradient (EG) algorithm of Helmbold et al. (1997) and without restricting the mixing measure to have compact support, the estimator produces estimates converging uniformly in probability to the true elliptic density at a rate that is independent of the dimension of the problem, hence circumventing the familiar curse of dimensionality inherent to many semiparametric estimators. The rate performance of our estimator depends on the tail behaviour of the underlying mixing density (and hence that of the data) rather than smoothness properties. In fact, our method achieves a rate of at least Op(n-1/4), provided only some positive moment exists. When further moments exist, the rate improves reaching Op(n-3/8) as the tails of the true density converge to those of a normal. Unlike the elliptic density estimator of Liebscher (2005), our sieve estimator always yields an estimate that is valid density, and is also attractive from a practical perspective as it accepts data as a stream, thus significantly reducing computational and storage requirements. Monte Carlo experimentation indicates encouraging finite sample performance over a range of elliptic densities. The estimator is also implemented in a binary classification task using the well-known Wisconsin breast cancer dataset
Dynamics of value-tracking in financial markets
The effciency of a modern economy depends on value-tracking: that market prices of key assets broadly track some underlying value. This can be expected if a suffcient weight of market participants are valuation- based traders, buying and selling an asset when its price is, respectively, below and above their well-informed private valuations. Such tracking will never be perfect, and we propose a natural unit of tracking error, the 'deciblack' . We then use a simple discrete-time model to show how large tracking errors can arise if enough market participants are not valuation-based traders, regardless of how much information the valuation-based traders have. Similarly to Lux [17] and others who study subtly different models, we find a threshold above which value-tracking breaks down without any changes in the underlying value of the asset. We propose an estimator of the tracking error and establish its statistical properties. Because financial markets are increasingly dominated by non-valuation-based traders, assessing how much valuation-based investing is required for reasonable value tracking is of urgent practical interest
Functional Symmetry and Statistical Depth for the Analysis of Movement Patterns in Alzheimer’s Patients
Black-box techniques have been applied with outstanding results to classify, in a supervised manner, the movement patterns of Alzheimer’s patients according to their stage of the disease. However, these techniques do not provide information on the difference of the patterns among the stages. We make use of functional data analysis to provide insight on the nature of these differences. In particular, we calculate the center of symmetry of the underlying distribution at each stage and use it to compute the functional depth of the movements of each patient. This results in an ordering of the data to which we apply nonparametric permutation tests to check on the differences in the distribution, median and deviance from the median. We consistently obtain that the movement pattern at each stage is significantly different to that of the prior and posterior stage in terms of the deviance from the median applied to the depth. The approach is validated by simulation