28 research outputs found

    A general asymptotic scheme for inference under order restrictions

    Full text link
    Limit distributions for the greatest convex minorant and its derivative are considered for a general class of stochastic processes including partial sum processes and empirical processes, for independent, weakly dependent and long range dependent data. The results are applied to isotonic regression, isotonic regression after kernel smoothing, estimation of convex regression functions, and estimation of monotone and convex density functions. Various pointwise limit distributions are obtained, and the rate of convergence depends on the self similarity properties and on the rate of convergence of the processes considered.Comment: Published at http://dx.doi.org/10.1214/009053606000000443 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Generalizing univariate signed rank statistics for testing and estimating a multivariate location parameter.

    Get PDF
    We generalize signed rank statistics to dimensions higher than one. This results in a class of orthogonally invariant and distribution free tests that can be used for testing spherical symmetry/location parameter. The corresponding estimator is orthogonally equivariant. Both the test and estimator can be chosen with asymptotic efficiency 1. The breakdown point of the estimator depends only on the scores, not on the dimension of the data. For elliptical distributions, we obtain an affine invariant test with the same asymptotic properties, if the signed rank statistic is applied to standardized data. We also present a method for computing the estimator numerically, and consider a real data example and some simulations. Finally, an application to detection of time-varying signals in spherically symmetric noise is given.Affine invariant tests; Asymptotic normality; Breakdown point; distribution free tests;

    Generalized S-estimators.

    Get PDF
    In this paper we introduce a new type of positive-breakdown regression method, called a generalized S-estimator (or GS-estimator), based on the minimization of a generalized M-estimator of residual scale. We compare the class of GS-estimators with the usual S-estimators, including least median of squares. It turns out that GS-estimators attain a much higher efficiency than S-estimators, at the cost of a slightly increased worst-case bias. We investigate the breakdown point, the maxbias curve and the influence function of GS-estimators. We also give an algorithm for computing GS-estimators, and apply it to real and simulated data.Breakdown point; Influence function; Maxbias curve; Regression analysis; Robustness;

    A Fast Algorithm for Robust Regression with Penalised Trimmed Squares

    Full text link
    The presence of groups containing high leverage outliers makes linear regression a difficult problem due to the masking effect. The available high breakdown estimators based on Least Trimmed Squares often do not succeed in detecting masked high leverage outliers in finite samples. An alternative to the LTS estimator, called Penalised Trimmed Squares (PTS) estimator, was introduced by the authors in \cite{ZiouAv:05,ZiAvPi:07} and it appears to be less sensitive to the masking problem. This estimator is defined by a Quadratic Mixed Integer Programming (QMIP) problem, where in the objective function a penalty cost for each observation is included which serves as an upper bound on the residual error for any feasible regression line. Since the PTS does not require presetting the number of outliers to delete from the data set, it has better efficiency with respect to other estimators. However, due to the high computational complexity of the resulting QMIP problem, exact solutions for moderately large regression problems is infeasible. In this paper we further establish the theoretical properties of the PTS estimator, such as high breakdown and efficiency, and propose an approximate algorithm called Fast-PTS to compute the PTS estimator for large data sets efficiently. Extensive computational experiments on sets of benchmark instances with varying degrees of outlier contamination, indicate that the proposed algorithm performs well in identifying groups of high leverage outliers in reasonable computational time.Comment: 27 page

    Statistical quality assessment and outlier detection for liquid chromatography-mass spectrometry experiments

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Quality assessment methods, that are common place in engineering and industrial production, are not widely spread in large-scale proteomics experiments. But modern technologies such as Multi-Dimensional Liquid Chromatography coupled to Mass Spectrometry (LC-MS) produce large quantities of proteomic data. These data are prone to measurement errors and reproducibility problems such that an automatic quality assessment and control become increasingly important.</p> <p>Results</p> <p>We propose a methodology to assess the quality and reproducibility of data generated in quantitative LC-MS experiments. We introduce quality descriptors that capture different aspects of the quality and reproducibility of LC-MS data sets. Our method is based on the Mahalanobis distance and a robust Principal Component Analysis.</p> <p>Conclusion</p> <p>We evaluate our approach on several data sets of different complexities and show that we are able to precisely detect LC-MS runs of poor signal quality in large-scale studies.</p

    Temporal Dynamics of Host Molecular Responses Differentiate Symptomatic and Asymptomatic Influenza A Infection

    Get PDF
    Exposure to influenza viruses is necessary, but not sufficient, for healthy human hosts to develop symptomatic illness. The host response is an important determinant of disease progression. In order to delineate host molecular responses that differentiate symptomatic and asymptomatic Influenza A infection, we inoculated 17 healthy adults with live influenza (H3N2/Wisconsin) and examined changes in host peripheral blood gene expression at 16 timepoints over 132 hours. Here we present distinct transcriptional dynamics of host responses unique to asymptomatic and symptomatic infections. We show that symptomatic hosts invoke, simultaneously, multiple pattern recognition receptors-mediated antiviral and inflammatory responses that may relate to virus-induced oxidative stress. In contrast, asymptomatic subjects tightly regulate these responses and exhibit elevated expression of genes that function in antioxidant responses and cell-mediated responses. We reveal an ab initio molecular signature that strongly correlates to symptomatic clinical disease and biomarkers whose expression patterns best discriminate early from late phases of infection. Our results establish a temporal pattern of host molecular responses that differentiates symptomatic from asymptomatic infections and reveals an asymptomatic host-unique non-passive response signature, suggesting novel putative molecular targets for both prognostic assessment and ameliorative therapeutic intervention in seasonal and pandemic influenza

    From basic to reduced bias kernel density estimators: links via Taylor series approximations

    No full text
    The transformation kernel density estimator of Ruppert and Cline (1994) achieves bias of order h4 (as the bandwidth h→0), an improvement over the order h2 bias associated with the basic kernel density estimator. Hössjer and Ruppert (1994) use Taylor series expansions to build a bridge between the two, displaying an infinite sequence of O(h4) bias estimators in the process. In this paper, we extend the work of Hössjer and Ruppert (i) by investigating three other natural Taylor series expansions, and (ii) by applying the approach to two other O(h4) bias estimators, namely the variable bandwidth and multiplicative bias correction methods. Several further infinite sequences of O(h4) bias estimators result

    Generalizing univariate signed rank statistics for testing and estimating a multivariate location parameter

    No full text
    We generalize signed rank statistics to dimensions higher than one. This results in a class of orthogonally invariant and distribution free tests that can be used for testing spherical symmetry/location parameter. The corresponding estimator is orthogonally equivariant. Both the test and estimator can be chosen with asymptotic efficiency 1. The breakdown point of the estimator depends only on the scores, not on the dimension of the data. For elliptical distributions, we obtain an affine invariant test with the same asymptotic properties, if the signed rank statistic is applied to standardized data. We also present a method for computing the estimator numerically, and consider a real data example and some simulations. Finally, an application to detection of time-varying signals in spherically symmetric noise is given.status: publishe

    On the effect of estimating the error density in nonparametric deconvolution

    No full text
    It is quite common in the statistical literature on nonparametric deconvolution to assume that the error density is perfectly known. Since this seems to be unrealistic in many practical applications, we study the effect of estimating the unknown error density. We derive minimax rates of convergence and propose a modification of the usual kernel-based estimation scheme, which takes the uncertainty about the error density into account. A simulation study quantifies the possible gains by this new method in finite sample situations
    corecore