Search CORE

9,343 research outputs found

Recommended from our members

Covariate-assisted ranking and screening for large-scale two-sample inference

Author: Cai T. Tony
Sun Wenguang
Wang Weinan
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Two-sample multiple testing has a wide range of applications. The conventionalpractice first reduces the original observations to a vector of p-values and then chooses a cutoffto adjust for multiplicity. However, this data reduction step could cause significant loss ofinformation and thus lead to suboptimal testing procedures.We introduce a new framework fortwo-sample multiple testing by incorporating a carefully constructed auxiliary variable in inferenceto improve the power. A data-driven multiple-testing procedure is developed by employinga covariate-assisted ranking and screening (CARS) approach that optimally combines the informationfrom both the primary and the auxiliary variables. The proposed CARS procedureis shown to be asymptotically valid and optimal for false discovery rate control. The procedureis implemented in the R package CARS. Numerical results confirm the effectiveness of CARSin false discovery rate control and show that it achieves substantial power gain over existingmethods. CARS is also illustrated through an application to the analysis of a satellite imagingdata set for supernova detection

eScholarship - University of California

Modification of S₁ statistic with Hodges-Lehmann as the central tendency measure

Author: Lee Ping Yin
Publication venue
Publication date: 01/01/2018
Field of study

Normality and variance homogeneity assumptions are usually the main concern of parametric procedures such as in testing the equality of central tendency measures. Violation of these assumptions can seriously inflate the Type I error rates, which will cause spurious rejection of null hypotheses. Parametric procedures such as ANOVA and t-test rely heavily on the assumptions which are hardly encountered in real data. Alternatively, nonparametric procedures do not rely on the distribution of the data, but the procedures are less powerful. In order to overcome the aforementioned issues, robust procedures are recommended. S₁ statistic is one of the robust procedures which uses median as the location parameter to test the equality of central tendency measures among groups, and it deals with the original data without having to trim or transform the data to attain normality. Previous works on S₁ showed lack of robustness in some of the conditions under balanced design. Hence, the objective of this study is to improve the original S₁ statistic by substituting median with Hodges-Lehmann estimator. The substitution was also done on the scale estimator using the variance of Hodges-Lehmann as well as several robust scale estimators. To examine the strengths and weaknesses of the proposed procedures, some variables like types of distributions, number of groups, balanced and unbalanced group sizes, equal and unequal variances, and the nature of pairings were manipulated. The findings show that all proposed procedures are robust across all conditions for every group case. Besides, three proposed procedures namely S₁(MADn), S₁(Tn) and S₁(Sn) show better performance than the original S₁ procedure under extremely skewed distribution. Overall, the proposed procedures illustrate the ability in controlling the inflation of Type I error. Hence, the objective of this study has been achieved as the three proposed procedures show improvement in robustness under skewed distributions

Universiti Utara Malaysia: UUM eTheses

The Impact of Levene\u27s Test of Equality of Variances on Statistical Theory and Practice.

Author: Gastwirth Joseph L.
Gel Yulia R.
Miao Weiwen
Publication venue: Haverford Scholarship
Publication date: 01/01/2009
Field of study

Haverford College: Haverford Scholarship

A parametric multiclass Bayes error estimator for the multispectral scanner spatial model performance evaluation

Author: Anuta P. E.
Mcgillem C. D.
Mobasseri B. G.
Publication venue
Publication date
Field of study

The author has identified the following significant results. The probability of correct classification of various populations in data was defined as the primary performance index. The multispectral data being of multiclass nature as well, required a Bayes error estimation procedure that was dependent on a set of class statistics alone. The classification error was expressed in terms of an N dimensional integral, where N was the dimensionality of the feature space. The multispectral scanner spatial model was represented by a linear shift, invariant multiple, port system where the N spectral bands comprised the input processes. The scanner characteristic function, the relationship governing the transformation of the input spatial, and hence, spectral correlation matrices through the systems, was developed

NASA Technical Reports Server

Sequential Estimation Methodologies with Observations Gathered in Groups: Theory, Practice and Data Analysis

Author: Wang Zhe
Publication venue: OpenCommons@UConn
Publication date: 22/07/2020
Field of study

Purely sequential procedure has been widely studied in different inference problems. However, in purely sequential procedure, only one observation should be taken at-a-time. In real life, packaged items purchased in bulk often cost less per unit sample than the cost of an individual item. This dissertation discussed this situation when observations are gathered in groups. First, two fundamental problems on purely sequential estimation are revisited: (i) the fixed-width confidence interval (FWCI) estimation problem, and (ii) the minimum risk point estimation (MRPE) problem, in the context of estimating an unknown mean in a normal population having an unknown variance. We begin by laying down general frameworks for the second-order asymptotic analyses, in both problems, under sequential sampling of one observation at-a-time. Then, we consider sequentially sampling k observations at-a-time in defining our proposed estimation strategies. In the first attempt, tentative estimators are used to study the feasibility. Then, replace the simple class of estimators with more complicated unbiased and consistent estimators under permutations within each group. These new estimators incorporated in the definition of the stopping boundaries have led to tighter estimation of requisite optimal fixed-sample sizes. In both scenarios, first-order and second-order asymptotic properties have been analyzed under appropriate requirements on the pilot sample size. Such estimators can also be used in two-sample comparisons. The last part of this dissertation presents the second-order asymptotic properties for comparing treatment means. Two separate situations are considered: (i) σ1 = σ2 = σ, but σ is assumed unknown, and (ii) σ1 and σ2 are unequal and unknown. For datasets with possible outliers, robust estimators are in use in purely sequential estimation strategies. For each problem, large-scale computer simulations and substantial data analysis have validated corresponding results. The methodologies are illustrated with the help of real-world data

DigitalCommons@UConn

OpenCommons at University of Connecticut

Multivariate Statistical Process Control Charts: An Overview

Author: Bersimis Sotiris
Panaretos John
Psarakis Stelios
Publication venue
Publication date
Field of study

In this paper we discuss the basic procedures for the implementation of multivariate statistical process control via control charting. Furthermore, we review multivariate extensions for all kinds of univariate control charts, such as multivariate Shewhart-type control charts, multivariate CUSUM control charts and multivariate EWMA control charts. In addition, we review unique procedures for the construction of multivariate control charts, based on multivariate statistical techniques such as principal components analysis (PCA) and partial lest squares (PLS). Finally, we describe the most significant methods for the interpretation of an out-of-control signal.quality control, process control, multivariate statistical process control, Hotelling's T-square, CUSUM, EWMA, PCA, PLS

Research Papers in Economics

Some two-step sampling procedures

Author: Connell Terrence Lee
Publication venue: Colorado State University. Libraries
Publication date: 01/01/1966
Field of study

1966 Spring.Includes bibliographic references (pages 69-71).Covers not scanned.Print version deaccessioned 2020.Two-step sampling procedures are presented to estimate the variance of a normal distribution and the mean of a Poisson distribution within d units with a specified confidence coefficient. The procedure to estimate the variance of a normal is based on a Tchebycheff type inequality derived especially for the gamma distribution. A different type of argument, which could be applied to many other distributions, was used to solve the problem for the Poisson distribution. Sampling sizes are presented in tables and graphs to implement the two solutions. Also, favorable comparisons are made with existing methods

Mountain Scholar (Digital Collections of Colorado and Wyoming)