2,566 research outputs found

    Approximating the Distribution of the Median and other Robust Estimators on Uncertain Data

    Get PDF
    Robust estimators, like the median of a point set, are important for data analysis in the presence of outliers. We study robust estimators for locationally uncertain points with discrete distributions. That is, each point in a data set has a discrete probability distribution describing its location. The probabilistic nature of uncertain data makes it challenging to compute such estimators, since the true value of the estimator is now described by a distribution rather than a single point. We show how to construct and estimate the distribution of the median of a point set. Building the approximate support of the distribution takes near-linear time, and assigning probability to that support takes quadratic time. We also develop a general approximation technique for distributions of robust estimators with respect to ranges with bounded VC dimension. This includes the geometric median for high dimensions and the Siegel estimator for linear regression.Comment: Full version of a paper to appear at SoCG 201

    Calculation of Weibull strength parameters and Batdorf flow-density constants for volume- and surface-flaw-induced fracture in ceramics

    Get PDF
    The calculation of shape and scale parameters of the two-parameter Weibull distribution is described using the least-squares analysis and maximum likelihood methods for volume- and surface-flaw-induced fracture in ceramics with complete and censored samples. Detailed procedures are given for evaluating 90 percent confidence intervals for maximum likelihood estimates of shape and scale parameters, the unbiased estimates of the shape parameters, and the Weibull mean values and corresponding standard deviations. Furthermore, the necessary steps are described for detecting outliers and for calculating the Kolmogorov-Smirnov and the Anderson-Darling goodness-of-fit statistics and 90 percent confidence bands about the Weibull distribution. It also shows how to calculate the Batdorf flaw-density constants by uing the Weibull distribution statistical parameters. The techniques described were verified with several example problems, from the open literature, and were coded. The techniques described were verified with several example problems from the open literature, and were coded in the Structural Ceramics Analysis and Reliability Evaluation (SCARE) design program

    The Stochastic Fluctuation of the Quantile Regression Curve

    Get PDF
    Let (X1, Y1), . . ., (Xn, Yn) be i.i.d. rvs and let l(x) be the unknown p-quantile regression curve of Y on X. A quantile-smoother ln(x) is a localised, nonlinear estimator of l(x). The strong uniform consistency rate is established under general conditions. In many applications it is necessary to know the stochastic fluctuation of the process {ln(x) - l(x)}. Using strong approximations of the empirical process and extreme value theory allows us to consider the asymptotic maximal deviation sup06x61 |ln(x)-l(x)|. The derived result helps in the construction of a uniform confidence band for the quantile curve l(x). This confidence band can be applied as a model check, e.g. in econometrics. An application considers a labour market discrimination effect.Quantile Regression, Consistency Rate, Confidence Band, Check Function, Kernel Smoothing, Nonparametric Fitting

    On the 3D structure of the mass, metallicity, and SFR space for SF galaxies

    Full text link
    We demonstrate that the space formed by the star-formation rate (SFR), gas-phase metallicity (Z), and stellar mass (M), can be reduced to a plane, as first proposed by Lara-Lopez et al. We study three different approaches to find the best representation of this 3D space, using a principal component analysis, a regression fit, and binning of the data. The PCA shows that this 3D space can be adequately represented in only 2 dimensions, i.e., a plane. We find that the plane that minimises the chi^2 for all variables, and hence provides the best representation of the data, corresponds to a regression fit to the stellar mass as a function of SFR and ZZ, M=f(Z,SFR). We find that the distribution resulting from the median values in bins for our data gives the highest chi^2. We also show that the empirical calibrations to the oxygen abundance used to derive the Fundamental Metallicity Relation (Nagao et al.) have important limitations, which contribute to the apparent inconsistencies. The main problem is that these empirical calibrations do not consider the ionization degree of the gas. Furthermore, the use of the N2 index to estimate oxygen abundances cannot be applied for ~8.8 because of the saturation of the [NII]6584 line in the high-metallicity regime. Finally we provide an update of the Fundamental Plane derived by Lara-Lopez et al.Comment: ApJ, accepted. 15 pages, 13 figure

    Coherent frequentism

    Full text link
    By representing the range of fair betting odds according to a pair of confidence set estimators, dual probability measures on parameter space called frequentist posteriors secure the coherence of subjective inference without any prior distribution. The closure of the set of expected losses corresponding to the dual frequentist posteriors constrains decisions without arbitrarily forcing optimization under all circumstances. This decision theory reduces to those that maximize expected utility when the pair of frequentist posteriors is induced by an exact or approximate confidence set estimator or when an automatic reduction rule is applied to the pair. In such cases, the resulting frequentist posterior is coherent in the sense that, as a probability distribution of the parameter of interest, it satisfies the axioms of the decision-theoretic and logic-theoretic systems typically cited in support of the Bayesian posterior. Unlike the p-value, the confidence level of an interval hypothesis derived from such a measure is suitable as an estimator of the indicator of hypothesis truth since it converges in sample-space probability to 1 if the hypothesis is true or to 0 otherwise under general conditions.Comment: The confidence-measure theory of inference and decision is explicitly extended to vector parameters of interest. The derivation of upper and lower confidence levels from valid and nonconservative set estimators is formalize
    corecore