1,495 research outputs found

    Modeling the variability of rankings

    Full text link
    For better or for worse, rankings of institutions, such as universities, schools and hospitals, play an important role today in conveying information about relative performance. They inform policy decisions and budgets, and are often reported in the media. While overall rankings can vary markedly over relatively short time periods, it is not unusual to find that the ranks of a small number of "highly performing" institutions remain fixed, even when the data on which the rankings are based are extensively revised, and even when a large number of new institutions are added to the competition. In the present paper, we endeavor to model this phenomenon. In particular, we interpret as a random variable the value of the attribute on which the ranking should ideally be based. More precisely, if pp items are to be ranked then the true, but unobserved, attributes are taken to be values of pp independent and identically distributed variates. However, each attribute value is observed only with noise, and via a sample of size roughly equal to nn, say. These noisy approximations to the true attributes are the quantities that are actually ranked. We show that, if the distribution of the true attributes is light-tailed (e.g., normal or exponential) then the number of institutions whose ranking is correct, even after recalculation using new data and even after many new institutions are added, is essentially fixed. Formally, pp is taken to be of order nCn^C for any fixed C>0C>0, and the number of institutions whose ranking is reliable depends very little on pp.Comment: Published in at http://dx.doi.org/10.1214/10-AOS794 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Using the bootstrap to quantify the authority of an empirical ranking

    Full text link
    The bootstrap is a popular and convenient method for quantifying the authority of an empirical ordering of attributes, for example of a ranking of the performance of institutions or of the influence of genes on a response variable. In the first of these examples, the number, pp, of quantities being ordered is sometimes only moderate in size; in the second it can be very large, often much greater than sample size. However, we show that in both types of problem the conventional bootstrap can produce inconsistency. Moreover, the standard nn-out-of-nn bootstrap estimator of the distribution of an empirical rank may not converge in the usual sense; the estimator may converge in distribution, but not in probability. Nevertheless, in many cases the bootstrap correctly identifies the support of the asymptotic distribution of ranks. In some contemporary problems, bootstrap prediction intervals for ranks are particularly long, and in this context, we also quantify the accuracy of bootstrap methods, showing that the standard bootstrap gets the order of magnitude of the interval right, but not the constant multiplier of interval length. The mm-out-of-nn bootstrap can improve performance and produce statistical consistency, but it requires empirical choice of mm; we suggest a tuning solution to this problem. We show that in genomic examples, where it might be expected that the standard, ``synchronous'' bootstrap will successfully accommodate nonindependence of vector components, that approach can produce misleading results. An ``independent component'' bootstrap can overcome these difficulties, even in cases where components are not strictly independent.Comment: Published in at http://dx.doi.org/10.1214/09-AOS699 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Feature selection when there are many influential features

    Full text link
    Recent discussion of the success of feature selection methods has argued that focusing on a relatively small number of features has been counterproductive. Instead, it is suggested, the number of significant features can be in the thousands or tens of thousands, rather than (as is commonly supposed at present) approximately in the range from five to fifty. This change, in orders of magnitude, in the number of influential features, necessitates alterations to the way in which we choose features and to the manner in which the success of feature selection is assessed. In this paper, we suggest a general approach that is suited to cases where the number of relevant features is very large, and we consider particular versions of the approach in detail. We propose ways of measuring performance, and we study both theoretical and numerical properties of the proposed methodology.Comment: Published in at http://dx.doi.org/10.3150/13-BEJ536 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

    The distribution and excretion of insulin in man

    Get PDF
    A technique has been devised for performing inulin clearances in man, which is a great simplification on the method used by earlier workers.The simplification is threefold.A single injection is used in lieu of a large priming injection followed by a sustained infusion to maintain the blood inulin concentration. As a result of this the blood concentration is constantly falling throughout the experiment, so that some difficulty may be experienced in getting the exact mean blood concentration for each observation.Urine specimens were voided in the natural way, and catherisation of the bladder followed by washing out with saline was not practised. It was felt that such a drastic procedure was not warranted by a clinical experiment.This is an obvious source of error on account of large differences between blood and urine concentration, but if the urine flow is kept at a high level by giving the subject large quantities of water to drink this can be minimised.A marked improvement in the manner of estimating blood inulin has been introduced. This method is both simpler and of greater accuracy than the original ones used in the estimation of inulin. It is on account of the greater accuracy that a relatively small single injection can be used.This method is much simpler than a similar one which has been introduced by Alving, Rubin and Miller in America and which Is fairly generally used in that country now. It is possible, however,that their method may be of slightly greater accuracy. I am sceptical of the use of this method without removing the blood glucose. The authors state that it is applicable at blood concentrations of over 30 mg. /100 c.c.The colour due to fructose in Herbert's method is 88% developed after 15 minutes, and this follows a rectilinear relationship. At this time the colour due to glucose is negligible. By the other method, however, the specimens are incubated for 1 hour, and though the proportions of the reagents are different, one feels that there must be a considerable colour development due to glucose.The results obtained in a series of subjects by this method are analysed and discussed, and certain of the observations are compared with synchronous urea clearances.In the course of these investigations the distribution of inulin in the human body is also studied, and the fact that inulin is excreted by a simple physical process of filtration is demonstrated by observations showing that the rate of excretion increases proportionally to the blood inulin concentration. As further proof of filtration, it has also been shown that when the fall of blood concentration is plotted logarithmically against time, a rectilinear relationship results

    Food, Money & Sex

    Get PDF

    Hugh A. Barr to H. R. Miller (5 November)

    Get PDF
    News concerning Miller\u27s electionhttps://egrove.olemiss.edu/ciwar_corresp/1605/thumbnail.jp

    Treatment by hypnotism and post-hypnotic suggestion

    Get PDF

    DNA Blueprints, Personhood, and Genetic Privacy

    Get PDF

    DNA Blueprints, Personhood, and Genetic Privacy

    Get PDF
    corecore