329 research outputs found

    Rejoinder of: Statistical analysis of an archeological find

    Full text link
    Rejoinder of ``Statistical analysis of an archeological find'' [arXiv:0804.0079]Comment: Published in at http://dx.doi.org/10.1214/08-AOAS99REJ the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Discussion of: Brownian distance covariance

    Full text link
    Discussion on "Brownian distance covariance" by G\'{a}bor J. Sz\'{e}kely, Maria L. Rizzo [arXiv:1010.0297]Comment: Published in at http://dx.doi.org/10.1214/09-AOAS312D the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Statistical Significance of the Netflix Challenge

    Full text link
    Inspired by the legacy of the Netflix contest, we provide an overview of what has been learned---from our own efforts, and those of others---concerning the problems of collaborative filtering and recommender systems. The data set consists of about 100 million movie ratings (from 1 to 5 stars) involving some 480 thousand users and some 18 thousand movies; the associated ratings matrix is about 99% sparse. The goal is to predict ratings that users will give to movies; systems which can do this accurately have significant commercial applications, particularly on the world wide web. We discuss, in some detail, approaches to "baseline" modeling, singular value decomposition (SVD), as well as kNN (nearest neighbor) and neural network models; temporal effects, cross-validation issues, ensemble methods and other considerations are discussed as well. We compare existing models in a search for new models, and also discuss the mission-critical issues of penalization and parameter shrinkage which arise when the dimensions of a parameter space reaches into the millions. Although much work on such problems has been carried out by the computer science and machine learning communities, our goal here is to address a statistical audience, and to provide a primarily statistical treatment of the lessons that have been learned from this remarkable set of data.Comment: Published in at http://dx.doi.org/10.1214/11-STS368 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Reduced-bias estimator of the Conditional Tail Expectation of heavy-tailed distributions

    Get PDF
    International audienceSeveral risk measures have been proposed in the literature. In this paper, we focus on the estimation of the Conditional Tail Expectation (CTE). Its asymptotic normality has been first established in the literature under the classical assumption that the second moment of the loss variable is finite, this condition being very restrictive in practical applications. Such a result has been extended by Necir {\it et al.} (2010) in the case of infinite second moment. In this framework, we propose a reduced-bias estimator of the CTE. We illustrate the efficiency of our approach on a small simulation study and a real data analysis

    Using statistical smoothing to date medieval manuscripts

    Full text link
    We discuss the use of multivariate kernel smoothing methods to date manuscripts dating from the 11th to the 15th centuries, in the English county of Essex. The dataset consists of some 3300 dated and 5000 undated manuscripts, and the former are used as a training sample for imputing dates for the latter. It is assumed that two manuscripts that are ``close'', in a sense that may be defined by a vector of measures of distance for documents, will have close dates. Using this approach, statistical ideas are used to assess ``similarity'', by smoothing among distance measures, and thus to estimate dates for the 5000 undated manuscripts by reference to the dated ones.Comment: Published in at http://dx.doi.org/10.1214/193940307000000248 the IMS Collections (http://www.imstat.org/publications/imscollections.htm) by the Institute of Mathematical Statistics (http://www.imstat.org

    Heavy-Tailed Distributions and Rating

    Get PDF

    On the estimation of the heavy-tail exponent in time series using the max-spectrum

    Full text link
    This paper addresses the problem of estimating the tail index Α of distributions with heavy, Pareto-type tails for dependent data, that is of interest in the areas of finance, insurance, environmental monitoring and teletraffic analysis. A novel approach based on the max self-similarity scaling behavior of block maxima is introduced. The method exploits the increasing lack of dependence of maxima over large size blocks, which proves useful for time series data. We establish the consistency and asymptotic normality of the proposed max-spectrum estimator for a large class of m -dependent time series, in the regime of intermediate block-maxima. In the regime of large block-maxima, we demonstrate the distributional consistency of the estimator for a broad range of time series models including linear processes. The max-spectrum estimator is a robust and computationally efficient tool, which provides a novel time-scale perspective to the estimation of the tail exponents. Its performance is illustrated over synthetic and real data sets. Copyright © 2009 John Wiley & Sons, Ltd.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/77436/1/764_ftp.pd
    corecore