890 research outputs found

    Asymptotics of hierarchical clustering for growing dimension

    Get PDF
    Modern day science presents many challenges to data analysts. Advances in data collection provide very large (number of observations and number of dimensions) data sets. In many areas of data analysis an informative task is to find natural separations of data into homogeneous groups, i.e. clusters. In this paper we study the asymptotic behavior of hierarchical clustering in situations where both sample size and dimension grow to infinity. We derive explicit signal vs noise boundaries between different types of clustering behaviors. We also show that the clustering behavior within the boundaries is the same across a wide spectrum of asymptotic settings

    Clustering Financial Time Series: How Long is Enough?

    Get PDF
    Researchers have used from 30 days to several years of daily returns as source data for clustering financial time series based on their correlations. This paper sets up a statistical framework to study the validity of such practices. We first show that clustering correlated random variables from their observed values is statistically consistent. Then, we also give a first empirical answer to the much debated question: How long should the time series be? If too short, the clusters found can be spurious; if too long, dynamics can be smoothed out.Comment: Accepted at IJCAI 201

    A proposal of a methodological framework with experimental guidelines to investigate clustering stability on financial time series

    Full text link
    We present in this paper an empirical framework motivated by the practitioner point of view on stability. The goal is to both assess clustering validity and yield market insights by providing through the data perturbations we propose a multi-view of the assets' clustering behaviour. The perturbation framework is illustrated on an extensive credit default swap time series database available online at www.datagrapple.com.Comment: Accepted at ICMLA 201

    Discrete scale invariance and complex dimensions

    Full text link
    We discuss the concept of discrete scale invariance and how it leads to complex critical exponents (or dimensions), i.e. to the log-periodic corrections to scaling. After their initial suggestion as formal solutions of renormalization group equations in the seventies, complex exponents have been studied in the eighties in relation to various problems of physics embedded in hierarchical systems. Only recently has it been realized that discrete scale invariance and its associated complex exponents may appear ``spontaneously'' in euclidean systems, i.e. without the need for a pre-existing hierarchy. Examples are diffusion-limited-aggregation clusters, rupture in heterogeneous systems, earthquakes, animals (a generalization of percolation) among many other systems. We review the known mechanisms for the spontaneous generation of discrete scale invariance and provide an extensive list of situations where complex exponents have been found. This is done in order to provide a basis for a better fundamental understanding of discrete scale invariance. The main motivation to study discrete scale invariance and its signatures is that it provides new insights in the underlying mechanisms of scale invariance. It may also be very interesting for prediction purposes.Comment: significantly extended version (Oct. 27, 1998) with new examples in several domains of the review paper with the same title published in Physics Reports 297, 239-270 (1998
    • …
    corecore