Search CORE

890 research outputs found

Asymptotics of hierarchical clustering for growing dimension

Author: Borysov Petro
Hannig Jan
Marron J.S.
Publication venue
Publication date: 01/01/2014
Field of study

Modern day science presents many challenges to data analysts. Advances in data collection provide very large (number of observations and number of dimensions) data sets. In many areas of data analysis an informative task is to find natural separations of data into homogeneous groups, i.e. clusters. In this paper we study the asymptotic behavior of hierarchical clustering in situations where both sample size and dimension grow to infinity. We derive explicit signal vs noise boundaries between different types of clustering behaviors. We also show that the clustering behavior within the boundaries is the same across a wide spectrum of asymptotic settings

Carolina Digital Repository

Clustering Financial Time Series: How Long is Enough?

Author: Andler Sébastien
Donnat Philippe
Marti Gautier
Nielsen Frank
Publication venue
Publication date: 14/04/2016
Field of study

Researchers have used from 30 days to several years of daily returns as source data for clustering financial time series based on their correlations. This paper sets up a statistical framework to study the validity of such practices. We first show that clustering correlated random variables from their observed values is statistically consistent. Then, we also give a first empirical answer to the much debated question: How long should the time series be? If too short, the clusters found can be spurious; if too long, dynamics can be smoothed out.Comment: Accepted at IJCAI 201

arXiv.org e-Print Archive

HAL-ENS-LYON

HAL-Polytechnique

A proposal of a methodological framework with experimental guidelines to investigate clustering stability on financial time series

Author: Donnat Philippe
Marti Gautier
Nielsen Frank
Very Philippe
Publication venue
Publication date: 17/09/2015
Field of study

We present in this paper an empirical framework motivated by the practitioner point of view on stability. The goal is to both assess clustering validity and yield market insights by providing through the data perturbations we propose a multi-view of the assets' clustering behaviour. The perturbation framework is illustrated on an extensive credit default swap time series database available online at www.datagrapple.com.Comment: Accepted at ICMLA 201

arXiv.org e-Print Archive

Crossref

Discrete scale invariance and complex dimensions

Author: Sornette Didier
Publication venue: 'Elsevier BV'
Publication date: 01/01/1998
Field of study

We discuss the concept of discrete scale invariance and how it leads to complex critical exponents (or dimensions), i.e. to the log-periodic corrections to scaling. After their initial suggestion as formal solutions of renormalization group equations in the seventies, complex exponents have been studied in the eighties in relation to various problems of physics embedded in hierarchical systems. Only recently has it been realized that discrete scale invariance and its associated complex exponents may appear ``spontaneously'' in euclidean systems, i.e. without the need for a pre-existing hierarchy. Examples are diffusion-limited-aggregation clusters, rupture in heterogeneous systems, earthquakes, animals (a generalization of percolation) among many other systems. We review the known mechanisms for the spontaneous generation of discrete scale invariance and provide an extensive list of situations where complex exponents have been found. This is done in order to provide a basis for a better fundamental understanding of discrete scale invariance. The main motivation to study discrete scale invariance and its signatures is that it provides new insights in the underlying mechanisms of scale invariance. It may also be very interesting for prediction purposes.Comment: significantly extended version (Oct. 27, 1998) with new examples in several domains of the review paper with the same title published in Physics Reports 297, 239-270 (1998

arXiv.org e-Print Archive

CiteSeerX

Crossref