Search CORE

352 research outputs found

Toward a generic representation of random variables for machine learning

Author: Donnat Philippe
Marti Gautier
Very Philippe
Publication venue
Publication date: 03/09/2015
Field of study

This paper presents a pre-processing and a distance which improve the performance of machine learning algorithms working on independent and identically distributed stochastic processes. We introduce a novel non-parametric approach to represent random variables which splits apart dependency and distribution without losing any information. We also propound an associated metric leveraging this representation and its statistical estimate. Besides experiments on synthetic datasets, the benefits of our contribution is illustrated through the example of clustering financial time series, for instance prices from the credit default swaps market. Results are available on the website www.datagrapple.com and an IPython Notebook tutorial is available at www.datagrapple.com/Tech for reproducible research.Comment: submitted to Pattern Recognition Letter

arXiv.org e-Print Archive

HAL-Polytechnique

Populaire à l’épreuve de la diversité culturelle (Le)

Author: Donnat Olivier
Publication venue
Publication date
Field of study

Bibliothèque numérique de l'enssib

A proposal of a methodological framework with experimental guidelines to investigate clustering stability on financial time series

Author: Donnat Philippe
Marti Gautier
Nielsen Frank
Very Philippe
Publication venue
Publication date: 17/09/2015
Field of study

We present in this paper an empirical framework motivated by the practitioner point of view on stability. The goal is to both assess clustering validity and yield market insights by providing through the data perturbations we propose a multi-view of the assets' clustering behaviour. The perturbation framework is illustrated on an extensive credit default swap time series database available online at www.datagrapple.com.Comment: Accepted at ICMLA 201

arXiv.org e-Print Archive

Crossref