Search CORE

6,977 research outputs found

A fast and recursive algorithm for clustering large datasets with $k$ -medians

Author: Cardot Hervé
Cénac Peggy
Monnez Jean-Marie
Publication venue
Publication date: 18/10/2011
Field of study

Clustering with fast algorithms large samples of high dimensional data is an important challenge in computational statistics. Borrowing ideas from MacQueen (1967) who introduced a sequential version of the

k

-means algorithm, a new class of recursive stochastic gradient algorithms designed for the

k

-medians loss criterion is proposed. By their recursive nature, these algorithms are very fast and are well adapted to deal with large samples of data that are allowed to arrive sequentially. It is proved that the stochastic gradient algorithm converges almost surely to the set of stationary points of the underlying loss criterion. A particular attention is paid to the averaged versions, which are known to have better performances, and a data-driven procedure that allows automatic selection of the value of the descent step is proposed. The performance of the averaged sequential estimator is compared on a simulation study, both in terms of computation speed and accuracy of the estimations, with more classical partitioning techniques such as

k

-means, trimmed

k

-means and PAM (partitioning around medoids). Finally, this new online clustering technique is illustrated on determining television audience profiles with a sample of more than 5000 individual television audiences measured every minute over a period of 24 hours.Comment: Under revision for Computational Statistics and Data Analysi

arXiv.org e-Print Archive

HAL-uB

HAL - Université de Franche-Comté

INRIA a CCSD electronic archive server

Polyhedral Predictive Regions For Power System Applications

Author: Golestaneh Faranak
Gooi Hoay Beng
Pinson Pierre
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Despite substantial improvement in the development of forecasting approaches, conditional and dynamic uncertainty estimates ought to be accommodated in decision-making in power system operation and market, in order to yield either cost-optimal decisions in expectation, or decision with probabilistic guarantees. The representation of uncertainty serves as an interface between forecasting and decision-making problems, with different approaches handling various objects and their parameterization as input. Following substantial developments based on scenario-based stochastic methods, robust and chance-constrained optimization approaches have gained increasing attention. These often rely on polyhedra as a representation of the convex envelope of uncertainty. In the work, we aim to bridge the gap between the probabilistic forecasting literature and such optimization approaches by generating forecasts in the form of polyhedra with probabilistic guarantees. For that, we see polyhedra as parameterized objects under alternative definitions (under

L_1

and

L_\infty

norms), the parameters of which may be modelled and predicted. We additionally discuss assessing the predictive skill of such multivariate probabilistic forecasts. An application and related empirical investigation results allow us to verify probabilistic calibration and predictive skills of our polyhedra.Comment: 8 page

arXiv.org e-Print Archive

DR-NTU (Digital Repository of NTU)

Online Research Database In Technology

Discussion of "Multivariate quantiles and multiple-output regression quantiles: From $L_1$ optimization to halfspace depth"

Author: Serfling Robert
Zuo Yijun
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 24/02/2010
Field of study

Discussion of "Multivariate quantiles and multiple-output regression quantiles: From

L_1

optimization to halfspace depth" by M. Hallin, D. Paindaveine and M. Siman [arXiv:1002.4486]Comment: Published in at http://dx.doi.org/10.1214/09-AOS723B the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref