Search CORE

54,557 research outputs found

Quantization/clustering: when and why does k-means work?

Author: Levrard Clément
Publication venue
Publication date: 01/01/2018
Field of study

Though mostly used as a clustering algorithm, k-means are originally designed as a quantization algorithm. Namely, it aims at providing a compression of a probability distribution with k points. Building upon [21, 33], we try to investigate how and when these two approaches are compatible. Namely, we show that provided the sample distribution satisfies a margin like condition (in the sense of [27] for supervised learning), both the associated empirical risk minimizer and the output of Lloyd's algorithm provide almost optimal classification in certain cases (in the sense of [6]). Besides, we also show that they achieved fast and optimal convergence rates in terms of sample size and compression risk

arXiv.org e-Print Archive

Numérisation de Documents Anciens Mathématiques

Hal-Diderot

Can jumps improve the futures margin level? An empirical study based on an SE-SVCJ-GPD model

Author: Chen Yan
Zhang Lei
Publication venue: Taylor and Francis Group and Juraj Dobrila University of Pula, Faculty of economics and tourism Dr. Mijo Mirković
Publication date: 01/01/2023
Field of study

In addition to the characteristics of leptokurtic fat-tailed distribution, financial sequences also exhibit typical volatility and jumps. Moreover, jumps exhibit self-exciting and clustering characteristics under extreme events. However, studies on dynamic margin levels often ignore jumps. In this study, we combine the self-exciting stochastic volatility with correlated jumps (SE-SVCJ) model with a generalized Pareto distribution (GPD) to measure the optimal margin level for the stock index futures market. Value at risk (VaR) is estimated and forecasted using the SE-SVCJ-GPD, SVCJ-GPD, and generalized autoregressive conditional heteroskedasticity with GPD (GARCH-GPD) models. SE-SVCJ-GPD can undertake more risks in the long or short trading position of stock index futures contracts. Moreover, the backtesting experiment results show that the SE-SVCJ-GPD model provides a more accurate margin level forecast than the other methods in both positions. This study’s findings have practical significance and theoretical value for assessing the level of risk and taking corresponding risk-prevention measures

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Minimum Density Hyperplanes

Author: Hofmeyr David P.
Pavlidis Nicos G.
Tasoulis Sotiris K.
Publication venue
Publication date: 01/01/2016
Field of study

Associating distinct groups of objects (clusters) with contiguous regions of high probability density (high-density clusters), is central to many statistical and machine learning approaches to the classification of unlabelled data. We propose a novel hyperplane classifier for clustering and semi-supervised classification which is motivated by this objective. The proposed minimum density hyperplane minimises the integral of the empirical probability density function along it, thereby avoiding intersection with high density clusters. We show that the minimum density and the maximum margin hyperplanes are asymptotically equivalent, thus linking this approach to maximum margin clustering and semi-supervised support vector classifiers. We propose a projection pursuit formulation of the associated optimisation problem which allows us to find minimum density hyperplanes efficiently in practice, and evaluate its performance on a range of benchmark datasets. The proposed approach is found to be very competitive with state of the art methods for clustering and semi-supervised classification

arXiv.org e-Print Archive

Lancaster E-Prints

Stellenbosch University SUNScholar Repository

Maximum Margin Clustering for State Decomposition of Metastable Systems

Author: Allwein
Becker
Berglund
Biancalani
Bowman
Boyd
Chema
Chodera
Chodera
Chodera
Crammer
Daura
Deuflhard
Deuflhard
Elmer
Genova
Glättli
Groningen
Hao Wu
Hastie
Horn
Jain
Keller
Kellogg
Kloeden
Kwak
McGibbon
Mehrmann
Noé
Noé
Noé
Noé
Noé
Nüske
Prinz
Pryor
Pérez-Hernández
Rahimi
Sarich
Schwantes
Shalev-Shwartz
Shao
Sorin
Swope
Vapnik
Wu
Xu
Yao
Zhang
Publication venue
Publication date: 31/12/2014
Field of study

When studying a metastable dynamical system, a prime concern is how to decompose the phase space into a set of metastable states. Unfortunately, the metastable state decomposition based on simulation or experimental data is still a challenge. The most popular and simplest approach is geometric clustering which is developed based on the classical clustering technique. However, the prerequisites of this approach are: (1) data are obtained from simulations or experiments which are in global equilibrium and (2) the coordinate system is appropriately selected. Recently, the kinetic clustering approach based on phase space discretization and transition probability estimation has drawn much attention due to its applicability to more general cases, but the choice of discretization policy is a difficult task. In this paper, a new decomposition method designated as maximum margin metastable clustering is proposed, which converts the problem of metastable state decomposition to a semi-supervised learning problem so that the large margin technique can be utilized to search for the optimal decomposition without phase space discretization. Moreover, several simulation examples are given to illustrate the effectiveness of the proposed method

arXiv.org e-Print Archive

Crossref

Repository: Freie Universität Berlin (FU), Math Department (fu_mi_publications)

Anisotropic oracle inequalities in noisy quantization

Author: Loustau Sébastien
Publication venue
Publication date: 26/04/2013
Field of study

The effect of errors in variables in quantization is investigated. We prove general exact and non-exact oracle inequalities with fast rates for an empirical minimization based on a noisy sample

Z_i=X_i+\epsilon_i,i=1,\ldots,n

, where

X_i

are i.i.d. with density

f

and

\epsilon_i

are i.i.d. with density

\eta

. These rates depend on the geometry of the density

f

and the asymptotic behaviour of the characteristic function of

\eta

. This general study can be applied to the problem of

k

-means clustering with noisy data. For this purpose, we introduce a deconvolution

k

-means stochastic minimization which reaches fast rates of convergence under standard Pollard's regularity assumptions.Comment: 30 pages. arXiv admin note: text overlap with arXiv:1205.141

arXiv.org e-Print Archive