Search CORE

69,876 research outputs found

Estimation of instrinsic dimension via clustering

Author: Crovella Mark
Eriksson Brian
Publication venue: Computer Science Department, Boston University
Publication date: 12/05/2011
Field of study

The problem of estimating the intrinsic dimension of a set of points in high dimensional space is a critical issue for a wide range of disciplines, including genomics, finance, and networking. Current estimation techniques are dependent on either the ambient or intrinsic dimension in terms of computational complexity, which may cause these methods to become intractable for large data sets. In this paper, we present a clustering-based methodology that exploits the inherent self-similarity of data to efficiently estimate the intrinsic dimension of a set of points. When the data satisfies a specified general clustering condition, we prove that the estimated dimension approaches the true Hausdorff dimension. Experiments show that the clustering-based approach allows for more efficient and accurate intrinsic dimension estimation compared with all prior techniques, even when the data does not conform to obvious self-similarity structure. Finally, we present empirical results which show the clustering-based estimation allows for a natural partitioning of the data points that lie on separate manifolds of varying intrinsic dimension

Boston University Institutional Repository (OpenBU)

A New Estimator of Intrinsic Dimension Based on the Multipoint Morisita Index

Author: Golay Jean
Kanevski Mikhail
Publication venue
Publication date: 06/05/2015
Field of study

The size of datasets has been increasing rapidly both in terms of number of variables and number of events. As a result, the empty space phenomenon and the curse of dimensionality complicate the extraction of useful information. But, in general, data lie on non-linear manifolds of much lower dimension than that of the spaces in which they are embedded. In many pattern recognition tasks, learning these manifolds is a key issue and it requires the knowledge of their true intrinsic dimension. This paper introduces a new estimator of intrinsic dimension based on the multipoint Morisita index. It is applied to both synthetic and real datasets of varying complexities and comparisons with other existing estimators are carried out. The proposed estimator turns out to be fairly robust to sample size and noise, unaffected by edge effects, able to handle large datasets and computationally efficient

arXiv.org e-Print Archive

CiteSeerX

Serveur académique lausannois

Investigating dynamic dependence using copulae

Author: Bouyé Eric
Gaussel Nicolas
Salmon Mark H.
Publication venue: Warwick Business School Financial Econometrics Research Centre
Publication date: 01/01/2001
Field of study

A general methodology for time series modelling is developed which works down from distributional properties to implied structural models including the standard regression relationship. This general to specific approach is important since it can avoid spurious assumptions such as linearity in the form of the dynamic relationship between variables. It is based on splitting the multivariate distribution of a time series into two parts: (i) the marginal unconditional distribution, (ii) the serial dependence encompassed in a general function , the copula. General properties of the class of copula functions that fulfill the necessary requirements for Markov chain construction are exposed. Special cases for the gaussian copula with AR(p) dependence structure and for archimedean copulae are presented. We also develop copula based dynamic dependency measures — auto-concordance in place of autocorrelation. Finally, we provide empirical applications using financial returns and transactions based forex data. Our model encompasses the AR(p) model and allows non-linearity. Moreover, we introduce non-linear time dependence functions that generalize the autocorrelation function

CiteSeerX

Warwick Research Archives Portal Repository

Exact Dimensionality Selection for Bayesian PCA

Author: Bouveyron Charles
Latouche Pierre
Mattei Pierre-Alexandre
Publication venue
Publication date: 21/05/2019
Field of study

We present a Bayesian model selection approach to estimate the intrinsic dimensionality of a high-dimensional dataset. To this end, we introduce a novel formulation of the probabilisitic principal component analysis model based on a normal-gamma prior distribution. In this context, we exhibit a closed-form expression of the marginal likelihood which allows to infer an optimal number of components. We also propose a heuristic based on the expected shape of the marginal likelihood curve in order to choose the hyperparameters. In non-asymptotic frameworks, we show on simulated data that this exact dimensionality selection approach is competitive with both Bayesian and frequentist state-of-the-art methods

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL Descartes

A comparative evaluation of nonlinear dynamics methods for time series prediction

Author: Camastra F.
Filippone M.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

A key problem in time series prediction using autoregressive models is to fix the model order, namely the number of past samples required to model the time series adequately. The estimation of the model order using cross-validation may be a long process. In this paper, we investigate alternative methods to cross-validation, based on nonlinear dynamics methods, namely Grassberger-Procaccia, K,gl, Levina-Bickel and False Nearest Neighbors algorithms. The experiments have been performed in two different ways. In the first case, the model order has been used to carry out the prediction, performed by a SVM for regression on three real data time series showing that nonlinear dynamics methods have performances very close to the cross-validation ones. In the second case, we have tested the accuracy of nonlinear dynamics methods in predicting the known model order of synthetic time series. In this case, most of the methods have yielded a correct estimate and when the estimate was not correct, the value was very close to the real one

Archivio della ricerca - Università degli studi di Napoli "Parthenope"

Enlighten

White Rose Research Online

Recommended from our members

Lumpy Price Adjustments: A Microeconometric Analysis

Author: Dhyne Emmanuel
Fuss Catherine
Pesaran M. Hashem
Sevestre Patrick
Publication venue: Faculty of Economics
Publication date: 01/04/2007
Field of study

This paper presents a simple model of state-dependent pricing that allows identification of the relative importance of the degree of price rigidity that is inherent to the price setting mechanism (intrinsic) and that which is due to the price’s driving variables (extrinsic). Using two data sets consisting of a large fraction of the price quotes used to compute the Belgian and French CPI, we are able to assess the role of intrinsic and extrinsic price stickiness in explaining the occurrence and magnitude of price changes at the outlet level. We find that infrequent price changes are not necessarily associated with large adjustment costs. Indeed, extrinsic rigidity appears to be significant in many cases. We also find that asymmetry in the price adjustment could be due to trends in marginal costs and/or desired mark-ups rather than asymmetric cost of adjustment bands

Apollo (Cambridge)