846,882 research outputs found

    Forecasting Time Series from Clusters.

    Get PDF
    Forecasting large numbers of time series is a costly and time-consuming exercise. Before forecasting a large number of series that are logically connected in some way, the authors can first cluster them into groups of similar series. In this paper they investigate forecasting the series in each cluster. Similar series are first grouped together using a clustering procedure that is based on a test of hypothesis. The series in each cluster are then pooled together and forecasts are obtained. Simulated results show that this procedure for forecasting similar series performs reasonably well.Autoregressive models, Clustering technique, Mean square forecast error, Pooled series,

    Comparison of time series with unequal length in the frequency domain

    Get PDF
    In statistical data analysis it is often important to compare, classify, and cluster different time series. For these purposes various methods have been proposed in the literature, but they usually assume time series with the same sample size. In this paper, we propose a spectral domain method for handling time series of unequal length. The method make the spectral estimates comparable by producing statistics at the same frequency. The procedure is compared with other methods proposed in the literature by a Monte Carlo simulation study. As an illustrative example, the proposed spectral method is applied to cluster industrial production series of some developed countries.Autocorrelation function; Cluster analysis; Interpolated periodogram; Reduced periodogram; Spectral analysis; Time series; Zero-padding.

    Fuzzy clustering of univariate and multivariate time series by genetic multiobjective optimization

    Get PDF
    Given a set of time series, it is of interest to discover subsets that share similar properties. For instance, this may be useful for identifying and estimating a single model that may fit conveniently several time series, instead of performing the usual identification and estimation steps for each one. On the other hand time series in the same cluster are related with respect to the measures assumed for cluster analysis and are suitable for building multivariate time series models. Though many approaches to clustering time series exist, in this view the most effective method seems to have to rely on choosing some features relevant for the problem at hand and seeking for clusters according to their measurements, for instance the autoregressive coe±cients, spectral measures or the eigenvectors of the covariance matrix. Some new indexes based on goodnessof-fit criteria will be proposed in this paper for fuzzy clustering of multivariate time series. A general purpose fuzzy clustering algorithm may be used to estimate the proper cluster structure according to some internal criteria of cluster validity. Such indexes are known to measure actually definite often conflicting cluster properties, compactness or connectedness, for instance, or distribution, orientation, size and shape. It is argued that the multiobjective optimization supported by genetic algorithms is a most effective choice in such a di±cult context. In this paper we use the Xie-Beni index and the C-means functional as objective functions to evaluate the cluster validity in a multiobjective optimization framework. The concept of Pareto optimality in multiobjective genetic algorithms is used to evolve a set of potential solutions towards a set of optimal non-dominated solutions. Genetic algorithms are well suited for implementing di±cult optimization problems where objective functions do not usually have good mathematical properties such as continuity, differentiability or convexity. In addition the genetic algorithms, as population based methods, may yield a complete Pareto front at each step of the iterative evolutionary procedure. The method is illustrated by means of a set of real data and an artificial multivariate time series data set.Fuzzy clustering, Internal criteria of cluster validity, Genetic algorithms, Multiobjective optimization, Time series, Pareto optimality

    Cluster analysis of financial time series

    Get PDF
    Mestrado em Mathematical FinanceEsta dissertação aplica o método da Signature como medida de similaridade entre dois objetos de séries temporais usando as propriedades de ordem 2 da Signature e aplicando-as a um método de Clustering Asimétrico. O método é comparado com uma abordagem de Clustering mais tradicional, onde a similaridade é medida usando Dynamic Time Warping, desenvolvido para trabalhar com séries temporais. O intuito é considerar a abordagem tradicional como benchmark e compará-la ao método da Signature através do tempo de computação, desempenho e algumas aplicações. Estes métodos são aplicados num conjunto de dados de séries temporais financeiras de Fundos Mútuos do Luxemburgo. Após a revisão da literatura, apresentamos o método Dynamic Time Warping e o método da Signature. Prossegue-se com a explicação das abordagens de Clustering Tradicional, nomeadamente k-Means, e Clustering Espectral Assimétrico, nomeadamente k-Axes, desenvolvido por Atev (2011). O último capítulo é dedicado à Investigação Prática onde os métodos anteriores são aplicados ao conjunto de dados. Os resultados confirmam que o método da Signature têm efectivamente potencial para Machine Learning e previsão, como sugerido por Levin, Lyons and Ni (2013).This thesis applies the Signature method as a measurement of similarities between two time-series objects, using the Signature properties of order 2, and its application to Asymmetric Spectral Clustering. The method is compared with a more Traditional Clustering approach where similarities are measured using Dynamic Time Warping, developed to work with time-series data. The intention for this is to consider the traditional approach as a benchmark and compare it to the Signature method through computation times, performance, and applications. These methods are applied to a financial time series data set of Mutual Exchange Funds from Luxembourg. After the literature review, we introduce the Dynamic Time Warping method and the Signature method. We continue with the explanation of Traditional Clustering approaches, namely k-Means, and Asymmetric Clustering techniques, namely the k-Axes algorithm, developed by Atev (2011). The last chapter is dedicated to Practical Research where the previous methods are applied to the data set. Results confirm that the Signature method has indeed potential for machine learning and prediction, as suggested by Levin, Lyons, and Ni (2013).info:eu-repo/semantics/publishedVersio

    Long-Range Dependence in Financial Markets: a Moving Average Cluster Entropy Approach

    Get PDF
    A perspective is taken on the intangible complexity of economic and social systems by investigating the underlying dynamical processes that produce, store and transmit information in financial time series in terms of the \textit{moving average cluster entropy}. An extensive analysis has evidenced market and horizon dependence of the \textit{moving average cluster entropy} in real world financial assets. The origin of the behavior is scrutinized by applying the \textit{moving average cluster entropy} approach to long-range correlated stochastic processes as the Autoregressive Fractionally Integrated Moving Average (ARFIMA) and Fractional Brownian motion (FBM). To that end, an extensive set of series is generated with a broad range of values of the Hurst exponent HH and of the autoregressive, differencing and moving average parameters p,d,qp,d,q. A systematic relation between \textit{moving average cluster entropy}, \textit{Market Dynamic Index} and long-range correlation parameters HH, dd is observed. This study shows that the characteristic behaviour exhibited by the horizon dependence of the cluster entropy is related to long-range positive correlation in financial markets. Specifically, long range positively correlated ARFIMA processes with differencing parameter d0.05 d\simeq 0.05, d0.15d\simeq 0.15 and d0.25 d\simeq 0.25 are consistent with \textit{moving average cluster entropy} results obtained in time series of DJIA, S\&P500 and NASDAQ

    Efficient Optimization of Echo State Networks for Time Series Datasets

    Full text link
    Echo State Networks (ESNs) are recurrent neural networks that only train their output layer, thereby precluding the need to backpropagate gradients through time, which leads to significant computational gains. Nevertheless, a common issue in ESNs is determining its hyperparameters, which are crucial in instantiating a well performing reservoir, but are often set manually or using heuristics. In this work we optimize the ESN hyperparameters using Bayesian optimization which, given a limited budget of function evaluations, outperforms a grid search strategy. In the context of large volumes of time series data, such as light curves in the field of astronomy, we can further reduce the optimization cost of ESNs. In particular, we wish to avoid tuning hyperparameters per individual time series as this is costly; instead, we want to find ESNs with hyperparameters that perform well not just on individual time series but rather on groups of similar time series without sacrificing predictive performance significantly. This naturally leads to a notion of clusters, where each cluster is represented by an ESN tuned to model a group of time series of similar temporal behavior. We demonstrate this approach both on synthetic datasets and real world light curves from the MACHO survey. We show that our approach results in a significant reduction in the number of ESN models required to model a whole dataset, while retaining predictive performance for the series in each cluster

    Bursts of extensive air showers: chaos vs. stochasticity

    Get PDF
    Bursts of the count rate of extensive air showers (EAS) lead to the appearance of clusters in time series that represent EAS arrival times. We apply methods of nonlinear time series analysis to twenty EAS cluster events found in the data set obtained with the EAS-1000 prototype array. In particular, we use the Grassberger-Procaccia algorithm to compute the correlation dimension of the time series in the vicinity of the clusters. We find that four cluster events produce signs of chaos in the corresponding time series. By applying a number of supplementary methods we assess that the nature of the observed behaviour of the correlation dimension is likely to be deterministic. We suggest a simple qualitative model that might explain an origin of clusters in general and "possibly chaotic" clusters in particular. Finally, we compare our conclusions with the results of similar investigations performed by the EAS-TOP and LAAS groups.Comment: An extended version of the paper to be submitted to Astroparticle Physics. Version 2: 22 pages, discussion extended, the main part shortened, accepted for publication. Version 1 is still valid (up to a number of typos
    corecore