Search CORE

131,868 research outputs found

Recommended from our members

Essays in High Dimensional Time Series Analysis

Author: Yousuf Kashif
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2019
Field of study

Due to the rapid improvements in the information technology, high dimensional time series datasets are frequently encountered in a variety of fields such as macroeconomics, finance, neuroscience, and meteorology. Some examples in economics and finance include forecasting low frequency macroeconomic indicators, such as GDP or inflation rate, or financial asset returns using a large number of macroeconomic and financial time series and their lags as possible covariates. In these settings, the number of candidate predictors (pT) can be much larger than the number of samples (T), and accurate estimation and prediction is made possible by relying on some form of dimension reduction. Given this ubiquity of time series data, it is surprising that few works on high dimensional statistics discuss the time series setting, and even fewer works have developed methods which utilize the unique features of time series data. This chapter consists of three chapters, and each one is self contained. The first chapter deals with high dimensional predictive regressions which are widely used in economics and finance. However, the theory and methodology is mainly developed assuming that the model is stationary with time invariant parameters. This is at odds with the prevalent evidence for parameter instability in economic time series. To remedy this, we present two L2 boosting algorithms for estimating high dimensional models in which the coefficients are modeled as functions evolving smoothly over time and the predictors are locally stationary. The first method uses componentwise local constant estimators as base learner, while the second relies on componentwise local linear estimators. We establish consistency of both methods, and address the practical issues of choosing the bandwidth for the base learners and the number of boosting iterations. In an extensive application to macroeconomic forecasting with many potential predictors, we find that the benefits to modeling time variation are substantial and are present across a wide range of economic series. Furthermore, these benefits increase with the forecast horizon and with the length of the time series available for estimation. This chapter is jointly written with Serena Ng. The second chapter deals with high dimensional non-linear time series models, and deals with the topic of variable screening/targeting predictors. Rather than assume a specific parametric model a priori, this chapter introduces several model free screening methods based on the partial distance correlation and developed specifically to deal with time dependent data. Methods are developed both for univariate models, such as nonlinear autoregressive models with exogenous predictors (NARX), and multivariate models such as linear or nonlinear VAR models. Sure screening properties are proved for our methods, which depend on the moment conditions, and the strength of dependence in the response and covariate processes, amongst other factors. Finite sample performance of our methods is shown through extensive simulation studies, and we show the effectiveness of our algorithms at forecasting US market returns. This chapter is jointly written with Yang Feng. The third chapter deals with variable selection for high dimensional linear stationary time series models. This chapter analyzes the theoretical properties of Sure Independence Screening (SIS), and its two stage combination with the adaptive Lasso, for high dimensional linear models with dependent and/or heavy tailed covariates and errors. We also introduce a generalized least squares screening (GLSS) procedure which utilizes the serial correlation present in the data. By utilizing this serial correlation when estimating our marginal effects, GLSS is shown to outperform SIS in many cases. For both procedures we prove two stage variable selection consistency when combined with the adaptive Lasso

Columbia University Academic Commons

Forecasting the CATS benchmark with the Double Vector Quantization method

Author: Cottrell Marie
Lee John
Simon Geoffroy
Verleysen Michel
Publication venue
Publication date: 01/01/2007
Field of study

The Double Vector Quantization method, a long-term forecasting method based on the SOM algorithm, has been used to predict the 100 missing values of the CATS competition data set. An analysis of the proposed time series is provided to estimate the dimension of the auto-regressive part of this nonlinear auto-regressive forecasting method. Based on this analysis experimental results using the Double Vector Quantization (DVQ) method are presented and discussed. As one of the features of the DVQ method is its ability to predict scalars as well as vectors of values, the number of iterative predictions needed to reach the prediction horizon is further observed. The method stability for the long term allows obtaining reliable values for a rather long-term forecasting horizon.Comment: Accepted for publication in Neurocomputing, Elsevie

arXiv.org e-Print Archive

CiteSeerX

DIAL UCLouvain

HAL-Paris1

Relationship between degree of efficiency and prediction in stock price changes

Author: Eom Cheoljun
Jung Woo-Sung
Oh Gabjin
Publication venue: 'Elsevier BV'
Publication date: 30/08/2007
Field of study

This study investigates empirically whether the degree of stock market efficiency is related to the prediction power of future price change using the indices of twenty seven stock markets. Efficiency refers to weak-form efficient market hypothesis (EMH) in terms of the information of past price changes. The prediction power corresponds to the hit-rate, which is the rate of the consistency between the direction of actual price change and that of predicted one, calculated by the nearest neighbor prediction method (NN method) using the out-of-sample. In this manuscript, the Hurst exponent and the approximate entropy (ApEn) are used as the quantitative measurements of the degree of efficiency. The relationship between the Hurst exponent, reflecting the various time correlation property, and the ApEn value, reflecting the randomness in the time series, shows negative correlation. However, the average prediction power on the direction of future price change has the strongly positive correlation with the Hurst exponent, and the negative correlation with the ApEn. Therefore, the market index with less market efficiency has higher prediction power for future price change than one with higher market efficiency when we analyze the market using the past price change pattern. Furthermore, we show that the Hurst exponent, a measurement of the long-term memory property, provides more significant information in terms of prediction of future price changes than the ApEn and the NN method.Comment: 10 page

arXiv.org e-Print Archive

Crossref

포항공과대학교

Leverage Financial News to Predict Stock Price Movements Using Word Embeddings and Deep Neural Networks

Author: Jiang Hui
Peng Yangtuo
Publication venue
Publication date: 23/06/2015
Field of study

Financial news contains useful information on public companies and the market. In this paper we apply the popular word embedding methods and deep neural networks to leverage financial news to predict stock price movements in the market. Experimental results have shown that our proposed methods are simple but very effective, which can significantly improve the stock prediction accuracy on a standard financial database over the baseline system using only the historical price information.Comment: 5 pages, 2 figures, technical repor

arXiv.org e-Print Archive

Crossref

Measures of Analysis of Time Series (MATS): A MATLAB Toolkit for Computation of Multiple Measures on Time Series Data Bases

Author: Kugiumtzis Dimitris
Tsimpiris Alkiviadis
Publication venue
Publication date: 01/01/2010
Field of study

In many applications, such as physiology and finance, large time series data bases are to be analyzed requiring the computation of linear, nonlinear and other measures. Such measures have been developed and implemented in commercial and freeware softwares rather selectively and independently. The Measures of Analysis of Time Series ({\tt MATS}) {\tt MATLAB} toolkit is designed to handle an arbitrary large set of scalar time series and compute a large variety of measures on them, allowing for the specification of varying measure parameters as well. The variety of options with added facilities for visualization of the results support different settings of time series analysis, such as the detection of dynamics changes in long data records, resampling (surrogate or bootstrap) tests for independence and linearity with various test statistics, and discrimination power of different measures and for different combinations of their parameters. The basic features of {\tt MATS} are presented and the implemented measures are briefly described. The usefulness of {\tt MATS} is illustrated on some empirical examples along with screenshots.Comment: 25 pages, 9 figures, two tables, the software can be downloaded at http://eeganalysis.web.auth.gr/indexen.ht

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

Journal of Statistical Software