131,868 research outputs found
Recommended from our members
Essays in High Dimensional Time Series Analysis
Due to the rapid improvements in the information technology, high dimensional time series datasets are frequently encountered in a variety of fields such as macroeconomics, finance, neuroscience, and meteorology. Some examples in economics and finance include forecasting low frequency macroeconomic indicators, such as GDP or inflation rate, or financial asset returns using a large number of macroeconomic and financial time series and their lags as possible covariates. In these settings, the number of candidate predictors (pT) can be much larger than the number of samples (T), and accurate estimation and prediction is made possible by relying on some form of dimension reduction. Given this ubiquity of time series data, it is surprising that few works on high dimensional statistics discuss the time series setting, and even fewer works have developed methods which utilize the unique features of time series data. This chapter consists of three chapters, and each one is self contained.
The first chapter deals with high dimensional predictive regressions which are widely used in economics and finance. However, the theory and methodology is mainly developed assuming that the model is stationary with time invariant parameters. This is at odds with the prevalent evidence for parameter instability in economic time series. To remedy this, we present two L2 boosting algorithms for estimating high dimensional models in which the coefficients are modeled as functions evolving smoothly over time and the predictors are locally stationary. The first method uses componentwise local constant estimators as base learner, while the second relies on componentwise local linear estimators. We establish consistency of both methods, and address the practical issues of choosing the bandwidth for the base learners and the number of boosting iterations. In an extensive application to macroeconomic forecasting with many potential predictors, we find that the benefits to modeling time variation are substantial and are present across a wide range of economic series. Furthermore, these benefits increase with the forecast horizon and with the length of the time series available for estimation. This chapter is jointly written with Serena Ng.
The second chapter deals with high dimensional non-linear time series models, and deals with the topic of variable screening/targeting predictors. Rather than assume a specific parametric model a priori, this chapter introduces several model free screening methods based on the partial distance correlation and developed specifically to deal with time dependent data. Methods are developed both for univariate models, such as nonlinear autoregressive models with exogenous predictors (NARX), and multivariate models such as linear or nonlinear VAR models. Sure screening properties are proved for our methods, which depend on the moment conditions, and the strength of dependence in the response and covariate processes, amongst other factors. Finite sample performance of our methods is shown through extensive simulation studies, and we show the effectiveness of our algorithms at forecasting US market returns. This chapter is jointly written with Yang Feng.
The third chapter deals with variable selection for high dimensional linear stationary time series models. This chapter analyzes the theoretical properties of Sure Independence Screening (SIS), and its two stage combination with the adaptive Lasso, for high dimensional linear models with dependent and/or heavy tailed covariates and errors. We also introduce a generalized least squares screening (GLSS) procedure which utilizes the serial correlation present in the data. By utilizing this serial correlation when estimating our marginal effects, GLSS is shown to outperform SIS in many cases. For both procedures we prove two stage variable selection consistency when combined with the adaptive Lasso
Forecasting the CATS benchmark with the Double Vector Quantization method
The Double Vector Quantization method, a long-term forecasting method based
on the SOM algorithm, has been used to predict the 100 missing values of the
CATS competition data set. An analysis of the proposed time series is provided
to estimate the dimension of the auto-regressive part of this nonlinear
auto-regressive forecasting method. Based on this analysis experimental results
using the Double Vector Quantization (DVQ) method are presented and discussed.
As one of the features of the DVQ method is its ability to predict scalars as
well as vectors of values, the number of iterative predictions needed to reach
the prediction horizon is further observed. The method stability for the long
term allows obtaining reliable values for a rather long-term forecasting
horizon.Comment: Accepted for publication in Neurocomputing, Elsevie
Relationship between degree of efficiency and prediction in stock price changes
This study investigates empirically whether the degree of stock market
efficiency is related to the prediction power of future price change using the
indices of twenty seven stock markets. Efficiency refers to weak-form efficient
market hypothesis (EMH) in terms of the information of past price changes. The
prediction power corresponds to the hit-rate, which is the rate of the
consistency between the direction of actual price change and that of predicted
one, calculated by the nearest neighbor prediction method (NN method) using the
out-of-sample. In this manuscript, the Hurst exponent and the approximate
entropy (ApEn) are used as the quantitative measurements of the degree of
efficiency. The relationship between the Hurst exponent, reflecting the various
time correlation property, and the ApEn value, reflecting the randomness in the
time series, shows negative correlation. However, the average prediction power
on the direction of future price change has the strongly positive correlation
with the Hurst exponent, and the negative correlation with the ApEn. Therefore,
the market index with less market efficiency has higher prediction power for
future price change than one with higher market efficiency when we analyze the
market using the past price change pattern. Furthermore, we show that the Hurst
exponent, a measurement of the long-term memory property, provides more
significant information in terms of prediction of future price changes than the
ApEn and the NN method.Comment: 10 page
Leverage Financial News to Predict Stock Price Movements Using Word Embeddings and Deep Neural Networks
Financial news contains useful information on public companies and the
market. In this paper we apply the popular word embedding methods and deep
neural networks to leverage financial news to predict stock price movements in
the market. Experimental results have shown that our proposed methods are
simple but very effective, which can significantly improve the stock prediction
accuracy on a standard financial database over the baseline system using only
the historical price information.Comment: 5 pages, 2 figures, technical repor
Measures of Analysis of Time Series (MATS): A MATLAB Toolkit for Computation of Multiple Measures on Time Series Data Bases
In many applications, such as physiology and finance, large time series data
bases are to be analyzed requiring the computation of linear, nonlinear and
other measures. Such measures have been developed and implemented in commercial
and freeware softwares rather selectively and independently. The Measures of
Analysis of Time Series ({\tt MATS}) {\tt MATLAB} toolkit is designed to handle
an arbitrary large set of scalar time series and compute a large variety of
measures on them, allowing for the specification of varying measure parameters
as well. The variety of options with added facilities for visualization of the
results support different settings of time series analysis, such as the
detection of dynamics changes in long data records, resampling (surrogate or
bootstrap) tests for independence and linearity with various test statistics,
and discrimination power of different measures and for different combinations
of their parameters. The basic features of {\tt MATS} are presented and the
implemented measures are briefly described. The usefulness of {\tt MATS} is
illustrated on some empirical examples along with screenshots.Comment: 25 pages, 9 figures, two tables, the software can be downloaded at
http://eeganalysis.web.auth.gr/indexen.ht
- …