33 research outputs found
Discrete representation strategies for foreign exchange prediction
This is an extended version of the paper presented at the 4th International Workshop NFMCP 2015 held in conjunction with ECML PKDD 2015. The initial version has been published in NFMCP 2015 conference proceedings as part of Springer Series. This paper presents a novel approach to financial times series (FTS) prediction by mapping hourly foreign exchange data to string representations and deriving simple trading strategies from them. To measure the degree of similarity in these market strings we apply familiar string kernels, bag of words and n-grams, whilst also introducing a new kernel, time-decay n-grams, that captures the temporal nature of FTS. In the process we propose a sequential Parzen windows algorithm based on discrete representations where trading decisions for each string are learned in an online manner and are thus subject to temporal fluctuations. We evaluate the strength of a number of representations using both the string version and its continuous counterpart, whilst also comparing the performance of different learning algorithms on these representations, namely support vector machines, Parzen windows and Fisher discriminant analysis. Our extensive experiments show that the simple string representation coupled with the sequential Parzen windows approach is capable of outperforming other more exotic approaches, supporting the idea that when it comes to working in high noise environments often the simplest approach is the most effective
State of the Art of Financial Decision Support Systems based on Problem, Requirement, Component and Evaluation Categories
Financial decision support has become an important information systems research topic and is also of highest interest to practitioners. Two rapidly emerging trends, the increasing amount of available data and the evolution of data mining methods, pose challenges for researchers. Thus, a review of existing research with the goal to guide future research efforts in this domain is timely. To structure our literature review and future research in this area, we propose a framework in the paper that integrates elements of decision support systems, design theory, and information mining. The framework is then applied in the paper. Our analysis reveals that the focus of existing research can be grouped into three major domain categories. More research is needed in two of the categories for which we found only very few IS studies, despite the high relevance of these topics due to increased turbulences in worldwide financial markets. Furthermore, we discuss the opportunities to make stronger use of heterogeneous data and of combined data mining techniques and to build upon the rich set of available evaluation methods
Exploring the International Application of Machine Learning in Asset Pricing: An Empirical Study
This thesis delves into the application of machine learning models for predicting cross-sectional returns in diverse markets. Chapter One explores the predictive abilities of XG-Boost, Random Forest, and neural network models in relation to fund performance and fund manager information characteristics. The findings indicate that fund performance characteristics prove to be more informative of future fund performance than the characteristics of fund managers. Chapter Two probes the presence of bimodality in momentum stocks and examines the profitability of deep momentum, a machine learning return prediction model, in the UK, Japan, and South Korea. The findings demonstrate that bimodality is a phenomenon linked to developed markets and can cause losses for JT strategy investors. However, the deep momentum model generates substantial profits in all markets by relieving bimodality in long-short portfolios. Chapter Three investigates the efficacy of the momentum factor in Chinese stock markets. We compare the performance of the traditional linear JT model, the XG-Boost model, the neural network model, and neural network reclassification models as developed by Han (2022). The study finds that machine learning models based on the momentum factor outperform the traditional JT linear regression model, indicating a non-linear relationship between the momentum factor and stock returns in China. Han's reclassification models perform the most strongly after reclassification of the true target distribution within high-return deciles moves from a bimodal shape to a right-skewed distribution. The study also observes a significant positive correlation between the return of the long-only portfolio developed using the momentum factor in the machine learning framework and the size and sentiment index. Overall, this thesis attests to the practicality of machine learning models for predicting cross-sectional returns in various markets, with potentially gainful implications for investors and policymakers
Doctor of Philosophy
dissertationDue to the popularity of Web 2.0 and Social Media in the last decade, the percolation of user generated content (UGC) has rapidly increased. In the financial realm, this results in the emergence of virtual investing communities (VIC) to the investing public. There is an on-going debate among scholars and practitioners on whether such UGC contain valuable investing information or mainly noise. I investigate two major studies in my dissertation. First I examine the relationship between peer influence and information quality in the context of individual characteristics in stock microblogging. Surprisingly, I discover that the set of individual characteristics that relate to peer influence is not synonymous with those that relate to high information quality. In relating to information quality, influentials who are frequently mentioned by peers due to their name value are likely to possess higher information quality while those who are better at diffusing information via retweets are likely to associate with lower information quality. Second I propose a study to explore predictability of stock microblog dimensions and features over stock price directional movements using data mining classification techniques. I find that author-ticker-day dimension produces the highest predictive accuracy inferring that this dimension is able to capture both relevant author and ticker information as compared to author-day and ticker-day. In addition to these two studies, I also explore two topics: network structure of co-tweeted tickers and sentiment annotation via crowdsourcing. I do this in order to understand and uncover new features as well as new outcome indicators with the objective of improving predictive accuracy of the classification or saliency of the explanatory models. My dissertation work extends the frontier in understanding the relationship between financial UGC, specifically stock microblogging with relevant phenomena as well as predictive outcomes
Essays on Financial Applications of Nonlinear Models
In this thesis, we examine the relationship between news and the
stock market. Further, we explore methods and build new nonlinear
models for forecasting stock price movement and portfolio
optimization based on past stock prices and on one type of big
data, news items, which are obtained through the RavenPack News
Analytics Global Equities editions.
The thesis consists of three essays. In Essay 1, we investigate
the relationship between news items and stock prices using the
artificial neural network (ANN) model. First, we use Granger
causality to ascertain how news items affect stock prices. The
results show that news volume is not the Granger cause of stock
price change; rather, news sentiment is. Second, we test the
semi–strong form efficient market hypothesis, whereas most
existing research testing efficient market hypothesis focuses on
the weak–form version. Our ANN strategies consistently
outperform the passive buy–and–hold strategy and this finding
is apparently at odds with the notion of the efficient market
hypothesis. Finally, using news sentiment analytics from
RavenPack Dow Jones News Analytics, we show positive
profitability with out–of–sample prediction using the
proposed ANN strategies for Google Inc. (NASDAQ: GOOG).
In Essay 2, we expand the utility of the information from news
volume and news sentiments to encompass portfolio
diversification. For the Dow Jones Industrial Average (DJIA)
components, we assign different weights to build portfolios
according to their weekly news volumes or news sentiments. Our
results show that news volume contributes to portfolio variance
both in–sample and out–of–sample: positive news sentiment
contributes to the portfolio return in–sample, while negative
contributes to the portfolio return out–of–sample, which is a
consequence of investors overreacting to the news sentiment.
Further, we propose a novel approach to portfolio diversification
using the k–Nearest Neighbors (kNN) algorithm based on the idea
that news sentiment correlates with stock returns.
Out–of–sample results indicate that such strategy dominates
the benchmark DJIA index portfolio.
In Essay 3, we propose a new model called the Combined Markov and
Hidden Markov Model (CMHMM), in which observation is affected by
a Markov model and an HMM (Hidden Markov Model) model. The three
fundamental questions of the CMHMM are discussed. Further, the
application of the CMHMM, in which the news sentiment is one
observation and the stock return is the other, is discussed. The
empirical results of the trading strategy based on the CMHMM show
the potential applications of the proposed model in finance.
This thesis contributes to the literature in a number of ways.
First, it extends the literature on financial applications of
nonlinear models. We explore the applications of the ANNs and kNN
in the financial market. Besides, the proposed new CMHMM model
adheres to the nature of the stock market and has better
potential prediction ability. Second, the empirical results from
this dissertation contribute to the understanding of the
relationship between news and the stock market. For instance, our
research found that news volume contributes to the portfolio
return and that investors overreact to news sentiment—a
phenomenon that has been discussed by other scholars from
different angles
Online computational algorithms for portfolio-selection problems
Abstract: This thesis contributes to the problem of equity portfolio management using computational intelligence methodologies. The focus is on generating automated nancial reasoning, with a basis in computational nance research, through searching a space of semantically meaningful propositions. In comparison with classical nancial modelling, our proposed algorithms allow continual adaptation to changing market conditions and a non- linear solution representations in most cases. When compared with other computational intelligence approaches, the focus is on a holistic design that integrates nancial research with machine learning. The major aim of the thesis is to develop portfolio allocation techniques for learning investment-decision making that can easily adapt to changes in market processes together with speed and accuracy. We evaluate the algorithms developed in out-of-sample trading framework using historical data sets. The testing is designed to be realistic; for instance, considering factors such as transaction costs, stock splits and data snooping. To demonstrate the robustness of our approach we perform extensive historical simulations using previously untested real market datasets. On all data sets considered, our proposed algorithms signicantly outperform existing portfolio allocation techniques, sometimes in a spectacular way, without any additional computational demand or modeling complexity. Before proceeding any further, we stress that setting up abstract and complex mathe- matical models is neither the intention nor the scope of this thesis. Our aim rather is to investigate empirically and possibly capture any existing nonlinearities or non-stochasticities that are apparent in the dynamics of cross sectional returns of stock prices. In doing so we iii utilise some novel techniques, which are mostly based on such methodologies that have been used successfully in the physical sciences were the deterministic dynamics of the phenomena are more easily detected. Our intention is to provide an additional empirical analysis frame- work that could shed new light in the investigation of the nature of financial time-series data generating processes.Ph.D. (Economics and Financial Sciences
Essays on Some Recent Penalization Methods with Applications in Finance and Marketing
The subject of this PhD research is within the areas of Econometrics and Artificial Intelligence. More concretely, it deals with the tasks of statistical regression and classification analysis. New classification methods have been proposed, as well as new applications of established ones in the areas of Finance and Marketing.
The bulk of this PhD research centers on extending standard methods that fall under the general term of loss-versus-penalty classification techniques. These techniques build on the premises that a model that uses a finite amount of available data to be trained on should neither be too complex nor too simple in order to possess a good forecasting ability. New proposed classification techniques in this area are Support Hyperplanes, Nearest Convex Hull classification and Soft Nearest Neighbor.
Next to the new techniques, new applications of some standard loss-versus-penalty methods have been put forward. Specifically, these are the application of the so-called Support Vector Machines (SVMs) for classification and regression analysis to financial time series forecasting, solving the Market Share Attraction model and solving and interpreting binary classification tasks in Marketing.
In addition, this research focuses on new efficient solutions to SVMs using the so-called majorization algorithm. This algorithm provides for the possibility to incorporate various so-called loss functions while solving general SVM-like methods
Recommended from our members
The predictive power of stock micro-blogging sentiment in forecasting stock market behaviour
This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonOnline stock forums have become a vital investing platform on which to publish relevant and valuable user-generated content (UGC) data such as investment recommendations and other stock-related information that allow investors to view the opinions of a large number of users and share-trading ideas. This thesis applies methods from computational linguistics and text-mining techniques to analyse and extract, on a daily basis, sentiments from stock-related micro-blogging messages called “StockTwits”. The primary aim of this research is to provide an understanding of the predictive ability of stock micro-blogging sentiments to forecast future stock price behavioural movements by investigating the various roles played by investor sentiments in determining asset pricing on the stock market.
The empirical analysis in this thesis consists of four main parts based on the predictive power and the role of investor sentiment in the stock market. The first part discusses the findings of the text-mining procedure for extracting and predicting sentiments from stock-related micro-blogging data. The purpose is to provide a comparative textual analysis of different machine learning algorithms for the purpose of selecting the most accurate text-mining techniques for predicting sentiment analysis on StockTwits through the provision of two different applications of feature selection, namely filter and wrapper approaches. The second part of the analysis focuses on investigating the predictive correlations between StockTwits features and the stock market indicators. It aims to examine the explanatory power of StockTwits variables in explaining the dynamic nature of different financial market indicators. The third part of the analysis investigates the role played by noise traders in determining asset prices. The aim is to show that stock returns, volatility and trading volumes are affected by investor sentiment; it also seeks to investigate whether changes in sentiment (bullish or bearish) will have different effects on stock market prices. The fourth part offers an in-depth analysis of some tweet-market relationships which represent an open problem in the empirical literature (e.g. sentiment-return relations and volume-disagreement relations).
The results suggest that StockTwits sentiments exhibit explanatory power in explaining the dynamics of stock prices in the U.S. market. Taking different approaches by combining text-mining techniques with feature selection methods has proved successful in predicting StockTwits sentiments. The applications of the approach presented in this thesis offer real-time investment ideas that may provide investors and their peers with a decision support mechanism. Investor sentiment plays a critical role in determining asset prices in capital markets. Overall, the findings suggest that investor sentiment among noise traders is a priced factor. The findings confirm the existence of asymmetric spillover effects of bullish and bearish sentiments on the stock market. They also suggest that sentiment is a significant factor in explaining stock price behaviour in the capital market and imply the positive role of the stock market in the formation of investor sentiment in stock markets. Furthermore, the research findings demonstrate that disagreement is not only an important factor in determining trading volumes but it is also considered a very significant factor in influencing asset prices and returns in capital markets.
Overall, the findings of the thesis provide empirical evidence that failure to consider the role of investor sentiment in traditional finance theory could lead to an imperfect picture when explaining the behaviour of stock prices in stock market