33 research outputs found

    Discrete representation strategies for foreign exchange prediction

    Get PDF
    This is an extended version of the paper presented at the 4th International Workshop NFMCP 2015 held in conjunction with ECML PKDD 2015. The initial version has been published in NFMCP 2015 conference proceedings as part of Springer Series. This paper presents a novel approach to financial times series (FTS) prediction by mapping hourly foreign exchange data to string representations and deriving simple trading strategies from them. To measure the degree of similarity in these market strings we apply familiar string kernels, bag of words and n-grams, whilst also introducing a new kernel, time-decay n-grams, that captures the temporal nature of FTS. In the process we propose a sequential Parzen windows algorithm based on discrete representations where trading decisions for each string are learned in an online manner and are thus subject to temporal fluctuations. We evaluate the strength of a number of representations using both the string version and its continuous counterpart, whilst also comparing the performance of different learning algorithms on these representations, namely support vector machines, Parzen windows and Fisher discriminant analysis. Our extensive experiments show that the simple string representation coupled with the sequential Parzen windows approach is capable of outperforming other more exotic approaches, supporting the idea that when it comes to working in high noise environments often the simplest approach is the most effective

    State of the Art of Financial Decision Support Systems based on Problem, Requirement, Component and Evaluation Categories

    Get PDF
    Financial decision support has become an important information systems research topic and is also of highest interest to practitioners. Two rapidly emerging trends, the increasing amount of available data and the evolution of data mining methods, pose challenges for researchers. Thus, a review of existing research with the goal to guide future research efforts in this domain is timely. To structure our literature review and future research in this area, we propose a framework in the paper that integrates elements of decision support systems, design theory, and information mining. The framework is then applied in the paper. Our analysis reveals that the focus of existing research can be grouped into three major domain categories. More research is needed in two of the categories for which we found only very few IS studies, despite the high relevance of these topics due to increased turbulences in worldwide financial markets. Furthermore, we discuss the opportunities to make stronger use of heterogeneous data and of combined data mining techniques and to build upon the rich set of available evaluation methods

    Exploring the International Application of Machine Learning in Asset Pricing: An Empirical Study

    Get PDF
    This thesis delves into the application of machine learning models for predicting cross-sectional returns in diverse markets. Chapter One explores the predictive abilities of XG-Boost, Random Forest, and neural network models in relation to fund performance and fund manager information characteristics. The findings indicate that fund performance characteristics prove to be more informative of future fund performance than the characteristics of fund managers. Chapter Two probes the presence of bimodality in momentum stocks and examines the profitability of deep momentum, a machine learning return prediction model, in the UK, Japan, and South Korea. The findings demonstrate that bimodality is a phenomenon linked to developed markets and can cause losses for JT strategy investors. However, the deep momentum model generates substantial profits in all markets by relieving bimodality in long-short portfolios. Chapter Three investigates the efficacy of the momentum factor in Chinese stock markets. We compare the performance of the traditional linear JT model, the XG-Boost model, the neural network model, and neural network reclassification models as developed by Han (2022). The study finds that machine learning models based on the momentum factor outperform the traditional JT linear regression model, indicating a non-linear relationship between the momentum factor and stock returns in China. Han's reclassification models perform the most strongly after reclassification of the true target distribution within high-return deciles moves from a bimodal shape to a right-skewed distribution. The study also observes a significant positive correlation between the return of the long-only portfolio developed using the momentum factor in the machine learning framework and the size and sentiment index. Overall, this thesis attests to the practicality of machine learning models for predicting cross-sectional returns in various markets, with potentially gainful implications for investors and policymakers

    Doctor of Philosophy

    Get PDF
    dissertationDue to the popularity of Web 2.0 and Social Media in the last decade, the percolation of user generated content (UGC) has rapidly increased. In the financial realm, this results in the emergence of virtual investing communities (VIC) to the investing public. There is an on-going debate among scholars and practitioners on whether such UGC contain valuable investing information or mainly noise. I investigate two major studies in my dissertation. First I examine the relationship between peer influence and information quality in the context of individual characteristics in stock microblogging. Surprisingly, I discover that the set of individual characteristics that relate to peer influence is not synonymous with those that relate to high information quality. In relating to information quality, influentials who are frequently mentioned by peers due to their name value are likely to possess higher information quality while those who are better at diffusing information via retweets are likely to associate with lower information quality. Second I propose a study to explore predictability of stock microblog dimensions and features over stock price directional movements using data mining classification techniques. I find that author-ticker-day dimension produces the highest predictive accuracy inferring that this dimension is able to capture both relevant author and ticker information as compared to author-day and ticker-day. In addition to these two studies, I also explore two topics: network structure of co-tweeted tickers and sentiment annotation via crowdsourcing. I do this in order to understand and uncover new features as well as new outcome indicators with the objective of improving predictive accuracy of the classification or saliency of the explanatory models. My dissertation work extends the frontier in understanding the relationship between financial UGC, specifically stock microblogging with relevant phenomena as well as predictive outcomes

    Essays on Financial Applications of Nonlinear Models

    Get PDF
    In this thesis, we examine the relationship between news and the stock market. Further, we explore methods and build new nonlinear models for forecasting stock price movement and portfolio optimization based on past stock prices and on one type of big data, news items, which are obtained through the RavenPack News Analytics Global Equities editions. The thesis consists of three essays. In Essay 1, we investigate the relationship between news items and stock prices using the artificial neural network (ANN) model. First, we use Granger causality to ascertain how news items affect stock prices. The results show that news volume is not the Granger cause of stock price change; rather, news sentiment is. Second, we test the semi–strong form efficient market hypothesis, whereas most existing research testing efficient market hypothesis focuses on the weak–form version. Our ANN strategies consistently outperform the passive buy–and–hold strategy and this finding is apparently at odds with the notion of the efficient market hypothesis. Finally, using news sentiment analytics from RavenPack Dow Jones News Analytics, we show positive profitability with out–of–sample prediction using the proposed ANN strategies for Google Inc. (NASDAQ: GOOG). In Essay 2, we expand the utility of the information from news volume and news sentiments to encompass portfolio diversification. For the Dow Jones Industrial Average (DJIA) components, we assign different weights to build portfolios according to their weekly news volumes or news sentiments. Our results show that news volume contributes to portfolio variance both in–sample and out–of–sample: positive news sentiment contributes to the portfolio return in–sample, while negative contributes to the portfolio return out–of–sample, which is a consequence of investors overreacting to the news sentiment. Further, we propose a novel approach to portfolio diversification using the k–Nearest Neighbors (kNN) algorithm based on the idea that news sentiment correlates with stock returns. Out–of–sample results indicate that such strategy dominates the benchmark DJIA index portfolio. In Essay 3, we propose a new model called the Combined Markov and Hidden Markov Model (CMHMM), in which observation is affected by a Markov model and an HMM (Hidden Markov Model) model. The three fundamental questions of the CMHMM are discussed. Further, the application of the CMHMM, in which the news sentiment is one observation and the stock return is the other, is discussed. The empirical results of the trading strategy based on the CMHMM show the potential applications of the proposed model in finance. This thesis contributes to the literature in a number of ways. First, it extends the literature on financial applications of nonlinear models. We explore the applications of the ANNs and kNN in the financial market. Besides, the proposed new CMHMM model adheres to the nature of the stock market and has better potential prediction ability. Second, the empirical results from this dissertation contribute to the understanding of the relationship between news and the stock market. For instance, our research found that news volume contributes to the portfolio return and that investors overreact to news sentiment—a phenomenon that has been discussed by other scholars from different angles

    Online computational algorithms for portfolio-selection problems

    Get PDF
    Abstract: This thesis contributes to the problem of equity portfolio management using computational intelligence methodologies. The focus is on generating automated nancial reasoning, with a basis in computational nance research, through searching a space of semantically meaningful propositions. In comparison with classical nancial modelling, our proposed algorithms allow continual adaptation to changing market conditions and a non- linear solution representations in most cases. When compared with other computational intelligence approaches, the focus is on a holistic design that integrates nancial research with machine learning. The major aim of the thesis is to develop portfolio allocation techniques for learning investment-decision making that can easily adapt to changes in market processes together with speed and accuracy. We evaluate the algorithms developed in out-of-sample trading framework using historical data sets. The testing is designed to be realistic; for instance, considering factors such as transaction costs, stock splits and data snooping. To demonstrate the robustness of our approach we perform extensive historical simulations using previously untested real market datasets. On all data sets considered, our proposed algorithms signicantly outperform existing portfolio allocation techniques, sometimes in a spectacular way, without any additional computational demand or modeling complexity. Before proceeding any further, we stress that setting up abstract and complex mathe- matical models is neither the intention nor the scope of this thesis. Our aim rather is to investigate empirically and possibly capture any existing nonlinearities or non-stochasticities that are apparent in the dynamics of cross sectional returns of stock prices. In doing so we iii utilise some novel techniques, which are mostly based on such methodologies that have been used successfully in the physical sciences were the deterministic dynamics of the phenomena are more easily detected. Our intention is to provide an additional empirical analysis frame- work that could shed new light in the investigation of the nature of financial time-series data generating processes.Ph.D. (Economics and Financial Sciences

    Essays on Some Recent Penalization Methods with Applications in Finance and Marketing

    Get PDF
    The subject of this PhD research is within the areas of Econometrics and Artificial Intelligence. More concretely, it deals with the tasks of statistical regression and classification analysis. New classification methods have been proposed, as well as new applications of established ones in the areas of Finance and Marketing. The bulk of this PhD research centers on extending standard methods that fall under the general term of loss-versus-penalty classification techniques. These techniques build on the premises that a model that uses a finite amount of available data to be trained on should neither be too complex nor too simple in order to possess a good forecasting ability. New proposed classification techniques in this area are Support Hyperplanes, Nearest Convex Hull classification and Soft Nearest Neighbor. Next to the new techniques, new applications of some standard loss-versus-penalty methods have been put forward. Specifically, these are the application of the so-called Support Vector Machines (SVMs) for classification and regression analysis to financial time series forecasting, solving the Market Share Attraction model and solving and interpreting binary classification tasks in Marketing. In addition, this research focuses on new efficient solutions to SVMs using the so-called majorization algorithm. This algorithm provides for the possibility to incorporate various so-called loss functions while solving general SVM-like methods

    Quantitative methods in high-frequency financial econometrics: modeling univariate and multivariate time series

    Get PDF
    corecore