1,041 research outputs found

    DISCOVERING INTERESTING PATTERNS FOR INVESTMENT DECISION MAKING WITH GLOWER C - A GENETIC LEARNER OVERLAID WITH ENTROPY REDUCTION

    Get PDF
    Prediction in financial domains is notoriously difficult for a number of reasons. First, theories tend to be weak or non-existent, which makes problem formulation open-ended by forcing us to consider a large number of independent variables and thereby increasing the dimensionality of the search space. Second, the weak relationships among variables tend to be nonlinear, and may hold only in limited areas of the search space. Third, in financial practice, where analysts conduct extensive manual analysis of historically well performing indicators, a key is to find the hidden interactions among variables that perform well in combination. Unfortunately, these are exactly the patterns that the greedy search biases incorporated by many standard rule algorithms will miss. In this paper, we describe and evaluate several variations of a new genetic learning algorithm (GLOWER) on a variety of data sets. The design of GLOWER has been motivated by financial prediction problems, but incorporates successful ideas from tree induction and rule learning. We examine the performance of several GLOWER variants on two UCI data sets as well as on a standard financial prediction problem (S&P500 stock returns), using the results to identify and use one of the better variants for further comparisons. We introduce a new (to KDD) financial prediction problem (predicting positive and negative earnings surprises), and experiment withGLOWER, contrasting it with tree- and rule-induction approaches. Our results are encouraging, showing that GLOWER has the ability to uncover effective patterns for difficult problems that have weak structure and significant nonlinearities.Information Systems Working Papers Serie

    The information content of cash flows versus accrual-based income numbers

    Get PDF
    Word processed copy. Includes bibliographical references (leaves 115-123)

    Ensemble Committees for Stock Return Classification and Prediction

    Full text link
    This paper considers a portfolio trading strategy formulated by algorithms in the field of machine learning. The profitability of the strategy is measured by the algorithm's capability to consistently and accurately identify stock indices with positive or negative returns, and to generate a preferred portfolio allocation on the basis of a learned model. Stocks are characterized by time series data sets consisting of technical variables that reflect market conditions in a previous time interval, which are utilized produce binary classification decisions in subsequent intervals. The learned model is constructed as a committee of random forest classifiers, a non-linear support vector machine classifier, a relevance vector machine classifier, and a constituent ensemble of k-nearest neighbors classifiers. The Global Industry Classification Standard (GICS) is used to explore the ensemble model's efficacy within the context of various fields of investment including Energy, Materials, Financials, and Information Technology. Data from 2006 to 2012, inclusive, are considered, which are chosen for providing a range of market circumstances for evaluating the model. The model is observed to achieve an accuracy of approximately 70% when predicting stock price returns three months in advance.Comment: 15 pages, 4 figures, Neukom Institute Computational Undergraduate Research prize - second plac

    Earnings prediction using machine learning methods and analyst comparison

    Get PDF
    In the course of this dissertation we propose an experimental study on how technical, macroeconomic, and financial variables, alongside analysts’ forecasts, can be used to optimize the prediction for the subsequent quarter’s earnings results using machine learning, comparing the performance of the models to analysts’ forecasts. The dissertation includes three steps. In step one, an event study is conducted to test abnormal returns in firms’ stock prices in the day following earnings announcement, grouped by earnings per share (EPS) growth in classes of size 3, 6 and 9, computed for each quarter. In step two, several machine learning models are built to maximize the accuracy of EPS predictions. In the last step, investment strategies are constructed to take advantage of investors’ expectations, which are closely correlated with analysts’ predictions. In the backdrop of an exhaustive analysis on quarterly earnings predictions using machine learning methods, conclusions are drawn related to the superiority of the CatBoost classifier. All machine learning models tested underperform analyst predictions, which could be explained by the time and privileged information at analysts’ disposal, as well as their selection of firms to cover. Regardless, machine learning models can be used as a confirmation for analyst predictions, and statistically significant investment strategies are pursued with those fundamentals. Importantly, high confidence predictions by machine learning models are significantly more accurate than the average accuracy of forecasts.No decorrer desta dissertação, realiza-se um estudo experimental sobre a forma como análises técnicas, macroeconómicas, fundamentais e as previsões dos analistas podem ser utilizadas em conjunto para otimizar a previsão dos resultados de lucros do próximo trimestre de empresas A dissertação inclui três etapas. Na primeira etapa, é efetuado um estudo de evento para testar os retornos anormais nas ações no dia seguinte aos anúncios de lucros, sendo estes agrupados pelo crescimento do lucro por ação nas classes de 3, 6 e 9, calculado para cada trimestre. Na etapa dois, vários modelos de machine learning (ML) são concebidos para maximizar a precisão das previsões de crescimento de lucros de empresas. Na última etapa, estratégias de investimento são construídas para tirar proveito das expectativas do investidor, que estão relacionadas com as previsões dos analistas. Uma vez que um dos projetos de pesquisa mais exaustivos sobre previsões de lucros para o próximo trimestre, conclusões podem ser retiradas relacionadas com a superioridade do modelo CatBoost nas previsões de lucros. Todos os modelos de testados apresentam desempenho inferior às previsões dos analistas, o que pode ser explicado pelo tempo e pelas informações privilegiadas a que os analistas têm acesso, bem como pela escolha da empresa sob a qual as suas previsões incidem. Os modelos de podem ser utilizados como uma confirmação para as previsões dos analistas criando estratégias de investimento estatisticamente significativas. Além disso, as previsões com alta confiança por modelos de são mais precisas do que a precisão média das previsões dos analistas

    Developing a Machine Learning based Systematic Investment Startegy: A case study for the Construction Industry

    Get PDF
    In this research work, an end-to-end systematic investment strategy based on machine learning models and leveraging the construction industry operational and management practices knowledge, is implemented. First, a literature research in the field of behavioral finance is done, presenting the current state of the knowledge and trends in the industry. A suitable investment opportunity exploiting prevailing market inefficiencies around earnings announcements is identified. Second, an extensive literature research is performed identifying the most relevant characteristics of construction companies’ operations and major risk factors they are exposed to. These insights are used to engineer a set of relevant variables. Third, advanced statistical techniques are used to select the most relevant subset of features, which includes market and analysts’ expectation data, macroeconomic indicators, the delay in reporting earnings, and the most important financial dimensions for construction firms. Fourth, the earnings’ surprise classification problem is characterized by a class imbalance and asymmetric misclassification costs. These issues are a consequence of the desired business application, and are addressed by selecting an appropriate evaluation metric. Additionally, considerations on the temporal dimension and generative process of the data are made to select an appropriate validation scheme. Five different state-of-the-art machine learning algorithms are considered: a multinomial logistic regression, a bagging classifier, a random forest, an XGBoost and a linear Support Vector Machine. The multinomial logistic regression is found to be the most suitable model, exhibiting a bias towards predicting positive earnings’ surprises over the rest of classes. The firm size, and the profitability and valuation measures, portrayed by the Return on Assets and Enterprise Value multiples, are found to be the most important variables when predicting earnings surprises. To conclude, the systematic investment strategy based on the investment signals produced by the selected machine learning model is back-tested, being the performance of the long-short portfolio driven by the positive surprise one as a consequence of the selected model bias. Keywords: Quantitative Investing, Machine Learning, Behavioral Financ

    DISCOVERING INTERESTING PATTERNS FOR INVESTMENT DECISION MAKING WITH GLOWER C - A GENETIC LEARNER OVERLAID WITH ENTROPY REDUCTION

    Get PDF
    Prediction in financial domains is notoriously difficult for a number of reasons. First, theories tend to be weak or non-existent, which makes problem formulation open-ended by forcing us to consider a large number of independent variables and thereby increasing the dimensionality of the search space. Second, the weak relationships among variables tend to be nonlinear, and may hold only in limited areas of the search space. Third, in financial practice, where analysts conduct extensive manual analysis of historically well performing indicators, a key is to find the hidden interactions among variables that perform well in combination. Unfortunately, these are exactly the patterns that the greedy search biases incorporated by many standard rule algorithms will miss. In this paper, we describe and evaluate several variations of a new genetic learning algorithm (GLOWER) on a variety of data sets. The design of GLOWER has been motivated by financial prediction problems, but incorporates successful ideas from tree induction and rule learning. We examine the performance of several GLOWER variants on two UCI data sets as well as on a standard financial prediction problem (S&P500 stock returns), using the results to identify and use one of the better variants for further comparisons. We introduce a new (to KDD) financial prediction problem (predicting positive and negative earnings surprises), and experiment withGLOWER, contrasting it with tree- and rule-induction approaches. Our results are encouraging, showing that GLOWER has the ability to uncover effective patterns for difficult problems that have weak structure and significant nonlinearities.Information Systems Working Papers Serie

    Forecasting stock market return with nonlinearity: a genetic programming approach

    Get PDF
    The issue whether return in the stock market is predictable remains ambiguous. This paper attempts to establish new return forecasting models in order to contribute on addressing this issue. In contrast to existing literatures, we first reveal that the model forecasting accuracy can be improved through better model specification without adding any new variables. Instead of having a unified return forecasting model, we argue that stock markets in different countries shall have different forecasting models. Furthermore, we adopt an evolutionary procedure called Genetic programming (GP), to develop our new models with nonlinearity. Our newly-developed forecasting models are testified to be more accurate than traditional AR-family models. More importantly, the trading strategy we propose based on our forecasting models has been verified to be highly profitable in different types of stock markets in terms of stock index futures trading

    The use of predictive analytics in finance

    Get PDF
    Statistical and computational methods are being increasingly integrated into Decision Support Systems to aid management and help with strategic decisions. Researchers need to fully understand the use of such techniques in order to make predictions when using financial data. This paper therefore presents a method based literature review focused on the predictive analytics domain. The study comprehensively covers classification, regression, clustering, association and time series models. It expands existing explanatory statistical modelling into the realm of computational modelling. The methods explored enable the prediction of the future through the analysis of financial time series and cross-sectional data that is collected, stored and processed in Information Systems. The output of such models allow financial managers and risk oversight professionals to achieve better outcomes. This review brings the various predictive analytic methods in finance together under one domain

    Uncovering hidden information and relations in time series data with wavelet analysis: three case studies in finance

    Get PDF
    This thesis aims to provide new insights into the importance of decomposing aggregate time series data using the Maximum Overlap Discrete Wavelet Transform. In particular, the analysis throughout this thesis involves decomposing aggregate financial time series data at hand into approximation (low-frequency) and detail (high-frequency) components. Following this, information and hidden relations can be extracted for different investment horizons, as matched with the detail components. The first study examines the ability of different GARCH models to forecast stock return volatility in eight international stock markets. The results demonstrate that de-noising the returns improves the accuracy of volatility forecasts regardless of the statistical test employed. After de-noising, the asymmetric GARCH approach tends to be preferred, although that result is not universal. Furthermore, wavelet de-noising is found to be more important at the key 99% Value-at-Risk level compared to the 95% level. The second study examines the impact of fourteen macroeconomic news announcements on the stock and bond return dynamic correlation in the U.S. from the day of the announcement up to sixteen days afterwards. Results conducted over the full sample offer very little evidence that macroeconomic news announcements affect the stock-bond return dynamic correlation. However, after controlling for the financial crisis of 2007-2008 several announcements become significant both on the announcement day and afterwards. Furthermore, the study observes that news released early in the day, i.e. before 12 pm, and in the first half of the month, exhibit a slower effect on the dynamic correlation than those released later in the month or later in the day. While several announcements exhibit significance in the 2008 crisis period, only CPI and Housing Starts show significant and consistent effects on the correlation outside the 2001, 2008 and 2011 crises periods. The final study investigates whether recent returns and the time-scaled return can predict the subsequent trading in ten stock markets. The study finds little evidence that recent returns do predict the subsequent trading, though this predictability is observed more over the long-run horizon. The study also finds a statistical relation between trading and return over the long-time investment horizons of [8-16] and [16-32] day periods. Yet, this relation is mostly a negative one, only being positive for developing countries. It also tends to be economically stronger during bull-periods
    • …
    corecore