4,140 research outputs found

    Textual Information and IPO Underpricing: A Machine Learning Approach

    Get PDF
    This study examines the predictive power of textual information from S-1 filings in explaining IPO underpricing. Our empirical approach differs from previous research, as we utilize several machine learning algorithms to predict whether an IPO will be underpriced, or not. We analyze a large sample of 2,481 U.S. IPOs from 1997 to 2016, and we find that textual information can effectively complement traditional financial variables in terms of prediction accuracy. In fact, models that use both textual data and financial variables as inputs have superior performance compared to models using a single type of input. We attribute our findings to the fact that textual information can reduce the ex-ante valuation uncertainty of IPO firms, thus leading to more accurate estimates

    Essays on Financial Applications of Nonlinear Models

    Get PDF
    In this thesis, we examine the relationship between news and the stock market. Further, we explore methods and build new nonlinear models for forecasting stock price movement and portfolio optimization based on past stock prices and on one type of big data, news items, which are obtained through the RavenPack News Analytics Global Equities editions. The thesis consists of three essays. In Essay 1, we investigate the relationship between news items and stock prices using the artificial neural network (ANN) model. First, we use Granger causality to ascertain how news items affect stock prices. The results show that news volume is not the Granger cause of stock price change; rather, news sentiment is. Second, we test the semi–strong form efficient market hypothesis, whereas most existing research testing efficient market hypothesis focuses on the weak–form version. Our ANN strategies consistently outperform the passive buy–and–hold strategy and this finding is apparently at odds with the notion of the efficient market hypothesis. Finally, using news sentiment analytics from RavenPack Dow Jones News Analytics, we show positive profitability with out–of–sample prediction using the proposed ANN strategies for Google Inc. (NASDAQ: GOOG). In Essay 2, we expand the utility of the information from news volume and news sentiments to encompass portfolio diversification. For the Dow Jones Industrial Average (DJIA) components, we assign different weights to build portfolios according to their weekly news volumes or news sentiments. Our results show that news volume contributes to portfolio variance both in–sample and out–of–sample: positive news sentiment contributes to the portfolio return in–sample, while negative contributes to the portfolio return out–of–sample, which is a consequence of investors overreacting to the news sentiment. Further, we propose a novel approach to portfolio diversification using the k–Nearest Neighbors (kNN) algorithm based on the idea that news sentiment correlates with stock returns. Out–of–sample results indicate that such strategy dominates the benchmark DJIA index portfolio. In Essay 3, we propose a new model called the Combined Markov and Hidden Markov Model (CMHMM), in which observation is affected by a Markov model and an HMM (Hidden Markov Model) model. The three fundamental questions of the CMHMM are discussed. Further, the application of the CMHMM, in which the news sentiment is one observation and the stock return is the other, is discussed. The empirical results of the trading strategy based on the CMHMM show the potential applications of the proposed model in finance. This thesis contributes to the literature in a number of ways. First, it extends the literature on financial applications of nonlinear models. We explore the applications of the ANNs and kNN in the financial market. Besides, the proposed new CMHMM model adheres to the nature of the stock market and has better potential prediction ability. Second, the empirical results from this dissertation contribute to the understanding of the relationship between news and the stock market. For instance, our research found that news volume contributes to the portfolio return and that investors overreact to news sentiment—a phenomenon that has been discussed by other scholars from different angles

    A system to predict the S&P 500 using a bio-inspired algorithm

    Get PDF
    The goal of this research was to develop an algorithmic system capable of predicting the directional trend of the S&P 500 financial index. The approach I have taken was inspired by the biology of the human retina. Extensive research has been published attempting to predict different financial markets using historical data, testing on an in-sample and trend basis with many employing sophisticated mathematical techniques. In reviewing and evaluating these in-sample methodologies, it became evident that this approach was unable to achieve sufficiently reliable prediction performance for commercial exploitation. For these reasons, I moved to an out-of-sample strategy and am able to predict tomorrow’s (t+1) directional trend of the S&P 500 at 55.1%. The key elements that underpin my bio-inspired out-of-sample system are: Identification of 51 financial market data (FMD) inputs, including other indices, currency pairs, swap rates, that affect the 500 component companies of the S&P 500. The use of an extensive historical data set, comprising the actual daily closing prices of the chosen 51 FMD inputs and S&P 500. The ability to compute this large data set in a time frame of less than 24 hours. The data set was fed into a linear regression algorithm to determine the predicted value of tomorrow’s (t+1) S&P 500 closing price. This process was initially carried out in MatLab which proved the concept of my approach, but (3) above was not met. In order to successfully meet the requirement of handling such a large data set to complete the prediction target on time, I decided to adopt a novel graphics processing unit (GPU) based computational architecture. Through extensive optimisation of my GPU engine, I was able to achieve a sufficient speed up of 150x to meet (3). In achieving my optimum directional trend of 55.1%, an extensive range of tests exploring a number of trade offs were carried out using an 8 year data set. The results I have obtained will form the basis of a commercial investment fund. It should be noted that my algorithm uses financial data of the past 60-days, and as such would not be able to predict rapid market changes such as a stock market crash

    Predicting Startup Success Using Publicly Available Data

    Get PDF
    Predicting the success of an early-stage startup has always been a major effort for investors and venture funds. Statistically, there are about 305 million total startups created in a year, but less than 10% of them succeed to become profitable businesses. Accurately identifying the signs of startup growth is the work of countless investors, and in recent years, research has turned to machine learning in hopes of improving the accuracy and speed of startup success prediction. To learn about a startup, investors have to navigate many different internet sources and often rely on personal intuition to determine the startup’s potential and likelihood of success. This thesis explores whether online data about a company, particularly general company data, previous funding events, published news articles, internet presence, and social media activity can be used to identify fast-growing startups. Data collected from Crunchbase, the Google Search API, and Twitter was used to predict whether a company will raise a round of funding within a fixed time horizon. A total of ten machine learning models were evaluated and the CatBoost ensemble method achieved the best performance with precision, recall, and F1 scores of 0.663, 0.827, and 0.736 respectively for predicting funding within 3 years. The same ensem- ble method achieved F1 scores of 0.528, 0.683, 0.736, 0.763, and 0.777 at predicting funding 1-5 years into the future. The final objective was to predict whether a startup that had already raised an angel or seed round would raise another investment within a one-year horizon. The CatBoost model with a 0.75 cutoff achieved precision and F0.1 scores of 0.790 and 0.774, beating the results of previous work in this field

    Can Deep Learning Techniques Improve the Risk Adjusted Returns from Enhanced Indexing Investment Strategies

    Get PDF
    Deep learning techniques have been widely applied in the field of stock market prediction particularly with respect to the implementation of active trading strategies. However, the area of portfolio management and passive portfolio management in particular has been much less well served by research to date. This research project conducts an investigation into the science underlying the implementation of portfolio management strategies in practice focusing on enhanced indexing strategies. Enhanced indexing is a passive management approach which introduces an element of active management with the aim of achieving a level of active return through small adjustments to the portfolio weights. It then proceeds to investigate current applications of deep learning techniques in the field of financial market predictions and also in the specific area of portfolio management. A series of successively deeper neural network models were then developed and assessed in terms of their ability to accurately predict whether a sample of stocks would either outperform or underperform the selected benchmark index. The predictions generated by these models were then used to guide the adjustment of portfolio weightings to implement and forward test an enhanced indexing strategy on a hypothetical stock portfolio
    • …
    corecore