594 research outputs found

    The Stock Exchange Prediction using Machine Learning Techniques: A Comprehensive and Systematic Literature Review

    Get PDF
    This literature review identifies and analyzes research topic trends, types of data sets, learning algorithm, methods improvements, and frameworks used in stock exchange prediction. A total of 81 studies were investigated, which were published regarding stock predictions in the period January 2015 to June 2020 which took into account the inclusion and exclusion criteria. The literature review methodology is carried out in three major phases: review planning, implementation, and report preparation, in nine steps from defining systematic review requirements to presentation of results. Estimation or regression, clustering, association, classification, and preprocessing analysis of data sets are the five main focuses revealed in the main study of stock prediction research. The classification method gets a share of 35.80% from related studies, the estimation method is 56.79%, data analytics is 4.94%, the rest is clustering and association is 1.23%. Furthermore, the use of the technical indicator data set is 74.07%, the rest are combinations of datasets. To develop a stock prediction model 48 different methods have been applied, 9 of the most widely applied methods were identified. The best method in terms of accuracy and also small error rate such as SVM, DNN, CNN, RNN, LSTM, bagging ensembles such as RF, boosting ensembles such as XGBoost, ensemble majority vote and the meta-learner approach is ensemble Stacking. Several techniques are proposed to improve prediction accuracy by combining several methods, using boosting algorithms, adding feature selection and using parameter and hyper-parameter optimization

    An academic review: applications of data mining techniques in finance industry

    Get PDF
    With the development of Internet techniques, data volumes are doubling every two years, faster than predicted by Moore’s Law. Big Data Analytics becomes particularly important for enterprise business. Modern computational technologies will provide effective tools to help understand hugely accumulated data and leverage this information to get insights into the finance industry. In order to get actionable insights into the business, data has become most valuable asset of financial organisations, as there are no physical products in finance industry to manufacture. This is where data mining techniques come to their rescue by allowing access to the right information at the right time. These techniques are used by the finance industry in various areas such as fraud detection, intelligent forecasting, credit rating, loan management, customer profiling, money laundering, marketing and prediction of price movements to name a few. This work aims to survey the research on data mining techniques applied to the finance industry from 2010 to 2015.The review finds that Stock prediction and Credit rating have received most attention of researchers, compared to Loan prediction, Money Laundering and Time Series prediction. Due to the dynamics, uncertainty and variety of data, nonlinear mapping techniques have been deeply studied than linear techniques. Also it has been proved that hybrid methods are more accurate in prediction, closely followed by Neural Network technique. This survey could provide a clue of applications of data mining techniques for finance industry, and a summary of methodologies for researchers in this area. Especially, it could provide a good vision of Data Mining Techniques in computational finance for beginners who want to work in the field of computational finance

    Modifying Hidden Layer in Neural Network Models to Improve Prediction Accuracy: A Combined Model for Estimating Stock Price

    Get PDF
    Investment experts, who deal with stock price estimation, commonly look for the most accurate and appropriate statistical techniques to make decisions on investment. The aim of this study is to improve the accuracy of stock price prediction models through modifying the structure of a combined neural network model with time-series data, in which the main contribution is to insert the time-series analysis prediction into the hidden layer of the neural network. The proposed structure is made up of neural networks and time-series analysis, with variable reduction used to remove attributes with inter-correlations. Data has been collected over six years (72 months) from the Iranian stock market, including the number of trades, new-coin price, gold-18 price, US Dollar and Euro equivalent currencies, oil-index price, Brent-oil price, industry index, and balanced stock index, followed by developing the prediction models. Comparing the performance criteria of the proposed structure to the traditional ones in terms of the mean square and mean absolute errors revealed that inserting time-series estimated variables into hidden layers would improve the performance of neural network models to estimate stock prices for making investment decisions. Doi: 10.28991/HIJ-2022-03-01-05 Full Text: PD

    Technical and Fundamental Features Analysis for Stock Market Prediction with Data Mining Methods

    Get PDF
    Predicting stock prices is an essential objective in the financial world. Forecasting stock returns and their risk represents one of the most critical concerns of market decision makers. This thesis investigates the stock price forecasting with three approaches from the data mining concept and shows how different elements in the stock price can help to enhance the accuracy of our prediction. For this reason, the first and second approaches capture many fundamental indicators from the stocks and implement them as explanatory variables to do stock price classification and forecasting. In the third approach, technical features from the candlestick representation of the share prices are extracted and used to enhance the accuracy of the forecasting. In each approach, different tools and techniques from data mining and machine learning are employed to justify why the forecasting is working. Furthermore, since the idea is to evaluate the potential of features in the stock trend forecasting, therefore we diversify our experiments using both technical and fundamental features. Therefore, in the first approach, a three-stage methodology is developed while in the first step, a comprehensive investigation of all possible features which can be effective on stocks risk and return are identified. Then, in the next stage, risk and return are predicted by applying data mining techniques for the given features. Finally, we develop a hybrid algorithm, based on some filters and function-based clustering; and re-predicted the risk and return of stocks. In the second approach, instead of using single classifiers, a fusion model is proposed based on the use of multiple diverse base classifiers that operate on a common input and a meta-classifier that learns from base classifiers’ outputs to obtain a more precise stock return and risk predictions. A set of diversity methods, including Bagging, Boosting, and AdaBoost, is applied to create diversity in classifier combinations. Moreover, the number and procedure for selecting base classifiers for fusion schemes are determined using a methodology based on dataset clustering and candidate classifiers’ accuracy. Finally, in the third approach, a novel forecasting model for stock markets based on the wrapper ANFIS (Adaptive Neural Fuzzy Inference System) – ICA (Imperialist Competitive Algorithm) and technical analysis of Japanese Candlestick is presented. Two approaches of Raw-based and Signal-based are devised to extract the model’s input variables and buy and sell signals are considered as output variables. To illustrate the methodologies, for the first and second approaches, Tehran Stock Exchange (TSE) data for the period from 2002 to 2012 are applied, while for the third approach, we used General Motors and Dow Jones indexes.Predicting stock prices is an essential objective in the financial world. Forecasting stock returns and their risk represents one of the most critical concerns of market decision makers. This thesis investigates the stock price forecasting with three approaches from the data mining concept and shows how different elements in the stock price can help to enhance the accuracy of our prediction. For this reason, the first and second approaches capture many fundamental indicators from the stocks and implement them as explanatory variables to do stock price classification and forecasting. In the third approach, technical features from the candlestick representation of the share prices are extracted and used to enhance the accuracy of the forecasting. In each approach, different tools and techniques from data mining and machine learning are employed to justify why the forecasting is working. Furthermore, since the idea is to evaluate the potential of features in the stock trend forecasting, therefore we diversify our experiments using both technical and fundamental features. Therefore, in the first approach, a three-stage methodology is developed while in the first step, a comprehensive investigation of all possible features which can be effective on stocks risk and return are identified. Then, in the next stage, risk and return are predicted by applying data mining techniques for the given features. Finally, we develop a hybrid algorithm, based on some filters and function-based clustering; and re-predicted the risk and return of stocks. In the second approach, instead of using single classifiers, a fusion model is proposed based on the use of multiple diverse base classifiers that operate on a common input and a meta-classifier that learns from base classifiers’ outputs to obtain a more precise stock return and risk predictions. A set of diversity methods, including Bagging, Boosting, and AdaBoost, is applied to create diversity in classifier combinations. Moreover, the number and procedure for selecting base classifiers for fusion schemes are determined using a methodology based on dataset clustering and candidate classifiers’ accuracy. Finally, in the third approach, a novel forecasting model for stock markets based on the wrapper ANFIS (Adaptive Neural Fuzzy Inference System) – ICA (Imperialist Competitive Algorithm) and technical analysis of Japanese Candlestick is presented. Two approaches of Raw-based and Signal-based are devised to extract the model’s input variables and buy and sell signals are considered as output variables. To illustrate the methodologies, for the first and second approaches, Tehran Stock Exchange (TSE) data for the period from 2002 to 2012 are applied, while for the third approach, we used General Motors and Dow Jones indexes.154 - Katedra financívyhově

    An Ensemble Classifier for Stock Trend Prediction Using Sentence-Level Chinese News Sentiment and Technical Indicators

    Get PDF
    In the financial market, predicting stock trends based on stock market news is a challenging task, and researchers are devoted to developing forecasting models. From the existing literature, the performance of the forecasting model is better when news sentiment and technical analysis are considered than when only one of them is used. However, analyzing news sentiment for trend forecasting is a difficult task, especially for Chinese news, because it is unstructured data and extracting the most important features is difficult. Moreover, positive or negative news does not always affect stock prices in a certain way. Therefore, in this paper, we propose an approach to build an ensemble classifier using sentiment in Chinese news at sentence level and technical indicators to predict stock trends. In the training stages, we first divide each news item into a set of sentences. TextRank and word2vec are then used to generate a predefined number of key sentences. The sentiment scores of these key sentences are computed using the given financial lexicon. The sentiment values of the key phrases, the three values of the technical indicators and the stock trend label are merged as a training instance. Based on the sentiment values of the key sets, the corpora are divided into positive and negative news datasets. The two datasets formed are then used to build positive and negative stock trend prediction models using the support vector machine. To increase the reliability of the prediction model, a third classifier is created using the Bollinger Bands. These three classifiers are combined to form an ensemble classifier. In the testing phase, a voting mechanism is used with the trained ensemble classifier to make the final decision based on the trading signals generated by the three classifiers. Finally, experiments were conducted on five years of news and stock prices of one company to show the effectiveness of the proposed approach, and results show that the accuracy and P / L ratio of the proposed approach are 61% and 4.0821 are better than the existing approach

    Data mining in computational finance

    Get PDF
    Computational finance is a relatively new discipline whose birth can be traced back to early 1950s. Its major objective is to develop and study practical models focusing on techniques that apply directly to financial analyses. The large number of decisions and computationally intensive problems involved in this discipline make data mining and machine learning models an integral part to improve, automate, and expand the current processes. One of the objectives of this research is to present a state-of-the-art of the data mining and machine learning techniques applied in the core areas of computational finance. Next, detailed analysis of public and private finance datasets is performed in an attempt to find interesting facts from data and draw conclusions regarding the usefulness of features within the datasets. Credit risk evaluation is one of the crucial modern concerns in this field. Credit scoring is essentially a classification problem where models are built using the information about past applicants to categorise new applicants as ‘creditworthy’ or ‘non-creditworthy’. We appraise the performance of a few classical machine learning algorithms for the problem of credit scoring. Typically, credit scoring databases are large and characterised by redundant and irrelevant features, making the classification task more computationally-demanding. Feature selection is the process of selecting an optimal subset of relevant features. We propose an improved information-gain directed wrapper feature selection method using genetic algorithms and successfully evaluate its effectiveness against baseline and generic wrapper methods using three benchmark datasets. One of the tasks of financial analysts is to estimate a company’s worth. In the last piece of work, this study predicts the growth rate for earnings of companies using three machine learning techniques. We employed the technique of lagged features, which allowed varying amounts of recent history to be brought into the prediction task, and transformed the time series forecasting problem into a supervised learning problem. This work was applied on a private time series dataset
    • …
    corecore