828 research outputs found

    Developing a Machine Learning based Systematic Investment Startegy: A case study for the Construction Industry

    Get PDF
    In this research work, an end-to-end systematic investment strategy based on machine learning models and leveraging the construction industry operational and management practices knowledge, is implemented. First, a literature research in the field of behavioral finance is done, presenting the current state of the knowledge and trends in the industry. A suitable investment opportunity exploiting prevailing market inefficiencies around earnings announcements is identified. Second, an extensive literature research is performed identifying the most relevant characteristics of construction companies’ operations and major risk factors they are exposed to. These insights are used to engineer a set of relevant variables. Third, advanced statistical techniques are used to select the most relevant subset of features, which includes market and analysts’ expectation data, macroeconomic indicators, the delay in reporting earnings, and the most important financial dimensions for construction firms. Fourth, the earnings’ surprise classification problem is characterized by a class imbalance and asymmetric misclassification costs. These issues are a consequence of the desired business application, and are addressed by selecting an appropriate evaluation metric. Additionally, considerations on the temporal dimension and generative process of the data are made to select an appropriate validation scheme. Five different state-of-the-art machine learning algorithms are considered: a multinomial logistic regression, a bagging classifier, a random forest, an XGBoost and a linear Support Vector Machine. The multinomial logistic regression is found to be the most suitable model, exhibiting a bias towards predicting positive earnings’ surprises over the rest of classes. The firm size, and the profitability and valuation measures, portrayed by the Return on Assets and Enterprise Value multiples, are found to be the most important variables when predicting earnings surprises. To conclude, the systematic investment strategy based on the investment signals produced by the selected machine learning model is back-tested, being the performance of the long-short portfolio driven by the positive surprise one as a consequence of the selected model bias. Keywords: Quantitative Investing, Machine Learning, Behavioral Financ

    A Back Propagation Neural Network Model with the Synthetic Minority Over-Sampling Technique for Construction Company Bankruptcy Prediction

    Get PDF
    Improving model accuracy is one of the most frequently addressed issues in bankruptcy prediction. Several previous studies employed artificial neural networks (ANNs) to improve the accuracy at which construction company bankruptcy can be predicted. However, most of these studies use the sample-matching technique and all of the available company quarters or company years in the dataset, resulting in sample selection biases and between-class imbalances. This study integrates a back propagation neural network (BPNN) with the synthetic minority over-sampling technique (SMOTE) and the use of all of the available company-year samples during the sample period to improve the accuracy at which bankruptcy in construction companies can be predicted. In addition to eliminating sample selection biases during the sample matching and between-class imbalance, these methods also achieve the high accuracy rates. Furthermore, the approach used in this study shows optimal over-sampling times, neurons of the hidden layer, and learning rate, all of which are major parameters in the BPNN and SMOTE-BPNN models. The traditional BPNN model is provided as a benchmark for evaluating the predictive abilities of the SMOTE-BPNN model. The empirical results of this paper show that the SMOTE-BPNN model outperforms the traditional BPNN

    Critical factors for insolvency prediction: Towards a theoretical model for the construction industry

    Get PDF
    © 2016 Informa UK Limited, trading as Taylor & Francis Group. Many construction industry insolvency prediction model (CI-IPM) studies have arbitrarily employed or simply adopted from previous studies different insolvency factors, without justification, leading to poorly performing CI-IPMs. This is due to the absence of a framework for selection of relevant factors. To identify the most important insolvency factors for a high-performance CI-IPM, this study used three approaches. Firstly, systematic review was used to identify all existing factors. Secondly, frequency of factor use and accuracy of models in the reviewed studies were analysed to establish the important factors. Finally, using a questionnaire survey of CI professionals, the importance levels of factors were validated using the Cronbach's alpha reliability coefficient and significant index ranking. The findings show that the important quantitative factors are profitability, liquidity, leverage, management efficiency and cash flow. While important qualitative factors are management/owner characteristics, internal strategy, management decision making, macroeconomic firm characteristics and sustainability. These factors, which align with existing insolvency-related theories, including Porter's five competitive forces and Mintzberg's 5Ps (plan, ploy, pattern, position and perspective) of strategy, were used to develop a theoretical framework. This study contributes to the debate on the need to amalgamate qualitative and quantitative factors to develop a valid CI-IPM

    Is the Financial Report Quality Important in the Default Prediction? SME Portuguese Construction Sector Evidence

    Get PDF
    This work analyses whether financial information quality is relevant to explaining firms’ probability of default. A financial default prediction model for SMEs (Small and Medium Enterprises) is presented, which includes not only traditional measures but also financial reporting quality (FRQ) measures. FRQ influences the decision-making due to its impact on financial information, which has repercussions on the accounting ratios’ informativeness. A panel data of 1560 Portuguese SMEs in the construction sector, from 2012 to 2018, is analysed. First, firms are classified as default or compliant using an ex-ante criterion which allows us to identify signs of financial constraints in advance. Then, the stepwise method is employed to identify which variables are more relevant to explain the default probability. Results show that FRQ measures, namely accruals quality and timeliness, impact firms’ defaulting, supporting their relevance in predicting financial difficulties. Finally, using a logit approach, the accuracy of the model increased when FRQ variables were included. Results are confirmed using “new age” classifiers, namely the random forest methodology. This work is not only relevant to the extant financial distress literature but has also relevant implications for practice since stakeholders can understand the impact of financial reporting quality to prevent additional risks.info:eu-repo/semantics/publishedVersio

    A framework for big data analytics approach to failure prediction of construction firms

    Get PDF
    This study explored use of big data analytics (BDA) to analyse data of a large number of construction firms to develop a construction business failure prediction model (CB-FPM). Careful analysis of literature revealed financial ratios as the best form of variable for this problem. Because of MapReduce’s unsuitability for iteration problems involved in developing CB-FPMs, various BDA initiatives for iteration problems were identified. A BDA framework for developing CB-FPM was proposed. It was validated by using 150,000 datacells of 30,000 construction firms, artificial neural network, Amazon Elastic Compute Cloud, Apache Spark and the R software. The BDA CB-FPM was developed in eight seconds while the same process without BDA was aborted after nine hours without success. This shows the issue of not wanting to use large dataset to develop CB-FPM due to tedious duration is resolvable by applying BDA technique. The BDA CB-FPM largely outperformed an ordinary CB-FPM developed with a dataset of 200 construction firms, proving that use of larger sample size with the aid of BDA, leads to better performing CB-FPMs. The high financial and social cost associated with misclassifications (i.e. model error) thus makes adoption of BDA CB-FPMs very important for, among others, financiers, clients and policy maker

    Corporate Bankruptcy Prediction

    Get PDF
    Bankruptcy prediction is one of the most important research areas in corporate finance. Bankruptcies are an indispensable element of the functioning of the market economy, and at the same time generate significant losses for stakeholders. Hence, this book was established to collect the results of research on the latest trends in predicting the bankruptcy of enterprises. It suggests models developed for different countries using both traditional and more advanced methods. Problems connected with predicting bankruptcy during periods of prosperity and recession, the selection of appropriate explanatory variables, as well as the dynamization of models are presented. The reliability of financial data and the validity of the audit are also referenced. Thus, I hope that this book will inspire you to undertake new research in the field of forecasting the risk of bankruptcy

    ‘Big data analytics’ for construction firms insolvency prediction models

    Get PDF
    In a pioneering effort, this study is the first to develop a construction firms insolvency prediction model (CF-IPM) with Big Data Analytics (BDA); combine qualitative and quantitative variables; advanced artificial intelligence tools such as Random Forest and Bart Machine; and data of all sizes of construction firms (CF), ensuring wide applicabilityThe pragmatism paradigm was employed to allow the use of mixed methods. This was necessary to allow the views of the top management team (TMT) of failed and existing construction firms to be captured using a qualitative approach.TMT members of 13 existing and 14 failed CFs were interviewed. Interview result was used to create a questionnaire with over hundred qualitative variables. A total of 272 and 259 (531) usable questionnaires were returned for existing and failed CFs respectively. The data of the 531 questionnaires were oversample to get a total questionnaire sample of 1052 CFs. The original and matched sample financial data of the firms were downloaded. Using Cronbach’s alpha and factor analysis, qualitative variables were reduced to 13 (Q1 to Q13) while11 financial ratios (i.e. quantitative variables) (R1 and R11) reported by large and MSM CFs were identified for the sample CFs.The BDA system was set up with the Amazon Web Services Elastic Compute Cloud using five ‘Instances’ as Hadoop DataNodes and one as NameNode. The NameNode was configured as Spark Master. Eleven variable selection methods and three voting systems were used to select the final seven qualitative and seven quantitative variables, which were used to develop 13 BDA-CF-IPMs. The Decision Tree BDA-CF-IPM was the model of choice in this study because it had high accuracy, low Type I error and transparency. The most important variables (factors) affecting insolvency of construction firms according to the best model are returned on total assets; liquidity; solvency ratio; top management characteristics; strategic issues and external relations; finance and conflict related issues; industry contract/project knowledge

    Impact of macroeconomic indicators on bankruptcy prediction models: case of the Portuguese construction sector

    Get PDF
    The importance of macroeconomic indicators on the performance of bankruptcy prediction models has been a contentious issue, due in part to a lack of empirical evidence. Most indicators are primarily centered around a company's internal environment, overlooking the impact of the economic cycle on the status of the company. This research brings awareness about the combination of microeconomic and macroeconomic factors. To do this, a new model based on logistic regression was combined with principal component analysis to determine the indicators that best explained the variations in the dataset studied. The sample used comprised data from 1,832 Portuguese construction companies from 2009 to 2019. The empirical results demonstrated an average accuracy rate of 90% up until three years before the bankruptcy. The microeconomic indicators with statistical significance fell within the category of liquidity ratios, solvency and financial autonomy ratios. Regarding the macroeconomic indicators, the gross domestic product and birth rate of enterprises proved to increase the accuracy of bankruptcy prediction more than using only microeconomic factors. A practical implication of the results obtained is that construction companies, as well as investors, government agencies and banks, can use the suggested model as a decision-support system. Furthermore, consistent use can lead to an effective method of preventing bankruptcy by spotting early warning indicators

    Do Fraudulent Companies Employ Different Linguistic Features in Their Annual Reports? An Empirical Study Using Logistic Regression and Random Forest Methodologies

    Get PDF
    The use of textual analysis to uncover fraudulent actions in 10-K filings is widespread. The previous studies have looked at the Management Disclosure and Analysis (MD&A) section of annual reports to predict illicit behaviour by analysing the tone of executives, with the majority of those studies dating back 10 years or more. The primary goal of this research is to find patterns in linguistic features of entire annual reports of convicted public businesses, which were found using the Corporate Prosecution Registry database, and compare them to non-fraudulent equivalents in the same industry. The algorithms of logistic regression and random forest are implemented to discover important factors and make accurate predictions. The accuracy rate, ROC-AUC value, and 10-fold cross-validation tools are performed to validate the success of each method. The results of the logistic regression revealed that corrupt organisations utilise a more negative, uncertain, and litigious tone. Furthermore, these businesses employ more words with a high lexical diversity and minimal complexity. Based on the Random Forest machine learning technique, the litigious variable is the most important variable in the prediction of untruthful corporations. Moreover, each of the validation methods demonstrates that the Random Forest methodology outperforms logistic regression.nhhma
    • 

    corecore