350 research outputs found

    Using machine learning techniques for early cost prediction of structural systems of buildings

    Get PDF
    Thesis (Doctoral)--İzmir Institute of Technology, Architecture, İzmir, 2005Includes bibliographical references (leaves:111)Text in English; Abstract: Turkish and Englishx, 111 leavesIt is desirable to predict construction costs in the early design stages in order tomake sure that target costs are met and competitive prices are realized. This study investigates the possibility of predicting the cost of construction early in the design phase by using machine learning (ML) techniques. To achieve this objective, artificialneural network (ANN) and case based reasoning (CBR) prediction models were developed in a spreadsheet-based format. An investigation of the impacts of weight generation methods on the ANN and CBR models was conducted. The performance of the ANN model was enhanced by experimenting with the weight generation methods of simplex optimization, back propagation training, and genetic algorithms while the CBR model was augmented by feature counting, gradient descent, genetic algorithms (GA), decision tree methods of binary-dtree, info-top and info-dtree.Cost data belonging to the superstructure of low-rise residential buildings were used to test these models. It was found that both approaches were capable of providing high prediction accuracy, 96% for ANN using simplex optimization for weight determination, and 84% for CBR using GA for attribute weight selection. A comparison of the Excel-based ANN and CBR models was made in terms of prediction accuracy, preprocessing effort, explanatory value, improvement potentials and ease of use. The study demonstrated the practicality of using spreadsheets in developing ANN and CBR models for use in construction management as well as the potential benefits of enhancing ANN and CBR models by using different weight generation methods

    A Comprehensive Survey on Enterprise Financial Risk Analysis: Problems, Methods, Spotlights and Applications

    Full text link
    Enterprise financial risk analysis aims at predicting the enterprises' future financial risk.Due to the wide application, enterprise financial risk analysis has always been a core research issue in finance. Although there are already some valuable and impressive surveys on risk management, these surveys introduce approaches in a relatively isolated way and lack the recent advances in enterprise financial risk analysis. Due to the rapid expansion of the enterprise financial risk analysis, especially from the computer science and big data perspective, it is both necessary and challenging to comprehensively review the relevant studies. This survey attempts to connect and systematize the existing enterprise financial risk researches, as well as to summarize and interpret the mechanisms and the strategies of enterprise financial risk analysis in a comprehensive way, which may help readers have a better understanding of the current research status and ideas. This paper provides a systematic literature review of over 300 articles published on enterprise risk analysis modelling over a 50-year period, 1968 to 2022. We first introduce the formal definition of enterprise risk as well as the related concepts. Then, we categorized the representative works in terms of risk type and summarized the three aspects of risk analysis. Finally, we compared the analysis methods used to model the enterprise financial risk. Our goal is to clarify current cutting-edge research and its possible future directions to model enterprise risk, aiming to fully understand the mechanisms of enterprise risk communication and influence and its application on corporate governance, financial institution and government regulation

    A multiagent system for the analysis of sequence data

    Get PDF
    The analysis of sequence data requires the processing of the data obtained from sequencers for their subsequent comparison with genomes. The information recovered from the sequencers must be assembled and aligned in order to recover the variations that exist in the patient DNA. This study proposes a system to detect and classify variations by integrating information taken from biomedical databases. The system incorporates different algorithms to search for differences as compared to the reference genome for patients

    Ensemble methods for meningitis aetiology diagnosis

    Get PDF
    In this work, we explore data-driven techniques for the fast and early diagnosis concerning the etiological origin of meningitis, more specifically with regard to differentiating between viral and bacterial meningitis. We study how machine learning can be used to predict meningitis aetiology once a patient has been diagnosed with this disease. We have a dataset of 26,228 patients described by 19 attributes, mainly about the patient's observable symptoms and the early results of the cerebrospinal fluid analysis. Using this dataset, we have explored several techniques of dataset sampling, feature selection and classification models based both on ensemble methods and on simple techniques (mainly, decision trees). Experiments with 27 classification models (19 of them involving ensemble methods) have been conducted for this paper. Our main finding is that the combination of ensemble methods with decision trees leads to the best meningitis aetiology classifiers. The best performance indicator values (precision, recall and f-measure of 89% and an AUC value of 95%) have been achieved by the synergy between bagging and NBTrees. Nonetheless, our results also suggest that the combination of ensemble methods with certain decision tree clearly improves the performance of diagnosis in comparison with those obtained with only the corresponding decision tree.This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. We would like to thank the Health Department of the Brazilian Government for providing the dataset and for authorizing its use in this study. We would also like to express our gratitude to the reviewers for their thoughtful comments and efforts towards improving our manuscript. Funding for open access charge: Universidad de Málaga / CBUA

    An Intelligent Framework for Estimating Software Development Projects using Machine Learning

    Get PDF
    The IT industry has faced many challenges related to software effort and cost estimation. A cost assessment is conducted after software effort estimation, which benefits customers as well as developers. The purpose of this paper is to discuss various methods for the estimation of software effort and cost in the context of software engineering, such as algorithmic methods, expert judgment methods, analogy-based estimation methods, and machine learning methods, as well as their different aspects. In spite of this, estimation of the effort involved in software development are subject to uncertainty. Several methods have been developed in the literature for improving estimation accuracy, many of which involve the use of machine learning techniques. A machine learning framework is proposed in this paper to address this challenging problem. In addition to being completely independent of algorithmic models and estimation problems, this framework also features a modular architecture. It has high interpretability, learning capability, and robustness to imprecise and uncertain inputs

    Personalized Finance Advisory through Case-based Recommender Systems and Diversification Strategies

    Get PDF
    Recommendation of financial investment strategies is a complex and knowledge-intensive task. Typically, financial advisors have to discuss at length with their wealthy clients and have to sift through several investment proposals before finding one able to completely meet investors' needs and constraints. As a consequence, a recent trend in wealth management is to improve the advisory process by exploiting recommendation technologies. This paper proposes a framework for recommendation of asset allocation strategies which combines case-based reasoning with a novel diversification strategy to support financial advisors in the task of proposing diverse and personalized investment portfolios. The performance of the framework has been evaluated by means of an experimental session conducted against 1172 real users, and results show that the yield obtained by recommended portfolios overcomes that of portfolios proposed by human advisors in most experimental settings while meeting the preferred risk profile. Furthermore, our diversification strategy shows promising results in terms of both diversity and average yield

    Strength Predictive Modelling of Soils Treated with Calcium-Based Additives Blended with Eco-Friendly Pozzolans—A Machine Learning Approach

    Get PDF
    Abstract: The unconfined compressive strength (UCS) of a stabilised soil is a major mechanical parameter in understanding and developing geomechanical models, and it can be estimated directly by either lab testing of retrieved core samples or remoulded samples. However, due to the effort, high cost and time associated with these methods, there is a need to develop a new technique for predicting UCS values in real time. An artificial intelligence paradigm of machine learning (ML) using the gradient boosting (GB) technique is applied in this study to model the unconfined compressive strength of soils stabilised by cementitious additive-enriched agro-based pozzolans. Both ML regression and multinomial classification of the UCS of the stabilised mix are investigated. Rigorous sensitivity-driven diagnostic testing is also performed to validate and provide an understanding of the intricacies of the decisions made by the algorithm. Results indicate that the well-tuned and optimised GB algorithm has a very high capacity to distinguish between positive and negative UCS categories (‘firm’, ‘very stiff’ and ‘hard’). An overall accuracy of 0.920, weighted recall rates and precision scores of 0.920 and 0.938, respectively, were produced by the GB model. Multiclass prediction in this regard shows that only 12.5% of misclassified instances was achieved. When applied to a regression problem, a coefficient of determination of approximately 0.900 and a mean error of about 0.335 were obtained, thus lending further credence to the high performance of the GB algorithm used. Finally, among the eight input features utilised as independent variables, the additives seemed to exhibit the strongest influence on the ML predictive modelling

    Machine learning for network based intrusion detection: an investigation into discrepancies in findings with the KDD cup '99 data set and multi-objective evolution of neural network classifier ensembles from imbalanced data.

    Get PDF
    For the last decade it has become commonplace to evaluate machine learning techniques for network based intrusion detection on the KDD Cup '99 data set. This data set has served well to demonstrate that machine learning can be useful in intrusion detection. However, it has undergone some criticism in the literature, and it is out of date. Therefore, some researchers question the validity of the findings reported based on this data set. Furthermore, as identified in this thesis, there are also discrepancies in the findings reported in the literature. In some cases the results are contradictory. Consequently, it is difficult to analyse the current body of research to determine the value in the findings. This thesis reports on an empirical investigation to determine the underlying causes of the discrepancies. Several methodological factors, such as choice of data subset, validation method and data preprocessing, are identified and are found to affect the results significantly. These findings have also enabled a better interpretation of the current body of research. Furthermore, the criticisms in the literature are addressed and future use of the data set is discussed, which is important since researchers continue to use it due to a lack of better publicly available alternatives. Due to the nature of the intrusion detection domain, there is an extreme imbalance among the classes in the KDD Cup '99 data set, which poses a significant challenge to machine learning. In other domains, researchers have demonstrated that well known techniques such as Artificial Neural Networks (ANNs) and Decision Trees (DTs) often fail to learn the minor class(es) due to class imbalance. However, this has not been recognized as an issue in intrusion detection previously. This thesis reports on an empirical investigation that demonstrates that it is the class imbalance that causes the poor detection of some classes of intrusion reported in the literature. An alternative approach to training ANNs is proposed in this thesis, using Genetic Algorithms (GAs) to evolve the weights of the ANNs, referred to as an Evolutionary Neural Network (ENN). When employing evaluation functions that calculate the fitness proportionally to the instances of each class, thereby avoiding a bias towards the major class(es) in the data set, significantly improved true positive rates are obtained whilst maintaining a low false positive rate. These findings demonstrate that the issues of learning from imbalanced data are not due to limitations of the ANNs; rather the training algorithm. Moreover, the ENN is capable of detecting a class of intrusion that has been reported in the literature to be undetectable by ANNs. One limitation of the ENN is a lack of control of the classification trade-off the ANNs obtain. This is identified as a general issue with current approaches to creating classifiers. Striving to create a single best classifier that obtains the highest accuracy may give an unfruitful classification trade-off, which is demonstrated clearly in this thesis. Therefore, an extension of the ENN is proposed, using a Multi-Objective GA (MOGA), which treats the classification rate on each class as a separate objective. This approach produces a Pareto front of non-dominated solutions that exhibit different classification trade-offs, from which the user can select one with the desired properties. The multi-objective approach is also utilised to evolve classifier ensembles, which yields an improved Pareto front of solutions. Furthermore, the selection of classifier members for the ensembles is investigated, demonstrating how this affects the performance of the resultant ensembles. This is a key to explaining why some classifier combinations fail to give fruitful solutions
    • …
    corecore