1,944 research outputs found

    Handling Uncertainty in Social Lending Credit Risk Prediction with a Choquet Fuzzy Integral Model

    Full text link
    As one of the main business models in the financial technology field, peer-to-peer (P2P) lending has disrupted traditional financial services by providing an online platform for lending money that has remarkably reduced financial costs. However, the inherent uncertainty in P2P loans can result in huge financial losses for P2P platforms. Therefore, accurate risk prediction is critical to the success of P2P lending platforms. Indeed, even a small improvement in credit risk prediction would be of benefit to P2P lending platforms. This paper proposes an innovative credit risk prediction framework that fuses base classifiers based on a Choquet fuzzy integral. Choquet integral fusion improves creditworthiness evaluations by synthesizing the prediction results of multiple classifiers and finding the largest consistency between outcomes among conflicting and consistent results. The proposed model was validated through experimental analysis on a real- world dataset from a well-known P2P lending marketplace. The empirical results indicate that the combination of multiple classifiers based on fuzzy Choquet integrals outperforms the best base classifiers used in credit risk prediction to date. In addition, the proposed methodology is superior to some conventional combination techniques

    One-Class Classification: Taxonomy of Study and Review of Techniques

    Full text link
    One-class classification (OCC) algorithms aim to build classification models when the negative class is either absent, poorly sampled or not well defined. This unique situation constrains the learning of efficient classifiers by defining class boundary just with the knowledge of positive class. The OCC problem has been considered and applied under many research themes, such as outlier/novelty detection and concept learning. In this paper we present a unified view of the general problem of OCC by presenting a taxonomy of study for OCC problems, which is based on the availability of training data, algorithms used and the application domains applied. We further delve into each of the categories of the proposed taxonomy and present a comprehensive literature review of the OCC algorithms, techniques and methodologies with a focus on their significance, limitations and applications. We conclude our paper by discussing some open research problems in the field of OCC and present our vision for future research.Comment: 24 pages + 11 pages of references, 8 figure

    Rails Quality Data Modelling via Machine Learning-Based Paradigms

    Get PDF

    Cost-Sensitive Learning-based Methods for Imbalanced Classification Problems with Applications

    Get PDF
    Analysis and predictive modeling of massive datasets is an extremely significant problem that arises in many practical applications. The task of predictive modeling becomes even more challenging when data are imperfect or uncertain. The real data are frequently affected by outliers, uncertain labels, and uneven distribution of classes (imbalanced data). Such uncertainties create bias and make predictive modeling an even more difficult task. In the present work, we introduce a cost-sensitive learning method (CSL) to deal with the classification of imperfect data. Typically, most traditional approaches for classification demonstrate poor performance in an environment with imperfect data. We propose the use of CSL with Support Vector Machine, which is a well-known data mining algorithm. The results reveal that the proposed algorithm produces more accurate classifiers and is more robust with respect to imperfect data. Furthermore, we explore the best performance measures to tackle imperfect data along with addressing real problems in quality control and business analytics

    Foundation and methodologies in computer-aided diagnosis systems for breast cancer detection

    Get PDF
    Breast cancer is the most prevalent cancer that affects women all over the world. Early detection and treatment of breast cancer could decline the mortality rate. Some issues such as technical reasons, which related to imaging quality and human error, increase misdiagnosis of breast cancer by radiologists. Computer-aided detection systems (CADs) are developed to overcome these restrictions and have been studied in many imaging modalities for breast cancer detection in recent years. The CAD systems improve radiologists’ performance in finding and discriminat- ing between the normal and abnormal tissues. These procedures are performed only as a double reader but the absolute decisions are still made by the radiologist. In this study, the recent CAD systems for breast cancer detec- tion on different modalities such as mammography, ultrasound, MRI, and biopsy histopathological images are introduced. The foundation of CAD systems generally consist of four stages: Pre-processing, Segmentation, Fea- ture extraction, and Classification. The approaches which applied to design different stages of CAD system are summarised. Advantages and disadvantages of different segmentation, feature extraction and classification tech- niques are listed. In addition, the impact of imbalanced datasets in classification outcomes and appropriate methods to solve these issues are discussed. As well as, performance evaluation metrics for various stages of breast cancer detection CAD systems are reviewed

    Dealing with imbalanced and weakly labelled data in machine learning using fuzzy and rough set methods

    Get PDF

    An academic review: applications of data mining techniques in finance industry

    Get PDF
    With the development of Internet techniques, data volumes are doubling every two years, faster than predicted by Moore’s Law. Big Data Analytics becomes particularly important for enterprise business. Modern computational technologies will provide effective tools to help understand hugely accumulated data and leverage this information to get insights into the finance industry. In order to get actionable insights into the business, data has become most valuable asset of financial organisations, as there are no physical products in finance industry to manufacture. This is where data mining techniques come to their rescue by allowing access to the right information at the right time. These techniques are used by the finance industry in various areas such as fraud detection, intelligent forecasting, credit rating, loan management, customer profiling, money laundering, marketing and prediction of price movements to name a few. This work aims to survey the research on data mining techniques applied to the finance industry from 2010 to 2015.The review finds that Stock prediction and Credit rating have received most attention of researchers, compared to Loan prediction, Money Laundering and Time Series prediction. Due to the dynamics, uncertainty and variety of data, nonlinear mapping techniques have been deeply studied than linear techniques. Also it has been proved that hybrid methods are more accurate in prediction, closely followed by Neural Network technique. This survey could provide a clue of applications of data mining techniques for finance industry, and a summary of methodologies for researchers in this area. Especially, it could provide a good vision of Data Mining Techniques in computational finance for beginners who want to work in the field of computational finance
    • …
    corecore