1,944 research outputs found
Handling Uncertainty in Social Lending Credit Risk Prediction with a Choquet Fuzzy Integral Model
As one of the main business models in the financial technology field,
peer-to-peer (P2P) lending has disrupted traditional financial services by
providing an online platform for lending money that has remarkably reduced
financial costs. However, the inherent uncertainty in P2P loans can result in
huge financial losses for P2P platforms. Therefore, accurate risk prediction is
critical to the success of P2P lending platforms. Indeed, even a small
improvement in credit risk prediction would be of benefit to P2P lending
platforms. This paper proposes an innovative credit risk prediction framework
that fuses base classifiers based on a Choquet fuzzy integral. Choquet integral
fusion improves creditworthiness evaluations by synthesizing the prediction
results of multiple classifiers and finding the largest consistency between
outcomes among conflicting and consistent results. The proposed model was
validated through experimental analysis on a real- world dataset from a
well-known P2P lending marketplace. The empirical results indicate that the
combination of multiple classifiers based on fuzzy Choquet integrals
outperforms the best base classifiers used in credit risk prediction to date.
In addition, the proposed methodology is superior to some conventional
combination techniques
One-Class Classification: Taxonomy of Study and Review of Techniques
One-class classification (OCC) algorithms aim to build classification models
when the negative class is either absent, poorly sampled or not well defined.
This unique situation constrains the learning of efficient classifiers by
defining class boundary just with the knowledge of positive class. The OCC
problem has been considered and applied under many research themes, such as
outlier/novelty detection and concept learning. In this paper we present a
unified view of the general problem of OCC by presenting a taxonomy of study
for OCC problems, which is based on the availability of training data,
algorithms used and the application domains applied. We further delve into each
of the categories of the proposed taxonomy and present a comprehensive
literature review of the OCC algorithms, techniques and methodologies with a
focus on their significance, limitations and applications. We conclude our
paper by discussing some open research problems in the field of OCC and present
our vision for future research.Comment: 24 pages + 11 pages of references, 8 figure
Cost-Sensitive Learning-based Methods for Imbalanced Classification Problems with Applications
Analysis and predictive modeling of massive datasets is an extremely significant problem that arises in many practical applications. The task of predictive modeling becomes even more challenging when data are imperfect or uncertain. The real data are frequently affected by outliers, uncertain labels, and uneven distribution of classes (imbalanced data). Such uncertainties create bias and make predictive modeling an even more difficult task. In the present work, we introduce a cost-sensitive learning method (CSL) to deal with the classification of imperfect data. Typically, most traditional approaches for classification demonstrate poor performance in an environment with imperfect data. We propose the use of CSL with Support Vector Machine, which is a well-known data mining algorithm. The results reveal that the proposed algorithm produces more accurate classifiers and is more robust with respect to imperfect data. Furthermore, we explore the best performance measures to tackle imperfect data along with addressing real problems in quality control and business analytics
Foundation and methodologies in computer-aided diagnosis systems for breast cancer detection
Breast cancer is the most prevalent cancer that affects women all over the world. Early detection
and treatment of breast cancer could decline the mortality rate. Some issues such as technical
reasons, which related to imaging quality and human error, increase misdiagnosis of breast cancer
by radiologists. Computer-aided detection systems (CADs) are developed to overcome these
restrictions and have been studied in many imaging modalities for breast cancer detection in recent
years. The CAD systems improve radiologists’ performance in finding and discriminat- ing between
the normal and abnormal tissues. These procedures are performed only as a double reader but the
absolute decisions are still made by the radiologist. In this study, the recent CAD systems for
breast cancer detec- tion on different modalities such as mammography, ultrasound, MRI, and biopsy
histopathological images are introduced. The foundation of CAD systems generally consist of four
stages: Pre-processing, Segmentation, Fea- ture extraction, and Classification. The approaches
which applied to design different stages of CAD system are summarised. Advantages and disadvantages
of different segmentation, feature extraction and classification tech- niques are listed.
In addition, the impact of imbalanced datasets in classification outcomes and appropriate methods to
solve these issues are discussed. As well as, performance evaluation metrics for various stages of
breast cancer detection CAD systems are reviewed
An academic review: applications of data mining techniques in finance industry
With the development of Internet techniques, data volumes are doubling every two years, faster than predicted by Moore’s Law. Big Data Analytics becomes particularly important for enterprise business. Modern computational technologies will provide effective tools to help understand hugely accumulated data and leverage this information to get insights into the finance industry. In order to get actionable insights into the business, data has become most valuable asset of financial organisations, as there are no physical products in finance industry to manufacture. This is where data mining techniques come to their rescue by allowing access to the right information at the right time. These techniques are used by the finance industry in various areas such as fraud detection, intelligent forecasting, credit rating, loan management, customer profiling, money laundering, marketing and prediction of price movements to name a few. This work aims to survey the research on data mining techniques applied to the finance industry from 2010 to 2015.The review finds that Stock prediction and Credit rating have received most attention of researchers, compared to Loan prediction, Money Laundering and Time Series prediction. Due to the dynamics, uncertainty and variety of data, nonlinear mapping techniques have been deeply studied than linear techniques. Also it has been proved that hybrid methods are more accurate in prediction, closely followed by Neural Network technique. This survey could provide a clue of applications of data mining techniques for finance industry, and a summary of methodologies for researchers in this area. Especially, it could provide a good vision of Data Mining Techniques in computational finance for beginners who want to work in the field of computational finance
- …