2,996 research outputs found
Deep Generative Models for Reject Inference in Credit Scoring
Credit scoring models based on accepted applications may be biased and their
consequences can have a statistical and economic impact. Reject inference is
the process of attempting to infer the creditworthiness status of the rejected
applications. In this research, we use deep generative models to develop two
new semi-supervised Bayesian models for reject inference in credit scoring, in
which we model the data generating process to be dependent on a Gaussian
mixture. The goal is to improve the classification accuracy in credit scoring
models by adding reject applications. Our proposed models infer the unknown
creditworthiness of the rejected applications by exact enumeration of the two
possible outcomes of the loan (default or non-default). The efficient
stochastic gradient optimization technique used in deep generative models makes
our models suitable for large data sets. Finally, the experiments in this
research show that our proposed models perform better than classical and
alternative machine learning models for reject inference in credit scoring
Detection of Review Abuse via Semi-Supervised Binary Multi-Target Tensor Decomposition
Product reviews and ratings on e-commerce websites provide customers with
detailed insights about various aspects of the product such as quality,
usefulness, etc. Since they influence customers' buying decisions, product
reviews have become a fertile ground for abuse by sellers (colluding with
reviewers) to promote their own products or to tarnish the reputation of
competitor's products. In this paper, our focus is on detecting such abusive
entities (both sellers and reviewers) by applying tensor decomposition on the
product reviews data. While tensor decomposition is mostly unsupervised, we
formulate our problem as a semi-supervised binary multi-target tensor
decomposition, to take advantage of currently known abusive entities. We
empirically show that our multi-target semi-supervised model achieves higher
precision and recall in detecting abusive entities as compared to unsupervised
techniques. Finally, we show that our proposed stochastic partial natural
gradient inference for our model empirically achieves faster convergence than
stochastic gradient and Online-EM with sufficient statistics.Comment: Accepted to the 25th ACM SIGKDD Conference on Knowledge Discovery and
Data Mining, 2019. Contains supplementary material. arXiv admin note: text
overlap with arXiv:1804.0383
Informative sample generation using class aware generative adversarial networks for classification of chest Xrays
Training robust deep learning (DL) systems for disease detection from medical
images is challenging due to limited images covering different disease types
and severity. The problem is especially acute, where there is a severe class
imbalance. We propose an active learning (AL) framework to select most
informative samples for training our model using a Bayesian neural network.
Informative samples are then used within a novel class aware generative
adversarial network (CAGAN) to generate realistic chest xray images for data
augmentation by transferring characteristics from one class label to another.
Experiments show our proposed AL framework is able to achieve state-of-the-art
performance by using about of the full dataset, thus saving significant
time and effort over conventional methods
- …