2,552 research outputs found
Self-Discriminative Modeling for Anomalous Graph Detection
This paper studies the problem of detecting anomalous graphs using a machine
learning model trained on only normal graphs, which has many applications in
molecule, biology, and social network data analysis. We present a
self-discriminative modeling framework for anomalous graph detection. The key
idea, mathematically and numerically illustrated, is to learn a discriminator
(classifier) from the given normal graphs together with pseudo-anomalous graphs
generated by a model jointly trained, where we never use any true anomalous
graphs and we hope that the generated pseudo-anomalous graphs interpolate
between normal ones and (real) anomalous ones. Under the framework, we provide
three algorithms with different computational efficiencies and stabilities for
anomalous graph detection. The three algorithms are compared with several
state-of-the-art graph-level anomaly detection baselines on nine popular graph
datasets (four with small size and five with moderate size) and show
significant improvement in terms of AUC. The success of our algorithms stems
from the integration of the discriminative classifier and the well-posed
pseudo-anomalous graphs, which provide new insights for anomaly detection.
Moreover, we investigate our algorithms for large-scale imbalanced graph
datasets. Surprisingly, our algorithms, though fully unsupervised, are able to
significantly outperform supervised learning algorithms of anomalous graph
detection. The corresponding reason is also analyzed.Comment: This work was submitted to NeurIPS 2023 but was unfortunately
rejecte
INTEGRATION OF SVM AND SMOTE-NC FOR CLASSIFICATION OF HEART FAILURE PATIENTS
SMOTE (Synthetic Minority Over-sampling Technique) and SMOTE-NC (SMOTE for Nominal and Continuous features) are variations of the original SMOTE algorithm designed to handle imbalanced datasets with continuous and nominal features. The primary difference lies in their ability to generate synthetic examples for the minority class when dealing with continuous and nominal features. We employed a dataset comprising continuous and nominal features from heart failure patients. The distribution of patients' statuses, either deceased or alive, exhibited an imbalance. To address this, we executed a data balancing procedure using SMOTE-NC before conducting the classification analysis with SVM. It was found that the combination of SVM and SMOTE-NC methods gave better results than the SVM method, seen from the higher level of accuracy and F1 score. F1 gives less sensitivity to class imbalance compared to accuracy. Suppose there is a significant imbalance in the number of instances between classes. In that case, the F1 score can be a more informative metric for evaluating a classifier's performance, especially when the minority class is of interest
- …