89,820 research outputs found
Soft Methodology for Cost-and-error Sensitive Classification
Many real-world data mining applications need varying cost for different
types of classification errors and thus call for cost-sensitive classification
algorithms. Existing algorithms for cost-sensitive classification are
successful in terms of minimizing the cost, but can result in a high error rate
as the trade-off. The high error rate holds back the practical use of those
algorithms. In this paper, we propose a novel cost-sensitive classification
methodology that takes both the cost and the error rate into account. The
methodology, called soft cost-sensitive classification, is established from a
multicriteria optimization problem of the cost and the error rate, and can be
viewed as regularizing cost-sensitive classification with the error rate. The
simple methodology allows immediate improvements of existing cost-sensitive
classification algorithms. Experiments on the benchmark and the real-world data
sets show that our proposed methodology indeed achieves lower test error rates
and similar (sometimes lower) test costs than existing cost-sensitive
classification algorithms. We also demonstrate that the methodology can be
extended for considering the weighted error rate instead of the original error
rate. This extension is useful for tackling unbalanced classification problems.Comment: A shorter version appeared in KDD '1
Classification hardness for supervised learners on 20 years of intrusion detection data
This article consolidates analysis of established (NSL-KDD) and new intrusion detection datasets (ISCXIDS2012, CICIDS2017, CICIDS2018) through the use of supervised machine learning (ML) algorithms. The uniformity in analysis procedure opens up the option to compare the obtained results. It also provides a stronger foundation for the conclusions about the efficacy of supervised learners on the main classification task in network security. This research is motivated in part to address the lack of adoption of these modern datasets. Starting with a broad scope that includes classification by algorithms from different families on both established and new datasets has been done to expand the existing foundation and reveal the most opportune avenues for further inquiry. After obtaining baseline results, the classification task was increased in difficulty, by reducing the available data to learn from, both horizontally and vertically. The data reduction has been included as a stress-test to verify if the very high baseline results hold up under increasingly harsh constraints. Ultimately, this work contains the most comprehensive set of results on the topic of intrusion detection through supervised machine learning. Researchers working on algorithmic improvements can compare their results to this collection, knowing that all results reported here were gathered through a uniform framework. This work's main contributions are the outstanding classification results on the current state of the art datasets for intrusion detection and the conclusion that these methods show remarkable resilience in classification performance even when aggressively reducing the amount of data to learn from
Transfer Learning with Deep Convolutional Neural Network (CNN) for Pneumonia Detection using Chest X-ray
Pneumonia is a life-threatening disease, which occurs in the lungs caused by
either bacterial or viral infection. It can be life-endangering if not acted
upon in the right time and thus an early diagnosis of pneumonia is vital. The
aim of this paper is to automatically detect bacterial and viral pneumonia
using digital x-ray images. It provides a detailed report on advances made in
making accurate detection of pneumonia and then presents the methodology
adopted by the authors. Four different pre-trained deep Convolutional Neural
Network (CNN)- AlexNet, ResNet18, DenseNet201, and SqueezeNet were used for
transfer learning. 5247 Bacterial, viral and normal chest x-rays images
underwent preprocessing techniques and the modified images were trained for the
transfer learning based classification task. In this work, the authors have
reported three schemes of classifications: normal vs pneumonia, bacterial vs
viral pneumonia and normal, bacterial and viral pneumonia. The classification
accuracy of normal and pneumonia images, bacterial and viral pneumonia images,
and normal, bacterial and viral pneumonia were 98%, 95%, and 93.3%
respectively. This is the highest accuracy in any scheme than the accuracies
reported in the literature. Therefore, the proposed study can be useful in
faster-diagnosing pneumonia by the radiologist and can help in the fast airport
screening of pneumonia patients.Comment: 13 Figures, 5 tables. arXiv admin note: text overlap with
arXiv:2003.1314
Construction of embedded fMRI resting state functional connectivity networks using manifold learning
We construct embedded functional connectivity networks (FCN) from benchmark
resting-state functional magnetic resonance imaging (rsfMRI) data acquired from
patients with schizophrenia and healthy controls based on linear and nonlinear
manifold learning algorithms, namely, Multidimensional Scaling (MDS), Isometric
Feature Mapping (ISOMAP) and Diffusion Maps. Furthermore, based on key global
graph-theoretical properties of the embedded FCN, we compare their
classification potential using machine learning techniques. We also assess the
performance of two metrics that are widely used for the construction of FCN
from fMRI, namely the Euclidean distance and the lagged cross-correlation
metric. We show that the FCN constructed with Diffusion Maps and the lagged
cross-correlation metric outperform the other combinations
Finding groups in data: Cluster analysis with ants
Wepresent in this paper a modification of Lumer and Faietaās algorithm for data clustering. This approach
mimics the clustering behavior observed in real ant colonies. This algorithm discovers automatically
clusters in numerical data without prior knowledge of possible number of clusters. In this paper we focus
on ant-based clustering algorithms, a particular kind of a swarm intelligent system, and on the effects on
the final clustering by using during the classification differentmetrics of dissimilarity: Euclidean, Cosine,
and Gower measures. Clustering with swarm-based algorithms is emerging as an alternative to more
conventional clustering methods, such as e.g. k-means, etc. Among the many bio-inspired techniques, ant
clustering algorithms have received special attention, especially because they still require much
investigation to improve performance, stability and other key features that would make such algorithms
mature tools for data mining.
As a case study, this paper focus on the behavior of clustering procedures in those new approaches.
The proposed algorithm and its modifications are evaluated in a number of well-known benchmark
datasets. Empirical results clearly show that ant-based clustering algorithms performs well when
compared to another techniques
- ā¦