522 research outputs found
An Effective Deep Learning Based Multi-Class Classification of DoS and DDoS Attack Detection
In the past few years, cybersecurity is becoming very important due to the
rise in internet users. The internet attacks such as Denial of service (DoS)
and Distributed Denial of Service (DDoS) attacks severely harm a website or
server and make them unavailable to other users. Network Monitoring and control
systems have found it challenging to identify the many classes of DoS and DDoS
attacks since each operates uniquely. Hence a powerful technique is required
for attack detection. Traditional machine learning techniques are inefficient
in handling extensive network data and cannot extract high-level features for
attack detection. Therefore, an effective deep learning-based intrusion
detection system is developed in this paper for DoS and DDoS attack
classification. This model includes various phases and starts with the Deep
Convolutional Generative Adversarial Networks (DCGAN) based technique to
address the class imbalance issue in the dataset. Then a deep learning
algorithm based on ResNet-50 extracts the critical features for each class in
the dataset. After that, an optimized AlexNet-based classifier is implemented
for detecting the attacks separately, and the essential parameters of the
classifier are optimized using the Atom search optimization algorithm. The
proposed approach was evaluated on benchmark datasets, CCIDS2019 and UNSW-NB15,
using key classification metrics and achieved 99.37% accuracy for the UNSW-NB15
dataset and 99.33% for the CICIDS2019 dataset. The investigational results
demonstrate that the suggested approach performs superior to other competitive
techniques in identifying DoS and DDoS attacks
An Empirical Study on the Effectiveness of Testing Metrics to Test Deep Learning Models
In recent years, Deep Learning (DL) models have widely been applied to develop safety and security critical systems. The recent evolvement of Deep Neural Networks (DNNs) is the key reason behind the
unprecedented achievements in image classification, object detection, medical image analysis, speech recog nition, and autonomous driving. However, DL models often remain a black box for their practitioners due
to the lack of interpretability and explainability. DL practitioners generally use standard metrics such as
Precision, Recall, and F1 score to evaluate the performance of DL models on the test dataset. However, as
high-quality test data is not frequently accessed, the expected level of accuracy of these standard metrics on
test datasets cannot justify the trustworthiness of testing adequacy, generality and robustness of DL models.
The way we ensure the quality of DL models is still in its infancy; hence, a scalable DL model testing frame work is highly demanded in the context of software testing. The existing techniques for testing traditional
software systems could not be directly applicable to DL models because of the fundamental difference in pro gramming paradigm, systems development methodologies, and processes. However, several testing metrics
(e.g., Neuron Coverage (NC), Confusion and Bias error metrics, and Multi-granularity metrics) have been
proposed leveraging the concept of test coverage in traditional software testing to measure the robustness of
DL models and the quality of the test datasets. Although test coverage is highly effective to test traditional
software systems, the effectiveness of DL coverage metrics must be evaluated in testing the robustness of DL
models and measuring the quality of the test datasets. In addition, the selected testing metrics work on the
activated neurons of a DL model. In our study, we consider the neuron count of a DL model differently than
the existing studies. For example, according to our calculation the LeNet-5 model has 6508 neurons whereas
other studies consider the LeNet-5 model contains 268 neurons only. Therefore, it is also important to in vestigate how neurons’ concept (e.g., the idea of having neurons in the DL model and the way we calculate
the number of neurons a DL model does have) impact the testing metrics. In this thesis, we thus conduct
an exploratory study for evaluating the effectiveness of the testing metrics to test DL models not only in
measuring their robustness but also in assessing the quality of the test datasets. Furthermore, since selected
testing metrics work on the activated neurons of a DL model, we also investigate the impact of the neurons’
concepts on the testing metrics. To conduct our experiments, we select popular publicly available datasets
(e.g., MNIST, Fashion MNIST, CIFAR-10, ImageNet and so on) and train DL models on them. We also
select sate-of-the-art DL models (e.g., VGG-16, VGG-19, ResNet-50, ResNet-101 and so on) trained on the
ImageNet dataset. Our experimental results demonstrate that whatever the neuron’s concepts are, NC and
Multi-granularity testing metrics are ineffective in evaluating the robustness of DL models and in assessing
the quality of the test datasets. In addition, the selection of threshold values has a negligible impact on the
NC metric. Increasing the coverage values of the Multi-granularity testing metrics can not separate regular
test data from adversarial test data. Our exploratory study also shows that the DL models still make accurate
predictions with higher coverage values of Multi-granularity metrics than the false predictions. Therefore, it is not always true that increasing coverage values of the Multi-granularity testing metrics find more defects
of DL models. Finally, the Precision and Recall scores show that the Confusion and Bias error metrics are
adequate to detect class-level violations of the DL models
- …