1,420 research outputs found
Optimal Representation of Anuran Call Spectrum in Environmental Monitoring Systems Using Wireless Sensor Networks
The analysis and classification of the sounds produced by certain animal species, notably anurans, have revealed these amphibians to be a potentially strong indicator of temperature fluctuations and therefore of the existence of climate change. Environmental monitoring systems using Wireless Sensor Networks are therefore of interest to obtain indicators of global warming. For the automatic classification of the sounds recorded on such systems, the proper representation of the sound spectrum is essential since it contains the information required for cataloguing anuran calls. The present paper focuses on this process of feature extraction by exploring three alternatives: the standardized MPEG-7, the Filter Bank Energy (FBE), and the Mel Frequency Cepstral Coefficients (MFCC). Moreover, various values for every option in the extraction of spectrum features have been considered. Throughout the paper, it is shown that representing the frame spectrum with pure FBE offers slightly worse results than using the MPEG-7 features. This performance can easily be increased, however, by rescaling the FBE in a double dimension: vertically, by taking the logarithm of the energies; and, horizontally, by applying mel scaling in the filter banks. On the other hand, representing the spectrum in the cepstral domain, as in MFCC, has shown additional marginal improvements in classification performance.University of Seville: Telefónica Chair "Intelligence Networks
Data mining for detecting Bitcoin Ponzi schemes
Soon after its introduction in 2009, Bitcoin has been adopted by
cyber-criminals, which rely on its pseudonymity to implement virtually
untraceable scams. One of the typical scams that operate on Bitcoin are the
so-called Ponzi schemes. These are fraudulent investments which repay users
with the funds invested by new users that join the scheme, and implode when it
is no longer possible to find new investments. Despite being illegal in many
countries, Ponzi schemes are now proliferating on Bitcoin, and they keep
alluring new victims, who are plundered of millions of dollars. We apply data
mining techniques to detect Bitcoin addresses related to Ponzi schemes. Our
starting point is a dataset of features of real-world Ponzi schemes, that we
construct by analysing, on the Bitcoin blockchain, the transactions used to
perform the scams. We use this dataset to experiment with various machine
learning algorithms, and we assess their effectiveness through standard
validation protocols and performance metrics. The best of the classifiers we
have experimented can identify most of the Ponzi schemes in the dataset, with a
low number of false positives
A New Data-Balancing Approach Based on Generative Adversarial Network for Network Intrusion Detection System
An intrusion detection system (IDS) plays a critical role in maintaining network security by
continuously monitoring network traffic and host systems to detect any potential security breaches
or suspicious activities. With the recent surge in cyberattacks, there is a growing need for automated
and intelligent IDSs. Many of these systems are designed to learn the normal patterns of
network traffic, enabling them to identify any deviations from the norm, which can be indicative of
anomalous or malicious behavior. Machine learning methods have proven to be effective in detecting
malicious payloads in network traffic. However, the increasing volume of data generated by IDSs
poses significant security risks and emphasizes the need for stronger network security measures. The
performance of traditional machine learning methods heavily relies on the dataset and its balanced
distribution. Unfortunately, many IDS datasets suffer from imbalanced class distributions, which
hampers the effectiveness of machine learning techniques and leads to missed detection and false
alarms in conventional IDSs. To address this challenge, this paper proposes a novel model-based
generative adversarial network (GAN) called TDCGAN, which aims to improve the detection rate
of the minority class in imbalanced datasets while maintaining efficiency. The TDCGAN model
comprises a generator and three discriminators, with an election layer incorporated at the end of the
architecture. This allows for the selection of the optimal outcome from the discriminators’ outputs.
The UGR’16 dataset is employed for evaluation and benchmarking purposes. Various machine
learning algorithms are used for comparison to demonstrate the efficacy of the proposed TDCGAN
model. Experimental results reveal that TDCGAN offers an effective solution for addressing imbalanced
intrusion detection and outperforms other traditionally used oversampling techniques. By
leveraging the power of GANs and incorporating an election layer, TDCGAN demonstrates superior
performance in detecting security threats in imbalanced IDS datasets.PID2020-113462RB-I00, PID2020-115570GB-C22
and PID2020-115570GB-C21 granted by Ministerio Español de Economía y CompetitividadProject TED2021-129938B-I0, granted by Ministerio Español de Ciencia e Innovació
Towards Effective Wireless Intrusion Detection using AWID Dataset
In the field of network security, intrusion detection system plays a vital role in the procedure of applying machine learning (ML) techniques with the dataset. This study is an IDS related in machine, developed the literature by utilizing AWID dataset. There tends to be a need in balancing a dataset and its existing approaches from the analysis of its respective works. A taxonomy of balancing technique was introduced due to the lack of treatment of imbalance. This attempt has provided a proper structure defined on all levels and a hierarchical group was formed with the collected papers. This describes a comparative study on the proposed or treated aspects. The main aspect from the surveyed papers were found that: understanding of the existing taxonomies were not in detail and there were no treatment of imbalance for the utilized dataset. So, this study concludes a gathered information in these aspects. Regardless, there are factors or weakness have been seen in any adaptations of the intrusion detection system. In this context, there are few findings that are multifold with contributions. Thus, to best of our knowledge, the study provides an integration with the observation of threshold limit and feature drop selection method by random samples. Thus, the work contributes a better understanding towards imbalanced techniques from the literature surveyed. Hence, this research would benefit for the development of IDS using ML
Applying machine learning to categorize distinct categories of network traffic
The recent rapid growth of the field of data science has made available to all fields opportunities to leverage machine learning. Computer network traffic classification has traditionally been performed using static, pre-written rules that are easily made ineffective if changes, legitimate or not, are made to the applications or protocols underlying a particular category of network traffic. This paper explores the problem of network traffic classification and analyzes the viability of having the process performed using a multitude of classical machine learning techniques against significant statistical similarities between classes of network traffic as opposed to traditional static traffic identifiers.
To accomplish this, network data was captured, processed, and evaluated for 10 application labels under the categories of video conferencing, video streaming, video gaming, and web browsing as described later in Table 1. Flow-based statistical features for the dataset were derived from the network captures in accordance with the “Flow Data Feature Creation” section and were analyzed against a nearest centroid, k-nearest neighbors, Gaussian naïve Bayes, support vector machine, decision tree, random forest, and multi-layer perceptron classifier. Tools and techniques broadly available to organizations and enthusiasts were used. Observations were made on working with network data in a machine learning context, strengths and weaknesses of different models on such data, and the overall efficacy of the tested models.
Ultimately, it was found that simple models freely available to anyone can achieve high accuracy, recall, and F1 scores in network traffic classification, with the best-performing model, random forest, having 89% accuracy, a macro average F1 score of .77, and a macro average recall of 76%, with the most common feature of successful classification being related to maximum packet sizes in a network flow
Automatic Driver Drowsiness Detection System
The proposed system aims to lessen the number of accidents that occur due to drivers’ drowsiness and fatigue, which will in turn increase transportation safety. This has become a common reason for accidents in recent times. Several facial and body gestures are considered signs of drowsiness and fatigue in drivers, including tiredness in the eyes and yawning. These features are an indication that the driver’s condition is improper. EAR (Eye Aspect Ratio) computes the ratio of distances between the horizontal and vertical eye landmarks, which is required for the detection of drowsiness. For the purpose of yawn detection, a YAWN value is calculated using the distance between the lower lip and the upper lip, and the distance will be compared against a threshold value. We have deployed an eSpeak module (text-to-speech synthesiser), which is used for giving appropriate voice alerts when the driver is feeling drowsy or is yawning. The proposed system is designed to decrease the rate of accidents and contribute to technology with the goal of preventing fatalities caused by road accidents. Over the past ten years, advances in artificial intelligence and computing technologies have improved driver monitoring systems. Several experimental studies have gathered data on actual driver fatigue using different artificial intelligence systems. In order to dramatically improve these systems' real-time performance, feature combinations are used. An updated evaluation of the driver sleepiness detection technologies put in place during the previous ten years is presented in this research. The paper discusses and displays current systems that track and identify drowsiness using various metrics. Based on the information used, each system can be categorised into one of four groups. Each system in this paper comes with a thorough discussion of the features, classification rules, and datasets it employs. 
FishEye8K: A Benchmark and Dataset for Fisheye Camera Object Detection
With the advance of AI, road object detection has been a prominent topic in
computer vision, mostly using perspective cameras. Fisheye lens provides
omnidirectional wide coverage for using fewer cameras to monitor road
intersections, however with view distortions. To our knowledge, there is no
existing open dataset prepared for traffic surveillance on fisheye cameras.
This paper introduces an open FishEye8K benchmark dataset for road object
detection tasks, which comprises 157K bounding boxes across five classes
(Pedestrian, Bike, Car, Bus, and Truck). In addition, we present benchmark
results of State-of-The-Art (SoTA) models, including variations of YOLOv5,
YOLOR, YOLO7, and YOLOv8. The dataset comprises 8,000 images recorded in 22
videos using 18 fisheye cameras for traffic monitoring in Hsinchu, Taiwan, at
resolutions of 10801080 and 12801280. The data annotation and
validation process were arduous and time-consuming, due to the ultra-wide
panoramic and hemispherical fisheye camera images with large distortion and
numerous road participants, particularly people riding scooters. To avoid bias,
frames from a particular camera were assigned to either the training or test
sets, maintaining a ratio of about 70:30 for both the number of images and
bounding boxes in each class. Experimental results show that YOLOv8 and YOLOR
outperform on input sizes 640640 and 12801280, respectively.
The dataset will be available on GitHub with PASCAL VOC, MS COCO, and YOLO
annotation formats. The FishEye8K benchmark will provide significant
contributions to the fisheye video analytics and smart city applications.Comment: CVPR Workshops 202
WATT-EffNet: A Lightweight and Accurate Model for Classifying Aerial Disaster Images
Incorporating deep learning (DL) classification models into unmanned aerial
vehicles (UAVs) can significantly augment search-and-rescue operations and
disaster management efforts. In such critical situations, the UAV's ability to
promptly comprehend the crisis and optimally utilize its limited power and
processing resources to narrow down search areas is crucial. Therefore,
developing an efficient and lightweight method for scene classification is of
utmost importance. However, current approaches tend to prioritize accuracy on
benchmark datasets at the expense of computational efficiency. To address this
shortcoming, we introduce the Wider ATTENTION EfficientNet (WATT-EffNet), a
novel method that achieves higher accuracy with a more lightweight architecture
compared to the baseline EfficientNet. The WATT-EffNet leverages width-wise
incremental feature modules and attention mechanisms over width-wise features
to ensure the network structure remains lightweight. We evaluate our method on
a UAV-based aerial disaster image classification dataset and demonstrate that
it outperforms the baseline by up to 15 times in terms of classification
accuracy and in terms of computing efficiency as measured by Floating
Point Operations per second (FLOPs). Additionally, we conduct an ablation study
to investigate the effect of varying the width of WATT-EffNet on accuracy and
computational efficiency. Our code is available at
\url{https://github.com/TanmDL/WATT-EffNet}.Comment: This paper is accepted in IEEE Trans. GRS
- …