49 research outputs found
Detecting IoT Attacks Using an Ensemble Machine Learning Model
Malicious attacks are becoming more prevalent due to the growing use of Internet of Things (IoT) devices in homes, offices, transportation, healthcare, and other locations. By incorporating fog computing into IoT, attacks can be detected in a short amount of time, as the distance between IoT devices and fog devices is smaller than the distance between IoT devices and the cloud. Machine learning is frequently used for the detection of attacks due to the huge amount of data available from IoT devices. However, the problem is that fog devices may not have enough resources, such as processing power and memory, to detect attacks in a timely manner. This paper proposes an approach to offload the machine learning model selection task to the cloud and the real-time prediction task to the fog nodes. Using the proposed method, based on historical data, an ensemble machine learning model is built in the cloud, followed by the real-time detection of attacks on fog nodes. The proposed approach is tested using the NSL-KDD dataset. The results show the effectiveness of the proposed approach in terms of several performance measures, such as execution time, precision, recall, accuracy, and ROC (receiver operating characteristic) curve
Introducing a New Alert Data Set for Multi-Step Attack Analysis
Intrusion detection systems (IDS) reinforce cyber defense by autonomously
monitoring various data sources for traces of attacks. However, IDSs are also
infamous for frequently raising false positives and alerts that are difficult
to interpret without context. This results in high workloads on security
operators who need to manually verify all reported alerts, often leading to
fatigue and incorrect decisions. To generate more meaningful alerts and
alleviate these issues, the research domain focused on multi-step attack
analysis proposes approaches for filtering, clustering, and correlating IDS
alerts, as well as generation of attack graphs. Unfortunately, existing data
sets are outdated, unreliable, narrowly focused, or only suitable for IDS
evaluation. Since hardly any suitable benchmark data sets are publicly
available, researchers often resort to private data sets that prevent
reproducibility of evaluations. We therefore generate a new alert data set that
we publish alongside this paper. The data set contains alerts from three
distinct IDSs monitoring eight executions of a multi-step attack as well as
simulations of normal user behavior. To illustrate the potential of our data
set, we experiment with alert prioritization as well as two open-source tools
for meta-alert generation and attack graph extraction
A Multi-Stage Classification Approach for IoT Intrusion Detection Based on Clustering with Oversampling
This research received no external funding. The APC is funded by Prince Sultan UniversityThe authors would like to acknowledge the support of Prince Sultan University
for paying the Article Processing Charges (APC) of this publication.Intrusion detection of IoT-based data is a hot topic and has received a lot of interests from researchers and practitioners since the security of IoT networks is crucial. Both supervised and unsupervised learning methods are used for intrusion detection of IoT networks. This paper proposes an approach of three stages considering a clustering with reduction stage, an oversampling stage, and a classification by a Single Hidden Layer Feed-Forward Neural Network (SLFN) stage. The novelty of the paper resides in the technique of data reduction and data oversampling for generating useful and balanced training data and the hybrid consideration of the unsupervised and supervised methods for detecting the intrusion activities. The experiments were evaluated in terms of accuracy, precision, recall, and G-mean and divided into four steps: measuring the effect of the data reduction with clustering, the evaluation of the framework with basic classifiers, the effect of the oversampling technique, and a comparison with basic classifiers. The results show that SLFN classification technique and the choice of Support Vector Machine and Synthetic Minority Oversampling Technique (SVM-SMOTE) with a ratio of 0.9 and the k value of 3 for k-means++ clustering technique give better results than other values and other classification techniques.Prince Sultan Universit
Instruct and Extract: Instruction Tuning for On-Demand Information Extraction
Large language models with instruction-following capabilities open the door
to a wider group of users. However, when it comes to information extraction - a
classic task in natural language processing - most task-specific systems cannot
align well with long-tail ad hoc extraction use cases for non-expert users. To
address this, we propose a novel paradigm, termed On-Demand Information
Extraction, to fulfill the personalized demands of real-world users. Our task
aims to follow the instructions to extract the desired content from the
associated text and present it in a structured tabular format. The table
headers can either be user-specified or inferred contextually by the model. To
facilitate research in this emerging area, we present a benchmark named
InstructIE, inclusive of both automatically generated training data, as well as
the human-annotated test set. Building on InstructIE, we further develop an
On-Demand Information Extractor, ODIE. Comprehensive evaluations on our
benchmark reveal that ODIE substantially outperforms the existing open-source
models of similar size. Our code and dataset are released on
https://github.com/yzjiao/On-Demand-IE.Comment: EMNLP 202
Malicious URL Website Detection using Selective Hyper Feature Link Stability based on Soft-Max Deep Featured Convolution Neural Network
The web resource contains many domains with different users' Uniform Resource Locators (URLs). Due to the increasing amount of information on the Internet resource, malicious activities are done by hackers by expecting malicious websites in URL sub-links. Increasing information theft leads data sources to be vested in huge mediums. So, to analyze the web features to find the malicious webpage based on the deep learning approach, we propose a Selective Hyper Feature Link stability rate (SHFLSR) based on Soft-max Deep featured convolution neural network (SmDFCNN) for identifying the malicious website detection depends on the actions performed and its feature responses. Initially, the URL Signature Frame rate (USFR) is estimated to verify the domain-specific hosting. Then the link stability was confirmed by post-response rate using HyperLink stability post-response state (LSPRS). Depending upon the Spectral successive Domain propagation rate (S2DPR), the features were selected and trained with a deep neural classifier with a logically defined Softmax- Logical activator (SmLA) using Deep featured Convolution neural network (DFCNN). The proposed system performs a high-performance rate by detecting the malicious URL based on the behavioral response of the domain. It increases the detection rate, prediction rate, and classifier performance
Network-based APT profiler
Constant innovation in attack methods presents a significant problem for the security community which struggles to remain current in attack prevention, detection and response. The practice of threat hunting provides a proactive approach to identify and mitigate attacks in real-time before the attackers complete their objective. In this research, I present a matrix of adversary techniques inspired by MITRE’s ATT&CK matrix. This study allows threat hunters to classify the actions of advanced persistent threats (APTs) according to network-based behaviors
Predicting US Elections with Social Media and Neural Networks
Increasingly, politicians and political parties are engaging their electors using social media. In the US Federal Election of 2016, candidates from both parties made heavy use of Social Media, particularly Twitter. It is then reasonable to attempt to find a correlation between popularity on Twitter, and eventual popular vote in the election. In this thesis, we will focus on using the subscriber ‘location’ field in the profile of each candidate to estimate support in each state.
A major challenge is that the Twitter location field in a user profile is not constrained, requiring the application of machine learning techniques to cluster users according to state.
In this thesis, we will train a Deep Convolutional Neural Network (CNN) to classify place names by state. Then we will apply the model to the Twitter Subscriber ‘location’ field of Twitter subscribers collected from each of the two candidates, Hillary Clinton (D), and Donald Trump (R). Finally, we will compare predicted popular votes in each state, to the actual results from the 2016 Presidential Election.
The hypothesis is that a city name has a strong correlation to the people who founded it and then incorporated it. Further, it’s hypothesized that the original settlers were mostly homogeneous, relative to the country of origin and shared a common language, thus resulting in place names using the language of their origin.
In addition to learning the pattern related to the State Names, this additional information may help a machine learning model learn to classify locations by state.
The results from our experiments are very promising. Using a dataset containing 695,389 cities, correctly labelled with their state, we partitioned the cities into a training dataset containing 556,311 cities, a validation dataset containing 111,262, and a test dataset containing 27,816. After the trained model was applied to the test dataset. We achieved a Correct Prediction rate of 84.4365%, a False Negative rate of 1.6106%, and a False Positive rate of 1.0697%.
Applying the trained model on Twitter Location data of subscribers of the two candidates, the model achieved an accuracy of 90%. The trained model was able to correctly pick the winner, by popular vote, in 45 out of the 50 states. With another US and Canadian election coming up in 2019, and 2020, it would be interesting to test the model on those as well
Interactive IoT Cloud Deep Learning Model for Research Development in Universities for the Educational Think Tank
The construction of university education think tanks using the interactive service platform enables the sharing of research resources, encourages cross-disciplinary research collaboration, and fosters innovation in education. It also helps to build a stronger relationship between academia and industry by enabling practitioners to participate in research activities. The Internet of Things (IoT) can be used to collect and analyze data from various sources, including sensors and other connected devices, to provide insights into education-related issues. The integration of these technologies in university education thinks tanks can help to enhance the efficiency and effectiveness of research, decision-making, and collaboration processes. Hence, this paper constructed an Interactive IoT Cloud Computing Platform (IIoTCC). With IIoTCC model the innovative idea about research and other ideas are collected and stored in a Cloud environment. Within the environment, information collected is stored in the stacked architecture model with the voting-based model. Through the stacked model, information is processed and evaluated for academic activities. The IoT environment is implemented through IIoTCC for the information process in a deep learning environment for academic-related issues. Simulation analysis expressed that IIoTCC model achieves a higher accuracy rate of 99.34% which is significantly higher than conventional classifiers
Cyber Security
This open access book constitutes the refereed proceedings of the 18th China Annual Conference on Cyber Security, CNCERT 2022, held in Beijing, China, in August 2022. The 17 papers presented were carefully reviewed and selected from 64 submissions. The papers are organized according to the following topical sections: ​​data security; anomaly detection; cryptocurrency; information security; vulnerabilities; mobile internet; threat intelligence; text recognition