11,428 research outputs found
Revolutionizing Cyber Threat Detection with Large Language Models: A privacy-preserving BERT-based Lightweight Model for IoT/IIoT Devices
The field of Natural Language Processing (NLP) is currently undergoing a
revolutionary transformation driven by the power of pre-trained Large Language
Models (LLMs) based on groundbreaking Transformer architectures. As the
frequency and diversity of cybersecurity attacks continue to rise, the
importance of incident detection has significantly increased. IoT devices are
expanding rapidly, resulting in a growing need for efficient techniques to
autonomously identify network-based attacks in IoT networks with both high
precision and minimal computational requirements. This paper presents
SecurityBERT, a novel architecture that leverages the Bidirectional Encoder
Representations from Transformers (BERT) model for cyber threat detection in
IoT networks. During the training of SecurityBERT, we incorporated a novel
privacy-preserving encoding technique called Privacy-Preserving Fixed-Length
Encoding (PPFLE). We effectively represented network traffic data in a
structured format by combining PPFLE with the Byte-level Byte-Pair Encoder
(BBPE) Tokenizer. Our research demonstrates that SecurityBERT outperforms
traditional Machine Learning (ML) and Deep Learning (DL) methods, such as
Convolutional Neural Networks (CNNs) or Recurrent Neural Networks (RNNs), in
cyber threat detection. Employing the Edge-IIoTset cybersecurity dataset, our
experimental analysis shows that SecurityBERT achieved an impressive 98.2%
overall accuracy in identifying fourteen distinct attack types, surpassing
previous records set by hybrid solutions such as GAN-Transformer-based
architectures and CNN-LSTM models. With an inference time of less than 0.15
seconds on an average CPU and a compact model size of just 16.7MB, SecurityBERT
is ideally suited for real-life traffic analysis and a suitable choice for
deployment on resource-constrained IoT devices.Comment: This paper has been accepted for publication in IEEE Access:
http://dx.doi.org/10.1109/ACCESS.2024.336346
Chameleon: A Hybrid Secure Computation Framework for Machine Learning Applications
We present Chameleon, a novel hybrid (mixed-protocol) framework for secure
function evaluation (SFE) which enables two parties to jointly compute a
function without disclosing their private inputs. Chameleon combines the best
aspects of generic SFE protocols with the ones that are based upon additive
secret sharing. In particular, the framework performs linear operations in the
ring using additively secret shared values and nonlinear
operations using Yao's Garbled Circuits or the Goldreich-Micali-Wigderson
protocol. Chameleon departs from the common assumption of additive or linear
secret sharing models where three or more parties need to communicate in the
online phase: the framework allows two parties with private inputs to
communicate in the online phase under the assumption of a third node generating
correlated randomness in an offline phase. Almost all of the heavy
cryptographic operations are precomputed in an offline phase which
substantially reduces the communication overhead. Chameleon is both scalable
and significantly more efficient than the ABY framework (NDSS'15) it is based
on. Our framework supports signed fixed-point numbers. In particular,
Chameleon's vector dot product of signed fixed-point numbers improves the
efficiency of mining and classification of encrypted data for algorithms based
upon heavy matrix multiplications. Our evaluation of Chameleon on a 5 layer
convolutional deep neural network shows 133x and 4.2x faster executions than
Microsoft CryptoNets (ICML'16) and MiniONN (CCS'17), respectively
Protecting privacy of users in brain-computer interface applications
Machine learning (ML) is revolutionizing research and industry. Many ML applications rely on the use of large amounts of personal data for training and inference. Among the most intimate exploited data sources is electroencephalogram (EEG) data, a kind of data that is so rich with information that application developers can easily gain knowledge beyond the professed scope from unprotected EEG signals, including passwords, ATM PINs, and other intimate data. The challenge we address is how to engage in meaningful ML with EEG data while protecting the privacy of users. Hence, we propose cryptographic protocols based on secure multiparty computation (SMC) to perform linear regression over EEG signals from many users in a fully privacy-preserving(PP) fashion, i.e., such that each individual's EEG signals are not revealed to anyone else. To illustrate the potential of our secure framework, we show how it allows estimating the drowsiness of drivers from their EEG signals as would be possible in the unencrypted case, and at a very reasonable computational cost. Our solution is the first application of commodity-based SMC to EEG data, as well as the largest documented experiment of secret sharing-based SMC in general, namely, with 15 players involved in all the computations
- …