4 research outputs found
Effectiveness of Transformer Models on IoT Security Detection in StackOverflow Discussions
The Internet of Things (IoT) is an emerging concept that directly links to
the billions of physical items, or "things", that are connected to the Internet
and are all gathering and exchanging information between devices and systems.
However, IoT devices were not built with security in mind, which might lead to
security vulnerabilities in a multi-device system. Traditionally, we
investigated IoT issues by polling IoT developers and specialists. This
technique, however, is not scalable since surveying all IoT developers is not
feasible. Another way to look into IoT issues is to look at IoT developer
discussions on major online development forums like Stack Overflow (SO).
However, finding discussions that are relevant to IoT issues is challenging
since they are frequently not categorized with IoT-related terms. In this
paper, we present the "IoT Security Dataset", a domain-specific dataset of 7147
samples focused solely on IoT security discussions. As there are no automated
tools to label these samples, we manually labeled them. We further employed
multiple transformer models to automatically detect security discussions.
Through rigorous investigations, we found that IoT security discussions are
different and more complex than traditional security discussions. We
demonstrated a considerable performance loss (up to 44%) of transformer models
on cross-domain datasets when we transferred knowledge from a general-purpose
dataset "Opiner", supporting our claim. Thus, we built a domain-specific IoT
security detector with an F1-Score of 0.69. We have made the dataset public in
the hope that developers would learn more about the security discussion and
vendors would enhance their concerns about product security
An Interpretable Systematic Review of Machine Learning Models for Predictive Maintenance of Aircraft Engine
This paper presents an interpretable review of various machine learning and
deep learning models to predict the maintenance of aircraft engine to avoid any
kind of disaster. One of the advantages of the strategy is that it can work
with modest datasets. In this study, sensor data is utilized to predict
aircraft engine failure within a predetermined number of cycles using LSTM,
Bi-LSTM, RNN, Bi-RNN GRU, Random Forest, KNN, Naive Bayes, and Gradient
Boosting. We explain how deep learning and machine learning can be used to
generate predictions in predictive maintenance using a straightforward scenario
with just one data source. We applied lime to the models to help us understand
why machine learning models did not perform well than deep learning models. An
extensive analysis of the model's behavior is presented for several test data
to understand the black box scenario of the models. A lucrative accuracy of
97.8%, 97.14%, and 96.42% are achieved by GRU, Bi-LSTM, and LSTM respectively
which denotes the capability of the models to predict maintenance at an early
stage
Bengali Fake Review Detection using Semi-supervised Generative Adversarial Networks
This paper investigates the potential of semi-supervised Generative
Adversarial Networks (GANs) to fine-tune pretrained language models in order to
classify Bengali fake reviews from real reviews with a few annotated data. With
the rise of social media and e-commerce, the ability to detect fake or
deceptive reviews is becoming increasingly important in order to protect
consumers from being misled by false information. Any machine learning model
will have trouble identifying a fake review, especially for a low resource
language like Bengali. We have demonstrated that the proposed semi-supervised
GAN-LM architecture (generative adversarial network on top of a pretrained
language model) is a viable solution in classifying Bengali fake reviews as the
experimental results suggest that even with only 1024 annotated samples,
BanglaBERT with semi-supervised GAN (SSGAN) achieved an accuracy of 83.59% and
a f1-score of 84.89% outperforming other pretrained language models -
BanglaBERT generator, Bangla BERT Base and Bangla-Electra by almost 3%, 4% and
10% respectively in terms of accuracy. The experiments were conducted on a
manually labeled food review dataset consisting of total 6014 real and fake
reviews collected from various social media groups. Researchers that are
experiencing difficulty recognizing not just fake reviews but other
classification issues owing to a lack of labeled data may find a solution in
our proposed methodology
Rank Your Summaries: Enhancing Bengali Text Summarization via Ranking-based Approach
With the increasing need for text summarization techniques that are both
efficient and accurate, it becomes crucial to explore avenues that enhance the
quality and precision of pre-trained models specifically tailored for
summarizing Bengali texts. When it comes to text summarization tasks, there are
numerous pre-trained transformer models at one's disposal. Consequently, it
becomes quite a challenge to discern the most informative and relevant summary
for a given text among the various options generated by these pre-trained
summarization models. This paper aims to identify the most accurate and
informative summary for a given text by utilizing a simple but effective
ranking-based approach that compares the output of four different pre-trained
Bengali text summarization models. The process begins by carrying out
preprocessing of the input text that involves eliminating unnecessary elements
such as special characters and punctuation marks. Next, we utilize four
pre-trained summarization models to generate summaries, followed by applying a
text ranking algorithm to identify the most suitable summary. Ultimately, the
summary with the highest ranking score is chosen as the final one. To evaluate
the effectiveness of this approach, the generated summaries are compared
against human-annotated summaries using standard NLG metrics such as BLEU,
ROUGE, BERTScore, WIL, WER, and METEOR. Experimental results suggest that by
leveraging the strengths of each pre-trained transformer model and combining
them using a ranking-based approach, our methodology significantly improves the
accuracy and effectiveness of the Bengali text summarization.Comment: Accepted in International Conference on Big Data, IoT and Machine
Learning 2023 (BIM 2023