Search CORE

948 research outputs found

BlogForever: D2.5 Weblog Spam Filtering Report and Associated Methodology

Author: Banos Vangelis
Kasioumis Nikolaos
Kim Yunhyong
Kopidaki Stella
Ross Seamus
Rynning Morten
Stepanyan Karen
Publication venue: BlogForever
Publication date: 25/10/2013
Field of study

This report is written as a first attempt to define the BlogForever spam detection strategy. It comprises a survey of weblog spam technology and approaches to their detection. While the report was written to help identify possible approaches to spam detection as a component within the BlogForver software, the discussion has been extended to include observations related to the historical, social and practical value of spam, and proposals of other ways of dealing with spam within the repository without necessarily removing them. It contains a general overview of spam types, ready-made anti-spam APIs available for weblogs, possible methods that have been suggested for preventing the introduction of spam into a blog, and research related to spam focusing on those that appear in the weblog context, concluding in a proposal for a spam detection workflow that might form the basis for the spam detection component of the BlogForever software

ZENODO

Enlighten

Spam-T5: Benchmarking Large Language Models for Few-Shot Email Spam Detection

Author: Labonne Maxime
Moran Sean
Publication venue
Publication date: 05/04/2023
Field of study

This paper investigates the effectiveness of large language models (LLMs) in email spam detection by comparing prominent models from three distinct families: BERT-like, Sentence Transformers, and Seq2Seq. Additionally, we examine well-established machine learning techniques for spam detection, such as Na\"ive Bayes and LightGBM, as baseline methods. We assess the performance of these models across four public datasets, utilizing different numbers of training samples (full training set and few-shot settings). Our findings reveal that, in the majority of cases, LLMs surpass the performance of the popular baseline techniques, particularly in few-shot scenarios. This adaptability renders LLMs uniquely suited to spam detection tasks, where labeled samples are limited in number and models require frequent updates. Additionally, we introduce Spam-T5, a Flan-T5 model that has been specifically adapted and fine-tuned for the purpose of detecting email spam. Our results demonstrate that Spam-T5 surpasses baseline models and other LLMs in the majority of scenarios, particularly when there are a limited number of training samples available. Our code is publicly available at https://github.com/jpmorganchase/emailspamdetection

arXiv.org e-Print Archive

Deep learning to filter SMS spam

Author: Banerjee Snehasish
Roy Pradeep Kumar
Singh Jyoti Prakash
Publication venue
Publication date: 01/01/2020
Field of study

The popularity of short message service (SMS) has been growing over the last decade. For businesses, these text messages are more effective than even emails. This is because while 98% of mobile users read their SMS by the end of the day, about 80% of the emails remain unopened. The popularity of SMS has also given rise to SMS Spam, which refers to any irrelevant text messages delivered using mobile networks. They are severely annoying to users. Most existing research that has attempted to filter SMS Spam has relied on manually identified features. Extending the current literature, this paper uses deep learning to classify Spam and Not-Spam text messages. Specifically, Convolutional Neural Network and Long Short-term memory models were employed. The proposed models were based on text data only, and self-extracted the feature set. On a benchmark dataset consisting of 747 Spam and 4,827 Not-Spam text messages, a remarkable accuracy of 99.44% was achieved

White Rose Research Online

A Review on mobile SMS Spam filtering techniques

Author: Abd Latiff Muhammad Shafie
Abdul-Salaam Gaddafi
Abdulhamid Shafi'i Muhammad
Abubakar Adamu
Chiroma Haruna
Herawan Tutut
Osho Oluwafemi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Under short messaging service (SMS) spam is understood the unsolicited or undesired messages received on mobile phones. These SMS spams constitute a veritable nuisance to the mobile subscribers. This marketing practice also worries service providers in view of the fact that it upsets their clients or even causes them lose subscribers. By way of mitigating this practice, researchers have proposed several solutions for the detection and filtering of SMS spams. In this paper, we present a review of the currently available methods, challenges, and future research directions on spam detection techniques, filtering, and mitigation of mobile SMS spams. The existing research literature is critically reviewed and analyzed. The most popular techniques for SMS spam detection, filtering, and mitigation are compared, including the used data sets, their findings, and limitations, and the future research directions are discussed. This review is designed to assist expert researchers to identify open areas that need further improvement

The International Islamic University Malaysia Repository

Artificial intelligence in the cyber domain: Offense and defense

Author: Diep Quoc Bao
Truong Thanh Cong
Zelinka Ivan
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

Artificial intelligence techniques have grown rapidly in recent years, and their applications in practice can be seen in many fields, ranging from facial recognition to image analysis. In the cybersecurity domain, AI-based techniques can provide better cyber defense tools and help adversaries improve methods of attack. However, malicious actors are aware of the new prospects too and will probably attempt to use them for nefarious purposes. This survey paper aims at providing an overview of how artificial intelligence can be used in the context of cybersecurity in both offense and defense.Web of Science123art. no. 41

Multidisciplinary Digital Publishing Institute

DSpace at VSB Technical University of Ostrava

Investigating and Validating Scam Triggers: A Case Study of a Craigslist Website

Author: Hirei Hassan Mohamed
Publication venue: The Repository at St. Cloud State
Publication date: 01/08/2020
Field of study

The internet and digital infrastructure play an important role in our day-to-day live, and it has also a huge impact on the organizations and how we do business transactions every day. Online business is booming in this 21st century, and there are many online platforms that enable sellers and buyers to do online transactions collectively. People can sell and purchase products that include vehicles, clothes, and shoes from anywhere and anytime. Thus, the purpose of this study is to identify and validate scam triggers using Craigslist as a case study. Craigslist is one of the websites where people can post advertising to sell and buy personal belongings online. However, with the growing number of people buying and selling, new threats and scams are created daily. Private cars are among the most significant items sold and purchased over the craigslist website. In this regard, several scammers have been drawn by the large number of vehicles being traded over craigslist. Scammers also use this forum to cheat others and exploit the vulnerable. The study identified online scam triggers including Bad key words, dealers’ posts as owners, personal email, multiple location, rogue picture and voice over IP to detect online scams that exists in craigslist. The study also found over 360 ads from craigslist based on our scam trigger. Finally, the study validated each and every one of the scam triggers and found 53.31% of our data is likelihood to be considered as a scam

St. Cloud State University

Leveraging Sociological Models for Predictive Analytics

Author: Kristin Glass
Richard Colbaugh
Publication venue
Publication date
Field of study

Abstract—There is considerable interest in developing techniques for predicting human behavior, for instance to enable emerging contentious situations to be forecast or the nature of ongoing but “hidden ” activities to be inferred. A promising approach to this problem is to identify and collect appropriate empirical data and then apply machine learning methods to these data to generate the predictions. This paper shows the performance of such learning algorithms often can be improved substantially by leveraging sociological models in their development and implementation. In particular, we demonstrate that sociologically-grounded learning algorithms outperform gold-standard methods in three important and challenging tasks: 1.) inferring the (unobserved) nature of relationships in adversarial social networks, 2.) predicting whether nascent social diffusion events will “go viral”, and 3.) anticipating and defending future actions of opponents in adversarial settings. Significantly, the new algorithms perform well even when there is limited data available for their training and execution. Keywords—predictive analysis, sociological models, social networks, empirical analysis, machine learning. I

CiteSeerX