Search CORE

1,431 research outputs found

A practical study on shape space and its occupancy in negative selection

Author: Ma Wanli
Sharma Dharmendra
Tran Dat
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Crossref

University of Canberra Research Repository

Image Spam Classification using Deep Learning

Author: Singh Ajay Pal
Publication venue: SJSU ScholarWorks
Publication date: 01/04/2018
Field of study

Image classification is a fundamental problem of computer vision and pattern recognition. Spam is unwanted bulk content and image spam is unwanted content embedded inside the images. Image spam creates threat to the email based communication systems. Nowadays, a lot of unsolicited content is circulated over the internet. While a lot of machine learning techniques are successful in detecting textual based spam, this is not the case for image spams, which can easily evade these textual-spam detection systems. In this project, we explore and evaluate four deep learning techniques that detect image spams. First, we study neural networks and the deep neural networks, which we train on various image features. We explore their robustness on an improved dataset, which was especially build in order to outsmart current image spam detection techniques. Finally, we design two convolution neural network architectures and provide experimental results for these alongside the existing VGG19 transfer learning model for detecting image spams. Our work offers a new tool for detecting image spams and is compared against recent related tools

SJSU ScholarWorks

Recommended from our members

Kinesthetics eXtreme: An External Infrastructure for Monitoring Distributed Legacy Systems

Author: Gross Philip N.
Kaiser Gail E.
Parekh Janak J.
Valetto Giuseppe
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2003
Field of study

Autonomic computing - self-configuring, self-healing, self-optimizing applications, systems and networks - is widely believed to be a promising solution to ever-increasing system complexity and the spiraling costs of human system management as systems scale to global proportions. Most results to date, however, suggest ways to architect new software constructed from the ground up as autonomic systems, whereas in the real world organizations continue to use stovepipe legacy systems and/or build 'systems of systems' that draw from a gamut of new and legacy components involving disparate technologies from numerous vendors. Our goal is to retrofit autonomic computing onto such systems, externally, without any need to understand or modify the code, and in many cases even when it is impossible to recompile. We present a meta-architecture implemented as active middleware infrastructure to explicitly add autonomic services via an attached feedback loop that provides continual monitoring and, as needed, reconfiguration and/or repair. Our lightweight design and separation of concerns enables easy adoption of individual components, as well as the full infrastructure, for use with a large variety of legacy, new systems, and systems of systems. We summarize several experiments spanning multiple domains

Columbia University Academic Commons

Campus Safety Data Gathering, Classification, and Ranking Based on Clery-Act Reports

Author: Abo Elenin Walaa F
Publication venue: Digital Commons@Georgia Southern
Publication date: 01/01/2023
Field of study

Most existing campus safety rankings are based on criminal incident history with minimal or no consideration of campus security conditions and standard safety measures. Campus safety information published by universities/colleges is usually conceptual/qualitative and not quantitative and are based-on criminal records of these campuses. Thus, no explicit and trusted ranking method for these campuses considers the level of compliance with the standard safety measures. A quantitative safety measure is important to compare different campuses easily and to learn about specific campus safety conditions. In this thesis, we utilize Clery-Act reports of campuses to automatically analyze their safety conditions and generate a safety rank based on these reports. We first provide a survey of campus safety and security measures. We utilize our survey results to provide an automated data-gathering method for capturing standard campus safety data from Clery-act reports. We then utilize the collected information to classify existing campuses based on their safety conditions. Our research model is also capable to predict the safety rank of campuses based on their Clery-Act report by comparing it to existing Clery-Act reports of other campuses and reported rank on public resources. Our research on this thesis uses a number of languages, tools, and technologies such as Python, shell scripts, text conversion, data mining, spreadsheets, and others. We provide a detailed description of our research work on this topic, explain our research methodology, and finally describe our findings and results. This research contributes to the automated campus safety data generation, classification, and ranking

Georgia Southern University: Digital Commons@Georgia Southern

A review of spam email detection: analysis of spammer strategies and the dataset shift problem

Author: Alaiz Rodríguez Rocío
Alegre Gutiérrez Enrique
González Castro Víctor
Jáñez-Martino Francisco
López Fidalgo Eduardo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/06/2022
Field of study

.Spam emails have been traditionally seen as just annoying and unsolicited emails containing advertisements, but they increasingly include scams, malware or phishing. In order to ensure the security and integrity for the users, organisations and researchers aim to develop robust filters for spam email detection. Recently, most spam filters based on machine learning algorithms published in academic journals report very high performance, but users are still reporting a rising number of frauds and attacks via spam emails. Two main challenges can be found in this field: (a) it is a very dynamic environment prone to the dataset shift problem and (b) it suffers from the presence of an adversarial figure, i.e. the spammer. Unlike classical spam email reviews, this one is particularly focused on the problems that this constantly changing environment poses. Moreover, we analyse the different spammer strategies used for contaminating the emails, and we review the state-of-the-art techniques to develop filters based on machine learning. Finally, we empirically evaluate and present the consequences of ignoring the matter of dataset shift in this practical field. Experimental results show that this shift may lead to severe degradation in the estimated generalisation performance, with error rates reaching values up to 48.81%.SIPublicación en abierto financiada por el Consorcio de Bibliotecas Universitarias de Castilla y León (BUCLE), con cargo al Programa Operativo 2014ES16RFOP009 FEDER 2014-2020 DE CASTILLA Y LEÓN, Actuación:20007-CL - Apoyo Consorcio BUCL

Leon University (Spain)

Image Spam Analysis

Author: Annadatha Annapurna Sowmya
Publication venue: SJSU ScholarWorks
Publication date: 08/06/2016
Field of study

Image spam is unsolicited bulk email, where the message is embedded in an image. This technique is used to evade text-based spam lters. In this research, we analyze and compare two novel approaches for detecting spam images. Our rst approach focuses on the extraction of a broad set of image features and selection of an optimal subset using a Support Vector Machine (SVM). Our second approach is based on Principal Component Analysis (PCA), where we determine eigenvectors for a set of spam images and compute scores by projecting images onto the resulting eigenspace. Both approaches provide high accuracy with low computational complexity. Further, we develop a new spam image dataset that should prove valuable for improving image spam detection capabilities

SJSU ScholarWorks

Clustering and classification methods for spam analysis

Author: Smirnov Maksim
Publication venue
Publication date: 08/10/2018
Field of study

Spam emails are a major tool for criminals to distribute malware, conduct fraudulent activity, sell counterfeit products, etc. Thus, security companies are interested in researching spam. Unfortunately, due to the spammers' detection-avoidance techniques, most of the existing tools for spam analysis are not able to provide accurate information about spam campaigns. Moreover, they are not able to link together campaigns initiated by the same sender. F-Secure, a cybersecurity company, collects vast amounts of spam for analysis. The threat intelligence collection from these messages currently involves a lot of manual work. In this thesis we apply state-of-the-art data-analysis techniques to increase the level of automation in the analysis process, thus enabling the human experts to focus on high-level information such as campaigns and actors. The thesis discusses a novel method of spam analysis in which email messages are clustered by different characteristics and the clusters are presented as a graph. The graph representation allows the analyst to see evolving campaigns and even connections between related messages which themselves have no features in common. This makes our analysis tool more powerful than previous methods that simply cluster emails to sets. We implemented a proof of concept version of the analysis tool to evaluate the usefulness of the approach. Experiments show that the graph representation and clustering by different features makes it possible to link together large and complex spam campaigns that were previously not detected. The tools also found evidence that different campaigns were likely to be organized by the same spammer. The results indicate that the graph-based approach is able to extract new, useful information about spam campaigns

Aaltodoc Publication Archive

Sharing Computer Network Logs for Security and Privacy: A Motivation for New Methodologies of Anonymization

Author: Slagell Adam J.
Yurcik William
Publication venue
Publication date: 03/09/2004
Field of study

Logs are one of the most fundamental resources to any security professional. It is widely recognized by the government and industry that it is both beneficial and desirable to share logs for the purpose of security research. However, the sharing is not happening or not to the degree or magnitude that is desired. Organizations are reluctant to share logs because of the risk of exposing sensitive information to potential attackers. We believe this reluctance remains high because current anonymization techniques are weak and one-size-fits-all--or better put, one size tries to fit all. We must develop standards and make anonymization available at varying levels, striking a balance between privacy and utility. Organizations have different needs and trust other organizations to different degrees. They must be able to map multiple anonymization levels with defined risks to the trust levels they share with (would-be) receivers. It is not until there are industry standards for multiple levels of anonymization that we will be able to move forward and achieve the goal of widespread sharing of logs for security researchers.Comment: 17 pages, 1 figur

arXiv.org e-Print Archive

CiteSeerX