Search CORE

18 research outputs found

Stacking classifiers for anti-spam filtering of e-mail

Author: Androutsopoulos I.
Karkaletsis V.
Paliouras G.
Sakkis G.
Spyropoulos C. D.
Stamatopoulos P.
Publication venue
Publication date: 01/01/2001
Field of study

We evaluate empirically a scheme for combining classifiers, known as stacked generalization, in the context of anti-spam filtering, a novel cost-sensitive application of text categorization. Unsolicited commercial e-mail, or "spam", floods mailboxes, causing frustration, wasting bandwidth, and exposing minors to unsuitable content. Using a public corpus, we show that stacking can improve the efficiency of automatically induced anti-spam filters, and that such filters can be used in real-life applications

arXiv.org e-Print Archive

CiteSeerX

"May I borrow Your Filter?" Exchanging Filters to Combat Spam in a Community

Author: Battiti Roberto
Cascella Roberto G.
Garg Anurag
Publication venue
Publication date: 01/11/2005
Field of study

Leveraging social networks in computer systems can be effective in dealing with a number of trust and security issues. Spam is one such issue where the "wisdom of crowds" can be harnessed by mining the collective knowledge of ordinary individuals. In this paper, we present a mechanism through which members of a virtual community can exchange information to combat spam. Previous attempts at collaborative spam filtering have concentrated on digest-based indexing techniques to share digests or fingerprints of emails that are known to be spam. We take a different approach and allow users to share their spam filters instead, thus dramatically reducing the amount of traffic generated in the network. The resultant diversity in the filters and cooperation in a community allows it to respond to spam in an autonomic fashion. As a test case for exchanging filters we use the popular SpamAssassin spam filtering software and show that exchanging spam filters provides an alternative method to improve spam filtering performance

Unitn-eprints Research

Determining the Drivers and Barriers to the Adoption of Smart Vending Machine

Author: Mnyakin Maxim
Publication venue: ResearchBerg Review of Science and Technology
Publication date: 28/12/2020
Field of study

The Internet of Things (IoT) revolution is revolutionizing numerous industries, including the vending machine industry. Smart vending machines are one example of how the Internet of Things is altering the way vending machines operate. Smart vending machines can do functions other than merely dispatching products in exchange for payment by combining modern technologies such as internet connectivity and touch screens. They can make purchasing more convenient for customers, track inventories in real-time, and even take mobile payments using a smartphone app. By surveying 412 business owners, this study employed a stacking classifier to analyze the determinants of smart vending machine adoption. The findings indicate that improved security and safety, as well as the decrease in operational costs, are the primary drivers of adoption among firms that have adopted the smart vending machine. Smart vending machines can be equipped with security cameras and alarms to deter theft and vandalism, as well as to prevent contamination or tampering. This can help to improve the vending machine's general security and safety, as well as the products it dispenses. By automating processes such as inventory management and refilling, smart vending machines can also help to minimize operating expenses. This can save the operator time and money. This study's findings also revealed that the primary barriers to adoption are upfront costs and technological challenges. The initial cost of purchasing and installing a Smart vending machine might be too expensive, particularly for SMEs. Operators may be required to spend on technical help and training in order to successfully use and maintain this equipment. The future of vending machines is expected to witness a steady move toward Smart vending machines. As sophisticated technology becomes more widely accessible and inexpensive, more operators are likely to realize it and make the switch

ResearchBerg

Multi-classifier classification of spam email on an ubiquitous multi-core architecture

Author: Chonka Ashley
Islam Md. Rafiqul
Singh Jaipal
Zhou Wanlei
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

This paper presents an innovative fusion based multi-classifier email classification on a ubiquitous multi-core architecture. Many approaches use text-based single classifiers or multiple weakly trained classifiers to identify spam messages from a large email corpus. We build upon our previous work on multi-core by apply our ubiquitous multi-core framework to run our fusion based multi-classifier architecture. By running each classifier process in parallel within their dedicated core, we greatly improve the performance of our proposed multi-classifier based filtering system. Our proposed architecture also provides a safeguard of user mailbox from different malicious attacks. Our experimental results show that we achieved an average of 30% speedup at the average cost of 1.4 ms. We also reduced the instance of false positive, which is one of the key challenges in spam filtering system, and increases email classification accuracy substantially compared with single classification techniques.<br /

Deakin Research Online

espace@Curtin

Hybrid GA-SVM for Efficient Feature Selection in E-mail Classification

Author: Abimbola Adigun
Stephen Olabiyisi
Temitayo Fagbola
Publication venue: The International Institute for Science, Technology and Education (IISTE)
Publication date: 28/02/2012
Field of study

Feature selection is a problem of global combinatorial optimization in machine learning in which subsets of relevant features are selected to realize robust learning models. The inclusion of irrelevant and redundant features in the dataset can result in poor predictions and high computational overhead. Thus, selecting relevant feature subsets can help reduce the computational cost of feature measurement, speed up learning process and improve model interpretability. SVM classifier has proven inefficient in its inability to produce accurate classification results in the face of large e-mail dataset while it also consumes a lot of computational resources. In this study, a Genetic Algorithm-Support Vector Machine (GA-SVM) feature selection technique is developed to optimize the SVM classification parameters, the prediction accuracy and computation time. Spam assassin dataset was used to validate the performance of the proposed system. The hybrid GA-SVM showed remarkable improvements over SVM in terms of classification accuracy and computation time. Keywords: E-mail Classification, Feature-Selection, Genetic algorithm, Support Vector Machin

International Institute for Science, Technology and Education (IISTE): E-Journals

Emerging Technologies in Healthcare: Analysis of UNOS Data Through Machine Learning

Author: Merekar Reyhan
Publication venue: CUNY Academic Works
Publication date: 18/05/2020
Field of study

The healthcare industry is primed for a massive transformation in the coming decades due to emerging technologies such as Artificial Intelligence (AI) and Machine Learning. With a practical application to the UNOS (United Network of Organ Sharing) database, this Thesis seeks to investigate how Machine Learning and analytic methods may be used to predict one-year heart transplantation outcomes. This study also sought to improve on predictive performances from prior studies by analyzing both Donor and Recipient data. Models built with algorithms such as Stacking and Tree Boosting gave the highest performance, with AUC’s of 0.6810 and 0.6804, respectively. In this work, a roadmap was created that justifies the need for these technologies in healthcare. In application, the data was prepared, models were built using advanced algorithms, and important variables were selected. These steps were continuously done with validation from experienced clinicians. To yield greater insights in this study, the dataset was split row-wise by factors such as LVAD Support, Donor/Recipient Gender Combinations, and Time Period; this rendered 8 new datasets for analysis. This work explores the trade-off between interpretability and performance in applying analytic methods in a real-world problem in this domain. Finally, forward looking industry implications are discussed

City University of New York