3,859 research outputs found
Symbiotic data mining for personalized spam filtering
Unsolicited e-mail (spam) is a severe problem due to intrusion of privacy, online fraud, viruses and time spent reading unwanted messages. To solve this issue, Collaborative Filtering (CF) and Content-Based Filtering (CBF) solutions have been adopted. We propose a new CBF-CF hybrid approach called Symbiotic Data Mining (SDM), which aims at aggregating distinct local filters in order to improve filtering at a personalized level using collaboration while preserving privacy. We apply SDM to spam e-mail detection and compare it with a local CBF filter (i.e. Naive Bayes). Several experiments were conducted by using a novel corpus based on the well known Enron datasets mixed with recent spam. The results show that the symbiotic strategy is competitive in performance when compared to CBF and also more robust to contamination attacks.Fundação para a Ciência e a Tecnologia (FCT) - PTDC/EIA/64541/2006
Symbiotic filtering for spam email detection
This paper presents a novel spam filtering technique called Symbiotic Filtering (SF) that aggregates distinct local filters from several users to improve the overall perfor- mance of spam detection. SF is an hybrid approach combining some features from both Collaborative (CF) and Content-Based Filtering (CBF). It allows for the use of social networks to personalize and tailor the set of filters that serve as input to the filtering. A comparison is performed against the commonly used Naive Bayes CBF algorithm. Several experiments were held with the well-known Enron data, under both fixed and incremental symbiotic groups. We show that our system is competitive in performance and is robust against both dictionary and focused con- tamination attacks. Moreover, it can be implemented and deployed with few effort and low communication costs, while assuring privacy.Fundação para a Ciência e a Tecnologia (FCT) - bolsa PTDC/EIA/64541/200
On Recommendation of Learning Objects using Felder-Silverman Learning Style Model
The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.The e-learning recommender system in learning institutions is increasingly becoming the preferred mode of delivery, as it enables learning anytime, anywhere. However, delivering personalised course learning objects based on learner preferences is still a challenge. Current mainstream recommendation algorithms, such as the Collaborative Filtering (CF) and Content-Based Filtering (CBF), deal with only two types of entities, namely users and items with their ratings. However, these methods do not pay attention to student preferences, such as learning styles, which are especially important for the accuracy of course learning objects prediction or recommendation. Moreover, several recommendation techniques experience cold-start and rating sparsity problems. To address the challenge of improving the quality of recommender systems, in this paper a novel recommender algorithm for machine learning is proposed, which combines students actual rating with their learning styles to recommend Top-N course learning objects (LOs). Various recommendation techniques are considered in an experimental study investigating the best technique to use in predicting student ratings for e-learning recommender systems. We use the Felder-Silverman Learning Styles Model (FSLSM) to represent both the student learning styles and the learning object profiles. The predicted rating has been compared with the actual student rating. This approach has been experimented on 80 students for an online course created in the MOODLE Learning Management System, while the evaluation of the experiments has been performed with the Mean Absolute Error (MAE) and Root Mean Square Error (RMSE). The results of the experiment verify that the proposed approach provides a higher prediction rating and significantly increases the accuracy of the recommendation
LRMM: Learning to Recommend with Missing Modalities
Multimodal learning has shown promising performance in content-based
recommendation due to the auxiliary user and item information of multiple
modalities such as text and images. However, the problem of incomplete and
missing modality is rarely explored and most existing methods fail in learning
a recommendation model with missing or corrupted modalities. In this paper, we
propose LRMM, a novel framework that mitigates not only the problem of missing
modalities but also more generally the cold-start problem of recommender
systems. We propose modality dropout (m-drop) and a multimodal sequential
autoencoder (m-auto) to learn multimodal representations for complementing and
imputing missing modalities. Extensive experiments on real-world Amazon data
show that LRMM achieves state-of-the-art performance on rating prediction
tasks. More importantly, LRMM is more robust to previous methods in alleviating
data-sparsity and the cold-start problem.Comment: 11 pages, EMNLP 201
Hybrid Collaborative Filtering with Autoencoders
Collaborative Filtering aims at exploiting the feedback of users to provide
personalised recommendations. Such algorithms look for latent variables in a
large sparse matrix of ratings. They can be enhanced by adding side information
to tackle the well-known cold start problem. While Neu-ral Networks have
tremendous success in image and speech recognition, they have received less
attention in Collaborative Filtering. This is all the more surprising that
Neural Networks are able to discover latent variables in large and
heterogeneous datasets. In this paper, we introduce a Collaborative Filtering
Neural network architecture aka CFN which computes a non-linear Matrix
Factorization from sparse rating inputs and side information. We show
experimentally on the MovieLens and Douban dataset that CFN outper-forms the
state of the art and benefits from side information. We provide an
implementation of the algorithm as a reusable plugin for Torch, a popular
Neural Network framework
- …
