Search CORE

2,194 research outputs found

BlogForever: D2.5 Weblog Spam Filtering Report and Associated Methodology

Author: Banos Vangelis
Kasioumis Nikolaos
Kim Yunhyong
Kopidaki Stella
Ross Seamus
Rynning Morten
Stepanyan Karen
Publication venue: BlogForever
Publication date: 25/10/2013
Field of study

This report is written as a first attempt to define the BlogForever spam detection strategy. It comprises a survey of weblog spam technology and approaches to their detection. While the report was written to help identify possible approaches to spam detection as a component within the BlogForver software, the discussion has been extended to include observations related to the historical, social and practical value of spam, and proposals of other ways of dealing with spam within the repository without necessarily removing them. It contains a general overview of spam types, ready-made anti-spam APIs available for weblogs, possible methods that have been suggested for preventing the introduction of spam into a blog, and research related to spam focusing on those that appear in the weblog context, concluding in a proposal for a spam detection workflow that might form the basis for the spam detection component of the BlogForever software

ZENODO

Enlighten

BlogForever D2.4: Weblog spider prototype and associated methodology

Author: Banos V.
Gulliksen M.
Joy M.
Manolopoulos I.
Rynning M.
Stepanyan K.
Tselepidis I.
Publication venue
Publication date: 25/10/2013
Field of study

The purpose of this document is to present the evaluation of different solutions for capturing blogs, established methodology and to describe the developed blog spider prototype

ZENODO

BlogForever D5.2: Implementation of Case Studies

Author: Arampatzis S.
Arango-Docio S.
Banos. E.
Gkotsis G.
Kopidaki S.
Manolopoulos I.
Pinsent E.
Rynning M.
Sleeman P.
Stepanyan K.
Trochidis I.
Publication venue
Publication date: 25/10/2013
Field of study

This document presents the internal and external testing results for the BlogForever case studies. The evaluation of the BlogForever implementation process is tabulated under the most relevant themes and aspects obtained within the testing processes. The case studies provide relevant feedback for the sustainability of the platform in terms of potential users’ needs and relevant information on the possible long term impact

ZENODO

Recommended from our members

Proliferation and detection of blog spam

Author: Abu-Nimeh S.
Chen T.M.
Publication venue: Institute of Electrical and Electronics Engineers (IEEE)
Publication date: 01/01/2010
Field of study

The ease of posting comments and links in blogs has attracted spammers as an alternative venue to conventional email. An experimental study investigates the nature and prevalence of blog spam. Using Defensio logs, the authors collected and analyzed more than one million blog comments during the last two weeks of June 2009. They used a support vector machine (SVM) classifier combined with heuristics to identify spam posters' IP addresses, autonomous system numbers (ASN), and IP blocks. Experimental results show that more than 75 percent of blog comments during the reporting period are spam. In addition, the results show that blog spammers likely operate from a few colocation facilities. © 2006 IEEE

City Research Online

Crossref

Cronfa at Swansea University

BlogForever D2.6: Data Extraction Methodology

Author: Banos V.
Davis R.
Gkotsis G.
Pincent E.
Stepanyan K.
Publication venue
Publication date: 25/10/2013
Field of study

This report outlines an inquiry into the area of web data extraction, conducted within the context of blog preservation. The report reviews theoretical advances and practical developments for implementing data extraction. The inquiry is extended through an experiment that demonstrates the effectiveness and feasibility of implementing some of the suggested approaches. More specifically, the report discusses an approach based on unsupervised machine learning that employs the RSS feeds and HTML representations of blogs. It outlines the possibilities of extracting semantics available in blogs and demonstrates the benefits of exploiting available standards such as microformats and microdata. The report proceeds to propose a methodology for extracting and processing blog data to further inform the design and development of the BlogForever platform

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY