2,526 research outputs found

    BlogForever: D2.5 Weblog Spam Filtering Report and Associated Methodology

    Get PDF
    This report is written as a first attempt to define the BlogForever spam detection strategy. It comprises a survey of weblog spam technology and approaches to their detection. While the report was written to help identify possible approaches to spam detection as a component within the BlogForver software, the discussion has been extended to include observations related to the historical, social and practical value of spam, and proposals of other ways of dealing with spam within the repository without necessarily removing them. It contains a general overview of spam types, ready-made anti-spam APIs available for weblogs, possible methods that have been suggested for preventing the introduction of spam into a blog, and research related to spam focusing on those that appear in the weblog context, concluding in a proposal for a spam detection workflow that might form the basis for the spam detection component of the BlogForever software

    A Longitudinal Study of Factors that Affect User Interactions with Social Media and Email Spam

    Get PDF
    Given the rapid growth of social media and the increasing prevalence of spam, it is crucial to understand users’ interactions with unsolicited content to develop effective countermeasures against spam. This thesis focuses on exploring the factors that influence users’ decisions to interact with spam on social media and email. It builds upon prior work, which serves as a foundation for further research and conducting a longitudinal analysis. Our results are based on the analysis of 221 responses collected through an online survey. The survey not only gathered demographic information such as age, gender, and race but also collected data on education, spam training, interaction with spam, and experiences of being a victim of spam. With about 87% of respondents stating they sometimes, often, or always encounter spam on social media, only 23% interact with it sometimes, often, or always before knowing it was spam, and 10% sometimes, often, or always interact with social media spam after knowing it was spam. Of the 75% of the respondents who stated that they sometimes, often, or always encounter email spam, approximately 13% of the respondents stated that they sometimes, often, or always interact with email spam before knowing it is spam, and 6%s stated that they sometimes, often, or always interact with email spam after knowing it is spam. Although only 38% of the users stated that they may have been victims of social media spam and 21% stated that they may have been victims of email spam. Among the factors analyzed, only age had an effect on reporting email spam, but not social media spam. A STEM education was found to reduce the likelihood of being a victim of both social media and email spam, as well as reduce the likelihood of interacting with both email and social media spam, but only before users knew they were interacting with spam. Interestingly, formal spam training did not show any statistical significance in determining how users interact with, report, or become victims of social media spam, although there was an effect when observing the identification of email spam. To quantify the effect of different factors on individuals falling victim to spam on social media and email, a logistic regression analysis was performed. The research findings suggest that individuals with a higher attained degree and a STEM background are the least likely to be victims of spam

    Blogs as a Means of Preservation Selection for the World Wide Web

    Get PDF
    Currently, there is not a strong system of selection in place when looking at preserving content on the Web. This study is an examination of the blogging community for the possibility of utilizing the decentralized and distributed nature of link selection that takes place within the community as a means of preservation selection. The purpose of this study is to compare the blog aggregators, Daypop, Blogdex, and BlogPulse, for their ability to collect content which is of archival quality. This study analyzes the content selected by these aggregators to determine if any content which is linked to most frequently for a given day is of archival quality. Archival quality is determined by comparing the content from the aggregator lists to criteria assembled for the study from a variety of archival policies and principles

    CLAD: A Complex and Long Activities Dataset with Rich Crowdsourced Annotations

    Get PDF
    This paper introduces a novel activity dataset which exhibits real-life and diverse scenarios of complex, temporally-extended human activities and actions. The dataset presents a set of videos of actors performing everyday activities in a natural and unscripted manner. The dataset was recorded using a static Kinect 2 sensor which is commonly used on many robotic platforms. The dataset comprises of RGB-D images, point cloud data, automatically generated skeleton tracks in addition to crowdsourced annotations. Furthermore, we also describe the methodology used to acquire annotations through crowdsourcing. Finally some activity recognition benchmarks are presented using current state-of-the-art techniques. We believe that this dataset is particularly suitable as a testbed for activity recognition research but it can also be applicable for other common tasks in robotics/computer vision research such as object detection and human skeleton tracking

    Wild Patterns: Ten Years After the Rise of Adversarial Machine Learning

    Get PDF
    Learning-based pattern classifiers, including deep networks, have shown impressive performance in several application domains, ranging from computer vision to cybersecurity. However, it has also been shown that adversarial input perturbations carefully crafted either at training or at test time can easily subvert their predictions. The vulnerability of machine learning to such wild patterns (also referred to as adversarial examples), along with the design of suitable countermeasures, have been investigated in the research field of adversarial machine learning. In this work, we provide a thorough overview of the evolution of this research area over the last ten years and beyond, starting from pioneering, earlier work on the security of non-deep learning algorithms up to more recent work aimed to understand the security properties of deep learning algorithms, in the context of computer vision and cybersecurity tasks. We report interesting connections between these apparently-different lines of work, highlighting common misconceptions related to the security evaluation of machine-learning algorithms. We review the main threat models and attacks defined to this end, and discuss the main limitations of current work, along with the corresponding future challenges towards the design of more secure learning algorithms.Comment: Accepted for publication on Pattern Recognition, 201

    BlogForever D2.4: Weblog spider prototype and associated methodology

    Get PDF
    The purpose of this document is to present the evaluation of different solutions for capturing blogs, established methodology and to describe the developed blog spider prototype
    • 

    corecore