243 research outputs found

    BlogForever: D2.5 Weblog Spam Filtering Report and Associated Methodology

    Get PDF
    This report is written as a first attempt to define the BlogForever spam detection strategy. It comprises a survey of weblog spam technology and approaches to their detection. While the report was written to help identify possible approaches to spam detection as a component within the BlogForver software, the discussion has been extended to include observations related to the historical, social and practical value of spam, and proposals of other ways of dealing with spam within the repository without necessarily removing them. It contains a general overview of spam types, ready-made anti-spam APIs available for weblogs, possible methods that have been suggested for preventing the introduction of spam into a blog, and research related to spam focusing on those that appear in the weblog context, concluding in a proposal for a spam detection workflow that might form the basis for the spam detection component of the BlogForever software

    Link ecosystem of the portuguese blogosphere

    Get PDF
    Tese de mestrado integrado. Engenharia Informåtica e Computação. Faculdade de Engenharia. Universidade do Porto. 201

    The Influence of Image Quality on User Interactions

    Get PDF
    The purpose of this paper is to evaluate the effect of image size on number of user interactions across different blogging platforms. Earlier studies (Kornejeva, 2012; Liao et al., 2013) indicate that visual components of blogs can impact readers' experiences with blog content. This study aims to further explore the ways in which visual components impact readers. The number of images, sizes of the images, number of comments and number of non-comment interactions per post were collected from 400 WordPress posts and 400 Blogger posts. While no effect of image size was detected, an effect of image presence was detected. Opportunities for further research were discussed, as were the implications of the findings of this study.Master of Science in Information Scienc

    Hierarchal Characterization and Generation of Blogosphere Workloads

    Full text link
    We present a thorough characterization of the access patterns in blogspace, which comprises a rich interconnected web of blog postings and comments by an increasingly prominent user community that collectively define what has become known as the blogosphere. Our characterization of over 35 million read, write, and management requests spanning a 28-day period is done at three different levels. The user view characterizes how individual users interact with blogosphere objects (blogs); the object view characterizes how individual blogs are accessed; the server view characterizes the aggregate access patterns of all users to all blogs. The more-interactive nature of the blogosphere leads to interesting traffic and communication patterns, which are different from those observed for traditional web content. We identify and characterize novel features of the blogosphere workload, and we show the similarities and differences between typical web server workloads and blogosphere server workloads. Finally, based on our main characterization results, we build a new synthetic blogosphere workload generator called GBLOT, which aims at mimicking closely a stream of requests originating from a population of blog users. Given the increasing share of blogspace traffic, realistic workload models and tools are important for capacity planning and traffic engineering purposes.UOL (Bolsa Pesquisa 20060520221328a); National Science Foundation (072064, 0735974, 0524477, 0520166, 0205294
    • 

    corecore