18,386 research outputs found

    A Semi-automatic Method for Efficient Detection of Stories on Social Media

    Get PDF
    Twitter has become one of the main sources of news for many people. As real-world events and emergencies unfold, Twitter is abuzz with hundreds of thousands of stories about the events. Some of these stories are harmless, while others could potentially be life-saving or sources of malicious rumors. Thus, it is critically important to be able to efficiently track stories that spread on Twitter during these events. In this paper, we present a novel semi-automatic tool that enables users to efficiently identify and track stories about real-world events on Twitter. We ran a user study with 25 participants, demonstrating that compared to more conventional methods, our tool can increase the speed and the accuracy with which users can track stories about real-world events.Comment: ICWSM'16, May 17-20, Cologne, Germany. In Proceedings of the 10th International AAAI Conference on Weblogs and Social Media (ICWSM 2016). Cologne, German

    BlogForever D2.6: Data Extraction Methodology

    Get PDF
    This report outlines an inquiry into the area of web data extraction, conducted within the context of blog preservation. The report reviews theoretical advances and practical developments for implementing data extraction. The inquiry is extended through an experiment that demonstrates the effectiveness and feasibility of implementing some of the suggested approaches. More specifically, the report discusses an approach based on unsupervised machine learning that employs the RSS feeds and HTML representations of blogs. It outlines the possibilities of extracting semantics available in blogs and demonstrates the benefits of exploiting available standards such as microformats and microdata. The report proceeds to propose a methodology for extracting and processing blog data to further inform the design and development of the BlogForever platform

    User evaluation outside the lab: the trial of FĂ­schlĂĄr-News

    Get PDF
    A user study of Físchlár-News system was conducted in Spring 2004 with 16 users, each user using the system for a 1-month period. Físchlár-News is an experimental online news archive that incorporates various automatic content-based video indexing techniques and a news story recommender algorithm to process and index the daily 9 o’clock broadcast news from TV and allows its users to browse, search, be recommended, and play news stories on a conventional web browser. Pre and post-trial questionnaires, interaction logging and incident diary methods collected both qualitative and quantitative usage data during the trial period. While the details of the findings from this evaluation is reported elsewhere, in this paper we report the details of the methodology taken and our experience of conducting this evaluation

    Tweet Acts: A Speech Act Classifier for Twitter

    Get PDF
    Speech acts are a way to conceptualize speech as action. This holds true for communication on any platform, including social media platforms such as Twitter. In this paper, we explored speech act recognition on Twitter by treating it as a multi-class classification problem. We created a taxonomy of six speech acts for Twitter and proposed a set of semantic and syntactic features. We trained and tested a logistic regression classifier using a data set of manually labelled tweets. Our method achieved a state-of-the-art performance with an average F1 score of more than 0.700.70. We also explored classifiers with three different granularities (Twitter-wide, type-specific and topic-specific) in order to find the right balance between generalization and overfitting for our task.Comment: ICWSM'16, May 17-20, Cologne, Germany. In Proceedings of the 10th AAAI Conference on Weblogs and Social Media (ICWSM 2016). Cologne, German
    • 

    corecore