5,561 research outputs found

    i-JEN: Visual interactive Malaysia crime news retrieval system

    Get PDF
    Supporting crime news investigation involves a mechanism to help monitor the current and past status of criminal events. We believe this could be well facilitated by focusing on the user interfaces and the event crime model aspects. In this paper we discuss on a development of Visual Interactive Malaysia Crime News Retrieval System (i-JEN) and describe the approach, user studies and planned, the system architecture and future plan. Our main objectives are to construct crime-based event; investigate the use of crime-based event in improving the classification and clustering; develop an interactive crime news retrieval system; visualize crime news in an effective and interactive way; integrate them into a usable and robust system and evaluate the usability and system performance. The system will serve as a news monitoring system which aims to automatically organize, retrieve and present the crime news in such a way as to support an effective monitoring, searching, and browsing for the target users groups of general public, news analysts and policemen or crime investigators. The study will contribute to the better understanding of the crime data consumption in the Malaysian context as well as the developed system with the visualisation features to address crime data and the eventual goal of combating the crimes

    Time Aware Knowledge Extraction for Microblog Summarization on Twitter

    Full text link
    Microblogging services like Twitter and Facebook collect millions of user generated content every moment about trending news, occurring events, and so on. Nevertheless, it is really a nightmare to find information of interest through the huge amount of available posts that are often noise and redundant. In general, social media analytics services have caught increasing attention from both side research and industry. Specifically, the dynamic context of microblogging requires to manage not only meaning of information but also the evolution of knowledge over the timeline. This work defines Time Aware Knowledge Extraction (briefly TAKE) methodology that relies on temporal extension of Fuzzy Formal Concept Analysis. In particular, a microblog summarization algorithm has been defined filtering the concepts organized by TAKE in a time-dependent hierarchy. The algorithm addresses topic-based summarization on Twitter. Besides considering the timing of the concepts, another distinguish feature of the proposed microblog summarization framework is the possibility to have more or less detailed summary, according to the user's needs, with good levels of quality and completeness as highlighted in the experimental results.Comment: 33 pages, 10 figure

    EveTAR: Building a Large-Scale Multi-Task Test Collection over Arabic Tweets

    Full text link
    This article introduces a new language-independent approach for creating a large-scale high-quality test collection of tweets that supports multiple information retrieval (IR) tasks without running a shared-task campaign. The adopted approach (demonstrated over Arabic tweets) designs the collection around significant (i.e., popular) events, which enables the development of topics that represent frequent information needs of Twitter users for which rich content exists. That inherently facilitates the support of multiple tasks that generally revolve around events, namely event detection, ad-hoc search, timeline generation, and real-time summarization. The key highlights of the approach include diversifying the judgment pool via interactive search and multiple manually-crafted queries per topic, collecting high-quality annotations via crowd-workers for relevancy and in-house annotators for novelty, filtering out low-agreement topics and inaccessible tweets, and providing multiple subsets of the collection for better availability. Applying our methodology on Arabic tweets resulted in EveTAR , the first freely-available tweet test collection for multiple IR tasks. EveTAR includes a crawl of 355M Arabic tweets and covers 50 significant events for which about 62K tweets were judged with substantial average inter-annotator agreement (Kappa value of 0.71). We demonstrate the usability of EveTAR by evaluating existing algorithms in the respective tasks. Results indicate that the new collection can support reliable ranking of IR systems that is comparable to similar TREC collections, while providing strong baseline results for future studies over Arabic tweets

    Storia: Summarizing Social Media Content based on Narrative Theory using Crowdsourcing

    Full text link
    People from all over the world use social media to share thoughts and opinions about events, and understanding what people say through these channels has been of increasing interest to researchers, journalists, and marketers alike. However, while automatically generated summaries enable people to consume large amounts of data efficiently, they do not provide the context needed for a viewer to fully understand an event. Narrative structure can provide templates for the order and manner in which this data is presented to create stories that are oriented around narrative elements rather than summaries made up of facts. In this paper, we use narrative theory as a framework for identifying the links between social media content. To do this, we designed crowdsourcing tasks to generate summaries of events based on commonly used narrative templates. In a controlled study, for certain types of events, people were more emotionally engaged with stories created with narrative structure and were also more likely to recommend them to others compared to summaries created without narrative structure

    An Event-Ontology-Based Approach to Constructing Episodic Knowledge from Unstructured Text Documents

    Get PDF
    Document summarization is an important function for knowledge management when a digital library of text documents grows. It allows documents to be presented in a concise manner for easy reading and understanding. Traditionally, document summarization adopts sentence-based mechanisms that identify and extract key sentences from long documents and assemble them together. Although that approach is useful in providing an abstract of documents, it cannot extract the relationship or sequence of a set of related events (also called episodes). This paper proposes an event-oriented ontology approach to constructing episodic knowledge to facilitate the understanding of documents. We also empirically evaluated the proposed approach by using instruments developed based on Bloom’s Taxonomy. The result reveals that the approach based on proposed event-oriented ontology outperformed the traditional text summarization approach in capturing conceptual and procedural knowledge, but the latter was still better in delivering factual knowledge
    corecore