285 research outputs found

    Adaptive Representations for Tracking Breaking News on Twitter

    Full text link
    Twitter is often the most up-to-date source for finding and tracking breaking news stories. Therefore, there is considerable interest in developing filters for tweet streams in order to track and summarize stories. This is a non-trivial text analytics task as tweets are short, and standard retrieval methods often fail as stories evolve over time. In this paper we examine the effectiveness of adaptive mechanisms for tracking and summarizing breaking news stories. We evaluate the effectiveness of these mechanisms on a number of recent news events for which manually curated timelines are available. Assessments based on ROUGE metrics indicate that an adaptive approaches are best suited for tracking evolving stories on Twitter.Comment: 8 Pag

    A survey of temporal knowledge discovery paradigms and methods

    Get PDF
    With the increase in the size of data sets, data mining has recently become an important research topic and is receiving substantial interest from both academia and industry. At the same time, interest in temporal databases has been increasing and a growing number of both prototype and implemented systems are using an enhanced temporal understanding to explain aspects of behavior associated with the implicit time-varying nature of the universe. This paper investigates the confluence of these two areas, surveys the work to date, and explores the issues involved and the outstanding problems in temporal data mining

    When Things Matter: A Data-Centric View of the Internet of Things

    Full text link
    With the recent advances in radio-frequency identification (RFID), low-cost wireless sensor devices, and Web technologies, the Internet of Things (IoT) approach has gained momentum in connecting everyday objects to the Internet and facilitating machine-to-human and machine-to-machine communication with the physical world. While IoT offers the capability to connect and integrate both digital and physical entities, enabling a whole new class of applications and services, several significant challenges need to be addressed before these applications and services can be fully realized. A fundamental challenge centers around managing IoT data, typically produced in dynamic and volatile environments, which is not only extremely large in scale and volume, but also noisy, and continuous. This article surveys the main techniques and state-of-the-art research efforts in IoT from data-centric perspectives, including data stream processing, data storage models, complex event processing, and searching in IoT. Open research issues for IoT data management are also discussed

    Time Warp Edit Distance with Stiffness Adjustment for Time Series Matching

    Full text link
    In a way similar to the string-to-string correction problem we address time series similarity in the light of a time-series-to-time-series-correction problem for which the similarity between two time series is measured as the minimum cost sequence of "edit operations" needed to transform one time series into another. To define the "edit operations" we use the paradigm of a graphical editing process and end up with a dynamic programming algorithm that we call Time Warp Edit Distance (TWED). TWED is slightly different in form from Dynamic Time Warping, Longest Common Subsequence or Edit Distance with Real Penalty algorithms. In particular, it highlights a parameter which drives a kind of stiffness of the elastic measure along the time axis. We show that the similarity provided by TWED is a metric potentially useful in time series retrieval applications since it could benefit from the triangular inequality property to speed up the retrieval process while tuning the parameters of the elastic measure. In that context, a lower bound is derived to relate the matching of time series into down sampled representation spaces to the matching into the original space. Empiric quality of the TWED distance is evaluated on a simple classification task. Compared to Edit Distance, Dynamic Time Warping, Longest Common Subsequnce and Edit Distance with Real Penalty, TWED has proven to be quite effective on the considered experimental task

    A Suite of Techniques for Describing Activity in Terms of Events

    Get PDF
    This report presents a set of software techniques that support the tasks of event recognition, summarization of event sequences, explanation of recognized events, explanation of non-recognized events, prediction of event completions, and question answering by leveraging language-encoded human knowledge of what typically happens during various types of events. The techniques operate on sequences of timestamped, three-dimensional positions and contacts for humans, body parts, and objects, provided by a Microsoft Kinect sensor plus associated software. Appendices describe 64 activity sequences used for development and testing of the techniques and 102 event models created as part of the effort

    When things matter: A survey on data-centric Internet of Things

    Get PDF
    With the recent advances in radio-frequency identification (RFID), low-cost wireless sensor devices, and Web technologies, the Internet of Things (IoT) approach has gained momentum in connecting everyday objects to the Internet and facilitating machine-to-human and machine-to-machine communication with the physical world. IoT offers the capability to connect and integrate both digital and physical entities, enabling a whole new class of applications and services, but several significant challenges need to be addressed before these applications and services can be fully realized. A fundamental challenge centers around managing IoT data, typically produced in dynamic and volatile environments, which is not only extremely large in scale and volume, but also noisy and continuous. This paper reviews the main techniques and state-of-the-art research efforts in IoT from data-centric perspectives, including data stream processing, data storage models, complex event processing, and searching in IoT. Open research issues for IoT data management are also discussed

    Durable Queries over Historical Time Series Data

    Get PDF
    published_or_final_versio
    corecore