260 research outputs found

    Combining Visual Layout and Lexical Cohesion Features for Text Segmentation

    Get PDF
    We propose integrating features from lexical cohesion with elements from layout recognition to build a composite framework. We use supervised machine learning on this composite feature set to derive discourse structure on the topic level. We demonstrate a system based on this principle and use both an intrinsic evaluation as well as the task of genre classification to assess its performance

    A PDTB-Styled End-to-End Discourse Parser

    Full text link
    We have developed a full discourse parser in the Penn Discourse Treebank (PDTB) style. Our trained parser first identifies all discourse and non-discourse relations, locates and labels their arguments, and then classifies their relation types. When appropriate, the attribution spans to these relations are also determined. We present a comprehensive evaluation from both component-wise and error-cascading perspectives.Comment: 15 pages, 5 figures, 7 table

    Using the Annotated Bibliography as a Resource for Indicative Summarization

    Get PDF
    We report on a language resource consisting of 2000 annotated bibliography entries, which is being analyzed as part of our research on indicative document summarization. We show how annotated bibliographies cover certain aspects of summarization that have not been well-covered by other summary corpora, and motivate why they constitute an important form to study for information retrieval. We detail our methodology for collecting the corpus, and overview our document feature markup that we introduced to facilitate summary analysis. We present the characteristics of the corpus, methods of collection, and show its use in finding the distribution of types of information included in indicative summaries and their relative ordering within the summaries.Comment: 8 pages, 3 figure

    #mytweet via Instagram: Exploring User Behaviour across Multiple Social Networks

    Full text link
    We study how users of multiple online social networks (OSNs) employ and share information by studying a common user pool that use six OSNs - Flickr, Google+, Instagram, Tumblr, Twitter, and YouTube. We analyze the temporal and topical signature of users' sharing behaviour, showing how they exhibit distinct behaviorial patterns on different networks. We also examine cross-sharing (i.e., the act of user broadcasting their activity to multiple OSNs near-simultaneously), a previously-unstudied behaviour and demonstrate how certain OSNs play the roles of originating source and destination sinks.Comment: IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, 2015. This is the pre-peer reviewed version and the final version is available at http://wing.comp.nus.edu.sg/publications/2015/lim-et-al-15.pd
    • …
    corecore