260 research outputs found
Combining Visual Layout and Lexical Cohesion Features for Text Segmentation
We propose integrating features from lexical cohesion with elements from layout recognition to build a composite framework. We use supervised machine learning on this composite feature set to derive discourse structure on the topic level. We demonstrate a system based on this principle and use both an intrinsic evaluation as well as the task of genre classification to assess its performance
A PDTB-Styled End-to-End Discourse Parser
We have developed a full discourse parser in the Penn Discourse Treebank
(PDTB) style. Our trained parser first identifies all discourse and
non-discourse relations, locates and labels their arguments, and then
classifies their relation types. When appropriate, the attribution spans to
these relations are also determined. We present a comprehensive evaluation from
both component-wise and error-cascading perspectives.Comment: 15 pages, 5 figures, 7 table
Using the Annotated Bibliography as a Resource for Indicative Summarization
We report on a language resource consisting of 2000 annotated bibliography
entries, which is being analyzed as part of our research on indicative document
summarization. We show how annotated bibliographies cover certain aspects of
summarization that have not been well-covered by other summary corpora, and
motivate why they constitute an important form to study for information
retrieval. We detail our methodology for collecting the corpus, and overview
our document feature markup that we introduced to facilitate summary analysis.
We present the characteristics of the corpus, methods of collection, and show
its use in finding the distribution of types of information included in
indicative summaries and their relative ordering within the summaries.Comment: 8 pages, 3 figure
#mytweet via Instagram: Exploring User Behaviour across Multiple Social Networks
We study how users of multiple online social networks (OSNs) employ and share
information by studying a common user pool that use six OSNs - Flickr, Google+,
Instagram, Tumblr, Twitter, and YouTube. We analyze the temporal and topical
signature of users' sharing behaviour, showing how they exhibit distinct
behaviorial patterns on different networks. We also examine cross-sharing
(i.e., the act of user broadcasting their activity to multiple OSNs
near-simultaneously), a previously-unstudied behaviour and demonstrate how
certain OSNs play the roles of originating source and destination sinks.Comment: IEEE/ACM International Conference on Advances in Social Networks
Analysis and Mining, 2015. This is the pre-peer reviewed version and the
final version is available at
http://wing.comp.nus.edu.sg/publications/2015/lim-et-al-15.pd
- …