Search CORE

377 research outputs found

Adaptive Representations for Tracking Breaking News on Twitter

Author: Brigadir Igor
Cunningham Pádraig
Greene Derek
Publication venue
Publication date: 28/11/2014
Field of study

Twitter is often the most up-to-date source for finding and tracking breaking news stories. Therefore, there is considerable interest in developing filters for tweet streams in order to track and summarize stories. This is a non-trivial text analytics task as tweets are short, and standard retrieval methods often fail as stories evolve over time. In this paper we examine the effectiveness of adaptive mechanisms for tracking and summarizing breaking news stories. We evaluate the effectiveness of these mechanisms on a number of recent news events for which manually curated timelines are available. Assessments based on ROUGE metrics indicate that an adaptive approaches are best suited for tracking evolving stories on Twitter.Comment: 8 Pag

arXiv.org e-Print Archive

Research Repository UCD

Scalable Multi-document Summarization Using Natural Language Processing

Author: Prabhala Bhargav
Publication venue: RIT Scholar Works
Publication date: 01/06/2014
Field of study

In this age of Internet, Natural Language Processing (NLP) techniques are the key sources for providing information required by users. However, with the extensive usage of available data, a secondary level of wrappers that interact with NLP tools have become necessary. These tools must extract a concise summary from the primary data set retrieved. The main reason for using text summarization techniques is to obtain this secondary level of information. Text summarization using NLP techniques is an interesting area of research with various implications for information retrieval. This report deals with the use of Latent Semantic Analysis (LSA) for generic text summarization and compares it with other models available. It proposes text summarization using LDS in conjunction with open-source NLP frameworks such as Mahout and Lucene. The LSA algorithm can be scaled to multiple large-sized documents using these framworks. The performance of this algorithm is then compared with other models commonly used for summarization and Recall-Oriented Understudy of Gisting Evaluation (ROUGE) scores. This project implements a text summarization framework, which uses available open-source tools and cloud resources to summarize documents from many languages such as, in the case of this study, English and Hindi

RIT Scholar Works

Search-based model summarization

Author: Ravichandran Lokesh Krishna
Publication venue: Scholars\u27 Mine
Publication date: 01/01/2013
Field of study

Large systems are complex and consist of numerous components and interactions between the components. Hence managing such large systems is a cumbersome and time consuming task. Large systems are usually described at the model level. But the large number of components in such models makes it difficult to modify. As a consequence, developers need a solution to rapidly detect which model components to revise. Effective solution is to generate a model summary. Although existing techniques are powerful enough to provide good summaries based on lexical information (relevant terms), they do not make use of structural information (component structure) well. In this thesis, model summarization is considered as an optimization problem that combines structural and lexical information to evaluate possible solutions. A summary solution is defined as a combination of model elements (e.g., classes, methods, comments, etc.) that should maximize, as much as possible, the coverage of both automatically generated structural rules and lexical information. The results of the experiments are reported on 6 open source projects where the majority of generated summaries are approved by developers --Abstract, page iii

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

Automatic text summarization using pathfinder network scaling

Author: Patil Kaustubh Raosaheb
Publication venue
Publication date: 01/01/2007
Field of study

Contém uma errataTese de Mestrado. Inteligência Artificial e Sistemas Inteligentes. Faculdade de Engenharia. Universidade do Porto, Faculdade de Economia. Universidade do Porto. 200

Repositório Aberto da Universidade do Porto

Recommended from our members

Investigating the Extractive Summarization of Literary Novels

Author: Ceylan Hakan
Publication venue: 'University of North Texas Libraries'
Publication date: 01/12/2011
Field of study

Abstract Due to the vast amount of information we are faced with, summarization has become a critical necessity of everyday human life. Given that a large fraction of the electronic documents available online and elsewhere consist of short texts such as Web pages, news articles, scientific reports, and others, the focus of natural language processing techniques to date has been on the automation of methods targeting short documents. We are witnessing however a change: an increasingly larger number of books become available in electronic format. This means that the need for language processing techniques able to handle very large documents such as books is becoming increasingly important. This thesis addresses the problem of summarization of novels, which are long and complex literary narratives. While there is a significant body of research that has been carried out on the task of automatic text summarization, most of this work has been concerned with the summarization of short documents, with a particular focus on news stories. However, novels are different in both length and genre, and consequently different summarization techniques are required. This thesis attempts to close this gap by analyzing a new domain for summarization, and by building unsupervised and supervised systems that effectively take into account the properties of long documents, and outperform the traditional extractive summarization systems typically addressing news genre

UNT Digital Library