377 research outputs found

    Adaptive Representations for Tracking Breaking News on Twitter

    Full text link
    Twitter is often the most up-to-date source for finding and tracking breaking news stories. Therefore, there is considerable interest in developing filters for tweet streams in order to track and summarize stories. This is a non-trivial text analytics task as tweets are short, and standard retrieval methods often fail as stories evolve over time. In this paper we examine the effectiveness of adaptive mechanisms for tracking and summarizing breaking news stories. We evaluate the effectiveness of these mechanisms on a number of recent news events for which manually curated timelines are available. Assessments based on ROUGE metrics indicate that an adaptive approaches are best suited for tracking evolving stories on Twitter.Comment: 8 Pag

    Scalable Multi-document Summarization Using Natural Language Processing

    Get PDF
    In this age of Internet, Natural Language Processing (NLP) techniques are the key sources for providing information required by users. However, with the extensive usage of available data, a secondary level of wrappers that interact with NLP tools have become necessary. These tools must extract a concise summary from the primary data set retrieved. The main reason for using text summarization techniques is to obtain this secondary level of information. Text summarization using NLP techniques is an interesting area of research with various implications for information retrieval. This report deals with the use of Latent Semantic Analysis (LSA) for generic text summarization and compares it with other models available. It proposes text summarization using LDS in conjunction with open-source NLP frameworks such as Mahout and Lucene. The LSA algorithm can be scaled to multiple large-sized documents using these framworks. The performance of this algorithm is then compared with other models commonly used for summarization and Recall-Oriented Understudy of Gisting Evaluation (ROUGE) scores. This project implements a text summarization framework, which uses available open-source tools and cloud resources to summarize documents from many languages such as, in the case of this study, English and Hindi

    Search-based model summarization

    Get PDF
    Large systems are complex and consist of numerous components and interactions between the components. Hence managing such large systems is a cumbersome and time consuming task. Large systems are usually described at the model level. But the large number of components in such models makes it difficult to modify. As a consequence, developers need a solution to rapidly detect which model components to revise. Effective solution is to generate a model summary. Although existing techniques are powerful enough to provide good summaries based on lexical information (relevant terms), they do not make use of structural information (component structure) well. In this thesis, model summarization is considered as an optimization problem that combines structural and lexical information to evaluate possible solutions. A summary solution is defined as a combination of model elements (e.g., classes, methods, comments, etc.) that should maximize, as much as possible, the coverage of both automatically generated structural rules and lexical information. The results of the experiments are reported on 6 open source projects where the majority of generated summaries are approved by developers --Abstract, page iii

    Automatic text summarization using pathfinder network scaling

    Get PDF
    Contém uma errataTese de Mestrado. Inteligência Artificial e Sistemas Inteligentes. Faculdade de Engenharia. Universidade do Porto, Faculdade de Economia. Universidade do Porto. 200
    corecore