Search CORE

66 research outputs found

Recommended from our members

MEAD - A Platform for Multidocument Multilingual Text Summarization

Author: Allison Timothy
Blair-Goldensohn Sasha
Blitzer John
Celebi Arda
Dimitrov Stanko
Drabek Elliott
Hakim Ali
Lam Wai
Liu Danyu
Otterbacher Jahna
Qi Hong
Radev Dragomir R.
Saggion Horacio
Teufel Simone
Topper Michael
Winkel Adam
Zhang Zhu
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2004
Field of study

This paper describes the functionality of MEAD, a comprehensive, public domain, open source, multidocument multilingual summarization environment that has been thus far downloaded by more than 500 organizations. MEAD has been used in a variety of summarization applications ranging from summarization for mobile devices to Web page summarization within a search engine and to novelty detection

Columbia University Academic Commons

Deliverable 6.1 Infrastructure for Extractive Summarization

Author: Fuentes Fort Maria
Padró Lluís
Saggion Horacio
Publication venue
Publication date: 01/01/2013
Field of study

SKATER Internal Report: software of infrastructure for extractive Summarization (work carried out until December 2013)Preprin

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

COMPENDIUM: a text summarisation tool for generating summaries of multiple purposes, domains, and genres

Author: Lloret Elena
Palomar Manuel
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2012
Field of study

In this paper, we present a Text Summarisation tool, compendium, capable of generating the most common types of summaries. Regarding the input, single- and multi-document summaries can be produced; as the output, the summaries can be extractive or abstractive-oriented; and finally, concerning their purpose, the summaries can be generic, query-focused, or sentiment-based. The proposed architecture for compendium is divided in various stages, making a distinction between core and additional stages. The former constitute the backbone of the tool and are common for the generation of any type of summary, whereas the latter are used for enhancing the capabilities of the tool. The main contributions of compendium with respect to the state-of-the-art summarisation systems are that (i) it specifically deals with the problem of redundancy, by means of textual entailment; (ii) it combines statistical and cognitive-based techniques for determining relevant content; and (iii) it proposes an abstractive-oriented approach for facing the challenge of abstractive summarisation. The evaluation performed in different domains and textual genres, comprising traditional texts, as well as texts extracted from the Web 2.0, shows that compendium is very competitive and appropriate to be used as a tool for generating summaries.This research has been supported by the project “Desarrollo de Técnicas Inteligentes e Interactivas de Minería de Textos” (PROMETEO/2009/119) and the project reference ACOMP/2011/001 from the Valencian Government, as well as by the Spanish Government (grant no. TIN2009-13391-C04-01)

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Automatic Multiple Document Text Summarization using Wordnet and Agility Tool

Author: naresh kumar
Publication venue: Global Journals Inc. (US)
Publication date: 29/03/2014
Field of study

The number of web pages on the World Wide Web is increasing very rapidly. Consequently, search engines like Google, AltaVista, Bing etc. provides a long list of URLs to the end user. So, it becomes very difficult to review and analyze each web page manually. That2019;s why automatic text sumarization is used to summarize the source text into its shorter version by preserving its information content and overall meaning. This paper proposes an automatic multiple documents text summarization technique called AMDTSWA, which allows the end user to select multiple URLs to generate their summarized results in parallel. AMDTSWA makes the use of concept based segmentation, HTML DOM tree and concept blocks formation. Similarities of contents are determined by calculating the sentence score and useful information is extracted for generating a comparative summary. The proposed approach is implemented by using ASP.Net and gives good results

Global Journal of Computer Science and Technology (GJCST)

Une Approche Mixte -statistique et structurelle- pour le Résumé Automatique

Author: Bossard Aurélien
Publication venue: HAL CCSD
Publication date: 24/06/2009
Field of study

International audienceAutomatic multi-document summarization techniques have recently evolved into statistical methods for selecting the sentences that will be used to generate the summary. In this paper, we present a system in accordance with « State-of-the-art » — CBSEAS — that we have developped for the « Opinion Task » (automatic summaries of opinions from blogs) and the « Update Task » (automatic summaries of newswire articles and information update) of the TAC 2008 evaluation campaign, and show the interest of structural and linguistic analysis of the documents to summarize. We also present our study on news structure and its integration to CBSEAS impact

HAL-Paris 13

Recommended from our members

Identifying similarities and differences across English and Arabic news

Author: Evans David Kirk
McKeown Kathleen
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2005
Field of study

We present a new approach for summarizing topically clustered documents from two sources, English and machine translated Arabic texts, that presents users with an overview of the differences in content of the two sources, and information that is supported by both sources. Our approach to multilingual multi-document summarization clusters all input document sentences, and identifies sentence clusters that contain information exclusive to the Arabic documents, information exclusive to the English documents, and information that is similar between the two. The result is a three-part summary describing information about the event that comes exclusively from Arabic sources, information coming exclusively from English sources, and information that both sources consider important, enabling analysts to more quickly understand differences between incoming documents from different sources. We report on a user evaluation of the summaries

Columbia University Academic Commons