Search CORE

5 research outputs found

Performance Evaluation of Nature-Inspired Metaheuristic Approaches for Single Document Text Summarization

Author: Pravesh Patel et. al
Publication venue: Auricle Global Society of Education and Research
Publication date: 01/01/2024
Field of study

In today era, day by day huge amount of data is collected on internet. The reading of text document or retrieving important information are time consuming process, so there is need for introducing effective text summarization technique. Text summarization, is the process of retrieving key information from lengthy document, its plays an essential role in information retrieval and content extraction. The paper we presented a comprehensive examination of nature-inspired metaheuristic algorithms, such as firefly, Cuckoo Search(CS) and Particle Swarm Optimization (PSO) to improve text summarization with an emphasis on single document datasets such as DUC-2001 and DUC-2002. The measurement of generated text summaries quality, generated summaries of datasets are compared with existing golden summaries and evaluated using ROUGE score. Our results show that nature-inspired metaheuristic-based approaches show potential for enhancing text summary of individual documents, metaheuristics methods improve summarizing effectiveness while offering a fresh viewpoint on how to handle the process within the confines of a single document dataset

International Journal on Recent and Innovation Trends in Computing and Communication

Use of Genetic Algorithm for Cohesive Summary Extraction to Assist Reading Difficulties

Author: K. Nandhini
S. R. Balasundaram
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2013
Field of study

Learners with reading difficulties normally face significant challenges in understanding the text-based learning materials. In this regard, there is a need for an assistive summary to help such learners to approach the learning documents with minimal difficulty. An important issue in extractive summarization is to extract cohesive summary from the text. Existing summarization approaches focus mostly on informative sentences rather than cohesive sentences. We considered several existing features, including sentence location, cardinality, title similarity, and keywords to extract important sentences. Moreover, learner-dependent readability-related features such as average sentence length, percentage of trigger words, percentage of polysyllabic words, and percentage of noun entity occurrences are considered for the summarization purpose. The objective of this work is to extract the optimal combination of sentences that increase readability through sentence cohesion using genetic algorithm. The results show that the summary extraction using our proposed approach performs better in -measure, readability, and cohesion than the baseline approach (lead) and the corpus-based approach. The task-based evaluation shows the effect of summary assistive reading in enhancing readability on reading difficulties

Crossref

Directory of Open Access Journals

Automatic text summarization with Maximal Frequent Sequences

Author: GARCIA HERNANDEZ RENE ARNULFO
GARCIA HERNANDEZ RENE ARNULFO
LEDENEVA YULIA NIKOLAEVNA
LEDENEVA YULIA NIKOLAEVNA
Publication venue: 'Universidad Autonoma del Estado de Mexico'
Publication date: 01/12/2013
Field of study

En las últimas dos décadas un aumento exponencial de la información electrónica ha provocado una gran necesidad de entender rápidamente grandes volúmenes de información. En este libro se desarrollan los métodos automáticos para producir un resumen. Un resumen es un texto corto que transmite la información más importante de un documento o de una colección de documentos. Los resúmenes utilizados en este libro son extractivos: una selección de las oraciones más importantes del texto. Otros retos consisten en generar resúmenes de manera independiente de lenguaje y dominio. Se describe la identificación de cuatro etapas para generación de resúmenes extractivos. La primera etapa es la selección de términos, en la que uno tiene que decidir qué unidades contarían como términos individuales. El proceso de estimación de la utilidad de los términos individuales se llama etapa de pesado de términos. El siguiente paso se denota como pesado de oraciones, donde todas las secuencias reciben alguna medida numérica de acuerdo con la utilidad de términos. Finalmente, el proceso de selección de las oraciones más importantes se llama selección de oraciones. Los diferentes métodos para generación de resúmenes extractivos pueden ser caracterizados como representan estas etapas. En este libro se describe la etapa de selección de términos, en la que la detección de descripciones multipalabra se realiza considerando Secuencias Frecuentes Maximales (sfms), las cuales adquieren un significado importante, mientras Secuencias Frecuentes (sf) no maximales, que son partes de otros sf, no deben de ser consideradas. En la motivación se consideró costo vs. beneficio: existen muchas sf no maximales, mientras que la probabilidad de adquirir un significado importante es baja. De todos modos, las sfms representan todas las sfs en el modo compacto: todas las sfs podrían ser obtenidas a partir de todas las sfms explotando cada sfm al conjunto de todas sus subsecuencias. Se presentan los nuevos métodos basados en grafos, algoritmos de agrupamiento y algoritmos genéticos, los cuales facilitan la tarea de generación de resúmenes de textos. Se ha experimentado diferentes combinaciones de las opciones de selección de términos, pesado de términos, pesado de oraciones y selección de oraciones para generar los resúmenes extractivos de textos independientes de lenguaje y dominio para una colección de noticias. Se ha analizado algunas opciones basadas en descripciones multipalabra considerándolas en los métodos de grafos, algoritmos de agrupamiento y algoritmos genéticos. Se han obtenido los resultados superiores al de estado de arte. Este libro está dirigido a los estudiantes y científicos del área de Lingüística Computacional, y también a quienes quieren saber sobre los recientes avances en las investigaciones de generación automática de resúmenes de textos.In the last two decades, an exponential increase in the available electronic information causes a big necessity to quickly understand large volumes of information. It raises the importance of the development of automatic methods for detecting the most relevant content of a document in order to produce a shorter text. Automatic Text Summarization (ats) is an active research area dedicated to generate abstractive and extractive summaries not only for a single document, but also for a collection of documents. Other necessity consists in finding method for ats in a language and domain independent way. In this book we consider extractive text summarization for single document task. We have identified that a typical extractive summarization method consists in four steps. First step is a term selection where one should decide what units will count as individual terms. The process of estimating the usefulness of the individual terms is called term weighting step. The next step denotes as sentence weighting where all the sentences receive some numerical measure according to the usefulness of its terms. Finally, the process of selecting the most relevant sentences calls sentence selection. Different extractive summarization methods can be characterized how they perform these steps. In this book, in the term selection step, we describe how to detect multiword descriptions considering Maximal Frequent Sequences (mfss), which bearing important meaning, while non-maximal frequent sequences (fss), those that are parts of another fs, should not be considered. Our additional motivation was cost vs. benefit considerations: there are too many non-maximal fss while their probability to bear important meaning is lower. In any case, mfss represent all fss in a compact way: all fss can be obtained from all mfss by bursting each mfs into a set of all its subsequences.New methods based on graph algorithms, genetic algorithms, and clustering algorithms which facilitate the text summarization task are presented. We have tested different combinations of term selection, term weighting, sentence weighting and sentence selection options for language-and domain-independent extractive single-document text summarization on a news report collection. We analyzed several options based on mfss, considering them with graph, genetic, and clustering algorithms. We obtained results superior to the existing state-ofthe- art methods. This book is addressed for students and scientists of the area of Computational Linguistics, and also who wants to know recent developments in the area of Automatic Text Generation of Summaries

Red Mexicana de Repositorios Institucionales

Repositorio Institucional de la Universidad Autónoma del Estado de México

Improving the Performance of Text Summarization

Author: Mohammadreza Vali Zadeh
Publication venue
Publication date: 19/12/2014
Field of study

Repositório Aberto da Universidade do Porto

Genetic Algorithm Based Multi-document Summarization

Author: D. Radev
D.E. Goldberg
K. Knight
MAN‘A-LO‘PEZ
R. Barzilay
Y.R. Baeza
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

Crossref