Search CORE

3 research outputs found

A Review On Automatic Text Summarization Approaches

Author: Basiron Halizah
C Suppiah Puspalata
Jaya Kumar Yogan
Ngo Hea Choon
Ong Sing Goh
Publication venue: 'Science Publications'
Publication date: 01/01/2016
Field of study

It has been more than 50 years since the initial investigation on automatic text summarization was started.Various techniques have been successfully used to extract the important contents from text document to represent document summary.In this study,we review some of the studies that have been conducted in this still-developing research area.It covers the basics of text summarization,the types of summarization,the methods that have been used and some areas in which text summarization has been applied.Furthermore,this paper also reviews the significant efforts which have been put in studies concerning sentence extraction,domain specific summarization and multi document summarization and provides the theoretical explanation and the fundamental concepts related to it.In addition,the advantages and limitations concerning the approaches commonly used for text summarization are also highlighted in this study

Universiti Teknikal Malaysia Melaka (UTeM) Repository

Caracterização da complementaridade temporal: subsídios para sumarização automática multidocumento

Author: Felippo Ariani Di
Souza Jackson Wilke da Cruz
Publication venue: 'Departamento de Educacao FCT/Unesp'
Publication date: 01/03/2018
Field of study

Complementarity is a usual multi-document phenomenon that commonly occurs among news texts about the same event. From a set of sentence pairs (in Portuguese) manually annotated with CST (Cross-Document Structure Theory) relations (Historical background and Follow-up) that make explicit the temporal complementary among the sentences, we identified a potential set of linguistic attributes of such complementary. Using Machine Learning algorithms, we evaluate the capacity of the attributes to discriminate between Historical background and Follow-up. JRip learned a small set of rules with high accuracy. Based on a set of 5 rules, the classifier discriminates the CST relations with 80% of accuracy. According to the rules, the occurrence of temporal expression in sentence 2 is the most discriminative feature in the task. As a contribution, the JRip classifier can improve the performance of the CST-discourse parsers for Portuguese.A complementaridade é um fenômeno multidocumento comumente observado entre notícias que versam sobre um mesmo evento. A partir de um corpus em português composto por um conjunto de pares de sentenças manualmente anotadas com as relações da Cross-Document Structure Theory (CST) que explicitam a complementaridade temporal (Historical background e Follow-up), identificou-se um conjunto potencial de atributos linguísticos desse tipo de complementaridade. Por meio de algoritmos de Aprendizado de Máquina, testou-se o potencial dos atributos em distinguir as referidas relações. O classificador simbólico gerado pelo algoritmo JRip obteve o melhor desempenho ao se considerar a precisão e o tamanho reduzido do conjunto de regras. Somente com base em 5 regras, tal classificador identificou Follow-up e Historical background com precisão aproximada de 80%. Ademais, as regras do classificador indicam que o atributo ocorrência de expressão temporal na sentença 2 é o mais relevante para a tarefa. Como contribuição, salienta-se que o classificador JRip aqui gerado pode ser utilizado nos analisadores discursivos multidocumento para o português do Brasil que são baseados na CST

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidade Estadual Paulista São Paulo (UNESP), Faculdade de Ciências e Letras (FCLAr): Portal de Periódicos

Pertanika Journal of Science & Technology

Author: Universiti Putra Malaysia Press
Publication venue: Universiti Putra Malaysia Press
Publication date: 01/01/2013
Field of study

Universiti Putra Malaysia Institutional Repository