Search CORE

742 research outputs found

Extractive text summarisation using graph triangle counting approach: proposed method

Author: Al-Khassawneh Yazan Alaya
Isiaka Obasa Adekunle
Salim Naomie
Publication venue
Publication date: 01/01/2014
Field of study

Currently, with a growing quantity of automated text data, the necessity for the con-struction of Summarisation systems turns out to be vital. Summarisation systems confine and condense the mainly vital ideas of the papers and assist the user to find and understand the foremost facts of the text quicker and easier from the dispensation of information. Compelling set of such systems are those that create summaries of ex-tracts. This type of summary, which is called Extractive Summarisation , is created by choosing large significant fragments of the text without making any amendment to the original. One methodology for generating this type of summary is consuming the graph theory. In graph theory there is one field called graph pruning / reduction, which means, to find the best representation of the main graph with a smaller number of nodes and edges. In this paper, a graph reduction technique called the triangle counting approach is presented to choose the most vital sentences of the text. The first phase is to represent a text as a graph, where nodes are the sentences and edges are the similarity between the sentences. The second phase is to construct the triangles, after that bit vector representation and the final phase is to retrieve the sentences based on the values of bit vector

Universiti Teknologi Malaysia Institutional Repository

Towards Personalized and Human-in-the-Loop Document Summarization

Author: Ghodratnama Samira
Publication venue
Publication date: 30/09/2021
Field of study

The ubiquitous availability of computing devices and the widespread use of the internet have generated a large amount of data continuously. Therefore, the amount of available information on any given topic is far beyond humans' processing capacity to properly process, causing what is known as information overload. To efficiently cope with large amounts of information and generate content with significant value to users, we require identifying, merging and summarising information. Data summaries can help gather related information and collect it into a shorter format that enables answering complicated questions, gaining new insight and discovering conceptual boundaries. This thesis focuses on three main challenges to alleviate information overload using novel summarisation techniques. It further intends to facilitate the analysis of documents to support personalised information extraction. This thesis separates the research issues into four areas, covering (i) feature engineering in document summarisation, (ii) traditional static and inflexible summaries, (iii) traditional generic summarisation approaches, and (iv) the need for reference summaries. We propose novel approaches to tackle these challenges, by: i)enabling automatic intelligent feature engineering, ii) enabling flexible and interactive summarisation, iii) utilising intelligent and personalised summarisation approaches. The experimental results prove the efficiency of the proposed approaches compared to other state-of-the-art models. We further propose solutions to the information overload problem in different domains through summarisation, covering network traffic data, health data and business process data.Comment: PhD thesi

arXiv.org e-Print Archive

Semantification of text through summarisation

Author: Joshi Monika
Publication venue
Publication date: 01/03/2019
Field of study

Ulster University's Research Portal

POLIS: a probabilistic summarisation logic for structured documents

Author: Forst Jan Frederik
Publication venue
Publication date: 01/01/2009
Field of study

PhDAs the availability of structured documents, formatted in markup languages such as SGML, RDF, or XML, increases, retrieval systems increasingly focus on the retrieval of document-elements, rather than entire documents. Additionally, abstraction layers in the form of formalised retrieval logics have allowed developers to include search facilities into numerous applications, without the need of having detailed knowledge of retrieval models. Although automatic document summarisation has been recognised as a useful tool for reducing the workload of information system users, very few such abstraction layers have been developed for the task of automatic document summarisation. This thesis describes the development of an abstraction logic for summarisation, called POLIS, which provides users (such as developers or knowledge engineers) with a high-level access to summarisation facilities. Furthermore, POLIS allows users to exploit the hierarchical information provided by structured documents. The development of POLIS is carried out in a step-by-step way. We start by defining a series of probabilistic summarisation models, which provide weights to document-elements at a user selected level. These summarisation models are those accessible through POLIS. The formal definition of POLIS is performed in three steps. We start by providing a syntax for POLIS, through which users/knowledge engineers interact with the logic. This is followed by a definition of the logics semantics. Finally, we provide details of an implementation of POLIS. The final chapters of this dissertation are concerned with the evaluation of POLIS, which is conducted in two stages. Firstly, we evaluate the performance of the summarisation models by applying POLIS to two test collections, the DUC AQUAINT corpus, and the INEX IEEE corpus. This is followed by application scenarios for POLIS, in which we discuss how POLIS can be used in specific IR tasks

CiteSeerX

Queen Mary Research Online

A literature survey of methods for analysis of subjective language

Author: Täckström Oscar
Publication venue: Swedish Institute of Computer Science
Publication date: 01/01/2009
Field of study

Subjective language is used to express attitudes and opinions towards things, ideas and people. While content and topic centred natural language processing is now part of everyday life, analysis of subjective aspects of natural language have until recently been largely neglected by the research community. The explosive growth of personal blogs, consumer opinion sites and social network applications in the last years, have however created increased interest in subjective language analysis. This paper provides an overview of recent research conducted in the area

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Software institutes' Online Digital Archive

Speech to text conversion and summarization for effective understanding and documentation

Author: A Vinnarasu
Jose Deepa V
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/10/2019
Field of study

Speech, is the most powerful way of communication with which human beings express their thoughts and feelings through different languages. The features of speech differs with each language. However, even while communicating in the same language, the pace and the dialect varies with each person. This creates difficulty in understanding the conveyed message for some people. Sometimes lengthy speeches are also quite difficult to follow due to reasons such as different pronunciation, pace and so on. Speech recognition which is an inter disciplinary field of computational linguistics aids in developing technologies that empowers the recognition and translation of speech into text. Text summarization extracts the utmost important information from a source which is a text and provides the adequate summary of the same. The research work presented in this paper describes an easy and effective method for speech recognition. The speech is converted to the corresponding text and produces summarized text. This has various applications like lecture notes creation, summarizing catalogues for lengthy documents and so on. Extensive experimentation is performed to validate the efficiency of the proposed metho

ZENODO

Institute of Advanced Engineering and Science

Text and data mining for information extraction for scientific documents

Author: Muhammad Bello Aliyu
Publication venue
Publication date: 01/02/2021
Field of study

Coventry University Pure Portal