Search CORE

570 research outputs found

Towards Personalized and Human-in-the-Loop Document Summarization

Author: Ghodratnama Samira
Publication venue
Publication date: 30/09/2021
Field of study

The ubiquitous availability of computing devices and the widespread use of the internet have generated a large amount of data continuously. Therefore, the amount of available information on any given topic is far beyond humans' processing capacity to properly process, causing what is known as information overload. To efficiently cope with large amounts of information and generate content with significant value to users, we require identifying, merging and summarising information. Data summaries can help gather related information and collect it into a shorter format that enables answering complicated questions, gaining new insight and discovering conceptual boundaries. This thesis focuses on three main challenges to alleviate information overload using novel summarisation techniques. It further intends to facilitate the analysis of documents to support personalised information extraction. This thesis separates the research issues into four areas, covering (i) feature engineering in document summarisation, (ii) traditional static and inflexible summaries, (iii) traditional generic summarisation approaches, and (iv) the need for reference summaries. We propose novel approaches to tackle these challenges, by: i)enabling automatic intelligent feature engineering, ii) enabling flexible and interactive summarisation, iii) utilising intelligent and personalised summarisation approaches. The experimental results prove the efficiency of the proposed approaches compared to other state-of-the-art models. We further propose solutions to the information overload problem in different domains through summarisation, covering network traffic data, health data and business process data.Comment: PhD thesi

arXiv.org e-Print Archive

Sentence classification experiments for legal text summarisation

Author: Grover Claire
Hachey Ben
Publication venue
Publication date: 01/01/2004
Field of study

Abstract. We describe experiments in building a classifier which determines the rhetorica

CiteSeerX

Edinburgh Research Explorer

Macquarie University ResearchOnline

Document-level sentiment analysis of email data

Author: Liu Sisi
Publication venue
Publication date: 01/01/2020
Field of study

Sisi Liu investigated machine learning methods for Email document sentiment analysis. She developed a systematic framework that has been qualitatively and quantitatively proved to be effective and efficient in identifying sentiment from massive amount of Email data. Analytical results obtained from the document-level Email sentiment analysis framework are beneficial for better decision making in various business settings

ResearchOnline at James Cook University

Learning Sentence-internal Temporal Relations

Author: Lapata M.
Lascarides A.
Publication venue: 'AI Access Foundation'
Publication date: 01/01/2006
Field of study

In this paper we propose a data intensive approach for inferring sentence-internal temporal relations. Temporal inference is relevant for practical NLP applications which either extract or synthesize temporal information (e.g., summarisation, question answering). Our method bypasses the need for manual coding by exploiting the presence of markers like after", which overtly signal a temporal relation. We first show that models trained on main and subordinate clauses connected with a temporal marker achieve good performance on a pseudo-disambiguation task simulating temporal inference (during testing the temporal marker is treated as unseen and the models must select the right marker from a set of possible candidates). Secondly, we assess whether the proposed approach holds promise for the semi-automatic creation of temporal annotations. Specifically, we use a model trained on noisy and approximate data (i.e., main and subordinate clauses) to predict intra-sentential relations present in TimeBank, a corpus annotated rich temporal information. Our experiments compare and contrast several probabilistic models differing in their feature space, linguistic assumptions and data requirements. We evaluate performance against gold standard corpora and also against human subjects

arXiv.org e-Print Archive

CiteSeerX

Crossref

Edinburgh Research Explorer

Text and data mining for information extraction for scientific documents

Author: Muhammad Bello Aliyu
Publication venue
Publication date: 01/02/2021
Field of study

Coventry University Pure Portal

Thirty years of Artificial Intelligence and Law:the second decade

Author: Araszkiewicz M.
Atkinson K.D.
Bench-Capon T.J.M.
Bex Floris
Francesconi E.
Intelligent Systems
Prakken Henry
Sartor Giovanni
Schilder Frank
Sileno Giovanni
Sub Intelligent Systems
van Engers Tom
Wyner A.Z.
Publication venue
Publication date: 01/01/2022
Field of study

The first issue of Artificial Intelligence and Law journal was published in 1992. This paper provides commentaries on nine significant papers drawn from the Journal’s second decade. Four of the papers relate to reasoning with legal cases, introducing contextual considerations, predicting outcomes on the basis of natural language descriptions of the cases, comparing different ways of representing cases, and formalising precedential reasoning. One introduces a method of analysing arguments that was to become very widely used in AI and Law, namely argumentation schemes. Two relate to ontologies for the representation of legal concepts and two take advantage of the increasing availability of legal corpora in this decade, to automate document summarisation and for the mining of arguments

University of Liverpool Repository

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Utrecht University Repository

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Dissertations of the University of Groningen

Tilburg University Repository

The HOLJ corpus: supporting summarisation of legal texts

Author: Grover Claire
Hachey Ben
Hughson Ian
Place Buccleuch
Publication venue
Publication date: 01/01/2004
Field of study

We describe an XML-encoded corpus of texts in the legal domain which was gathered for an automatic summarisation project. We describe two distinct layers of annotation: manual annotation of the rhetorical status of sentences and an entirely automatic annotation process incorporating a host of individual linguistic processors. The manual rhetorical status annotation has been developed as training and testing material for a summarisation system based on the work of Teufel and Moens, while the automatic layer of annotation encodes linguistic information as features for a machine learning approach to rhetorical status classification. 1 Project Overvie

CiteSeerX

Edinburgh Research Explorer

A rhetorical status classifier for legal text summarisation

Author: Grover Claire
Hachey Ben
Publication venue
Publication date: 01/01/2004
Field of study

We describe a classifier which determines the rhetorical status of sentences in texts from a corpus of judgments of the UK House of Lords. Our summarisation system is based on the work of Teufel and Moens where sentences are classified for rhetorical status to aid sentence selection. We experiment with a variety of linguistic features with results comparable to Teufel and Moens, thereby demonstrating the feasibility of porting this kind of system to a new domain.

CiteSeerX

Edinburgh Research Explorer

Macquarie University ResearchOnline

Long Document Text Summarisation

Author: Bishop Jennifer
Publication venue
Publication date: 31/12/2023
Field of study

The University of Manchester - Institutional Repository

Summarising Legal Texts: Sentential Tense and Argumentative Roles

Author: Grover Claire
Hachey Ben
Korycinski Chris
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2003
Field of study

We report on the SUM project which applies automatic summarisation techniques to the legal domain. We pursue a methodology based on Teufel and Moens (2002) where sentences are classified according to their argumentative role. We describe some experiments with judgments of the House of Lords where we have performed automatic linguistic annotation of a small sample set in order to explore correlations between linguistic features and argumentative roles. We use state-of-the-art NLP techniques to perform the linguistic annotation using XML-based tools and a combination of rulebased and statistical methods. We focus here on the predictive capacity of tense and aspect features for a classifier

CiteSeerX

Crossref

Edinburgh Research Explorer

Macquarie University ResearchOnline