88 research outputs found
Enumeration of Extractive Oracle Summaries
To analyze the limitations and the future directions of the extractive
summarization paradigm, this paper proposes an Integer Linear Programming (ILP)
formulation to obtain extractive oracle summaries in terms of ROUGE-N. We also
propose an algorithm that enumerates all of the oracle summaries for a set of
reference summaries to exploit F-measures that evaluate which system summaries
contain how many sentences that are extracted as an oracle summary. Our
experimental results obtained from Document Understanding Conference (DUC)
corpora demonstrated the following: (1) room still exists to improve the
performance of extractive summarization; (2) the F-measures derived from the
enumerated oracle summaries have significantly stronger correlations with human
judgment than those derived from single oracle summaries.Comment: 12 page
Recommended from our members
Quantifying the Limits and Success of Extractive Summarization Systems Across Domains
This paper analyzes the topic identification stage of single-document automatic text summarization across four different domains, consisting of newswire, literary, scientific and legal documents
Multi-view representation learning for natural language processing applications
The pervasion of machine learning in a vast number of applications has given rise to an increasing demand for the effective processing of complex, diverse and variable datasets.
One representative case of data diversity can be found in multi-view datasets, which contain input originating from more than one source or having multiple aspects or facets.
Examples include, but are not restricted to, multimodal datasets, where data may consist of audio, image and/or text.
The nature of multi-view datasets calls for special treatment in terms of representation.
A subsequent fundamental problem is that of combining information from potentially incoherent sources; a problem commonly referred to as view fusion.
Quite often, the heuristic solution of early fusion is applied to this problem: aggregating representations from different views using a simple function (concatenation, summation or mean pooling).
However, early fusion can cause overfitting in the case of small training samples and also, it may result in specific statistical properties of each view being lost in the learning process.
Representation learning, the set of ideas and algorithms devised to learn meaningful representations for machine learning problems, has recently grown to a vibrant research field, that encompasses multiple view setups.
A plethora of multi-view representation learning methods has been proposed in the literature, with a large portion of them being based on the idea of maximising the correlation between available views.
Commonly, such techniques are evaluated on synthetic datasets or strictly defined benchmark setups; a role that, within Natural Language Processing, is often assumed by the multimodal sentiment analysis problem.
This thesis argues that more complex downstream applications could benefit from such representations
and describes a multi-view contemplation of a range of tasks, from static, two-view, unimodal to dynamic, three-view, trimodal applications.setting out to explore the limits of the seeming applicability of multi-view representation learning
More specifically, we experiment with document summarisation, framing it as a multi-view problem where documents and summaries are considered two separate, textual views.
Moreover, we present a multi-view inference algorithm for the bimodal problem of image captioning.
Delving more into multimodal setups, we develop a set of multi-view models for applications pertaining to videos, including tagging and text generation tasks.
Finally, we introduce narration generation, a new text generation task from movie videos, that requires inference on the storyline level and temporal context-based reasoning.
The main argument of the thesis is that, due to their performance, multi-view representation learning tools warrant serious consideration by the researchers and practitioners of the Natural Language Processing community.
Exploring the limits of multi-view representations, we investigate their fitness for Natural Language Processing tasks and show that they are able to hold information required for complex problems, while being a good alternative to the early fusion paradigm
Deep latent-variable models for neural text generation
Text generation aims to produce human-like natural language output for down-stream tasks. It covers a wide range of applications like machine translation, document summarization, dialogue generation and so on. Recently deep neural network-based end-to-end architectures are known to be data-hungry, and text generated from them usually suffer from low diversity, interpretability and controllability. As a result, it is difficult to trust the output from them in real-life applications. Deep latent-variable models, by specifying the probabilistic distribution over an intermediate latent process, provide a potential way of addressing these problems while maintaining the expressive power of deep neural networks. This presentation will explain how deep latent-variable models can improve over the standard encoder-decoder model for text generation. We will start from an introduction of encoder-decoder and deep latent-variable models, then go over popular optimization strategies, and finally elaborate on how latent variable models can help improve the diversity, interpretability and data efficiency in different applications of text generation tasks.Textgenerierung zielt darauf ab, eine menschenähnliche Textausgabe in natürlicher Sprache für Anwendungen zu erzeugen. Es deckt eine breite Palette von Anwendungen ab, wie maschinelle Übersetzung, Zusammenfassung von Dokumenten, Generierung von Dialogen usw. In letzter Zeit werden dafür hauptsächlich Endto- End-Architekturen auf der Basis von tiefen neuronalen Netzwerken verwendet. Der End-to-End-Ansatz fasst alle Submodule, die früher nach komplexen handgefertigten Regeln entworfen wurden, zu einer ganzheitlichen Codierungs- Decodierungs-Architektur zusammen. Bei ausreichenden Trainingsdaten kann eine Leistung auf dem neuesten Stand der Technik erzielt werden, ohne dass sprach- und domänenabhängiges Wissen erforderlich ist. Deep-Learning-Modelle sind jedoch als extrem datenhungrig bekannt und daraus generierter Text leidet normalerweise unter geringer Diversität, Interpretierbarkeit und Kontrollierbarkeit. Infolgedessen ist es schwierig, der Ausgabe von ihnen in realen Anwendungen zu vertrauen. Tiefe Modelle mit latenten Variablen bieten durch Angabe der Wahrscheinlichkeitsverteilung über einen latenten Zwischenprozess eine potenzielle Möglichkeit, diese Probleme zu lösen und gleichzeitig die Ausdruckskraft tiefer neuronaler Netze zu erhalten. Diese Dissertation zeigt, wie tiefe Modelle mit latenten Variablen Texterzeugung verbessern gegenüber dem üblichen Encoder-Decoder-Modell. Wir beginnen mit einer Einführung in Encoder-Decoder- und Deep Latent Variable-Modelle und gehen dann auf gängige Optimierungsstrategien wie Variationsinferenz, dynamische Programmierung, Soft Relaxation und Reinforcement Learning ein. Danach präsentieren wir Folgendes: 1. Wie latente Variablen Vielfalt der Texterzeugung verbessern können, indem ganzheitliche, latente Darstellungen auf Satzebene gelernt werden. Auf diese Weise kann zunächst eine latente Darstellung ausgewählt werden, aus der verschiedene Texte generiert werden können. Wir präsentieren effektive Algorithmen, um gleichzeitig das Lernen der Repräsentation und die Texterzeugung durch Variationsinferenz zu trainieren. Um die Einschränkungen der Variationsinferenz bezüglich Uni-Modalität und Inkonsistenz anzugehen, schlagen wir eine Wake-Sleep-Variation und ein auf Transinformation basierendes Trainingsziel vor. Experimente zeigen, dass sie sowohl die übliche Variationsinferenz als auch nicht-latente Variablenmodelle bei der Dialoggenerierung übertreffen. 2. Wie latente Variablen die Steuerbarkeit und Interpretierbarkeit der Texterzeugung verbessern können, indem feinkörnigere latente Spezifikationen zum Zwischengenerierungsprozess hinzugefügt werden. Wir veranschaulichen die Verwendung latenter Variablen für Wortausrichtung, Inhaltsauswahl, Textsegmentierung und Feldsegmentkorrespondenz. Wir leiten für sie effiziente Trainingsalgorithmen ab, damit die Texterzeugung explizit gesteuert werden kann, indem die latente Variable, die durch ihre Definition vom Menschen interpretiert werden kann, manipuliert wird. 3. Überwindung der Seltenheit von Trainingsmustern durch Behandlung von nicht parallelem Text als latente Variablen. Das Training kann wie beim Standard-EM-Algorithmus durchgeführt werden, der stabil konvergiert. Wir zeigen, dass es bei der Dialoggenerierung erfolgreich angewendet werden kann und den Generierungsraum durch die Verwendung von nicht-konversativem Text erheblich bereichert
Recommended from our members
Investigating the Extractive Summarization of Literary Novels
Abstract
Due to the vast amount of information we are faced with, summarization has become a critical necessity of everyday human life. Given that a large fraction of the electronic documents available online and elsewhere consist of short texts such as Web pages, news articles, scientific reports, and others, the focus of natural language processing techniques to date has been on the automation of methods targeting short documents. We are witnessing however a change: an increasingly larger number of books become available in electronic format. This means that the need for language processing techniques able to handle very large documents such as books is becoming increasingly important. This thesis addresses the problem of summarization of novels, which are long and complex literary narratives. While there is a significant body of research that has been carried out on the task of automatic text summarization, most of this work has been concerned with the summarization of short documents, with a particular focus on news stories. However, novels are different in both length and genre, and consequently different summarization techniques are required. This thesis attempts to close this gap by analyzing a new domain for summarization, and by building unsupervised and supervised systems that effectively take into account the properties of long documents, and outperform the traditional extractive summarization systems typically addressing news genre
- …