Search CORE

253,513 research outputs found

Long and Diverse Text Generation with Planning-based Hierarchical Variational Model

Author: Huang Minlie
Shao Zhihong
Wen Jiangtao
Xu Wenfei
Zhu Xiaoyan
Publication venue
Publication date: 01/01/2019
Field of study

Existing neural methods for data-to-text generation are still struggling to produce long and diverse texts: they are insufficient to model input data dynamically during generation, to capture inter-sentence coherence, or to generate diversified expressions. To address these issues, we propose a Planning-based Hierarchical Variational Model (PHVM). Our model first plans a sequence of groups (each group is a subset of input items to be covered by a sentence) and then realizes each sentence conditioned on the planning result and the previously generated context, thereby decomposing long text generation into dependent sentence generation sub-tasks. To capture expression diversity, we devise a hierarchical latent structure where a global planning latent variable models the diversity of reasonable planning and a sequence of local latent variables controls sentence realization. Experiments show that our model outperforms state-of-the-art baselines in long and diverse text generation.Comment: To appear in EMNLP 201

arXiv.org e-Print Archive

CatGAN: Category-aware Generative Adversarial Networks with Hierarchical Evolutionary Learning for Category Text Generation

Author: Liang Zhiwei
Liu Zhiyue
Wang Jiahai
Publication venue
Publication date: 20/11/2019
Field of study

Generating multiple categories of texts is a challenging task and draws more and more attention. Since generative adversarial nets (GANs) have shown competitive results on general text generation, they are extended for category text generation in some previous works. However, the complicated model structures and learning strategies limit their performance and exacerbate the training instability. This paper proposes a category-aware GAN (CatGAN) which consists of an efficient category-aware model for category text generation and a hierarchical evolutionary learning algorithm for training our model. The category-aware model directly measures the gap between real samples and generated samples on each category, then reducing this gap will guide the model to generate high-quality category samples. The Gumbel-Softmax relaxation further frees our model from complicated learning strategies for updating CatGAN on discrete data. Moreover, only focusing on the sample quality normally leads the mode collapse problem, thus a hierarchical evolutionary learning algorithm is introduced to stabilize the training procedure and obtain the trade-off between quality and diversity while training CatGAN. Experimental results demonstrate that CatGAN outperforms most of the existing state-of-the-art methods.Comment: 15 pages, 4 figures. Accepted by AAAI 202

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Unsupervised Melody-to-Lyric Generation

Author: Cervone Alessandra
Chung Tagyoung
Huang Jing
Narayan-Chen Anjali
Oraby Shereen
Peng Nanyun
Sigurdsson Gunnar
Tao Chenyang
Tian Yufei
Zhao Wenbo
Publication venue
Publication date: 30/05/2023
Field of study

Automatic melody-to-lyric generation is a task in which song lyrics are generated to go with a given melody. It is of significant practical interest and more challenging than unconstrained lyric generation as the music imposes additional constraints onto the lyrics. The training data is limited as most songs are copyrighted, resulting in models that underfit the complicated cross-modal relationship between melody and lyrics. In this work, we propose a method for generating high-quality lyrics without training on any aligned melody-lyric data. Specifically, we design a hierarchical lyric generation framework that first generates a song outline and second the complete lyrics. The framework enables disentanglement of training (based purely on text) from inference (melody-guided text generation) to circumvent the shortage of parallel data. We leverage the segmentation and rhythm alignment between melody and lyrics to compile the given melody into decoding constraints as guidance during inference. The two-step hierarchical design also enables content control via the lyric outline, a much-desired feature for democratizing collaborative song creation. Experimental results show that our model can generate high-quality lyrics that are more on-topic, singable, intelligible, and coherent than strong baselines, for example SongMASS, a SOTA model trained on a parallel dataset, with a 24% relative overall quality improvement based on human ratings. OComment: Accepted to ACL 23. arXiv admin note: substantial text overlap with arXiv:2305.0776

arXiv.org e-Print Archive

An Online Word Vector Generation Method Based on Incremental Huffman Tree Merging

Author: Kui Qian*
Lei Tian
Xiulan Wen
Zhenzhong Song
Publication venue: 'Mechanical Engineering Faculty in Slavonski Brod'
Publication date: 01/01/2021
Field of study

Aiming at high real-time performance processing requirements for large amounts of online text data in natural language processing applications, an online word vector model generation method based on incremental Huffman tree merging is proposed. Maintaining the inherited word Huffman tree in existing word vector model unchanged, a new Huffman tree of incoming words is constructed and ensures that there is no leaf node identical to the inherited Huffman tree. Then the Huffman tree is updated by a method of node merging. Thus based on the existing word vector model, each word still has a unique encoding for the calculation of the hierarchical softmax model. Finally, the generation of incremental word vector model is realized by using neural network on the basis of hierarchical softmax model. The experimental results show that the method could realize the word vector model generation online based on incremental learning with faster time and better performance

Graphical Analysis on Text Mining Unstructured Data Using D-Matrix

Author: Sneha P. Mendhe, Kapil N. Hande
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 30/06/2017
Field of study

Fault dependency (D-matrix) is used as a diagnostic model that identifies the fault system data and its causal relationship at the hierarchical system-level. It consists of dependencies and relationship between identified failure modes and symptoms related to a system. Constructing such D-matrix fault detection model is time overwhelming task .A system is proposed that describes associate ontology based text mining on unstructured data using D-matrix for automatically constructing D-matrix by mining many repair verbatim text data (typically written in unstructured text) collected throughout the identification process. And also graphical model generation for each generated D-matrix. Initially we construct fault diagnosis ontology and then text mining techniques are applied to spot dependencies among failure modes and identified symptom. D-matrix is represented in graph so analysis gets easier and faulty parts becomes simply detectable. The proposed methodology are implemented as a prototype tool and validated by using real-life information collected from the automobile domain

International Journal on Recent and Innovation Trends in Computing and Communication

MULTIHIERARCHICAL DOCUMENTS AND FINE-GRAINED ACCESS CONTROL

Author: Moore Neil
Publication venue: UKnowledge
Publication date: 01/01/2012
Field of study

This work presents new models and algorithms for creating, modifying, and controlling access to complex text. The digitization of texts opens new opportunities for preservation, access, and analysis, but at the same time raises questions regarding how to represent and collaboratively edit such texts. Two issues of particular interest are modelling the relationships of markup (annotations) in complex texts, and controlling the creation and modification of those texts. This work addresses and connects these issues, with emphasis on data modelling, algorithms, and computational complexity; and contributes new results in these areas of research. Although hierarchical models of text and markup are common, complex texts often exhibit layers of overlapping structure that are best described by multihierarchical markup. We develop a new model of multihierarchical markup, the globally ordered GODDAG, that combines features of both graph- and range-based models of markup, allowing documents to be unambiguously serialized. We describe extensions to the XPath query language to support globally ordered GODDAGs, provide semantics for a set of update operations on this structure, and provide algorithms for converting between two different representations of the globally ordered GODDAG. Managing the collaborative editing of documents can require restricting the types of changes different editors may make, while not altogether restricting their access to the document. Fine-grained access control allows precisely these kinds of restrictions on the operations that a user is or is not permitted to perform on a document. We describe a rule-based model of fine-grained access control for updates of hierarchical documents, and in this context analyze the document generation problem: determining whether a document could have been created without violating a particular access control policy. We show that this problem is undecidable in the general case and provide computational complexity bounds for a number of restricted variants of the problem. Finally, we extend our fine-grained access control model from hierarchical to multihierarchical documents. We provide semantics for fine-grained access control policies that control splice-in, splice-out, and rename operations on globally ordered GODDAGs, and show that the multihierarchical version of the document generation problem remains undecidable

Automatic generation of natural language descriptions of visual data: describing images and videos using recurrent and self-attentive models

Author: Harzig Philipp
Publication venue
Publication date: 20/05/2022
Field of study

Humans are faced with a constant flow of visual stimuli, e.g., from the environment or when looking at social media. In contrast, visually-impaired people are often incapable to perceive and process this advantageous and beneficial information that could help maneuver them through everyday situations and activities. However, audible feedback such as natural language can give them the ability to better be aware of their surroundings, thus enabling them to autonomously master everyday's challenges. One possibility to create audible feedback is to produce natural language descriptions for visual data such as still images and then read this text to the person. Moreover, textual descriptions for images can be further utilized for text analysis (e.g., sentiment analysis) and information aggregation. In this work, we investigate different approaches and techniques for the automatic generation of natural language of visual data such as still images and video clips. In particular, we look at language models that generate textual descriptions with recurrent neural networks: First, we present a model that allows to generate image captions for scenes that depict interactions between humans and branded products. Thereby, we focus on the correct identification of the brand name in a multi-task training setting and present two new metrics that allow us to evaluate this requirement. Second, we explore the automatic answering of questions posed for an image. In fact, we propose a model that generates answers from scratch instead of predicting an answer from a limited set of possible answers. In comparison to related works, we are therefore able to generate rare answers, which are not contained in the pool of frequent answers. Third, we review the automatic generation of doctors' reports for chest X-ray images. That is, we introduce a model that can cope with a dataset bias of medical datasets (i.e., abnormal cases are very rare) and generates reports with a hierarchical recurrent model. We also investigate the correlation between the distinctiveness of the report and the score in traditional metrics and find a discrepancy between good scores and accurate reports. Then, we examine self-attentive language models that improve computational efficiency and performance over the recurrent models. Specifically, we utilize the Transformer architecture. First, we expand the automatic description generation to the domain of videos where we present a video-to-text (VTT) model that can easily synchronize audio-visual features. With an extensive experimental exploration, we verify the effectiveness of our video-to-text translation pipeline. Finally, we revisit our recurrent models with this self-attentive approach

OPUS Augsburg