22 research outputs found

    MixPoet: Diverse Poetry Generation via Learning Controllable Mixed Latent Space

    Full text link
    As an essential step towards computer creativity, automatic poetry generation has gained increasing attention these years. Though recent neural models make prominent progress in some criteria of poetry quality, generated poems still suffer from the problem of poor diversity. Related literature researches show that different factors, such as life experience, historical background, etc., would influence composition styles of poets, which considerably contributes to the high diversity of human-authored poetry. Inspired by this, we propose MixPoet, a novel model that absorbs multiple factors to create various styles and promote diversity. Based on a semi-supervised variational autoencoder, our model disentangles the latent space into some subspaces, with each conditioned on one influence factor by adversarial training. In this way, the model learns a controllable latent variable to capture and mix generalized factor-related properties. Different factor mixtures lead to diverse styles and hence further differentiate generated poems from each other. Experiment results on Chinese poetry demonstrate that MixPoet improves both diversity and quality against three state-of-the-art models.Comment: 8 pages, 5 figures, published in AAAI 202

    Computational Understanding, Generation and Evaluation of Creative Expressions

    Get PDF
    Computational creativity has received a good amount of research interest in generating creative artefacts programmatically. At the same time, research has been conducted in computational aesthetics, which essentially tries to analyse creativity exhibited in art. This thesis aims to unite these two distinct lines of research in the context of natural language generation by building, from models for interpretation and generation, a cohesive whole that can assess its own generations. I present a novel method for interpreting one of the most difficult rhetoric devices in the figurative use of language: metaphors. The method does not rely on hand-annotated data and it is purely data-driven. It obtains the state of the art results and is comparable to the interpretations given by humans. We show how a metaphor interpretation model can be used in generating metaphors and metaphorical expressions. Furthermore, as a creative natural language generation task, we demonstrate assigning creative names to colours using an algorithmic approach that leverages a knowledge base of stereotypical associations for colours. Colour names produced by the approach were favoured by human judges to names given by humans 70% of the time. A genetic algorithm-based method is elaborated for slogan generation. The use of a genetic algorithm makes it possible to model the generation of text while optimising multiple fitness functions, as part of the evolutionary process, to assess the aesthetic quality of the output. Our evaluation indicates that having multiple balanced aesthetics outperforms a single maximised aesthetic. From an interplay of neural networks and the traditional AI approach of genetic algorithms, we present a symbiotic framework. This is called the master-apprentice framework. This makes it possible for the system to produce more diverse output as the neural network can learn from both the genetic algorithm and real people. The master-apprentice framework emphasises a strong theoretical foundation for the creative problem one seeks to solve. From this theoretical foundation, a reasoned evaluation method can be derived. This thesis presents two different evaluation practices based on two different theories on computational creativity. This research is conducted in two distinct practical tasks: pun generation in English and poetry generation in Finnish.Laskennallista luovuutta on tutkittu paljon puhtaan tuottamisen näkökulmasta ja saman aikaan tutkimusta on tehty laskennallisen estetiikan saralla. Väitöskirjani yhdistää näitä kahta eri koulukuntaa, sillä kehittämäni laskennallisesti luovat järjestelmät käyttävät tuottamisessa apuna estetiikkaa; järjestelmät siis tulkitsevat teoksiaan samaan aikaan, kun ne niitä tuottavat. Käsittelen väitöskirjassani metaforien automaattista tulkintaa, värien nimien tuottamista, sloganien tuottamista sekä suomenkielisen runouden tuottamista. Metodeina käytän perinteistä koneoppimisalgoritmia, eli niin kutsuttua geneettistä algoritmia, sekä neuroverkkoja. Niiden yhdistelmää nimitän mestari ja oppipoika -malliksi, jossa geneettinen algoritmi opettaa neuroverkkoja

    A Comprehensive Survey of Natural Language Generation Advances from the Perspective of Digital Deception

    Get PDF
    In recent years there has been substantial growth in the capabilities of systems designed to generate text that mimics the fluency and coherence of human language. From this, there has been considerable research aimed at examining the potential uses of these natural language generators (NLG) towards a wide number of tasks. The increasing capabilities of powerful text generators to mimic human writing convincingly raises the potential for deception and other forms of dangerous misuse. As these systems improve, and it becomes ever harder to distinguish between human-written and machine-generated text, malicious actors could leverage these powerful NLG systems to a wide variety of ends, including the creation of fake news and misinformation, the generation of fake online product reviews, or via chatbots as means of convincing users to divulge private information. In this paper, we provide an overview of the NLG field via the identification and examination of 119 survey-like papers focused on NLG research. From these identified papers, we outline a proposed high-level taxonomy of the central concepts that constitute NLG, including the methods used to develop generalised NLG systems, the means by which these systems are evaluated, and the popular NLG tasks and subtasks that exist. In turn, we provide an overview and discussion of each of these items with respect to current research and offer an examination of the potential roles of NLG in deception and detection systems to counteract these threats. Moreover, we discuss the broader challenges of NLG, including the risks of bias that are often exhibited by existing text generation systems. This work offers a broad overview of the field of NLG with respect to its potential for misuse, aiming to provide a high-level understanding of this rapidly developing area of research

    Discriminative and Generative Learning with Style Information

    Get PDF
    Conventional machine learning approaches usually assume that the patterns follow the identical and independent distribution (i.i.d.). However, in many empirical cases, such condition might be violated when data are equipped with diverse and inconsistent style information. The effectiveness of those traditional predictors may be limited due to the violation of the i.i.d. assumption brought by the existence of the style inconsistency. In this thesis, we investigate how the style information can be appropriately utilized for further lifting up the performance of machine learning models. It is fulfilled by not only introducing the style information into some state-of-the-art models, some new architectures, frameworks are also designed and implemented with specific purposes to make proper use of the style information. The main work is listed as the following summaries: First, the idea of the style averaging is initially introduced by an example of an image process based sunglasses recovery algorithm to perform robust one-shot facial expression recognition task. It is named as Style Elimination Transformation (SET). By recovering the pixels corrupted by the dark colors of the sunglasses brought by the proposed algorithm, the classification performance is promoted on several state-of-the-art machine learning classifiers even in a one-shot training setting. Then the investigation of the style normalization and style neutralization is investigated with both discriminative and generative machine learning approaches respectively. In discriminative learning models with style information, the style normalization transformation (SNT) is integrated into the support vector machines (SVM) for both classification and regression, named as the field support vector classification (F-SVC) and field support vector regression (F-SVR) respectively. The SNT can be represented with the nonlinearity by mapping the sufficiently complicated style information to the high-dimensional reproducing kernel Hilbert space. The learned SNT would normalize the inconsistent style information, producing i.i.d. examples, on which the SVM will be applied. Furthermore, a self-training based transductive framework will be introduced to incorporate with the unseen styles during training. The transductive SNT (T-SNT) is learned by transferring the trained styles to the unknown ones. Besides, in generative learning with style information, the style neutralization generative adversarial classifier (SN-GAC) is investigated to incorporate with the style information when performing the classification. As a neural network based framework, the SN-GAC enables the nonlinear mapping due to the nature of the nonlinearity of the neural network transformation with the generative manner. As a generalized and novel classification framework, it is capable of synthesizing style-neutralized high-quality humanunderstandable patterns given any style-inconsistent ones. Being learned with the adversarial training strategy in the first step, the final classification performance will be further promoted by fine-tuning the classifier when those style-neutralized examples can be well generated. Finally, the reversed task of the upon-mentioned style neutralization in the SN-GAC model, namely, the generation of arbitrary-style patterns, is also investigated in this thesis. By introducing the W-Net, a deep architecture upgraded from the famous U-Net model for image-to-image translation tasks, the few-shot (even the one-shot) arbitrary-style Chinese character generation task will be fulfilled. Same as the SN-GAC model, the W-Net is also trained with the adversarial training strategy proposed by the generative adversarial network. Such W-Net architecture is capable of generating any Chinese characters with the similar style as those given a few, or even one single, stylized examples. For all the proposed algorithms, frameworks, and models mentioned above for both the prediction and generation tasks, the inconsistent style information is taken into appropriate consideration. Inconsistent sunglasses information is eliminated by an image processing based sunglasses recovery algorithm in the SET, producing style-consistent patterns. The facial expression recognition is performed based on those transformed i.i.d. examples. The SNT is integrated into the SVM model, normalizing the inconsistent style information nonlinearly with the kernelized mapping. The T-SNT further enables the field prediction on those unseen styles during training. In the SN-GAC model, the style neutralization is performed by the neural network based upgraded U-Net architecture. Trained with separated steps with the adversarial optimization strategy included, it produces the high-quality style-neutralized i.i.d. patterns. The following classification is learned to produce superior performance with no additional computation involved. The W-Net architecture enables the free manipulation of the style data generation task with only a few, or even one single, style reference(s) available. It makes the Few-shot, or even the One-shot, Chinese Character Generation with the Arbitrary-style information task to be realized. Such appealing property is hardly seen in the literature
    corecore