3,432 research outputs found

    A REST Service for Poetry Generation

    Get PDF
    This paper describes a REST API developed on the top of PoeTryMe, a poetry generation platform. This API exposes several functionalities, from the production of full poems, to narrower tasks, having in mind their utility for poetry composition, including the acquisition of well-formed lines, or semantically-related words, possibly constrained by the number of syllables, rhyme, or polarity. Examples that illustrate the endpoints and what they can be used for are also revealed

    Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation

    Get PDF
    This paper surveys the current state of the art in Natural Language Generation (NLG), defined as the task of generating text or speech from non-linguistic input. A survey of NLG is timely in view of the changes that the field has undergone over the past decade or so, especially in relation to new (usually data-driven) methods, as well as new applications of NLG technology. This survey therefore aims to (a) give an up-to-date synthesis of research on the core tasks in NLG and the architectures adopted in which such tasks are organised; (b) highlight a number of relatively recent research topics that have arisen partly as a result of growing synergies between NLG and other areas of artificial intelligence; (c) draw attention to the challenges in NLG evaluation, relating them to similar challenges faced in other areas of Natural Language Processing, with an emphasis on different evaluation methods and the relationships between them.Comment: Published in Journal of AI Research (JAIR), volume 61, pp 75-170. 118 pages, 8 figures, 1 tabl

    Looking behind the text-to-be-seen: Analysing Twitter bots as electronic literature

    Get PDF
    This thesis focuses on showing how Twitter bots can be analysed from the viewpoint of electronic literature (e-lit) and how the analysis differs from evaluating other works of e-lit. Although formal research on electronic literature goes back some decades, there is still not much research discussing bots in particular. By examining historical and contemporary textual generators, seminal theories on reading and writing e-lit and botmakers’ practical notes about their craft, this study attempts to build an understanding of the process of creating a bot and the essential characteristics related to different kinds of bots. What makes the analysis of bots different from other textual generators is that the source code, which many theorists consider key in understanding works of e-lit, is rarely available for reading. This thesis proposes an alternative method for analysing bots, a framework for reverse-engineering the bot’s text generation procedures. By comparing the bot’s updates with one another, it is possible to notice the formulas and words repeated by the bot in order to better understand the authorial choices made in its design. The framework takes into account the special characteristics of different kinds of bots, focusing on grammar-based bots, which utilise fill-in-the-blank-type sentence structures to generate texts, and list-based bots, which methodically progress through large databases. From a survey of contemporary bots and earlier works of electronic and procedural literature, it becomes evident that understanding programming code is not essential for either analysing or creating bots: it is more important to understand the mechanisms of combinatory text generation and the author’s role in writing and curating the materials used. Bots and text generators also often raise questions of authorship. However, a review of their creation process makes it clear that human creativity is essential for the production of computer-generated texts. With bots, the writing of texts turns into a second-order creation, the writing of word lists, templates and rules, to generate the text-to-be-seen, the output for the reader to encounter.TĂ€mĂ€ opinnĂ€ytetyö keskittyy esittelemÀÀn, kuinka Twitter-botteja on mahdollista analysoida elektronisen kirjallisuuden nĂ€kökulmasta ja kuinka niiden analyysi poikkeaa muiden elektronisen kirjallisuuden teosten tutkimisesta. Vaikka elektronisen kirjallisuuden tutkimusta on tehty joitain vuosikymmeniĂ€, ei erityisesti botteihin keskittyvÀÀ tutkimusta ole juurikaan tuotettu. TĂ€ssĂ€ tutkimuksessa analysoidaan historiallisia ja nykyaikaisia tekstigeneraattoreita, elektronisen kirjallisuuden tutkijoiden teorioita teosten lukemisesta ja luomisesta sekĂ€ botin tekijöiden kĂ€ytĂ€nnön huomioita bottien kirjoittamisesta, joiden pohjalta luodaan kuva botin luomisprosessista ja erilaisiin botteihin liittyvistĂ€ ominaispiirteistĂ€. Bottien lĂ€hdekoodi on harvoin vapaasti luettavissa, minkĂ€ vuoksi bottien analysointi eroaa merkittĂ€vĂ€sti muiden tekstigeneraattoreiden tutkimuksesta. Monet teoreetikot pitĂ€vĂ€t lĂ€hdekoodin lukemista olennaisena tapana analysoida elektronisen kirjallisuuden teoksia. TĂ€mĂ€ opinnĂ€ytetyö esittelee vaihtoehtoisen tavan analysoida botteja. Botin tuottamien pĂ€ivitysten vertailu keskenÀÀn auttaa nĂ€kemÀÀn botin lĂ€hdekoodissa kĂ€ytetyt toistuvat kaavat sekĂ€ ymmĂ€rtĂ€mÀÀn tarkemmin botin tekstin tuottavia menetelmiĂ€ ja niihin liittyviĂ€ taiteellisia valintoja. Esitelty metodi ottaa huomioon erityyppisten bottien ominaispiirteet, keskittyen kaavapohjaisiin botteihin, jotka asettelevat yksittĂ€isiĂ€ sanoja valmiisiin lausepohjiin, ja listapohjaisiin botteihin, jotka kĂ€yvĂ€t jĂ€rjestelmĂ€llisesti lĂ€pi suuria tietokantoja. Tutkimuksessa lĂ€pikĂ€ytyjen vanhempien elektronisen ja proseduraalisen kirjallisuuden teosten ja nykyaikaisten bottien analyysin pohjalta voidaan pÀÀtellĂ€, ettei bottien analysoiminen tai tekeminen vaadi ohjelmakoodin ymmĂ€rtĂ€mistĂ€: on tĂ€rkeĂ€mpÀÀ, ettĂ€ botin lukija/tekijĂ€ ymmĂ€rtÀÀ prosessipohjaisen tekstitaiteen lainalaisuuksia sekĂ€ tekijĂ€n valintojen merkityksen kĂ€ytettyjen materiaalien kirjoittamisessa ja kuratoinnissa. Botit ja tekstigeneraattorit kyseenalaistavat usein myös tekijyyden kĂ€sitteen. Niiden luomisprosessien analyysi osoittaa kuitenkin kiistattomasti, ettĂ€ tietokoneavusteinen tekstintuottaminen vaatii ihmisen luovuutta suunnitteluvaiheessa. Bottien tekemisessĂ€ kirjoittaminen vaihtuu toisen asteen luomiseksi, sanalistojen, lausepohjien ja sÀÀntöjen kirjoittamiseksi, joiden pohjalta botti tuottaa lukijalle nĂ€ytettĂ€vĂ€t tekstit, joita kuvataan otsikon termillĂ€ “the text-to-be-seen”

    Echoes of Persuasion: The Effect of Euphony in Persuasive Communication

    Full text link
    While the effect of various lexical, syntactic, semantic and stylistic features have been addressed in persuasive language from a computational point of view, the persuasive effect of phonetics has received little attention. By modeling a notion of euphony and analyzing four datasets comprising persuasive and non-persuasive sentences in different domains (political speeches, movie quotes, slogans and tweets), we explore the impact of sounds on different forms of persuasiveness. We conduct a series of analyses and prediction experiments within and across datasets. Our results highlight the positive role of phonetic devices on persuasion

    Poetry at the first steps of Artificial Intelligence

    Get PDF
    This paper is about Artificial Intelligence (AI) attempts at writing poetry, usually referred to with the term “poetry generation”. Poetry generation started out from Digital Humanities, which developed out of humanities computing; nowadays, however, it is part of Computational Creativity, a field that tackles several areas of art and science. In the paper it is examined, first, why poetry was chosen among other literary genres as a field for experimentation. Mention is made to the characteristics of poetry (namely arbitrariness and absurdity) that make it fertile ground for such endeavors and also to various text- and reader-centered literary approaches that favored experimentation even by human poets. Then, a rough historic look at poetry generation is attempted, followed by a review of the methods employed, either for fun or as academic projects, along Lamb et al.’s (2017) taxonomy which distinguishes between mere poetry generation and result enhancement. Another taxonomy by Gonçalo Oliveira (2017), dividing between form and content issues in poetry generation, is also briefly presented. The results of poetry generators are evaluated as generally poor and the reasons for this failure are examined: inability of computers to understand any word as a sign with a signified, lack of general intelligence, process- (rather than output-) driven attempts, etc. Then, computer-like results from a number of human poetic movements are also presented as a juxtaposition: DADA, stream of consciousness, OuLiPo, LangPo, Flarf, blackout/erasure poetry. The equivalence between (i) human poets that are concerned more with experimentation more than with good results and (ii) computer scientists who are process-driven leads to a discussion of the characteristics of humanness, of the possibility of granting future AI personhood and of the need to see our world in terms of a new, more refined ontology

    An adaptive clustering and classification algorithm for Twitter data streaming in Apache Spark

    Get PDF
    On-going big data from social networks sites alike Twitter or Facebook has been an entrancing hotspot for investigation by researchers in current decades as a result of various aspects including up-to-date-ness, accessibility and popularity; however anyway there may be a trade off in accuracy. Moreover, clustering of twitter data has caught the attention of researchers. As such, an algorithm which can cluster data within a lesser computational time, especially for data streaming is needed. The presented adaptive clustering and classification algorithm is used for data streaming in Apache spark to overcome the existing problems is processed in two phases. In the first phase, the input pre-processed twitter data is viably clustered utilizing an Improved Fuzzy C-means clustering and the proposed clustering is additionally improved by an Adaptive Particle swarm optimization (PSO) algorithm. Further the clustered data streaming is assessed utilizing spark engine. In the second phase, the input pre-processed Higgs data is classified utilizing the modified support vector machine (MSVM) classifier with grid search optimization. At long last the optimized information is assessed in spark engine and the assessed esteem is utilized to discover an accomplished confusion matrix. The proposed work is utilizing Twitter dataset and Higgs dataset for the data streaming in Apache Spark. The computational examinations exhibit the superiority ofpresented approach comparing with the existing methods in terms of precision, recall, F-score, convergence, ROC curve and accuracy

    Perceptual fail: Female power, mobile technologies and images of self

    Get PDF
    Like a biological species, images of self have descended and modified throughout their journey down the ages, interweaving and recharging their viability with the necessary interjections from culture, tools and technology. Part of this journey has seen images of self also become an intrinsic function within the narratives about female power; consider Helen of Troy “a face that launched a thousand ships” (Marlowe, 1604) or Kim Kardashian (KUWTK) who heralded in the mass mediated ‘selfie’ as a social practice. The interweaving process itself sees the image oscillate between naturalized ‘icon’ and idealized ‘symbol’ of what the person looked like and/or aspired to become. These public images can confirm or constitute beauty ideals as well as influence (via imitation) behaviour and mannerisms, and as such the viewers belief in the veracity of the representative image also becomes intrinsically political manipulating the associated narratives and fostering prejudice (Dobson 2015, Korsmeyer 2004, Pollock 2003). The selfie is arguably ‘a sui generis,’ whilst it is a mediated photographic image of self, it contains its own codes of communication and decorum that fostered the formation of numerous new digital communities and influenced new media aesthetics . For example the selfie is both of nature (it is still a time based piece of documentation) and known to be perceptually untrue (filtered, modified and full of artifice). The paper will seek to demonstrate how selfie culture is infused both by considerable levels of perceptual failings that are now central to contemporary celebrity culture and its’ notion of glamour which in turn is intrinsically linked (but not solely defined) by the province of feminine desire for reinvention, transformation or “self-sexualisation” (Hall, West and McIntyre, 2012). The subject, like the Kardashians or selfies, is divisive. In conclusion this paper will explore the paradox of the perceptual failings at play within selfie culture more broadly, like ‘Reality TV’ selfies are infamously fake yet seem to provide Debord’s (1967) illusory cultural opiate whilst fulfilling a cultural longing. Questions then emerge when considering the narrative impact of these trends on engendered power structures and the traditional status of illusion and narrative fiction

    Topic-Guided Self-Introduction Generation for Social Media Users

    Full text link
    Millions of users are active on social media. To allow users to better showcase themselves and network with others, we explore the auto-generation of social media self-introduction, a short sentence outlining a user's personal interests. While most prior work profiles users with tags (e.g., ages), we investigate sentence-level self-introductions to provide a more natural and engaging way for users to know each other. Here we exploit a user's tweeting history to generate their self-introduction. The task is non-trivial because the history content may be lengthy, noisy, and exhibit various personal interests. To address this challenge, we propose a novel unified topic-guided encoder-decoder (UTGED) framework; it models latent topics to reflect salient user interest, whose topic mixture then guides encoding a user's history and topic words control decoding their self-introduction. For experiments, we collect a large-scale Twitter dataset, and extensive results show the superiority of our UTGED to the advanced encoder-decoder models without topic modeling

    Detecting AI generated text using neural networks

    Get PDF
    For humans, distinguishing machine generated text from human written text is men- tally taxing and slow. NLP models have been created to do this more effectively and faster. But, what if some adversarial changes have been added to the machine generated text? This thesis discusses this issue and text detectors in general. The primary goal of this thesis is to describe the current state of text detectors in research and to discuss a key adversarial issue in modern NLP transformers. To describe the current state of text detectors a Systematic Literature Review was done on 50 relevant papers to machine-centric detection in chapter 2. As for the key ad- versarial issue, chapter 3 describes an experiment where RoBERTa was used to test transformers against simple mutations which cause mislabelling. The state of the literature was written at length in the 2nd chapter, showing how viable text detection as a subject has become. Lastly, RoBERTa was shown to be vulnerable to mutation attacks. The solution was found to be fine-tuning it to some heuristics, as long as the mutations can be predicted the model can be fine tuned to detect them
    • 

    corecore