Search CORE

3 research outputs found

Unsupervised Neural Stylistic Text Generation using Transfer learning and Adapters

Author: Gangadharaiah Rashmi
Kumar Vinayshekhar Bannihatti
Roth Dan
Publication venue
Publication date: 06/10/2022
Field of study

Research has shown that personality is a key driver to improve engagement and user experience in conversational systems. Conversational agents should also maintain a consistent persona to have an engaging conversation with a user. However, text generation datasets are often crowd sourced and thereby have an averaging effect where the style of the generation model is an average style of all the crowd workers that have contributed to the dataset. While one can collect persona-specific datasets for each task, it would be an expensive and time consuming annotation effort. In this work, we propose a novel transfer learning framework which updates only

0.3\%

of model parameters to learn style specific attributes for response generation. For the purpose of this study, we tackle the problem of stylistic story ending generation using the ROC stories Corpus. We learn style specific attributes from the PERSONALITY-CAPTIONS dataset. Through extensive experiments and evaluation metrics we show that our novel training procedure can improve the style generation by 200 over Encoder-Decoder baselines while maintaining on-par content relevance metrics wit

arXiv.org e-Print Archive

WriterForcing: Generating more interesting story endings

Author: Bhutani Mukul
Black Alan W
Gupta Prakhar
Kumar Vinayshekhar Bannihatti
Publication venue
Publication date: 01/01/2019
Field of study

We study the problem of generating interesting endings for stories. Neural generative models have shown promising results for various text generation problems. Sequence to Sequence (Seq2Seq) models are typically trained to generate a single output sequence for a given input sequence. However, in the context of a story, multiple endings are possible. Seq2Seq models tend to ignore the context and generate generic and dull responses. Very few works have studied generating diverse and interesting story endings for a given story context. In this paper, we propose models which generate more diverse and interesting outputs by 1) training models to focus attention on important keyphrases of the story, and 2) promoting generation of non-generic words. We show that the combination of the two leads to more diverse and interesting endings.Comment: Accepted in ACL workshop on Storytelling 201

arXiv.org e-Print Archive

Crossref

A Tale of Two Regulatory Regimes: Creation and Analysis of a Bilingual Privacy Policy Corpus

Author: Arora Siddhant
Chen Rex
Degeling Martin
Hosseini Henry
Hupperich Thomas
Kumar Vinayshekhar Bannihatti
Mangat Jasmine
Norton Tom
Story Peter
Utz Christine
Publication venue: Clark Digital Commons
Publication date: 01/06/2022
Field of study

Over the past decade, researchers have started to explore the use of NLP to develop tools aimed at helping the public, vendors, and regulators analyze disclosures made in privacy policies. With the introduction of new privacy regulations, the language of privacy policies is also evolving, and disclosures made by the same organization are not always the same in different languages, especially when used to communicate with users who fall under different jurisdictions. This work explores the use of language technologies to capture and analyze these differences at scale. We introduce an annotation scheme designed to capture the nuances of two new landmark privacy regulations, namely the EU\u27s GDPR and California\u27s CCPA/CPRA. We then introduce the first bilingual corpus of mobile app privacy policies consisting of 64 privacy policies in English (292K words) and 91 privacy policies in German (478K words), respectively with manual annotations for 8K and 19K fine-grained data practices. The annotations are used to develop computational methods that can automatically extract “disclosures” from privacy policies. Analysis of a subset of 59 “semi-parallel” policies reveals differences that can be attributed to different regulatory regimes, suggesting that systematic analysis of policies using automated language technologies is indeed a worthwhile endeavor. © European Language Resources Association (ELRA), licensed under CC-BY-NC-4.0

Clark University