Search CORE

156,179 research outputs found

How well can Text-to-Image Generative Models understand Ethical Natural Language Interventions?

Author: Bansal Hritik
Chang Kai-Wei
Monajatipoor Masoud
Yin Da
Publication venue
Publication date: 27/10/2022
Field of study

Text-to-image generative models have achieved unprecedented success in generating high-quality images based on natural language descriptions. However, it is shown that these models tend to favor specific social groups when prompted with neutral text descriptions (e.g., 'a photo of a lawyer'). Following Zhao et al. (2021), we study the effect on the diversity of the generated images when adding ethical intervention that supports equitable judgment (e.g., 'if all individuals can be a lawyer irrespective of their gender') in the input prompts. To this end, we introduce an Ethical NaTural Language Interventions in Text-to-Image GENeration (ENTIGEN) benchmark dataset to evaluate the change in image generations conditional on ethical interventions across three social axes -- gender, skin color, and culture. Through ENTIGEN framework, we find that the generations from minDALL.E, DALL.E-mini and Stable Diffusion cover diverse social groups while preserving the image quality. Preliminary studies indicate that a large change in the model predictions is triggered by certain phrases such as 'irrespective of gender' in the context of gender bias in the ethical interventions. We release code and annotated data at https://github.com/Hritikbansal/entigen_emnlp.Comment: 13 pages, 8 figures, 6 tables. Accepted as Oral Presentation at EMNLP 202

arXiv.org e-Print Archive

Table-to-text Generation by Structure-aware Seq2seq Learning

Author: Chang Baobao
Liu Tianyu
Sha Lei
Sui Zhifang
Wang Kexiang
Publication venue
Publication date: 27/11/2017
Field of study

Table-to-text generation aims to generate a description for a factual table which can be viewed as a set of field-value records. To encode both the content and the structure of a table, we propose a novel structure-aware seq2seq architecture which consists of field-gating encoder and description generator with dual attention. In the encoding phase, we update the cell memory of the LSTM unit by a field gate and its corresponding field value in order to incorporate field information into table representation. In the decoding phase, dual attention mechanism which contains word level attention and field level attention is proposed to model the semantic relevance between the generated description and the table. We conduct experiments on the \texttt{WIKIBIO} dataset which contains over 700k biographies and corresponding infoboxes from Wikipedia. The attention visualizations and case studies show that our model is capable of generating coherent and informative descriptions based on the comprehensive understanding of both the content and the structure of a table. Automatic evaluations also show our model outperforms the baselines by a great margin. Code for this work is available on https://github.com/tyliupku/wiki2bio.Comment: Accepted by AAAI201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Textual Economy through Close Coupling of Syntax and Semantics

Author: Stone Matthew
Webber Bonnie
Publication venue
Publication date: 01/01/1998
Field of study

We focus on the production of efficient descriptions of objects, actions and events. We define a type of efficiency, textual economy, that exploits the hearer's recognition of inferential links to material elsewhere within a sentence. Textual economy leads to efficient descriptions because the material that supports such inferences has been included to satisfy independent communicative goals, and is therefore overloaded in Pollack's sense. We argue that achieving textual economy imposes strong requirements on the representation and reasoning used in generating sentences. The representation must support the generator's simultaneous consideration of syntax and semantics. Reasoning must enable the generator to assess quickly and reliably at any stage how the hearer will interpret the current sentence, with its (incomplete) syntax and semantics. We show that these representational and reasoning requirements are met in the SPUD system for sentence planning and realization.Comment: 10 pages, uses QobiTree.te

arXiv.org e-Print Archive

CiteSeerX

Edinburgh Research Explorer

Atlas.txt : Linking Geo-referenced Data to Text for NLG

Author: Sripada Gowri Somayajulu
Thomas Kavita
Publication venue
Publication date: 01/01/2007
Field of study

Peer reviewedPreprin

Aberdeen University Research

Towards Faithful Neural Table-to-Text Generation with Content-Matching Constraints

Author: An Bang
Chen Changyou
Wang Xiaoyang
Wang Zhenyi
Yu Dong
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2020
Field of study

Text generation from a knowledge base aims to translate knowledge triples to natural language descriptions. Most existing methods ignore the faithfulness between a generated text description and the original table, leading to generated information that goes beyond the content of the table. In this paper, for the first time, we propose a novel Transformer-based generation framework to achieve the goal. The core techniques in our method to enforce faithfulness include a new table-text optimal-transport matching loss and a table-text embedding similarity loss based on the Transformer model. Furthermore, to evaluate faithfulness, we propose a new automatic metric specialized to the table-to-text generation problem. We also provide detailed analysis on each component of our model in our experiments. Automatic and human evaluations show that our framework can significantly outperform state-of-the-art by a large margin.Comment: Accepted at ACL202

arXiv.org e-Print Archive

Crossref

Text to 3D Scene Generation with Rich Lexical Grounding

Author: Chang Angel
Manning Christopher D.
Monroe Will
Potts Christopher
Savva Manolis
Publication venue
Publication date: 01/01/2015
Field of study

The ability to map descriptions of scenes to 3D geometric representations has many applications in areas such as art, education, and robotics. However, prior work on the text to 3D scene generation task has used manually specified object categories and language that identifies them. We introduce a dataset of 3D scenes annotated with natural language descriptions and learn from this data how to ground textual descriptions to physical objects. Our method successfully grounds a variety of lexical terms to concrete referents, and we show quantitatively that our method improves 3D scene generation over previous work using purely rule-based methods. We evaluate the fidelity and plausibility of 3D scenes generated with our grounding approach through human judgments. To ease evaluation on this task, we also introduce an automated metric that strongly correlates with human judgments.Comment: 10 pages, 7 figures, 3 tables. To appear in ACL-IJCNLP 201

arXiv.org e-Print Archive

CiteSeerX

Crossref