TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation
  with Question Answering

Hu, Yushi; Kasai, Jungo; Krishna, Ranjay; Liu, Benlin; Ostendorf, Mari; Smith, Noah A.; Wang, Yizhong

TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering

Authors: Yushi Hu
Jungo Kasai
Ranjay Krishna
Benlin Liu
Mari Ostendorf
Noah A. Smith
Yizhong Wang
Publication date: 28 March 2023
Publisher

Abstract

Despite thousands of researchers, engineers, and artists actively working on improving text-to-image generation models, systems often fail to produce images that accurately align with the text inputs. We introduce TIFA (Text-to-Image Faithfulness evaluation with question Answering), an automatic evaluation metric that measures the faithfulness of a generated image to its text input via visual question answering (VQA). Specifically, given a text input, we automatically generate several question-answer pairs using a language model. We calculate image faithfulness by checking whether existing VQA models can answer these questions using the generated image. TIFA is a reference-free metric that allows for fine-grained and interpretable evaluations of generated images. TIFA also has better correlations with human judgments than existing metrics. Based on this approach, we introduce TIFA v1.0, a benchmark consisting of 4K diverse text inputs and 25K questions across 12 categories (object, counting, etc.). We present a comprehensive evaluation of existing text-to-image models using TIFA v1.0 and highlight the limitations and challenges of current models. For instance, we find that current text-to-image models, despite doing well on color and material, still struggle in counting, spatial relations, and composing multiple objects. We hope our benchmark will help carefully measure the research progress in text-to-image synthesis and provide valuable insights for further research

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2303.11897

Last time updated on 02/04/2023