Late Fusion with Triplet Margin Objective for Multimodal Ideology
  Prediction and Analysis

Qiu, Changyuan; Wang, Lu; Wu, Winston; Zhang, Xinliang Frederick

Late Fusion with Triplet Margin Objective for Multimodal Ideology Prediction and Analysis

Authors: Changyuan Qiu
Lu Wang
Winston Wu
Xinliang Frederick Zhang
Publication date: 4 November 2022
Publisher

Abstract

Prior work on ideology prediction has largely focused on single modalities, i.e., text or images. In this work, we introduce the task of multimodal ideology prediction, where a model predicts binary or five-point scale ideological leanings, given a text-image pair with political content. We first collect five new large-scale datasets with English documents and images along with their ideological leanings, covering news articles from a wide range of US mainstream media and social media posts from Reddit and Twitter. We conduct in-depth analyses of news articles and reveal differences in image content and usage across the political spectrum. Furthermore, we perform extensive experiments and ablation studies, demonstrating the effectiveness of targeted pretraining objectives on different model components. Our best-performing model, a late-fusion architecture pretrained with a triplet objective over multimodal content, outperforms the state-of-the-art text-only model by almost 4% and a strong multimodal baseline with no pretraining by over 3%.Comment: EMNLP 202

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2211.02269

Last time updated on 12/12/2022