Search CORE

162,016 research outputs found

Multimodal Grounding for Language Processing

Author: Beinborn Lisa
Botschen Teresa
Gurevych Iryna
Publication venue
Publication date: 01/01/2018
Field of study

This survey discusses how recent developments in multimodal processing facilitate conceptual grounding of language. We categorize the information flow in multimodal processing with respect to cognitive models of human information processing and analyze different methods for combining multimodal representations. Based on this methodological inventory, we discuss the benefit of multimodal grounding for a variety of language processing tasks and the challenges that arise. We particularly focus on multimodal grounding of verbs which play a crucial role for the compositional power of language.Comment: The paper has been published in the Proceedings of the 27 Conference of Computational Linguistics. Please refer to this version for citations: https://www.aclweb.org/anthology/papers/C/C18/C18-1197

arXiv.org e-Print Archive

TUbiblio

VU Research Portal

International Migration, Integration and Social Cohesion online publications

UvA-DARE

UR-FUNNY: A Multimodal Language Dataset for Understanding Humor

Author: Hasan Md Kamrul
Hoque
Mohammed
Morency Louis-Philippe
Rahman Wasifur
Tanveer Md Iftekhar
Zadeh Amir
Zhong Jianyuan
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2019
Field of study

Humor is a unique and creative communicative behavior displayed during social interactions. It is produced in a multimodal manner, through the usage of words (text), gestures (vision) and prosodic cues (acoustic). Understanding humor from these three modalities falls within boundaries of multimodal language; a recent research trend in natural language processing that models natural language as it happens in face-to-face communication. Although humor detection is an established research area in NLP, in a multimodal context it is an understudied area. This paper presents a diverse multimodal dataset, called UR-FUNNY, to open the door to understanding multimodal language used in expressing humor. The dataset and accompanying studies, present a framework in multimodal humor detection for the natural language processing community. UR-FUNNY is publicly available for research

arXiv.org e-Print Archive

Crossref

The Multimodal Experience of Art

Author: Nanay Bence
Publication venue
Publication date: 01/01/2012
Field of study

The aim of this paper is to argue that our experience of artworks is normally multimodal. It is the result of perceptual processing in more than one sense modality. In other words, multimodal experience of art is not the exception; it is the rule. I use the example of music in order to demonstrate the various ways in which the visual sense modality influences the auditory processing of music and conclude that this should make us look more closely at our practices of engaging with artworks

PhilPapers

Institutional Repository Universiteit Antwerpen

Multi-modal Image Processing based on Coupled Dictionary Learning

Author: Rodrigues Miguel R. D.
Song Pingfan
Publication venue
Publication date: 01/01/2018
Field of study

In real-world scenarios, many data processing problems often involve heterogeneous images associated with different imaging modalities. Since these multimodal images originate from the same phenomenon, it is realistic to assume that they share common attributes or characteristics. In this paper, we propose a multi-modal image processing framework based on coupled dictionary learning to capture similarities and disparities between different image modalities. In particular, our framework can capture favorable structure similarities across different image modalities such as edges, corners, and other elementary primitives in a learned sparse transform domain, instead of the original pixel domain, that can be used to improve a number of image processing tasks such as denoising, inpainting, or super-resolution. Practical experiments demonstrate that incorporating multimodal information using our framework brings notable benefits.Comment: SPAWC 2018, 19th IEEE International Workshop On Signal Processing Advances In Wireless Communication

arXiv.org e-Print Archive

Crossref

UCL Discovery

Conceptual Frameworks for Multimodal Social Signal Processing

Author: Brunet P.
Cowie R.
Heylen Dirk K.J.
Martin J-C.
Nijholt Antinus
Schroeder M.
Publication venue: Springer Verlag
Publication date: 01/01/2012
Field of study

This special issue is about a research area which is developing rapidly. Pentland gave it a name which has become widely used, ‘Social Signal Processing’ (SSP for short), and his phrase provides the title of a European project, SSPnet, which has a brief to consolidate the area. The challenge that Pentland highlighted was understanding the nonlinguistic signals that serve as the basis for “subconscious discussions between humans about relationships, resources, risks, and rewards”. He identified it as an area where computational research had made interesting progress, and could usefully make more

Crossref

Springer - Publisher Connector

University of Twente Research Information

Eyes and ears together: new task for multimodal spoken content analysis

Author: Jones Gareth J.F.
Metze Florian
Moriya Yasufumi
Sanabria Ramon
Publication venue: CEUR-WS
Publication date: 01/10/2018
Field of study

Human speech processing is often a multimodal process combining audio and visual processing. Eyes and Ears Together proposes two benchmark multimodal speech processing tasks: (1) multimodal automatic speech recognition (ASR) and (2) multimodal co-reference resolution on the spoken multimedia. These tasks are motivated by our desire to address the difficulties of ASR for multimedia spoken content. We review prior work on the integration of multimodal signals into speech processing for multimedia data, introduce a multimedia dataset for our proposed tasks, and outline these tasks

Irish Universities

DCU Online Research Access Service