4,846 research outputs found
Transient Analysis for Music and Moving Images: Consideration for Television Advertising
In audiovisual composition, coupling montage moving images with music is common practice. Interpretation of the effect on an audioviewer's consequent interpretation of the composition is discursive and unquantified. Meth-odology for evaluating the audiovisual multimodal inter-activity is proposed, developing an analysis procedure via the study of modality interdependent transient structures, explained as forming the foundation of perception via the concept of Basic Exposure response to the stimulus. The research has implications for analysis of all audiovisual media, with practical implications in television advertis-ing as a discrete typology of target driven audiovisual presentation. Examples from contemporary advertising are used to explore typical transient interaction patterns and the consequences of which are discussed from the practical viewpoint of the audiovisual composer
Hierachical Delta-Attention Method for Multimodal Fusion
In vision and linguistics; the main input modalities are facial expressions,
speech patterns, and the words uttered. The issue with analysis of any one mode
of expression (Visual, Verbal or Vocal) is that lot of contextual information
can get lost. This asks researchers to inspect multiple modalities to get a
thorough understanding of the cross-modal dependencies and temporal context of
the situation to analyze the expression. This work attempts at preserving the
long-range dependencies within and across different modalities, which would be
bottle-necked by the use of recurrent networks and adds the concept of
delta-attention to focus on local differences per modality to capture the
idiosyncrasy of different people. We explore a cross-attention fusion technique
to get the global view of the emotion expressed through these
delta-self-attended modalities, in order to fuse all the local nuances and
global context together. The addition of attention is new to the multi-modal
fusion field and currently being scrutinized for on what stage the attention
mechanism should be used, this work achieves competitive accuracy for overall
and per-class classification which is close to the current state-of-the-art
with almost half number of parameters
Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation
This paper surveys the current state of the art in Natural Language
Generation (NLG), defined as the task of generating text or speech from
non-linguistic input. A survey of NLG is timely in view of the changes that the
field has undergone over the past decade or so, especially in relation to new
(usually data-driven) methods, as well as new applications of NLG technology.
This survey therefore aims to (a) give an up-to-date synthesis of research on
the core tasks in NLG and the architectures adopted in which such tasks are
organised; (b) highlight a number of relatively recent research topics that
have arisen partly as a result of growing synergies between NLG and other areas
of artificial intelligence; (c) draw attention to the challenges in NLG
evaluation, relating them to similar challenges faced in other areas of Natural
Language Processing, with an emphasis on different evaluation methods and the
relationships between them.Comment: Published in Journal of AI Research (JAIR), volume 61, pp 75-170. 118
pages, 8 figures, 1 tabl
- …