3,155 research outputs found
Multimodal human behavior analysis: Learning correlation and interaction across modalities
Multimodal human behavior analysis is a challenging task due to the presence of complex nonlinear correlations and interactions across modalities. We present a novel approach to this problem based on Kernel Canonical Correlation Analysis (KCCA) and Multi-view Hidden Conditional Random Fields (MV-HCRF). Our approach uses a nonlinear kernel to map multimodal data to a high-dimensional feature space and finds a new projection of the data that maximizes the correlation across modalities. We use a multi-chain structured graphical model with disjoint sets of latent variables, one set per modality, to jointly learn both view-shared and view-specific sub-structures of the projected data, capturing interaction across modalities explicitly. We evaluate our approach on a task of agreement and disagreement recognition from nonverbal audio-visual cues using the Canal 9 dataset. Experimental results show that KCCA makes capturing nonlinear hidden dynamics easier and MV-HCRF helps learning interaction across modalities.United States. Office of Naval Research (Grant N000140910625)National Science Foundation (U.S.) (Grant IIS-1118018)National Science Foundation (U.S.) (Grant IIS-1018055)United States. Army Research, Development, and Engineering Comman
Multi-Zone Unit for Recurrent Neural Networks
Recurrent neural networks (RNNs) have been widely used to deal with sequence
learning problems. The input-dependent transition function, which folds new
observations into hidden states to sequentially construct fixed-length
representations of arbitrary-length sequences, plays a critical role in RNNs.
Based on single space composition, transition functions in existing RNNs often
have difficulty in capturing complicated long-range dependencies. In this
paper, we introduce a new Multi-zone Unit (MZU) for RNNs. The key idea is to
design a transition function that is capable of modeling multiple space
composition. The MZU consists of three components: zone generation, zone
composition, and zone aggregation. Experimental results on multiple datasets of
the character-level language modeling task and the aspect-based sentiment
analysis task demonstrate the superiority of the MZU.Comment: Accepted at AAAI 202
Generating Diverse Translation by Manipulating Multi-Head Attention
Transformer model has been widely used on machine translation tasks and
obtained state-of-the-art results. In this paper, we report an interesting
phenomenon in its encoder-decoder multi-head attention: different attention
heads of the final decoder layer align to different word translation
candidates. We empirically verify this discovery and propose a method to
generate diverse translations by manipulating heads. Furthermore, we make use
of these diverse translations with the back-translation technique for better
data augmentation. Experiment results show that our method generates diverse
translations without severe drop in translation quality. Experiments also show
that back-translation with these diverse translations could bring significant
improvement on performance on translation tasks. An auxiliary experiment of
conversation response generation task proves the effect of diversity as well.Comment: Accepted by AAAI 202
- …