379,497 research outputs found
Multimodal news article analysis
The intersection of Computer Vision and Natural Language Processing has been a hot topic of research in recent years, with results that were unthinkable only a few years ago. In view of this progress, we want to highlight online news articles as a potential next step for this area of research. The rich interrelations of text, tags, images or videos, as well as a vast corpus of general knowledge are an exciting benchmark for high-capacity models such as the deep neural networks. In this paper we present a series of tasks and baseline approaches to leverage corpus such as the BreakingNews dataset.Peer ReviewedPostprint (author's final draft
Using Visual Journals as a Reflective Worldview Window into Educator Identity
This ethnographic case study research and content analysis presents the conclusion of a three-year study involving 37 teacher candidate participants across a three-year study within a two year (2 semester program) Bachelor of Education program at a university in Ontario, Canada. Each academic year participants were intentionally given time over two semesters of literacy courses to engage in literacy practices and knowledge of self through the use of multimodal visual journals. Candidates reflect on their conceptions of literacy, teaching, identity and worldview within an institution grounded in the Christian faith. Findings, philosophical ponderings and content analysis suggest that the identity of the teacher candidate filters learning through visual and multimodal ways. The findings raise questions about the place of multimodal learning, self-reflection, faith and worldview in the learning process, and in identity formation of educators. We suggest that this study may inform current multimodal and visual literacy research while generating enriching discussions on how multimodal forms of literacy instruction may assist in acknowledgement of worldview recognition and self-identity awareness.
Keywords: Multiliteracies, visual journals, self-knowledge, worldview, identity, visual literacy, multimodal literacy, teacher educatio
Video Highlight Prediction Using Audience Chat Reactions
Sports channel video portals offer an exciting domain for research on
multimodal, multilingual analysis. We present methods addressing the problem of
automatic video highlight prediction based on joint visual features and textual
analysis of the real-world audience discourse with complex slang, in both
English and traditional Chinese. We present a novel dataset based on League of
Legends championships recorded from North American and Taiwanese Twitch.tv
channels (will be released for further research), and demonstrate strong
results on these using multimodal, character-level CNN-RNN model architectures.Comment: EMNLP 201
Multimodal Polynomial Fusion for Detecting Driver Distraction
Distracted driving is deadly, claiming 3,477 lives in the U.S. in 2015 alone.
Although there has been a considerable amount of research on modeling the
distracted behavior of drivers under various conditions, accurate automatic
detection using multiple modalities and especially the contribution of using
the speech modality to improve accuracy has received little attention. This
paper introduces a new multimodal dataset for distracted driving behavior and
discusses automatic distraction detection using features from three modalities:
facial expression, speech and car signals. Detailed multimodal feature analysis
shows that adding more modalities monotonically increases the predictive
accuracy of the model. Finally, a simple and effective multimodal fusion
technique using a polynomial fusion layer shows superior distraction detection
results compared to the baseline SVM and neural network models.Comment: INTERSPEECH 201
- …
