Search CORE

379,497 research outputs found

Multimodal news article analysis

Author: Ramisa Ayats Arnau
Publication venue: 'International Joint Conferences on Artificial Intelligence'
Publication date: 01/01/2017
Field of study

The intersection of Computer Vision and Natural Language Processing has been a hot topic of research in recent years, with results that were unthinkable only a few years ago. In view of this progress, we want to highlight online news articles as a potential next step for this area of research. The rich interrelations of text, tags, images or videos, as well as a vast corpus of general knowledge are an exciting benchmark for high-capacity models such as the deep neural networks. In this paper we present a series of tasks and baseline approaches to leverage corpus such as the BreakingNews dataset.Peer ReviewedPostprint (author's final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC

UPCommons (Universitat Politècnica de Catalunya)

Using Visual Journals as a Reflective Worldview Window into Educator Identity

Author: Belcher Christina
Loerts Terry
Publication venue: Digital Commons @ George Fox University
Publication date: 04/03/2020
Field of study

This ethnographic case study research and content analysis presents the conclusion of a three-year study involving 37 teacher candidate participants across a three-year study within a two year (2 semester program) Bachelor of Education program at a university in Ontario, Canada. Each academic year participants were intentionally given time over two semesters of literacy courses to engage in literacy practices and knowledge of self through the use of multimodal visual journals. Candidates reflect on their conceptions of literacy, teaching, identity and worldview within an institution grounded in the Christian faith. Findings, philosophical ponderings and content analysis suggest that the identity of the teacher candidate filters learning through visual and multimodal ways. The findings raise questions about the place of multimodal learning, self-reflection, faith and worldview in the learning process, and in identity formation of educators. We suggest that this study may inform current multimodal and visual literacy research while generating enriching discussions on how multimodal forms of literacy instruction may assist in acknowledgement of worldview recognition and self-identity awareness. Keywords: Multiliteracies, visual journals, self-knowledge, worldview, identity, visual literacy, multimodal literacy, teacher educatio

Digital Commons @ George Fox University

Video Highlight Prediction Using Audience Chat Reactions

Author: Bansal Mohit
Berg Alexander C.
Fu Cheng-Yang
Lee Joon
Publication venue
Publication date: 01/01/2017
Field of study

Sports channel video portals offer an exciting domain for research on multimodal, multilingual analysis. We present methods addressing the problem of automatic video highlight prediction based on joint visual features and textual analysis of the real-world audience discourse with complex slang, in both English and traditional Chinese. We present a novel dataset based on League of Legends championships recorded from North American and Taiwanese Twitch.tv channels (will be released for further research), and demonstrate strong results on these using multimodal, character-level CNN-RNN model architectures.Comment: EMNLP 201

arXiv.org e-Print Archive

Crossref

Multimodal Polynomial Fusion for Detecting Driver Distraction

Author: Black Alan W
Du Yulun
Eskenazi Maxine
Morency Louis-Philippe
Raman Chirag
Publication venue: 'International Speech Communication Association'
Publication date: 24/10/2018
Field of study

Distracted driving is deadly, claiming 3,477 lives in the U.S. in 2015 alone. Although there has been a considerable amount of research on modeling the distracted behavior of drivers under various conditions, accurate automatic detection using multiple modalities and especially the contribution of using the speech modality to improve accuracy has received little attention. This paper introduces a new multimodal dataset for distracted driving behavior and discusses automatic distraction detection using features from three modalities: facial expression, speech and car signals. Detailed multimodal feature analysis shows that adding more modalities monotonically increases the predictive accuracy of the model. Finally, a simple and effective multimodal fusion technique using a polynomial fusion layer shows superior distraction detection results compared to the baseline SVM and neural network models.Comment: INTERSPEECH 201

arXiv.org e-Print Archive

Crossref