379,497 research outputs found

    Multimodal news article analysis

    Get PDF
    The intersection of Computer Vision and Natural Language Processing has been a hot topic of research in recent years, with results that were unthinkable only a few years ago. In view of this progress, we want to highlight online news articles as a potential next step for this area of research. The rich interrelations of text, tags, images or videos, as well as a vast corpus of general knowledge are an exciting benchmark for high-capacity models such as the deep neural networks. In this paper we present a series of tasks and baseline approaches to leverage corpus such as the BreakingNews dataset.Peer ReviewedPostprint (author's final draft

    Using Visual Journals as a Reflective Worldview Window into Educator Identity

    Full text link
    This ethnographic case study research and content analysis presents the conclusion of a three-year study involving 37 teacher candidate participants across a three-year study within a two year (2 semester program) Bachelor of Education program at a university in Ontario, Canada. Each academic year participants were intentionally given time over two semesters of literacy courses to engage in literacy practices and knowledge of self through the use of multimodal visual journals. Candidates reflect on their conceptions of literacy, teaching, identity and worldview within an institution grounded in the Christian faith. Findings, philosophical ponderings and content analysis suggest that the identity of the teacher candidate filters learning through visual and multimodal ways. The findings raise questions about the place of multimodal learning, self-reflection, faith and worldview in the learning process, and in identity formation of educators. We suggest that this study may inform current multimodal and visual literacy research while generating enriching discussions on how multimodal forms of literacy instruction may assist in acknowledgement of worldview recognition and self-identity awareness. Keywords: Multiliteracies, visual journals, self-knowledge, worldview, identity, visual literacy, multimodal literacy, teacher educatio

    Video Highlight Prediction Using Audience Chat Reactions

    Full text link
    Sports channel video portals offer an exciting domain for research on multimodal, multilingual analysis. We present methods addressing the problem of automatic video highlight prediction based on joint visual features and textual analysis of the real-world audience discourse with complex slang, in both English and traditional Chinese. We present a novel dataset based on League of Legends championships recorded from North American and Taiwanese Twitch.tv channels (will be released for further research), and demonstrate strong results on these using multimodal, character-level CNN-RNN model architectures.Comment: EMNLP 201

    Multimodal Polynomial Fusion for Detecting Driver Distraction

    Full text link
    Distracted driving is deadly, claiming 3,477 lives in the U.S. in 2015 alone. Although there has been a considerable amount of research on modeling the distracted behavior of drivers under various conditions, accurate automatic detection using multiple modalities and especially the contribution of using the speech modality to improve accuracy has received little attention. This paper introduces a new multimodal dataset for distracted driving behavior and discusses automatic distraction detection using features from three modalities: facial expression, speech and car signals. Detailed multimodal feature analysis shows that adding more modalities monotonically increases the predictive accuracy of the model. Finally, a simple and effective multimodal fusion technique using a polynomial fusion layer shows superior distraction detection results compared to the baseline SVM and neural network models.Comment: INTERSPEECH 201
    corecore