70,991 research outputs found

    Video Highlight Prediction Using Audience Chat Reactions

    Full text link
    Sports channel video portals offer an exciting domain for research on multimodal, multilingual analysis. We present methods addressing the problem of automatic video highlight prediction based on joint visual features and textual analysis of the real-world audience discourse with complex slang, in both English and traditional Chinese. We present a novel dataset based on League of Legends championships recorded from North American and Taiwanese Twitch.tv channels (will be released for further research), and demonstrate strong results on these using multimodal, character-level CNN-RNN model architectures.Comment: EMNLP 201

    Revealing Possible Truths Behind “Coolest Monkey in the Jungle”: Ideational Making Analysis Approach

    Get PDF
    Multimodal Critical Discourse Analysis (CDA)-commonly defined as a discourse analysis approach that focuses on both linguistic and non-linguistic resources- has been witnessing increasing popularity in the research area. It has been argued, on the one hand, that the greater level of practicality the approach has to offer compared to its predecessor (Critical Discourse Analysis) becomes the main reason of why researchers gradually turn their reference on analysing discourses to this method (Han, 2015). An increasing trend of multimodal communications - that no longer use speech or writing exclusively in their occurrences - has become the logical ground on the other (Kress, 2011). This paper, using H&M ‘monkey' hoodie advertisement (refer to appendix) as a form of multimodal communication, employs MCDA to explore the possible advantages one can learn through the process. To start with, a brief account of the key factors in the development of MCDA will be presented. This part is then followed by the discussion of the analysis approach employed in the paper and also the rational basis for choosing the approach over others. The analysis of the advertisement is done by drawing on related language and social theories to scrutinize the ideologies the company implanted on their advertisement. A thoughtful discussion on what is understood through the analysis process and what and how one can relate the practicality of multimodality analysis to another social area such as pedagogy will mark the end of this paper's discussion

    Multimodal discourse strategies of factuality and subjectivity in educational digital storytelling

    Get PDF
    As new technologies continue to emerge, students and lecturers are provided with new educational tools. One such tool, which is increasingly used in higher education, is digital storytelling, i.e. multi-media digital narratives. Despite the increasing attention that education and media scholars have paid to digital storytelling, there is scant research examining digital narratives from a discourse-analytic perspective.This paper addresses this gap in the literature and, in line with the belief that individuals make meaning through a range of semiotic devices, including, among others, language, sound, graphics and text, it aims to examine discourse strategies of factuality and subjectivity in historical-cultural digital narratives and their multimodal realisations (Kress & Van Leeuwen 2001; Patrona 2005). To carry out this study a corpus of 16 digital stories was compiled and analysed from a multidisciplinary framework which draws from studies on digital storytelling, computer-mediated communication, media studies, and multimodal discourse analysis. Results show that students/digital story tellers resort to a number of varied multimodal discursive strategies which are constitutive of their identity as capable students in an educational setting

    Using Visual Journals as a Reflective Worldview Window into Educator Identity

    Full text link
    This ethnographic case study research and content analysis presents the conclusion of a three-year study involving 37 teacher candidate participants across a three-year study within a two year (2 semester program) Bachelor of Education program at a university in Ontario, Canada. Each academic year participants were intentionally given time over two semesters of literacy courses to engage in literacy practices and knowledge of self through the use of multimodal visual journals. Candidates reflect on their conceptions of literacy, teaching, identity and worldview within an institution grounded in the Christian faith. Findings, philosophical ponderings and content analysis suggest that the identity of the teacher candidate filters learning through visual and multimodal ways. The findings raise questions about the place of multimodal learning, self-reflection, faith and worldview in the learning process, and in identity formation of educators. We suggest that this study may inform current multimodal and visual literacy research while generating enriching discussions on how multimodal forms of literacy instruction may assist in acknowledgement of worldview recognition and self-identity awareness. Keywords: Multiliteracies, visual journals, self-knowledge, worldview, identity, visual literacy, multimodal literacy, teacher educatio

    Follow-up question handling in the IMIX and Ritel systems: A comparative study

    Get PDF
    One of the basic topics of question answering (QA) dialogue systems is how follow-up questions should be interpreted by a QA system. In this paper, we shall discuss our experience with the IMIX and Ritel systems, for both of which a follow-up question handling scheme has been developed, and corpora have been collected. These two systems are each other's opposites in many respects: IMIX is multimodal, non-factoid, black-box QA, while Ritel is speech, factoid, keyword-based QA. Nevertheless, we will show that they are quite comparable, and that it is fruitful to examine the similarities and differences. We shall look at how the systems are composed, and how real, non-expert, users interact with the systems. We shall also provide comparisons with systems from the literature where possible, and indicate where open issues lie and in what areas existing systems may be improved. We conclude that most systems have a common architecture with a set of common subtasks, in particular detecting follow-up questions and finding referents for them. We characterise these tasks using the typical techniques used for performing them, and data from our corpora. We also identify a special type of follow-up question, the discourse question, which is asked when the user is trying to understand an answer, and propose some basic methods for handling it

    Fusing Audio, Textual and Visual Features for Sentiment Analysis of News Videos

    Full text link
    This paper presents a novel approach to perform sentiment analysis of news videos, based on the fusion of audio, textual and visual clues extracted from their contents. The proposed approach aims at contributing to the semiodiscoursive study regarding the construction of the ethos (identity) of this media universe, which has become a central part of the modern-day lives of millions of people. To achieve this goal, we apply state-of-the-art computational methods for (1) automatic emotion recognition from facial expressions, (2) extraction of modulations in the participants' speeches and (3) sentiment analysis from the closed caption associated to the videos of interest. More specifically, we compute features, such as, visual intensities of recognized emotions, field sizes of participants, voicing probability, sound loudness, speech fundamental frequencies and the sentiment scores (polarities) from text sentences in the closed caption. Experimental results with a dataset containing 520 annotated news videos from three Brazilian and one American popular TV newscasts show that our approach achieves an accuracy of up to 84% in the sentiments (tension levels) classification task, thus demonstrating its high potential to be used by media analysts in several applications, especially, in the journalistic domain.Comment: 5 pages, 1 figure, International AAAI Conference on Web and Social Medi

    Multimodal signs in (non)heteronormative discourse of transnational Hindi cinema: the case study of Hindi film Dostana

    Get PDF
    This article conducts a detailed analysis of multimodal signifiers in a popular Hindi film Dostana (meaning friendship) with particular focus on film’s (non) heteronormative and sexist system of signification. The signifiers that construct gender and sexual stereotypical worldview of the film are analyzed following Lazar’s (2007) conception of feminist critical discourse analysis and Wodak’s (2001) framework of Discourse Historical Approach which proposes three simultaneously functioning aspects of discourse, i.e. immanent, diagnostic and prognostic. The multimodal signifiers in the film are analyzed within Indo-Pakistani discursive context where patriarchal discourse does not seem to allow any cognitive pattern and mental model other than heteronormativity and heterosexual love and romance. In such discursive set-up, so-called deviant sexualities and gender roles struggle for voice, signifiers and representation. The prognostic critique of this article can be thought of as Positive Discourse Analysis (Martin, 2004), because eventually film’s text offers some examples of how certain multimodal signs can be used to resist hegemonic patriarchal and heteronormative discourses which are considered common sense and natural by mainstream Hindi film audience

    Query-Based Summarization using Rhetorical Structure Theory

    Get PDF
    Research on Question Answering is focused mainly on classifying the question type and finding the answer. Presenting the answer in a way that suits the user’s needs has received little attention. This paper shows how existing question answering systems—which aim at finding precise answers to questions—can be improved by exploiting summarization techniques to extract more than just the answer from the document in which the answer resides. This is done using a graph search algorithm which searches for relevant sentences in the discourse structure, which is represented as a graph. The Rhetorical Structure Theory (RST) is used to create a graph representation of a text document. The output is an extensive answer, which not only answers the question, but also gives the user an opportunity to assess the accuracy of the answer (is this what I am looking for?), and to find additional information that is related to the question, and which may satisfy an information need. This has been implemented in a working multimodal question answering system where it operates with two independently developed question answering modules
    corecore