Search CORE

233,301 research outputs found

Language as the Medium: Multimodal Video Classification through text only

Author: Hanu Laura
Thewlis James
Verő Anita L.
Publication venue
Publication date: 19/09/2023
Field of study

Despite an exciting new wave of multimodal machine learning models, current approaches still struggle to interpret the complex contextual relationships between the different modalities present in videos. Going beyond existing methods that emphasize simple activities or objects, we propose a new model-agnostic approach for generating detailed textual descriptions that captures multimodal video information. Our method leverages the extensive knowledge learnt by large language models, such as GPT-3.5 or Llama2, to reason about textual descriptions of the visual and aural modalities, obtained from BLIP-2, Whisper and ImageBind. Without needing additional finetuning of video-text models or datasets, we demonstrate that available LLMs have the ability to use these multimodal textual descriptions as proxies for ``sight'' or ``hearing'' and perform zero-shot multimodal classification of videos in-context. Our evaluations on popular action recognition benchmarks, such as UCF-101 or Kinetics, show these context-rich descriptions can be successfully used in video understanding tasks. This method points towards a promising new research direction in multimodal classification, demonstrating how an interplay between textual, visual and auditory machine learning models can enable more holistic video understanding.Comment: Accepted at "What is Next in Multimodal Foundation Models?" (MMFM) workshop at ICCV 202

arXiv.org e-Print Archive

The virtual path to academic transition: enabling international students to begin their transition to university study before they arrive

Author: Watson Julie
Publication venue
Publication date
Field of study

Institutions receiving international students for postgraduate study are now committing time and energy to the development of online transition resources to enable students to prepare for the demands of a different academic culture before they arrive. Important questions underlying such initiatives are identifying what kind of digital resources will both engage international students and be of most use to them in preparing for this transition, and how to effectively reach students. Current institutional initiatives are taking several forms. A popular model is to offer browsable advice/tips or FAQs about life and study at a particular institution together with, for example, video clips of other international students describing their experiences there. These may be open and web-hosted or accessible through a password protected area on an institutional website or VLE. Less commonly found are video and other media embedded in learning resources developed in the form of ‘learning objects’ which have been designed to offer key information through structured interactive learning activities supported with answers and feedback. Importantly, these also offer opportunities for language improvement at the same time since they are supported by help, feedback and transcripts. This case study focuses on a project to develop and deliver a pre-arrival online course of interactive learning resources for all incoming international students to one UK institution. Building on five years of experience in delivering pre-arrival, tutored online courses to pre-sessional course international students, the project team developed institution-specific learning objects and incorporated open resources from the website, ‘Prepare for Success’, developed by the same institution. The project seeks to deliver a self-access online course with three strands to it to address students’ concerns and needs. These are to prepare international students for the location in which they will be living and studying (the city of Southampton - its key features and amenities); to introduce them to practical aspects of British life and culture (e.g. setting up a bank account, shopping in a UK supermarket) and to familiarise them with key study skills and other aspects of UK academic culture which may present challenges for them (e.g. academic writing conventions; dealing with course reading lists). This paper will be of value to institutions embarking on similar ventures. It will describe the rationale for the online course; refer to the pedagogic approach taken; showcase course content, and report on the first phase of its delivery which begins in late spring 2011 <br/

Southampton (e-Prints Soton)

Recommended from our members

JuxtaLearn D3.2 Performance Framework

Author: Adams Anne
Goldsmith Rick
Hartnett Elizabeth
José Rui
Malzahn Nils
Publication venue: 'The Open University'
Publication date: 01/01/2013
Field of study

This deliverable, D3.2, for Work Package 3 incorporating the pedagogy from WP2 and orchestration factors mapped in D3.1 reviews aspects of performance in the context of participative video making. It reviews literature on curiosity and engagement characteristics of interaction mechanisms for public displays and anticipates requirements for social network analysis of relevant public videos from WP6 task 6.3. Thus, to support JuxtaLearn performance it proposes a reflective performance framework that encompasses the material environment and objects required, the participants, and the knowledge needed

Open Research Online (The Open University)

Language Learning and Interactive TV

Author: Underwood Joshua
Publication venue
Publication date: 01/01/2002
Field of study

The integration of engaging TV style content with the individualization and ‘intelligent’ content management offered by techniques from AI has the potential to provide learning environments that are both highly motivating and educationally sound. This paper describes why the area of language learning would be a particularly appropriate domain for interactive educational television to focus on. It also indicates some of the criteria to be fulfilled in order to provide optimal language learning conditions and how these might be satisfied using TV/Film content and techniques from AIED

UCL Discovery