3,122 research outputs found
Analysis and Visualization of Index Words from Audio Transcripts of Instructional Videos
We introduce new techniques for extracting, analyzing, and visualizing
textual contents from instructional videos of low production quality. Using
Automatic Speech Recognition, approximate transcripts (H75% Word Error Rate)
are obtained from the originally highly compressed videos of university
courses, each comprising between 10 to 30 lectures. Text material in the form
of books or papers that accompany the course are then used to filter meaningful
phrases from the seemingly incoherent transcripts. The resulting index into the
transcripts is tied together and visualized in 3 experimental graphs that help
in understanding the overall course structure and provide a tool for localizing
certain topics for indexing. We specifically discuss a Transcript Index Map,
which graphically lays out key phrases for a course, a Textbook Chapter to
Transcript Match, and finally a Lecture Transcript Similarity graph, which
clusters semantically similar lectures. We test our methods and tools on 7 full
courses with 230 hours of video and 273 transcripts. We are able to extract up
to 98 unique key terms for a given transcript and up to 347 unique key terms
for an entire course. The accuracy of the Textbook Chapter to Transcript Match
exceeds 70% on average. The methods used can be applied to genres of video in
which there are recurrent thematic words (news, sports, meetings,...)Comment: 2004 IEEE International Workshop on Multimedia Content-based Analysis
and Retrieval; 20 pages, 8 figures, 7 table
Recommended from our members
Teachers’ Understanding and Usage of Scientific Data Visualizations for Teaching Topics in Earth and Space Science
Scientific data visualizations are the products, and increasingly a core practice, of modern computational science across all domains. With recent science education standards emphasizing student engagement in practices, these scientific visualizations will only increase in their availability and use for K-12 science instruction. But teacher practice is key to the successful learning outcomes for these, and any, educational technology. This study follows eleven science teachers from initial exposure in a PD program through classroom use of scientific data visualizations that address topics in Earth and Space science. The framework of technological pedagogical content knowledge (TPCK) is used to examine key dimensions of teacher knowledge that are activated as they seek to understand the data visualizations and the conceptual models that they represent, select and integrate them into their curriculum, and ultimately use them for instruction. Baseline measures of select dimensions of TPCK are measured for all teachers. Two representative case studies allow for a deep analysis of TPCK in action throughout their professional and instructional experience, and finally the impact on teachers’ knowledge from the experience is examined, with implications for educative curricular material and PD program design
Video Question Answering on Screencast Tutorials
This paper presents a new video question answering task on screencast
tutorials. We introduce a dataset including question, answer and context
triples from the tutorial videos for a software. Unlike other video question
answering works, all the answers in our dataset are grounded to the domain
knowledge base. An one-shot recognition algorithm is designed to extract the
visual cues, which helps enhance the performance of video question answering.
We also propose several baseline neural network architectures based on various
aspects of video contexts from the dataset. The experimental results
demonstrate that our proposed models significantly improve the question
answering performances by incorporating multi-modal contexts and domain
knowledge
Generative Disco: Text-to-Video Generation for Music Visualization
Visuals are a core part of our experience of music, owing to the way they can
amplify the emotions and messages conveyed through the music. However, creating
music visualization is a complex, time-consuming, and resource-intensive
process. We introduce Generative Disco, a generative AI system that helps
generate music visualizations with large language models and text-to-image
models. Users select intervals of music to visualize and then parameterize that
visualization by defining start and end prompts. These prompts are warped
between and generated according to the beat of the music for audioreactive
video. We introduce design patterns for improving generated videos:
"transitions", which express shifts in color, time, subject, or style, and
"holds", which encourage visual emphasis and consistency. A study with
professionals showed that the system was enjoyable, easy to explore, and highly
expressive. We conclude on use cases of Generative Disco for professionals and
how AI-generated content is changing the landscape of creative work
CONTENT BASED RETRIEVAL OF LECTURE VIDEO REPOSITORY: LITERATURE REVIEW
Multimedia has a significant role in communicating the information and a large amount of multimedia repositories make the browsing, retrieval and delivery of video contents. For higher education, using video as a tool for learning and teaching through multimedia application is a considerable promise. Many universities adopt educational systems where the teacher lecture is video recorded and the video lecture is made available to students with minimum post-processing effort. Since each video may cover many subjects, it is critical for an e-Learning environment to have content-based video searching capabilities to meet diverse individual learning needs. The present paper reviewed 120+ core research article on the content based retrieval of the lecture video repositories hosted on cloud by government academic and research organization of India
A Closer Look into Recent Video-based Learning Research: A Comprehensive Review of Video Characteristics, Tools, Technologies, and Learning Effectiveness
People increasingly use videos on the Web as a source for learning. To
support this way of learning, researchers and developers are continuously
developing tools, proposing guidelines, analyzing data, and conducting
experiments. However, it is still not clear what characteristics a video should
have to be an effective learning medium. In this paper, we present a
comprehensive review of 257 articles on video-based learning for the period
from 2016 to 2021. One of the aims of the review is to identify the video
characteristics that have been explored by previous work. Based on our
analysis, we suggest a taxonomy which organizes the video characteristics and
contextual aspects into eight categories: (1) audio features, (2) visual
features, (3) textual features, (4) instructor behavior, (5) learners
activities, (6) interactive features (quizzes, etc.), (7) production style, and
(8) instructional design. Also, we identify four representative research
directions: (1) proposals of tools to support video-based learning, (2) studies
with controlled experiments, (3) data analysis studies, and (4) proposals of
design guidelines for learning videos. We find that the most explored
characteristics are textual features followed by visual features, learner
activities, and interactive features. Text of transcripts, video frames, and
images (figures and illustrations) are most frequently used by tools that
support learning through videos. The learner activity is heavily explored
through log files in data analysis studies, and interactive features have been
frequently scrutinized in controlled experiments. We complement our review by
contrasting research findings that investigate the impact of video
characteristics on the learning effectiveness, report on tasks and technologies
used to develop tools that support learning, and summarize trends of design
guidelines to produce learning video
- …