6 research outputs found
A Tool for Creating, Editing and Tracking Virtual SMIL Presentations
The ability to easily find, edit and re-use content adds significant value to that content. When the content consists of complex, multimedia objects, both the difficulty of implementing such capabilities and the added-value are multiplied. The work described here is based on an archive of SMIL presentations, built and indexed using tools, developed by the authors, which enabled digitized videos of lectures to be automatically synchronized with their corresponding PowerPoint slides and recorded metadata about the lecture context, origin of the video and slide files and the temporal alignment information. As the archive grew, it became clear that the ability to edit, update or customize existing presentations by deleting, adding or replacing specific slides without re-filming the entire lecture was required. This paper describes an application that provides such functionality by enabling the easy editing, repurposing and tracking of presentations for web-based distance learning
Content-based indexing of low resolution documents
In any multimedia presentation, the trend for attendees taking pictures of slides that
interest them during the presentation using capturing devices is gaining popularity.
To enhance the image usefulness, the images captured could be linked to image or
video database. The database can be used for the purpose of file archiving, teaching
and learning, research and knowledge management, which concern image search.
However, the above-mentioned devices include cameras or mobiles phones have low
resolution resulted from poor lighting and noise. Content-Based Image Retrieval
(CBIR) is considered among the most interesting and promising fields as far as
image search is concerned. Image search is related with finding images that are
similar for the known query image found in a given image database. This thesis
concerns with the methods used for the purpose of identifying documents that are
captured using image capturing devices. In addition, the thesis also concerns with a
technique that can be used to retrieve images from an indexed image database. Both
concerns above apply digital image processing technique. To build an indexed
structure for fast and high quality content-based retrieval of an image, some existing
representative signatures and the key indexes used have been revised. The retrieval
performance is very much relying on how the indexing is done. The retrieval
approaches that are currently in existence including making use of shape, colour and
texture features. Putting into consideration these features relative to individual
databases, the majority of retrievals approaches have poor results on low resolution
documents, consuming a lot of time and in the some cases, for the given query image,
irrelevant images are obtained. The proposed identification and indexing method in
the thesis uses a Visual Signature (VS). VS consists of the captures slides textual
layout’s graphical information, shape’s moment and spatial distribution of colour.
This approach, which is signature-based are considered for fast and efficient
matching to fulfil the needs of real-time applications. The approach also has the
capability to overcome the problem low resolution document such as noisy image,
the environment’s varying lighting conditions and complex backgrounds. We present
hierarchy indexing techniques, whose foundation are tree and clustering. K-means
clustering are used for visual features like colour since their spatial distribution give a good image’s global information. Tree indexing for extracted layout and shape
features are structured hierarchically and Euclidean distance is used to get similarity
image for CBIR. The assessment of the proposed indexing scheme is conducted
based on recall and precision, a standard CBIR retrieval performance evaluation. We
develop CBIR system and conduct various retrieval experiments with the
fundamental aim of comparing the accuracy during image retrieval. A new algorithm
that can be used with integrated visual signatures, especially in late fusion query was
introduced. The algorithm has the capability of reducing any shortcoming associated
with normalisation in initial fusion technique. Slides from conferences, lectures and
meetings presentation are used for comparing the proposed technique’s performances
with that of the existing approaches with the help of real data. This finding of the
thesis presents exciting possibilities as the CBIR systems is able to produce high
quality result even for a query, which uses low resolution documents. In the future,
the utilization of multimodal signatures, relevance feedback and artificial intelligence
technique are recommended to be used in CBIR system to further enhance the
performance
Methods and Technologies for Digital Preservation of Analog Audio Collections
This paper is a consideration of some of the challenges libraries and other institutions face in the digitization of analog audio collections for preservation and access. The intent of this study is to demonstrate that such an existing analog audio collection can be digitized, described, stored, and preserved using currently available and emerging technologies. Consideration was accorded to a broad range of topics related to the creation of enduring digital audio collections. It is hoped that this effort will illuminate possible paths to successful preservation of audio collections while also vastly improving access to them. Furthermore, the study will demonstrate that libraries and other institutions that possess similar multimedia assets may have access to effective and economical means of transforming isolated and moribund materials into persistent, valuable, and widely available multimedia documents
Annotation of multimedia learning materials for semantic search
Multimedia is the main source for online learning materials, such as videos, slides and textbooks, and its size is growing with the popularity of online programs offered by Universities and Massive Open Online Courses (MOOCs). The increasing amount of multimedia learning resources available online makes it very challenging to browse through the materials or find where a specific concept of interest is covered. To enable semantic search on the lecture materials, their content must be annotated and indexed. Manual annotation of learning materials such as videos is tedious and cannot be envisioned for the growing quantity of online materials. One of the most commonly used methods for learning video annotation is to index the video, based on the transcript obtained from translating the audio track of the video into text. Existing speech to text translators require extensive training especially for non-native English speakers and are known to have low accuracy.
This dissertation proposes to index the slides, based on the keywords. The keywords extracted from the textbook index and the presentation slides are the basis of the indexing scheme. Two types of lecture videos are generally used (i.e., classroom recording using a regular camera or slide presentation screen captures using specific software) and their quality varies widely. The screen capture videos, have generally a good quality and sometimes come with metadata. But often, metadata is not reliable and hence image processing techniques are used to segment the videos. Since the learning videos have a static background of slide, it is challenging to detect the shot boundaries. Comparative analysis of the state of the art techniques to determine best feature descriptors suitable for detecting transitions in a learning video is presented in this dissertation. The videos are indexed with keywords obtained from slides and a correspondence is established by segmenting the video temporally using feature descriptors to match and align the video segments with the presentation slides converted into images. The classroom recordings using regular video cameras often have poor illumination with objects partially or totally occluded. For such videos, slide localization techniques based on segmentation and heuristics is presented to improve the accuracy of the transition detection.
A region prioritized ranking mechanism is proposed that integrates the keyword location in the presentation into the ranking of the slides when searching for a slide that covers a given keyword. This helps in getting the most relevant results first. With the increasing size of course materials gathered online, a user looking to understand a given concept can get overwhelmed. The standard way of learning and the concept of “one size fits all” is no longer the best way to learn for millennials. Personalized concept recommendation is presented according to the user’s background knowledge.
Finally, the contributions of this dissertation have been integrated into the Ultimate Course Search (UCS), a tool for an effective search of course materials. UCS integrates presentation, lecture videos and textbook content into a single platform with topic based search capabilities and easy navigation of lecture materials
ADVISE: advanced digital video information segmentation engine.
by Chung-Wing Ng.Thesis (M.Phil.)--Chinese University of Hong Kong, 2002.Includes bibliographical references (leaves 100-107).Abstracts in English and Chinese.Abstract --- p.iiAcknowledgment --- p.viTable of Contents --- p.viiList of Tables --- p.xList of Figures --- p.xiChapter Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Image-based Video Description --- p.2Chapter 1.2 --- Video Summary --- p.5Chapter 1.3 --- Video Matching --- p.6Chapter 1.4 --- Contributions --- p.7Chapter 1.5 --- Outline of Thesis --- p.8Chapter Chapter 2 --- Literature Review --- p.10Chapter 2.1 --- Video Retrieval in Digital Video Libraries --- p.11Chapter 2.1.1 --- The VISION Project --- p.11Chapter 2.1.2 --- The INFORMEDIA Project --- p.12Chapter 2.1.3 --- Discussion --- p.13Chapter 2.2 --- Video Structuring --- p.14Chapter 2.2.1 --- Video Segmentation --- p.16Chapter 2.2.2 --- Color histogram Extraction --- p.17Chapter 2.2.3 --- Further Structuring --- p.18Chapter 2.3 --- XML Technologies --- p.19Chapter 2.3.1 --- XML Syntax --- p.20Chapter 2.3.2 --- "Document Type Definition, DTD" --- p.21Chapter 2.3.3 --- "Extensible Stylesheet Language, XSL" --- p.21Chapter 2.4 --- SMIL Technology --- p.22Chapter 2.4.1 --- SMIL Syntax --- p.23Chapter 2.4.2 --- Model of SMIL Applications --- p.23Chapter Chapter 3 --- Overview of ADVISE --- p.25Chapter 3.1 --- Objectives --- p.26Chapter 3.2 --- System Architecture --- p.26Chapter 3.2.1 --- Video Preprocessing Module --- p.26Chapter 3.2.2 --- Web-based Video Retrieval Module --- p.30Chapter 3.2.3 --- Video Streaming Server --- p.34Chapter 3.3 --- Summary --- p.35Chapter Chapter 4 --- Construction of Video Table-of-Contents (V-ToC) --- p.36Chapter 4.1 --- Video Structuring --- p.37Chapter 4.1.1 --- Terms and Definitions --- p.37Chapter 4.1.2 --- Regional Color Histograms --- p.39Chapter 4.1.3 --- Video Shot Boundaries Detection --- p.43Chapter 4.1.4 --- Video Groups Formation --- p.47Chapter 4.1.5 --- Video Scenes Formation --- p.50Chapter 4.2 --- Storage and Presentation --- p.53Chapter 4.2.1 --- Definition of XML Video Structure --- p.54Chapter 4.2.2 --- V-ToC Presentation Using XSL --- p.55Chapter 4.3 --- Evaluation of Video Structure --- p.58Chapter Chapter 5 --- Video Summarization --- p.62Chapter 5.1 --- Terms and Definitions --- p.64Chapter 5.2 --- Video Features Used for Summarization --- p.65Chapter 5.3 --- Video Summarization Algorithm --- p.67Chapter 5.3.1 --- Combining Extracted Video Segments --- p.68Chapter 5.3.2 --- Scoring the Extracted Video Segments --- p.69Chapter 5.3.3 --- Selecting Extracted Video Segments --- p.70Chapter 5.3.4 --- Refining the Selection Result --- p.71Chapter 5.4 --- Video Summary in SMIL --- p.74Chapter 5.5 --- Evaluations --- p.76Chapter 5.5.1 --- Experiment 1: Percentages of Features Extracted --- p.76Chapter 5.5.2 --- Experiment 2: Evaluation of the Refinement Process --- p.78Chapter Chapter 6 --- Video Matching Using V-ToC --- p.80Chapter 6.1 --- Terms and Definitions --- p.81Chapter 6.2 --- Video Features Used for Matching --- p.82Chapter 6.3 --- Non-ordered Tree Matching Algorithm --- p.83Chapter 6.4 --- Ordered Tree Matching Algorithms --- p.87Chapter 6.5 --- Evaluation of Video Matching --- p.91Chapter 6.5.1 --- Applying Non-ordered Tree Matching --- p.92Chapter 6.5.2 --- Applying Ordered Tree Matching --- p.94Chapter Chapter 7 --- Conclusion --- p.96Bibliography --- p.10