3 research outputs found
A framework for automated landmark recognition in community contributed image corpora
Any large library of information requires efficient ways to organise it and methods that allow people to access information efficiently and collections of digital images are no exception. Automatically creating high-level semantic tags based on image content is difficult, if not impossible to achieve accurately. In this thesis a framework is presented that allows for the automatic creation of rich and accurate tags for images with landmarks as the main object. This framework uses state of the art computer vision techniques fused with the wide range of contextual information that is available with community contributed imagery.
Images are organised into clusters based on image content and spatial data associated with each image. Based on these clusters different types of classifiers are* trained to recognise landmarks contained within the images in each cluster. A novel hybrid approach is proposed combining these classifiers with an hierarchical matching approach to allow near real-time classification and captioning of images containing landmarks
Recommended from our members
Matching Slides to Presentation Videos
Video streaming is becoming a major channel for distance learning (or e-learning). A tremendous number of videos for educational purpose are capturedand archived in various e-learning systems today throughout schools, corporations and over the Internet. However, making information searchable and browsable, and presenting results optimally for a wide range of users and systems, remains a challenge.In this work two core algorithms have been developedto support effective browsing and searching of educational videos. The first is a fully automatic approach that recognizes slides in the videowith high accuracy. Built upon SIFT (scale invariant feature transformation) keypoint matching using RANSAC (random sample consensus), the approach is independent of capture systems and can handle a variety of videos with different styles and plentiful ambiguities. In particular, we propose a multi-phase matching pipeline that incrementally identifies slides from the easy ones to the difficult ones. We achieve further robustness by using the matching confidence as part of a dynamic Hidden Markov model (HMM) that integrates temporal information, taking camera operations into account as well.The second algorithm locates slides in the video. We develop a non-linear optimization method (bundle adjustment) to accurately estimate the projective transformations (homographies) between slides and video frames. Different from estimating homography from a single image, our method solves a set of homographies jointly in a frame sequence that is related to a single slide.These two algorithms open up a series of possibilities for making the video content more searchable, browsable and understandable, thus greatly enriching the user's learning experience. Their usefulness has been demonstrated in the SLIC (Semantically Linking Instructional Content) system, which aims to turnsimple video content into fully interactive learning experience for students and scholars
Matching slides to presentation videos using sift and scene background matching
We present a general approach for automatically matching electronic slides to videos of corresponding presentations for use in distance learning and video proceedings of conferences. We deal with a large variety of videos, various frame compositions and color balances, arbitrary slides sequence and with dynamic cameras switching, pan, tilt and zoom. To achieve high accuracy, we develop a two-phases process with unsupervised scene background modelling. In the first phase, scale invariant feature transform (SIFT) keypoints are applied to frame to slide matching, under constraint projective transformation (constraint homography) using a random sample consensus (RANSAC). Successful first-phase matches are then used to automatically build a scene background model. In the second phase the background model is applied to the remaining unmatched frames to boost the matching performance for difficult cases such as wide field of view camera shots where the slide shows as a small portion of the frame. We also show that color correction is helpful when color-related similarity measures are used for identifying slides. We provide detailed quantitative experimentation results characterizing the effect of each part of our approach. The results show that our approach is robust and achieves high performance on matching slides to a number of videos with different styles. Categories and Subject Descriptor