912 research outputs found
DC-image for real time compressed video matching
This chapter presents a suggested framework for video matching based on local features extracted from the DC-image of MPEG compressed videos, without full decompression. In addition, the relevant arguments and supporting evidences are discussed. Several local feature detectors will be examined to select the best for matching using the DC-image. Two experiments are carried to support the above. The first is comparing between the DC-image and I-frame, in terms of matching performance and computation complexity. The second experiment compares between using local features and global features regarding compressed video matching with respect to the DC-image. The results confirmed that the use of DC-image, despite its highly reduced size, it is promising as it produces higher matching precision, compared to the full I-frame. Also, SIFT, as a local feature, outperforms most of the standard global features. On the other hand, its computation complexity is relatively higher, but it is still within the real-time margin which leaves a space for further optimizations that can be done to improve this computation complexity
Enriquecendo animações em quadros-chaves espaciais com movimento capturado
While motion capture (mocap) achieves realistic character animation at great cost, keyframing is capable of producing less realistic but more controllable animations. In this work we show how to combine the Spatial Keyframing (SK) Framework of IGARASHI et al. [1] and multidimensional projection techniques to reuse mocap data in several ways. Additionally, we show that multidimensional projection also can be used for visualization and motion analysis. We also propose a method for mocap compaction with the help of SK’s pose reconstruction (backprojection) algorithm. Finally, we present a novel multidimensional projection optimization technique that significantly enhances SK-based reconstruction and can also be applied to other contexts where a backprojection algorithm is available.Movimento capturado (mocap) produz animacões de personagens com grande realismo mas a um custo alto. A utilização de quadros-chave torna mais difícil um resultado com realismo mas torna mais fácil o controle da animacão. Neste trabalho, mostramos como combinar o uso de quadros-chaves espaciais – Spatial Keyframing (SK) Framework – de IGARASHI et al. [1] e técnicas de projeção multidimensional para reutilizar dados de movimento capturado de várias maneiras. Mostramos também como projeções multidimensionais podem ser utilizadas para visualização e análise de movimento. Propomos um método de compactação de dados de mocap utilizando a reconstrução de poses por meio do algoritmo de quadros-chaves espaciais. Também apresentamos uma técnica de otimização para as projeções multidimensionais que melhora a reconstrução do movimento e que pode ser aplicada em outros casos onde um algoritmo de retroprojecão esteja dad
Video matching using DC-image and local features
This paper presents a suggested framework for video matching based on local features extracted from the DCimage of MPEG compressed videos, without decompression. The relevant arguments and supporting evidences are discussed for developing video similarity techniques that works directly on compressed videos, without decompression, and especially utilising small size images. Two experiments are carried to support the above. The first is comparing between the DC-image and I-frame, in terms of matching performance and the corresponding computation complexity. The second experiment compares between using local features and global features in video matching, especially in the compressed domain and with the small size images. The results confirmed that the use of DC-image, despite its highly reduced size, is promising as it produces at least similar (if not better) matching precision, compared to the full I-frame. Also, using SIFT, as a local feature, outperforms precision of most of the standard global features. On the other hand, its computation complexity is relatively higher, but it is still within the realtime margin. There are also various optimisations that can be done to improve this computation complexity
Keyframe-based monocular SLAM: design, survey, and future directions
Extensive research in the field of monocular SLAM for the past fifteen years
has yielded workable systems that found their way into various applications in
robotics and augmented reality. Although filter-based monocular SLAM systems
were common at some time, the more efficient keyframe-based solutions are
becoming the de facto methodology for building a monocular SLAM system. The
objective of this paper is threefold: first, the paper serves as a guideline
for people seeking to design their own monocular SLAM according to specific
environmental constraints. Second, it presents a survey that covers the various
keyframe-based monocular SLAM systems in the literature, detailing the
components of their implementation, and critically assessing the specific
strategies made in each proposed solution. Third, the paper provides insight
into the direction of future research in this field, to address the major
limitations still facing monocular SLAM; namely, in the issues of illumination
changes, initialization, highly dynamic motion, poorly textured scenes,
repetitive textures, map maintenance, and failure recovery
Automatic summarization of rushes video using bipartite graphs
In this paper we present a new approach for automatic summarization of rushes, or unstructured video. Our approach is composed of three major steps. First, based on shot and sub-shot segmentations, we filter sub-shots with low information content not likely to be useful in a summary. Second, a method using maximal matching in a bipartite graph is adapted to measure similarity between the remaining shots and to minimize inter-shot redundancy by removing repetitive retake shots common in rushes video. Finally, the presence of faces and motion intensity are characterised in each sub-shot. A measure of how representative the sub-shot is in the context of the overall video is then proposed. Video summaries composed of keyframe slideshows are then generated. In order to evaluate the effectiveness of this approach we re-run the evaluation carried out by TRECVid, using the same dataset and evaluation metrics used in the TRECVid video summarization task in 2007 but with our own assessors. Results show that our approach leads to a significant improvement on our own work in terms of the fraction of the TRECVid summary ground truth included and is competitive with the best of other approaches in TRECVid 2007
- …