69,949 research outputs found
Semantically Guided Depth Upsampling
We present a novel method for accurate and efficient up- sampling of sparse
depth data, guided by high-resolution imagery. Our approach goes beyond the use
of intensity cues only and additionally exploits object boundary cues through
structured edge detection and semantic scene labeling for guidance. Both cues
are combined within a geodesic distance measure that allows for
boundary-preserving depth in- terpolation while utilizing local context. We
model the observed scene structure by locally planar elements and formulate the
upsampling task as a global energy minimization problem. Our method determines
glob- ally consistent solutions and preserves fine details and sharp depth
bound- aries. In our experiments on several public datasets at different levels
of application, we demonstrate superior performance of our approach over the
state-of-the-art, even for very sparse measurements.Comment: German Conference on Pattern Recognition 2016 (Oral
Shot boundary detection in MPEG videos using local and global indicators
Shot boundary detection (SBD) plays important roles in many video applications. In this letter, we describe a novel method on SBD operating directly in the compressed domain. First, several local indicators are extracted from MPEG macroblocks, and AdaBoost is employed for feature selection and fusion. The selected features are then used in classifying candidate cuts into five sub-spaces via pre-filtering and rule-based decision making. Following that, global indicators of frame similarity between boundary frames of cut candidates are examined using phase correlation of dc images. Gradual transitions like fade, dissolve, and combined shot cuts are also identified. Experimental results on the test data from TRECVID'07 have demonstrated the effectiveness and robustness of our proposed methodology. * INSPEC o Controlled Indexing decision making , image segmentation , knowledge based systems , video coding o Non Controlled Indexing AdaBoost , MPEG videos , feature selection , global indicator , local indicator , rule-based decision making , shot boundary detection , video segmentation * Author Keywords Decision making , TRECVID , shot boundary detection (SBD) , video segmentation , video signal processing References 1. J. Yuan , H. Wang , L. Xiao , W. Zheng , J. L. F. Lin and B. Zhang "A formal study of shot boundary detection", IEEE Trans. Circuits Syst. Video Technol., vol. 17, pp. 168 2007. Abstract |Full Text: PDF (2789KB) 2. C. Grana and R. Cucchiara "Linear transition detection as a unified shot detection approach", IEEE Trans. Circuits Syst. Video Technol., vol. 17, pp. 483 2007. Abstract |Full Text: PDF (505KB) 3. Q. Urhan , M. K. Gullu and S. Erturk "Modified phase-correlation based robust hard-cut detection with application to archive film", IEEE Trans. Circuits Syst. Video Technol., vol. 16, pp. 753 2006. Abstract |Full Text: PDF (3808KB) 4. C. Cotsaces , N. Nikolaidis and I. Pitas "Video shot detection and condensed representation: A review", Proc. IEEE Signal Mag., vol. 23, pp. 28 2006. 5. National Institute of Standards and Technology (NIST), pp. [online] Available: http://www-nlpir.nist.gov/projects/trecvid/ 6. J. Bescos "Real-time shot change detection over online MPEG-2 video", IEEE Trans. Circuits Syst. Video Technol., vol. 14, pp. 475 2004. Abstract |Full Text: PDF (1056KB) 7. H. Lu and Y. P. Tan "An effective post-refinement method for shot boundary detection", IEEE Trans. Circuits Syst. Video Technol., vol. 15, pp. 1407 2005. Abstract |Full Text: PDF (3128KB) 8. G. Boccignone , A. Chianese , V. Moscato and A. Picariello "Foveated shot detection for video segmentation", IEEE Trans. Circuits Syst. Video Technol., vol. 15, pp. 365 2005. Abstract |Full Text: PDF (2152KB) 9. Z. Cernekova , I. Pitas and C. Nikou "Information theory-based shot cut/fade detection and video summarization", IEEE Trans. Circuits Syst. Video Technol., vol. 16, pp. 82 2006. Abstract |Full Text: PDF (1184KB) 10. L.-Y. Duan , M. Xu , Q. Tian , C.-S. Xu and J. S. Jin "A unified framework for semantic shot classification in sports video", IEEE Trans. Multimedia, vol. 7, pp. 1066 2005. Abstract |Full Text: PDF (2872KB) 11. H. Fang , J. M. Jiang and Y. Feng "A fuzzy logic approach for detection of video shot boundaries", Pattern Recogn., vol. 39, pp. 2092 2006. [CrossRef] 12. R. A. Joyce and B. Liu "Temporal segmentation of video using frame and histogram space", IEEE Trans. Multimedia, vol. 8, pp. 130 2006. Abstract |Full Text: PDF (864KB) 13. A. Hanjalic "Shot boundary detection: Unraveled and resolved", IEEE Trans. Circuits Syst. Video Technol., vol. 12, pp. 90 2002. Abstract |Full Text: PDF (289KB) 14. S.-C. Pei and Y.-Z. Chou "Efficient MPEG compressed video analysis using macroblock type information", IEEE Trans. Multimedia, vol. 1, pp. 321 1999. Abstract |Full Text: PDF (612KB) 15. C.-L. Huang and B.-Y. Liao "A robust scene-change detection method for video segmentation", IEEE Trans. Circuits Syst. Video Technol., vol. 11, pp. 1281 2001. Abstract |Full Text: PDF (241KB) 16. Y. Freund and R. E. Schapire "A decision-theoretic generalization of online learning and an application to boosting", J. Comput. Syst. Sci., vol. 55, pp. 119 1997. [CrossRef] On this page * Abstract * Index Terms * References Brought to you by STRATHCLYDE UNIVERSITY LIBRARY * Your institute subscribes to: * IEEE-Wiley eBooks Library , IEEE/IET Electronic Library (IEL) * What can I access? Terms of Us
All You Need Is Boundary: Toward Arbitrary-Shaped Text Spotting
Recently, end-to-end text spotting that aims to detect and recognize text
from cluttered images simultaneously has received particularly growing interest
in computer vision. Different from the existing approaches that formulate text
detection as bounding box extraction or instance segmentation, we localize a
set of points on the boundary of each text instance. With the representation of
such boundary points, we establish a simple yet effective scheme for end-to-end
text spotting, which can read the text of arbitrary shapes. Experiments on
three challenging datasets, including ICDAR2015, TotalText and COCO-Text
demonstrate that the proposed method consistently surpasses the
state-of-the-art in both scene text detection and end-to-end text recognition
tasks.Comment: Accepted to AAAI202
Goal Detection in Soccer Video: Role-Based Events Detection Approach
Soccer video processing and analysis to find critical events such as occurrences of goal event have been one of the important issues and topics of active researches in recent years. In this paper, a new role-based framework is proposed for goal event detection in which the semantic structure of soccer game is used. Usually after a goal scene, the audiences’ and reporters’ sound intensity is increased, ball is sent back to the center and the camera may: zoom on Player, show audiences’ delighting, repeat the goal scene or display a combination of them. Thus, the occurrence of goal event will be detectable by analysis of sequences of above roles. The proposed framework in this paper consists of four main procedures: 1- detection of game’s critical events by using audio channel, 2- detection of shot boundary and shots classification, 3- selection of candidate events according to the type of shot and existence of goalmouth in the shot, 4- detection of restarting the game from the center of the field. A new method for shot classification is also presented in this framework. Finally, by applying the proposed method it was shown that the goal events detection has a good accuracy and the percentage of detection failure is also very low.DOI:http://dx.doi.org/10.11591/ijece.v4i6.637
Improved fade and dissolve detection for reliable video segmentation
We present improved algorithms for automatic fade and dissolve detection in digital video analysis. We devise new two-step algorithms for fade and dissolve detection and introduce a method for eliminating false positives from a list of detected candidate transitions. In our detailed study of these gradual shot transitions, our objective has been to accurately classify the type of transitions (fade-in, fade-out, and dissolve) and to precisely locate the boundary of the transitions. This distinguishes our work from early work in scene change detection which focuses on identifying the existence of a transition rather than its precise temporal extent. We evaluate our algorithms against two other commonly used methods on a comprehensive data set, and demonstrate the improved performance due to our enhancements
Finding Temporally Consistent Occlusion Boundaries in Videos using Geometric Context
We present an algorithm for finding temporally consistent occlusion
boundaries in videos to support segmentation of dynamic scenes. We learn
occlusion boundaries in a pairwise Markov random field (MRF) framework. We
first estimate the probability of an spatio-temporal edge being an occlusion
boundary by using appearance, flow, and geometric features. Next, we enforce
occlusion boundary continuity in a MRF model by learning pairwise occlusion
probabilities using a random forest. Then, we temporally smooth boundaries to
remove temporal inconsistencies in occlusion boundary estimation. Our proposed
framework provides an efficient approach for finding temporally consistent
occlusion boundaries in video by utilizing causality, redundancy in videos, and
semantic layout of the scene. We have developed a dataset with fully annotated
ground-truth occlusion boundaries of over 30 videos ($5000 frames). This
dataset is used to evaluate temporal occlusion boundaries and provides a much
needed baseline for future studies. We perform experiments to demonstrate the
role of scene layout, and temporal information for occlusion reasoning in
dynamic scenes.Comment: Applications of Computer Vision (WACV), 2015 IEEE Winter Conference
o
Local features for visual object matching and video scene detection
Local features are important building blocks for many computer vision algorithms such as visual object alignment, object recognition, and content-based image retrieval. Local features are extracted from an image by a local feature detector and then the detected features are encoded using a local feature descriptor. The resulting features based on the descriptors, such as histograms or binary strings, are used in matching to find similar features between objects in images.
In this thesis, we deal with two research problem in the context of local features for object detection: we extend the original local feature detector and descriptor performance benchmarks from the wide baseline setting to the intra-class matching; and propose local features for consumer video scene boundary detection.
In the intra-class matching, the visual appearance of objects semantic class can be very different (e.g., Harley Davidson and Scooter in the same motorbike class) and making the task more difficult than wide baseline matching. The performance of different local feature detectors and descriptors are evaluated over three different image databases and results for more advance analysis are reported.
In the second part of the thesis, we study the use of Bag-of-Words (BoW) in the video scene boundary detection. In literature there have been several approaches to the task exploiting the local features, but based on the author’s knowledge, none of them are practical in an online processing of user videos. We introduce an online BoW based scene boundary detector using a dynamic codebook, study the optimal parameters for the detector and compare our method to the existing methods. Precision and recall curves are used as a performance metric.
The goal of this thesis is to find the best local feature detector and descriptor for intra-class matching and develop a novel scene boundary detection method for online applications
- …