23,249 research outputs found
Classification of Occluded Objects using Fast Recurrent Processing
Recurrent neural networks are powerful tools for handling incomplete data
problems in computer vision, thanks to their significant generative
capabilities. However, the computational demand for these algorithms is too
high to work in real time, without specialized hardware or software solutions.
In this paper, we propose a framework for augmenting recurrent processing
capabilities into a feedforward network without sacrificing much from
computational efficiency. We assume a mixture model and generate samples of the
last hidden layer according to the class decisions of the output layer, modify
the hidden layer activity using the samples, and propagate to lower layers. For
visual occlusion problem, the iterative procedure emulates feedforward-feedback
loop, filling-in the missing hidden layer activity with meaningful
representations. The proposed algorithm is tested on a widely used dataset, and
shown to achieve 2 improvement in classification accuracy for occluded
objects. When compared to Restricted Boltzmann Machines, our algorithm shows
superior performance for occluded object classification.Comment: arXiv admin note: text overlap with arXiv:1409.8576 by other author
Traffic monitoring using image processing : a thesis presented in partial fulfillment of the requirements for the degree of Master of Engineering in Information and Telecommunications Engineering at Massey University, Palmerston North, New Zealand
Traffic monitoring involves the collection of data describing the characteristics of vehicles and their movements. Such data may be used for automatic tolls, congestion and incident detection, law enforcement, and road capacity planning etc. With the recent advances in Computer Vision technology, videos can be analysed automatically and relevant information can be extracted for particular applications. Automatic surveillance using video cameras with image processing technique is becoming a powerful and useful technology for traffic monitoring. In this research project, a video image processing system that has the potential to be developed for real-time application is developed for traffic monitoring including vehicle tracking, counting, and classification. A heuristic approach is applied in developing this system. The system is divided into several parts, and several different functional components have been built and tested using some traffic video sequences. Evaluations are carried out to show that this system is robust and can be developed towards real-time applications
Context-Aware Single-Shot Detector
SSD is one of the state-of-the-art object detection algorithms, and it
combines high detection accuracy with real-time speed. However, it is widely
recognized that SSD is less accurate in detecting small objects compared to
large objects, because it ignores the context from outside the proposal boxes.
In this paper, we present CSSD--a shorthand for context-aware single-shot
multibox object detector. CSSD is built on top of SSD, with additional layers
modeling multi-scale contexts. We describe two variants of CSSD, which differ
in their context layers, using dilated convolution layers (DiCSSD) and
deconvolution layers (DeCSSD) respectively. The experimental results show that
the multi-scale context modeling significantly improves the detection accuracy.
In addition, we study the relationship between effective receptive fields
(ERFs) and the theoretical receptive fields (TRFs), particularly on a VGGNet.
The empirical results further strengthen our conclusion that SSD coupled with
context layers achieves better detection results especially for small objects
( on MS-COCO compared to the newest SSD), while
maintaining comparable runtime performance
Finding Temporally Consistent Occlusion Boundaries in Videos using Geometric Context
We present an algorithm for finding temporally consistent occlusion
boundaries in videos to support segmentation of dynamic scenes. We learn
occlusion boundaries in a pairwise Markov random field (MRF) framework. We
first estimate the probability of an spatio-temporal edge being an occlusion
boundary by using appearance, flow, and geometric features. Next, we enforce
occlusion boundary continuity in a MRF model by learning pairwise occlusion
probabilities using a random forest. Then, we temporally smooth boundaries to
remove temporal inconsistencies in occlusion boundary estimation. Our proposed
framework provides an efficient approach for finding temporally consistent
occlusion boundaries in video by utilizing causality, redundancy in videos, and
semantic layout of the scene. We have developed a dataset with fully annotated
ground-truth occlusion boundaries of over 30 videos ($5000 frames). This
dataset is used to evaluate temporal occlusion boundaries and provides a much
needed baseline for future studies. We perform experiments to demonstrate the
role of scene layout, and temporal information for occlusion reasoning in
dynamic scenes.Comment: Applications of Computer Vision (WACV), 2015 IEEE Winter Conference
o
Selective sampling importance resampling particle filter tracking with multibag subspace restoration
Real-time detection and tracking of multiple objects with partial decoding in H.264/AVC bitstream domain
In this paper, we show that we can apply probabilistic spatiotemporal
macroblock filtering (PSMF) and partial decoding processes to effectively
detect and track multiple objects in real time in H.264|AVC bitstreams with
stationary background. Our contribution is that our method cannot only show
fast processing time but also handle multiple moving objects that are
articulated, changing in size or internally have monotonous color, even though
they contain a chaotic set of non-homogeneous motion vectors inside. In
addition, our partial decoding process for H.264|AVC bitstreams enables to
improve the accuracy of object trajectories and overcome long occlusion by
using extracted color information.Comment: SPIE Real-Time Image and Video Processing Conference 200
- …