2,125 research outputs found
Performing edge detection by difference of Gaussians using q-Gaussian kernels
In image processing, edge detection is a valuable tool to perform the
extraction of features from an image. This detection reduces the amount of
information to be processed, since the redundant information (considered less
relevant) can be unconsidered. The technique of edge detection consists of
determining the points of a digital image whose intensity changes sharply. This
changes are due to the discontinuities of the orientation on a surface for
example. A well known method of edge detection is the Difference of Gaussians
(DoG). The method consists of subtracting two Gaussians, where a kernel has a
standard deviation smaller than the previous one. The convolution between the
subtraction of kernels and the input image results in the edge detection of
this image. This paper introduces a method of extracting edges using DoG with
kernels based on the q-Gaussian probability distribution, derived from the
q-statistic proposed by Constantino Tsallis. To demonstrate the method's
potential, we compare the introduced method with the traditional DoG using
Gaussians kernels. The results showed that the proposed method can extract
edges with more accurate details.Comment: 5 pages, 5 figures, IC-MSQUARE 201
Thermo-visual feature fusion for object tracking using multiple spatiogram trackers
In this paper, we propose a framework that can efficiently combine features for robust tracking based on fusing the outputs of multiple spatiogram trackers. This is achieved without the exponential increase in storage and processing that other multimodal tracking approaches suffer from. The framework allows the features to be split arbitrarily between the trackers, as well as providing the flexibility to add, remove or dynamically weight features. We derive a mean-shift type algorithm for the framework that allows efficient object tracking with very low computational overhead. We especially target the fusion of thermal infrared and visible spectrum features as the most useful features for automated surveillance applications. Results are shown on multimodal video sequences clearly illustrating the benefits of combining multiple features using our framework
SAR Image Edge Detection: Review and Benchmark Experiments
Edges are distinct geometric features crucial to higher level object detection and recognition in remote-sensing processing, which is a key for surveillance and gathering up-to-date geospatial intelligence. Synthetic aperture radar (SAR) is a powerful form of remote-sensing. However, edge detectors designed for optical images tend to have low performance on SAR images due to the presence of the strong speckle noise-causing false-positives (type I errors). Therefore, many researchers have proposed edge detectors that are tailored to deal with the SAR image characteristics specifically. Although these edge detectors might achieve effective results on their own evaluations, the comparisons tend to include a very limited number of (simulated) SAR images. As a result, the generalized performance of the proposed methods is not truly reflected, as real-world patterns are much more complex and diverse. From this emerges another problem, namely, a quantitative benchmark is missing in the field. Hence, it is not currently possible to fairly evaluate any edge detection method for SAR images. Thus, in this paper, we aim to close the aforementioned gaps by providing an extensive experimental evaluation for SAR images on edge detection. To that end, we propose the first benchmark on SAR image edge detection methods established by evaluating various freely available methods, including methods that are considered to be the state of the art
Evaluation of MoG Video Segmentation on GPU-based HPC System
Automated and intelligent video surveillance systems play an important role in the modern world. Since the number of various video streams that must be analyzed concurrently grows, such systems can assist humans in performing tiresome tasks. In order to be effective, video surveillance systems have to meet several requirements: they must be accurate and able to process the received video stream in real-time. A robust system should not depend on lighting conditions, illumination changes and other sources of scene variation. A common component of surveillance systems is a module that performs background estimation and foreground segmentation. The MoG (Mixture of Gaussians) algorithm is a widely used statistical technique of video segmentation. The estimation process is time-consuming, especially for complex mixture models containing many components. The work presented here focuses on the performance evaluation of MoG algorithm aiming to assess feasibility of OpenCL-based processing of high resolution video on GPU accelerated platforms
A Projected Gradient Descent Method for CRF Inference allowing End-To-End Training of Arbitrary Pairwise Potentials
Are we using the right potential functions in the Conditional Random Field
models that are popular in the Vision community? Semantic segmentation and
other pixel-level labelling tasks have made significant progress recently due
to the deep learning paradigm. However, most state-of-the-art structured
prediction methods also include a random field model with a hand-crafted
Gaussian potential to model spatial priors, label consistencies and
feature-based image conditioning.
In this paper, we challenge this view by developing a new inference and
learning framework which can learn pairwise CRF potentials restricted only by
their dependence on the image pixel values and the size of the support. Both
standard spatial and high-dimensional bilateral kernels are considered. Our
framework is based on the observation that CRF inference can be achieved via
projected gradient descent and consequently, can easily be integrated in deep
neural networks to allow for end-to-end training. It is empirically
demonstrated that such learned potentials can improve segmentation accuracy and
that certain label class interactions are indeed better modelled by a
non-Gaussian potential. In addition, we compare our inference method to the
commonly used mean-field algorithm. Our framework is evaluated on several
public benchmarks for semantic segmentation with improved performance compared
to previous state-of-the-art CNN+CRF models.Comment: Presented at EMMCVPR 2017 conferenc
Generalized Max Pooling
State-of-the-art patch-based image representations involve a pooling
operation that aggregates statistics computed from local descriptors. Standard
pooling operations include sum- and max-pooling. Sum-pooling lacks
discriminability because the resulting representation is strongly influenced by
frequent yet often uninformative descriptors, but only weakly influenced by
rare yet potentially highly-informative ones. Max-pooling equalizes the
influence of frequent and rare descriptors but is only applicable to
representations that rely on count statistics, such as the bag-of-visual-words
(BOV) and its soft- and sparse-coding extensions. We propose a novel pooling
mechanism that achieves the same effect as max-pooling but is applicable beyond
the BOV and especially to the state-of-the-art Fisher Vector -- hence the name
Generalized Max Pooling (GMP). It involves equalizing the similarity between
each patch and the pooled representation, which is shown to be equivalent to
re-weighting the per-patch statistics. We show on five public image
classification benchmarks that the proposed GMP can lead to significant
performance gains with respect to heuristic alternatives.Comment: (to appear) CVPR 2014 - IEEE Conference on Computer Vision & Pattern
Recognition (2014
- …