47,338 research outputs found
Three-dimensional block matching using orthonormal tree-structured haar transform for multichannel images
Multichannel images, i.e., images of the same object or scene taken in different spectral bands or with different imaging modalities/settings, are common in many applications. For example, multispectral images contain several wavelength bands and hence, have richer information than color images. Multichannel magnetic resonance imaging and multichannel computed tomography images are common in medical imaging diagnostics, and multimodal images are also routinely used in art investigation. All the methods for grayscale images can be applied to multichannel images by processing each channel/band separately. However, it requires vast computational time, especially for the task of searching for overlapping patches similar to a given query patch. To address this problem, we propose a three-dimensional orthonormal tree-structured Haar transform (3D-OTSHT) targeting fast full search equivalent for three-dimensional block matching in multichannel images. The use of a three-dimensional integral image significantly saves time to obtain the 3D-OTSHT coefficients. We demonstrate superior performance of the proposed block matching
Particular object retrieval with integral max-pooling of CNN activations
Recently, image representation built upon Convolutional Neural Network (CNN)
has been shown to provide effective descriptors for image search, outperforming
pre-CNN features as short-vector representations. Yet such models are not
compatible with geometry-aware re-ranking methods and still outperformed, on
some particular object retrieval benchmarks, by traditional image search
systems relying on precise descriptor matching, geometric re-ranking, or query
expansion. This work revisits both retrieval stages, namely initial search and
re-ranking, by employing the same primitive information derived from the CNN.
We build compact feature vectors that encode several image regions without the
need to feed multiple inputs to the network. Furthermore, we extend integral
images to handle max-pooling on convolutional layer activations, allowing us to
efficiently localize matching objects. The resulting bounding box is finally
used for image re-ranking. As a result, this paper significantly improves
existing CNN-based recognition pipeline: We report for the first time results
competing with traditional methods on the challenging Oxford5k and Paris6k
datasets
Review of Person Re-identification Techniques
Person re-identification across different surveillance cameras with disjoint
fields of view has become one of the most interesting and challenging subjects
in the area of intelligent video surveillance. Although several methods have
been developed and proposed, certain limitations and unresolved issues remain.
In all of the existing re-identification approaches, feature vectors are
extracted from segmented still images or video frames. Different similarity or
dissimilarity measures have been applied to these vectors. Some methods have
used simple constant metrics, whereas others have utilised models to obtain
optimised metrics. Some have created models based on local colour or texture
information, and others have built models based on the gait of people. In
general, the main objective of all these approaches is to achieve a
higher-accuracy rate and lowercomputational costs. This study summarises
several developments in recent literature and discusses the various available
methods used in person re-identification. Specifically, their advantages and
disadvantages are mentioned and compared.Comment: Published 201
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio
Automated Mobile System for Accurate Outdoor Tree Crop Enumeration Using an Uncalibrated Camera.
This paper demonstrates an automated computer vision system for outdoor tree crop enumeration in a seedling nursery. The complete system incorporates both hardware components (including an embedded microcontroller, an odometry encoder, and an uncalibrated digital color camera) and software algorithms (including microcontroller algorithms and the proposed algorithm for tree crop enumeration) required to obtain robust performance in a natural outdoor environment. The enumeration system uses a three-step image analysis process based upon: (1) an orthographic plant projection method integrating a perspective transform with automatic parameter estimation; (2) a plant counting method based on projection histograms; and (3) a double-counting avoidance method based on a homography transform. Experimental results demonstrate the ability to count large numbers of plants automatically with no human effort. Results show that, for tree seedlings having a height up to 40 cm and a within-row tree spacing of approximately 10 cm, the algorithms successfully estimated the number of plants with an average accuracy of 95.2% for trees within a single image and 98% for counting of the whole plant population in a large sequence of images
Exploring Human Vision Driven Features for Pedestrian Detection
Motivated by the center-surround mechanism in the human visual attention
system, we propose to use average contrast maps for the challenge of pedestrian
detection in street scenes due to the observation that pedestrians indeed
exhibit discriminative contrast texture. Our main contributions are first to
design a local, statistical multi-channel descriptorin order to incorporate
both color and gradient information. Second, we introduce a multi-direction and
multi-scale contrast scheme based on grid-cells in order to integrate
expressive local variations. Contributing to the issue of selecting most
discriminative features for assessing and classification, we perform extensive
comparisons w.r.t. statistical descriptors, contrast measurements, and scale
structures. This way, we obtain reasonable results under various
configurations. Empirical findings from applying our optimized detector on the
INRIA and Caltech pedestrian datasets show that our features yield
state-of-the-art performance in pedestrian detection.Comment: Accepted for publication in IEEE Transactions on Circuits and Systems
for Video Technology (TCSVT
Numerical simulations of time resolved quantum electronics
This paper discusses the technical aspects - mathematical and numerical -
associated with the numerical simulations of a mesoscopic system in the time
domain (i.e. beyond the single frequency AC limit). After a short review of the
state of the art, we develop a theoretical framework for the calculation of
time resolved observables in a general multiterminal system subject to an
arbitrary time dependent perturbation (oscillating electrostatic gates, voltage
pulses, time-vaying magnetic fields) The approach is mathematically equivalent
to (i) the time dependent scattering formalism, (ii) the time resolved Non
Equilibrium Green Function (NEGF) formalism and (iii) the partition-free
approach. The central object of our theory is a wave function that obeys a
simple Schrodinger equation with an additional source term that accounts for
the electrons injected from the electrodes. The time resolved observables
(current, density. . .) and the (inelastic) scattering matrix are simply
expressed in term of this wave function. We use our approach to develop a
numerical technique for simulating time resolved quantum transport. We find
that the use of this wave function is advantageous for numerical simulations
resulting in a speed up of many orders of magnitude with respect to the direct
integration of NEGF equations. Our technique allows one to simulate realistic
situations beyond simple models, a subject that was until now beyond the
simulation capabilities of available approaches.Comment: Typographic mistakes in appendix C were correcte
- …