Search CORE

5,349 research outputs found

Video Registration in Egocentric Vision under Day and Night Illumination Changes

Author: Alletto Stefano
Cucchiara Rita
Serra Giuseppe
Publication venue
Publication date: 28/07/2016
Field of study

With the spread of wearable devices and head mounted cameras, a wide range of application requiring precise user localization is now possible. In this paper we propose to treat the problem of obtaining the user position with respect to a known environment as a video registration problem. Video registration, i.e. the task of aligning an input video sequence to a pre-built 3D model, relies on a matching process of local keypoints extracted on the query sequence to a 3D point cloud. The overall registration performance is strictly tied to the actual quality of this 2D-3D matching, and can degrade if environmental conditions such as steep changes in lighting like the ones between day and night occur. To effectively register an egocentric video sequence under these conditions, we propose to tackle the source of the problem: the matching process. To overcome the shortcomings of standard matching techniques, we introduce a novel embedding space that allows us to obtain robust matches by jointly taking into account local descriptors, their spatial arrangement and their temporal robustness. The proposal is evaluated using unconstrained egocentric video sequences both in terms of matching quality and resulting registration performance using different 3D models of historical landmarks. The results show that the proposed method can outperform state of the art registration algorithms, in particular when dealing with the challenges of night and day sequences

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università degli Studi di Udine

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Generalized Max Pooling

Author: Murray Naila
Perronnin Florent
Publication venue
Publication date: 01/01/2014
Field of study

State-of-the-art patch-based image representations involve a pooling operation that aggregates statistics computed from local descriptors. Standard pooling operations include sum- and max-pooling. Sum-pooling lacks discriminability because the resulting representation is strongly influenced by frequent yet often uninformative descriptors, but only weakly influenced by rare yet potentially highly-informative ones. Max-pooling equalizes the influence of frequent and rare descriptors but is only applicable to representations that rely on count statistics, such as the bag-of-visual-words (BOV) and its soft- and sparse-coding extensions. We propose a novel pooling mechanism that achieves the same effect as max-pooling but is applicable beyond the BOV and especially to the state-of-the-art Fisher Vector -- hence the name Generalized Max Pooling (GMP). It involves equalizing the similarity between each patch and the pooled representation, which is shown to be equivalent to re-weighting the per-patch statistics. We show on five public image classification benchmarks that the proposed GMP can lead to significant performance gains with respect to heuristic alternatives.Comment: (to appear) CVPR 2014 - IEEE Conference on Computer Vision & Pattern Recognition (2014

arXiv.org e-Print Archive

CiteSeerX

Crossref

OBJECT RECOGNITION USING SIFT ON DM3730 PROCESSOR

Author: Mohana Lakshmi K.
Sindhu S.
Vishwanath N.G.
Publication venue: International Journal of Innovative Technology and Research
Publication date: 05/07/2016
Field of study

Stable local feature recognition and representation is really a fundamental element of many image registration and object recognition calculations. This paper examines the neighborhood image descriptor utilized by SIFT. The SIFT formula (Scale Invariant Feature Transform) is definitely a method for removing distinctive invariant features from images. It's been effectively put on a number of computer vision problems according to feature matching including object recognition, pose estimation, image retrieval and many more. Like SIFT, our descriptors encode the salient facets of the look gradient within the feature point’s neighborhood Optical object recognition and pose estimation are extremely challenging tasks in automobiles given that they suffer from problems for example different sights of the object, various light conditions, surface glare, and noise brought on by image sensors. Presently available calculations for example SIFT can to some degree solve these complaints because they compute so known as point features that are invariant towards scaling and rotation. However, these calculations are computationally complex and need effective hardware to be able to operate instantly. In automotive programs and usually in the area of mobile products, limited processing power and also the interest in low electric batteries consumption play a huge role. Hence, adopting individuals sophisticated point feature calculations to mobile hardware is definitely an ambitious, but additionally necessary computer engineering task. However, in tangible-world programs there's still an excuse for improvement from the algorithm’s sturdiness with regards to the correct matching of SIFT features. Within this work, we advise to make use of original SIFT formula to supply more reliable feature matching with regards to object recognition

International Journal of Innovative Technology and Research (IJITR)