Search CORE

1,749 research outputs found

Mapping, Localization and Path Planning for Image-based Navigation using Visual Features and Map

Author: Chhatkuli Ajad
Paudel Danda Pani
Probst Thomas
Thoma Janine
Van Gool Luc
Publication venue
Publication date: 01/01/2019
Field of study

Building on progress in feature representations for image retrieval, image-based localization has seen a surge of research interest. Image-based localization has the advantage of being inexpensive and efficient, often avoiding the use of 3D metric maps altogether. That said, the need to maintain a large number of reference images as an effective support of localization in a scene, nonetheless calls for them to be organized in a map structure of some kind. The problem of localization often arises as part of a navigation process. We are, therefore, interested in summarizing the reference images as a set of landmarks, which meet the requirements for image-based navigation. A contribution of this paper is to formulate such a set of requirements for the two sub-tasks involved: map construction and self-localization. These requirements are then exploited for compact map representation and accurate self-localization, using the framework of a network flow problem. During this process, we formulate the map construction and self-localization problems as convex quadratic and second-order cone programs, respectively. We evaluate our methods on publicly available indoor and outdoor datasets, where they outperform existing methods significantly.Comment: CVPR 2019, for implementation see https://github.com/janinethom

arXiv.org e-Print Archive

Repository for Publications and Research Data

Crossref

Leveraging Deep Visual Descriptors for Hierarchical Efficient Localization

Author: Cadena Cesar
Debraine Frédéric
Dymczyk Marcin
Sarlin Paul-Edouard
Siegwart Roland
Publication venue
Publication date: 01/01/2018
Field of study

Many robotics applications require precise pose estimates despite operating in large and changing environments. This can be addressed by visual localization, using a pre-computed 3D model of the surroundings. The pose estimation then amounts to finding correspondences between 2D keypoints in a query image and 3D points in the model using local descriptors. However, computational power is often limited on robotic platforms, making this task challenging in large-scale environments. Binary feature descriptors significantly speed up this 2D-3D matching, and have become popular in the robotics community, but also strongly impair the robustness to perceptual aliasing and changes in viewpoint, illumination and scene structure. In this work, we propose to leverage recent advances in deep learning to perform an efficient hierarchical localization. We first localize at the map level using learned image-wide global descriptors, and subsequently estimate a precise pose from 2D-3D matches computed in the candidate places only. This restricts the local search and thus allows to efficiently exploit powerful non-binary descriptors usually dismissed on resource-constrained devices. Our approach results in state-of-the-art localization performance while running in real-time on a popular mobile platform, enabling new prospects for robotics research.Comment: CoRL 2018 Camera-ready (fix typos and update citations

arXiv.org e-Print Archive

Repository for Publications and Research Data

Learning View-Model Joint Relevance for 3D Object Retrieval

Author: Dong Jiyang
He Ning
Lu Ke
Shao Ling
Xue Jian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/01/2015
Field of study

3D object retrieval has attracted extensive research efforts and become an important task in recent years. It is noted that how to measure the relevance between 3D objects is still a difficult issue. Most of the existing methods employ just the model-based or view-based approaches, which may lead to incomplete information for 3D object representation. In this paper, we propose to jointly learn the view-model relevance among 3D objects for retrieval, in which the 3D objects are formulated in different graph structures. With the view information, the multiple views of 3D objects are employed to formulate the 3D object relationship in an object hypergraph structure. With the model data, the model-based features are extracted to construct an object graph to describe the relationship among the 3D objects. The learning on the two graphs is conducted to estimate the relevance among the 3D objects, in which the view/model graph weights can be also optimized in the learning process. This is the first work to jointly explore the view-based and model-based relevance among the 3D objects in a graph-based framework. The proposed method has been evaluated in three data sets. The experimental results and comparison with the state-of-the-art methods demonstrate the effectiveness on retrieval accuracy of the proposed 3D object retrieval method

Northumbria Research Link

Crossref

University of East Anglia digital repository

Video Registration in Egocentric Vision under Day and Night Illumination Changes

Author: Alletto Stefano
Cucchiara Rita
Serra Giuseppe
Publication venue
Publication date: 28/07/2016
Field of study

With the spread of wearable devices and head mounted cameras, a wide range of application requiring precise user localization is now possible. In this paper we propose to treat the problem of obtaining the user position with respect to a known environment as a video registration problem. Video registration, i.e. the task of aligning an input video sequence to a pre-built 3D model, relies on a matching process of local keypoints extracted on the query sequence to a 3D point cloud. The overall registration performance is strictly tied to the actual quality of this 2D-3D matching, and can degrade if environmental conditions such as steep changes in lighting like the ones between day and night occur. To effectively register an egocentric video sequence under these conditions, we propose to tackle the source of the problem: the matching process. To overcome the shortcomings of standard matching techniques, we introduce a novel embedding space that allows us to obtain robust matches by jointly taking into account local descriptors, their spatial arrangement and their temporal robustness. The proposal is evaluated using unconstrained egocentric video sequences both in terms of matching quality and resulting registration performance using different 3D models of historical landmarks. The results show that the proposed method can outperform state of the art registration algorithms, in particular when dealing with the challenges of night and day sequences

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università degli Studi di Udine

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

DeformNet: Free-Form Deformation Network for 3D Shape Reconstruction from a Single Image

Author: Choy Christopher
Garg Animesh
Gwak JunYoung
Ji Jingwei
Kurenkov Andrey
Mehta Viraj
Savarese Silvio
Publication venue
Publication date: 10/08/2017
Field of study

3D reconstruction from a single image is a key problem in multiple applications ranging from robotic manipulation to augmented reality. Prior methods have tackled this problem through generative models which predict 3D reconstructions as voxels or point clouds. However, these methods can be computationally expensive and miss fine details. We introduce a new differentiable layer for 3D data deformation and use it in DeformNet to learn a model for 3D reconstruction-through-deformation. DeformNet takes an image input, searches the nearest shape template from a database, and deforms the template to match the query image. We evaluate our approach on the ShapeNet dataset and show that - (a) the Free-Form Deformation layer is a powerful new building block for Deep Learning models that manipulate 3D data (b) DeformNet uses this FFD layer combined with shape retrieval for smooth and detail-preserving 3D reconstruction of qualitatively plausible point clouds with respect to a single query image (c) compared to other state-of-the-art 3D reconstruction methods, DeformNet quantitatively matches or outperforms their benchmarks by significant margins. For more information, visit: https://deformnet-site.github.io/DeformNet-website/ .Comment: 11 pages, 9 figures, NIP

arXiv.org e-Print Archive

Crossref

On Rearrangement of Items Stored in Stacks

Author: Szegedy Mario
Yu Jingjin
Publication venue
Publication date: 06/05/2020
Field of study

There are

n \ge 2

stacks, each filled with

d

items, and one empty stack. Every stack has capacity

d > 0

. A robot arm, in one stack operation (step), may pop one item from the top of a non-empty stack and subsequently push it onto a stack not at capacity. In a {\em labeled} problem, all

nd

items are distinguishable and are initially randomly scattered in the

n

stacks. The items must be rearranged using pop-and-pushs so that in the end, the

k^{\rm th}

stack holds items

(k-1)d +1, \ldots, kd

, in that order, from the top to the bottom for all

1 \le k \le n

. In an {\em unlabeled} problem, the

nd

items are of

n

types of

d

each. The goal is to rearrange items so that items of type

k

are located in the

k^{\rm th}

stack for all

1 \le k \le n

. In carrying out the rearrangement, a natural question is to find the least number of required pop-and-pushes. Our main contributions are: (1) an algorithm for restoring the order of

n^2

items stored in an

n \times n

table using only

2n

column and row permutations, and its generalization, and (2) an algorithm with a guaranteed upper bound of

O(nd)

steps for solving both versions of the stack rearrangement problem when

d \le \lceil cn \rceil

for arbitrary fixed positive number

c

. In terms of the required number of steps, the labeled and unlabeled version have lower bounds

\Omega(nd + nd{\frac{\log d}{\log n}})

and

\Omega(nd)

, respectively

arXiv.org e-Print Archive

Survey of Object Detection Methods in Camouflaged Image

Author: Dhawale C. A.
Misra Sanjay
Singh S. K.
Publication venue
Publication date: 01/01/2013
Field of study

Camouflage is an attempt to conceal the signature of a target object into the background image. Camouflage detection methods or Decamouflaging method is basically used to detect foreground object hidden in the background image. In this research paper authors presented survey of camouflage detection methods for different applications and areas

Covenant University Repository

Elsevier - Publisher Connector