Search CORE

96 research outputs found

Class Representative Visual Words for Category-Level Object Recognition

Author: B. Leibe
K. Mikolajczyk
P. Perronnin
S. Belongie
T. Tuytelaars
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Recent works in object recognition often use visual words, i.e. vector quantized local descriptors extracted from the images. In this paper we present a novel method to build such a codebook with class representative vectors. This method, coined Cluster Precision Maximization (CPM), is based on a new measure of the cluster precision and on an optimization procedure that leads any clustering algorithm towards class representative visual words. We compare our procedure with other measures of cluster precision and present the integration of a Reciprocal Nearest Neighbor (RNN) clustering algorithm in the CPM method. In the experiments, on a subset of the the Caltech101 database, we analyze several vocabularies obtained with different local descriptors and different clustering algorithms, and we show that the vocabularies obtained with the CPM process perform best in a category-level object recognition system using a Support Vector Machine (SVM). © 2009 Springer Berlin Heidelberg.López Sastre R.J., Tuytelaars T., Maldonado Bascón S., ''Class representative visual words for category-level object recognition'', Lecture notes in computer science, vol. 5524, 2009 (4th Iberian conference on pattern recognition and image analysis - IbPRAI 2009, June 10-12, 2009, Póvoa de Varzim, Portugal).status: publishe

Lirias

Crossref

Siam R-CNN: Visual Tracking by Re-Detection

Author: Leibe Bastian
Luiten Jonathon
Torr Philip H. S.
Voigtlaender Paul
Publication venue
Publication date: 01/01/2020
Field of study

We present Siam R-CNN, a Siamese re-detection architecture which unleashes the full power of two-stage object detection approaches for visual object tracking. We combine this with a novel tracklet-based dynamic programming algorithm, which takes advantage of re-detections of both the first-frame template and previous-frame predictions, to model the full history of both the object to be tracked and potential distractor objects. This enables our approach to make better tracking decisions, as well as to re-detect tracked objects after long occlusion. Finally, we propose a novel hard example mining strategy to improve Siam R-CNN's robustness to similar looking objects. Siam R-CNN achieves the current best performance on ten tracking benchmarks, with especially strong results for long-term tracking. We make our code and models available at www.vision.rwth-aachen.de/page/siamrcnn.Comment: CVPR 2020 camera-ready versio

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

HoughNet: Integrating Near and Long-Range Evidence for Bottom-Up Object Detection

Author: B Leibe
DH Ballard
E Akbas
E Gabriel
O Barinova
PF Felzenszwalb
PF Felzenszwalb
S Belongie
T-Y Lin
VJ Traver
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

© 2020, Springer Nature Switzerland AG.This paper presents HoughNet, a one-stage, anchor-free, voting-based, bottom-up object detection method. Inspired by the Generalized Hough Transform, HoughNet determines the presence of an object at a certain location by the sum of the votes cast on that location. Votes are collected from both near and long-distance locations based on a log-polar vote field. Thanks to this voting mechanism, HoughNet is able to integrate both near and long-range, class-conditional evidence for visual recognition, thereby generalizing and enhancing current object detection methodology, which typically relies on only local evidence. On the COCO dataset, HoughNet’s best model achieves 46.4 AP (and 65.1 AP50), performing on par with the state-of-the-art in bottom-up object detection and outperforming most major one-stage and two-stage methods. We further validate the effectiveness of our proposal in another task, namely, “labels to photo” image generation by integrating the voting module of HoughNet to two different GAN models and showing that the accuracy is significantly improved in both cases. Code is available at https://github.com/nerminsamet/houghnet

arXiv.org e-Print Archive

Crossref

OpenMETU (Middle East Technical University)

Key Role of Choline Head Groups in Large Unilamellar Phospholipid Vesicles for the Interaction with and Rupture by Silica Nanoparticles

Author: Diabaté Silvia
Fritsch-Decker Susanne
Gussmann Florian
Leibe Regina
Ulrich Anne S.
Wadhwani Parvesh
Wagbo Ane Marit
Weiss Carsten
Wenzel Wolfgang
Publication venue: John Wiley and Sons
Publication date: 02/05/2023
Field of study

KITopen

Accurate Single Image Multi-Modal Camera Pose Estimation

Author: B. Leibe
C.P. Lu
D.F. DeMenthon
D.G. Lowe
G. Penney
G. Vosselman
H. Bay
K. Mikolajczyk
P. David
P. Viola
R. Raguram
S. Benhimane
V. Lepetit
Publication venue
Publication date: 01/01/2012
Field of study

Abstract. A well known problem in photogrammetry and computer vision is the precise and robust determination of camera poses with respect to a given 3D model. In this work we propose a novel multi-modal method for single image camera pose estimation with respect to 3D models with intensity information (e.g., LiDAR data with reflectance information). We utilize a direct point based rendering approach to generate synthetic 2D views from 3D datasets in order to bridge the dimensionality gap. The proposed method then establishes 2D/2D point and local region correspondences based on a novel self-similarity distance measure. Correct correspondences are robustly identified by searching for small regions with a similar geometric relationship of local self-similarities using a Generalized Hough Transform. After backprojection of the generated features into 3D a standard Perspective-n-Points problem is solved to yield an initial camera pose. The pose is then accurately refined using an intensity based 2D/3D registration approach. An evaluation on Vis/IR 2D and airborne and terrestrial 3D datasets shows that the proposed method is applicable to a wide range of different sensor types. In addition, the approach outperforms standard global multi-modal 2D/3D registration approaches based on Mutual Information with respect to robustness and speed. Potential applications are widespread and include for instance multispectral texturing of 3D models, SLAM applications, sensor data fusion and multi-spectral camera calibration and super-resolution applications

CiteSeerX

Crossref

Multi-person Localization and Track Assignment in Overlapping Camera Views

Author: A. Mittal
B. Leibe
C. Bishop
F. Fleuret
K. Murty
R. Szeliski
S. Calderara
S. Khan
W. Hu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Crossref

International Migration, Integration and Social Cohesion online publications

Multi-feature walking pedestrians detection for driving assistance systems

Author: Bouguet
Gavrila
Gavrila
Gavrila
Harville
Havasi
Horn
Leibe
Loy
Lucas
Mählisch
S. Bota
S. Nedesvchi
Shashua
Viola
Welch
Woodfill
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 01/01/2008
Field of study

Crossref

Using Multi-view Recognition and Meta-data Annotation to Guide a Robot's Attention

Author: Alexander Thomas
Bastian Leibe
Bay H.
Belongie S.
Brostow G.
Cheng L.
Cornelis N.
Everingham M.
Everingham M.
Ferrari V.
Fraundorfer F.
Goedemé T.
Han F.
Hassner T.
Hoiem D.
Hoiem D.
Hoiem D.
Kushal A.
Leibe B.
Leibe B.
Leibe B.
Liebelt J.
Liu C.
Lowe D.G.
Luc Van Gool
Mumford D.
Munoz D.
Pantofaru C.
Posner I.
Russell B.
Savarese S.
Saxena A.
Saxena A.
Seemann E.
Segvic S.
Thomas A.
Thomas A.
Thomas A.
Tinne Tuytelaars
Vittorio Ferrari
Yan P.
Publication venue: 'SAGE Publications'
Publication date: 01/01/2009
Field of study

In the transition from industrial to service robotics, robots will have to deal with increasingly unpredictable and variable environments. We present a system that is able to recognize objects of a certain class in an image and to identify their parts for potential interactions. The method can recognize objects from arbitrary viewpoints and generalizes to instances that have never been observed during training, even if they are partially occluded and appear against cluttered backgrounds. Our approach builds on the implicit shape model of Leibe et al. We extend it to couple recognition to the provision of meta-dat

Lirias

CiteSeerX

Crossref

Edinburgh Research Explorer

Publikationsserver der RWTH Aachen University

Occlusion and Motion Reasoning for Long-Term Tracking

Author: A.D. Jepson
B. Babenko
B. Leibe
B. Liu
D. Comaniciu
D. Ramanan
D.A. Ross
F. Moreno-Noguer
H. Grabner
I. Matthews
K. Lee
M. Everingham
M. Isard
M. Spengler
N. Sundaram
P. Ochs
P. Pérez
P.F. Felzenszwalb
P.L. Hammer
R.E. Fan
R.T. Collins
S. Avidan
T. Brox
T. Brox
V. Kolmogorov
W. Du
X. Mei
Z. Kalal
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2014
Field of study

International audienceObject tracking is a reoccurring problem in computer vision. Tracking-by-detection approaches, in particular Struck (Hare et al., 2011), have shown to be competitive in recent evaluations. However, such approaches fail in the presence of long-term occlusions as well as severe viewpoint changes of the object. In this paper we propose a principled way to combine occlusion and motion reasoning with a tracking-by-detection approach. Occlusion and motion reasoning is based on state-of-the-art long-term trajectories which are labeled as object or background tracks with an energy-based formulation. The overlap between labeled tracks and detected regions allows to identify occlusions. The motion changes of the object between consecutive frames can be estimated robustly from the geometric relation between object trajectories. If this geometric change is significant, an additional detector is trained. Experimental results show that our tracker obtains state-of-the-art results and handles occlusion and viewpoints changes better than competing tracking methods

Queen's University Belfast Research Portal

CiteSeerX

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

A Review of Codebook Models in Patch-Based Visual Object Recognition

Author: A Agarwal
AK Jain
Amirthalingam Ramanan
B Leibe
D Comaniciu
D Lowe
D Nister
DM Blei
EB Sudderth
F Perronnin
G Peterson
H Bay
J Canny
J Zhang
JC Platt
JC Platt
JDR Farquhar
JR Quinlan
K Mikolajczyk
K Mikolajczyk
L Breiman
L Juan
Mahesan Niranjan
N Larios
P Quelhas
S Agarwal
S Lazebnik
T Kadir
X Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref