Search CORE

11 research outputs found

Object Detection Through Exploration With A Foveated Visual Field

Author: A Borji
A Lewis
A Torralba
B Alexe
BR Beutter
BW Tatler
C Bradley
C Morvan
CA Curcio
CA Curcio
CA Curcio
CH Lampert
CJ Ludwig
DG Lowe
DM Dacey
DM Levi
Emre Akbas
GJ Zelinsky
GL Malcolm
H Larochelle
H Strasburger
H Yamamoto
I Kokkinos
J Elder
J Freeman
J Hosang
J Najemnik
J Najemnik
J Rovamo
JH Elder
JM Findlay
JM Findlay
K Koehler
L Itti
L Itti
L Zhaoping
L Zhaoping
LW Renninger
MB Neider
MF Land
Miguel P. Eckstein
MJ Choi
MP Eckstein
MP Eckstein
MP Eckstein
MP Eckstein
MP Eckstein
ND Bruce
NJ Butko
NJ Marshall
P Azzopardi
P Kontschieder
P Verghese
P Viola
PF Felzenszwalb
R Rosenholtz
S Ren
S Zhang
SC Mack
T Malisiewicz
T Wertheim
TJ Preston
W Zhang
Wolfgang Einhäuser
X Chen
Z Li
ZP Li
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/10/2017
Field of study

We present a foveated object detector (FOD) as a biologically-inspired alternative to the sliding window (SW) approach which is the dominant method of search in computer vision object detection. Similar to the human visual system, the FOD has higher resolution at the fovea and lower resolution at the visual periphery. Consequently, more computational resources are allocated at the fovea and relatively fewer at the periphery. The FOD processes the entire scene, uses retino-specific object detection classifiers to guide eye movements, aligns its fovea with regions of interest in the input image and integrates observations across multiple fixations. Our approach combines modern object detectors from computer vision with a recent model of peripheral pooling regions found at the V1 layer of the human visual system. We assessed various eye movement strategies on the PASCAL VOC 2007 dataset and show that the FOD performs on par with the SW detector while bringing significant computational cost savings.Comment: An extended version of this manuscript was published in PLOS Computational Biology (October 2017) at https://doi.org/10.1371/journal.pcbi.100574

arXiv.org e-Print Archive

CiteSeerX

Crossref

Directory of Open Access Journals

OpenMETU (Middle East Technical University)

Object Detection: Current and Future Directions

Author: Javier Ruiz-del-Solar
Rodrigo Verschae
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2015
Field of study

Frontiers - Publisher Connector

Data Decomposition and Spatial Mixture Modeling for Part Based Model

Author: Junge Zhang
Kaiqi Huang
Tieniu Tan
Yongzhen Huang
Zifeng Wu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Abstract. This paper presents a system of data decomposition and spa-tial mixture modeling for part based models. Recently, many enhanced part based models (with e.g., multiple features, more components or parts) have been proposed. Nevertheless, those enhanced models bring high computation cost together with the risk of over-fitting. To tackle this problem, we propose a data decomposition method for part based models which not only accelerates training and testing process but also improves the performance on average. Besides, the original part based model uses a strict rigid structural model to describe the distribution of each part location. It is not “deformable ” enough, especially for those instances with different viewpoints or poses in the same aspect ratio. To address this problem, we present a novel spatial mixture modeling method. The spatial mixture embedded model is then integrated into the proposed data decomposition framework. We evaluate our system on the challenging PASCAL VOC2007 and PASCAL VOC2010 datasets, demonstrating the state-of-the-art performance compared with other re-lated methods in terms of accuracy and efficiency.

CiteSeerX

Crossref

A neural implementation of the Hough transform and the advantages of explaining away

Author: Spratling M.W.
Publication venue
Publication date: 01/08/2016
Field of study

Crossref

King's Research Portal

Characterizing Objects in Images using Human Context

Author: Srikantha Abhilash
Publication venue: Universitäts- und Landesbibliothek Bonn
Publication date
Field of study

Humans have an unmatched capability of interpreting detailed information about existent objects by just looking at an image. Particularly, they can effortlessly perform the following tasks: 1) Localizing various objects in the image and 2) Assigning functionalities to the parts of localized objects. This dissertation addresses the problem of aiding vision systems accomplish these two goals. The first part of the dissertation concerns object detection in a Hough-based framework. To this end, the independence assumption between features is addressed by grouping them in a local neighborhood. We study the complementary nature of individual and grouped features and combine them to achieve improved performance. Further, we consider the challenging case of detecting small and medium sized household objects under human-object interactions. We first evaluate appearance based star and tree models. While the tree model is slightly better, appearance based methods continue to suffer due to deficiencies caused by human interactions. To this end, we successfully incorporate automatically extracted human pose as a form of context for object detection. The second part of the dissertation addresses the tedious process of manually annotating objects to train fully supervised detectors. We observe that videos of human-object interactions with activity labels can serve as weakly annotated examples of household objects. Since such objects cannot be localized only through appearance or motion, we propose a framework that includes human centric functionality to retrieve the common object. Designed to maximize data utility by detecting multiple instances of an object per video, the framework achieves performance comparable to its fully supervised counterpart. The final part of the dissertation concerns localizing functional regions or affordances within objects by casting the problem as that of semantic image segmentation. To this end, we introduce a dataset involving human-object interactions with strong i.e. pixel level and weak i.e. clickpoint and image level affordance annotations. We propose a framework that utilizes both forms of weak labels and demonstrate that efforts for weak annotation can be further optimized using human context

bonndoc – Der Publikationsserver der Universität Bonn

Object Detection Using Hough Transform

Author: Chroboczek Martin
Publication venue: Vysoké učení technické v Brně. Fakulta informačních technologií
Publication date: 01/01/2014
Field of study

Tato diplomová práce se zabývá problematikou detekce objektů pomocí matematické techniky zvané Houghova transformace. Techniku Houghovy transformace pojímá z obecného hlediska od de facto nejjednoduššího užití pro detekci elementárních analyticky popsatelných útvarů jako jsou přímky, elipsy, kružnice či jednoduché analyticky definovatelné prvky až po sofistikované užití pro detekci komplexních - analyticky prakticky nepopsatelných - objektů. Mezi ně patří například automobily či chodci, kteří se detekují na základě předložených fotografických záznamů těchto objektů a entit. Dokument tedy mapuje definice a použití jednotlivých subtechnik Houghovy transformace spolu s jejich základním členěním na pravděpodobnostní a nepravděpodobnostní metody. Práce následně vrcholí popisem obecné state-of-the-art metody zvané Třídně-specifické Houghovy lesy pro detekci objektů, uvádí její definici, postup trénovaní na základě poskytnutého datasetu a detekce z testovacích obrazců. V závěru této práce je pak navrhnut a implementován obecně trénovatelný detektor objektů využívající tuto techniku. A je experimentálně vyhodnocena jeho úspěšnost.This diploma thesis deals with object detection using mathematical technique called Hough transform. Hough transform technique is conceived in general terms from the de facto simplest use for the detection of elementary analytically describable shapes such as lines, ellipses, circles or simple analytically definable elements to sophisticated use for the detection of complex - analytically virtually indescribable - objects. These include cars or pedestrians who are detected on the basis of the photographic records of these objects and entities. The document thus maps the definition and use of the respective Hough transform subtechniques along with their basic classification on probabilistic and non-probabilistic methods. The work subsequently culminates in describing the general state-of-the-art technique called Class-Specific Hough Forests for Object Detection, introduces its definition, training procedure on a provided dataset and the detection of test patterns. In conclusion of this work,there is designed and implemented generally trainable object detector using this technique. And there is experimental evaluation of its quality.

Digital library of Brno University of Technology

National Repository of Grey Literature

Context-driven Object Detection and Segmentation with Auxiliary Information

Author: Wang Tao
Publication venue
Publication date
Field of study

One fundamental problem in computer vision and robotics is to localize objects of interest in an image. The task can either be formulated as an object detection problem if the objects are described by a set of pose parameters, or an object segmentation one if we recover object boundary precisely. A key issue in object detection and segmentation concerns exploiting the spatial context, as local evidence is often insufficient to determine object pose in the presence of heavy occlusions or large object appearance variations. This thesis addresses the object detection and segmentation problem in such adverse conditions with auxiliary depth data provided by RGBD cameras. We focus on four main issues in context-aware object detection and segmentation: 1) what are the effective context representations? 2) how can we work with limited and imperfect depth data? 3) how to design depth-aware features and integrate depth cues into conventional visual inference tasks? 4) how to make use of unlabeled data to relax the labeling requirements for training data? We discuss three object detection and segmentation scenarios based on varying amounts of available auxiliary information. In the first case, depth data are available for model training but not available for testing. We propose a structured Hough voting method for detecting objects with heavy occlusion in indoor environments, in which we extend the Hough hypothesis space to include both the object's location, and its visibility pattern. We design a new score function that accumulates votes for object detection and occlusion prediction. In addition, we explore the correlation between objects and their environment, building a depth-encoded object-context model based on RGBD data. In the second case, we address the problem of localizing glass objects with noisy and incomplete depth data. Our method integrates the intensity and depth information from a single view point, and builds a Markov Random Field that predicts glass boundary and region jointly. In addition, we propose a nonparametric, data-driven label transfer scheme for local glass boundary estimation. A weighted voting scheme based on a joint feature manifold is adopted to integrate depth and appearance cues, and we learn a distance metric on the depth-encoded feature manifold. In the third case, we make use of unlabeled data to relax the annotation requirements for object detection and segmentation, and propose a novel data-dependent margin distribution learning criterion for boosting, which utilizes the intrinsic geometric structure of datasets. One key aspect of this method is that it can seamlessly incorporate unlabeled data by including a graph Laplacian regularizer. We demonstrate the performance of our models and compare with baseline methods on several real-world object detection and segmentation tasks, including indoor object detection, glass object segmentation and foreground segmentation in video

The Australian National University