Search CORE

138 research outputs found

Learned Monocular Depth Priors in Visual-Inertial Initialization

Author: DuToit Ryan C.
Guo Chao X.
Kar Abhishek
Kowdle Adarsh
Tsotsos Konstantine
Turner Eric
Zhou Yunwen
Publication venue
Publication date: 19/04/2022
Field of study

Visual-inertial odometry (VIO) is the pose estimation backbone for most AR/VR and autonomous robotic systems today, in both academia and industry. However, these systems are highly sensitive to the initialization of key parameters such as sensor biases, gravity direction, and metric scale. In practical scenarios where high-parallax or variable acceleration assumptions are rarely met (e.g. hovering aerial robot, smartphone AR user not gesticulating with phone), classical visual-inertial initialization formulations often become ill-conditioned and/or fail to meaningfully converge. In this paper we target visual-inertial initialization specifically for these low-excitation scenarios critical to in-the-wild usage. We propose to circumvent the limitations of classical visual-inertial structure-from-motion (SfM) initialization by incorporating a new learning-based measurement as a higher-level input. We leverage learned monocular depth images (mono-depth) to constrain the relative depth of features, and upgrade the mono-depth to metric scale by jointly optimizing for its scale and shift. Our experiments show a significant improvement in problem conditioning compared to a classical formulation for visual-inertial initialization, and demonstrate significant accuracy and robustness improvements relative to the state-of-the-art on public benchmarks, particularly under motion-restricted scenarios. We further extend this improvement to implementation within an existing odometry system to illustrate the impact of our improved initialization method on resulting tracking trajectories

arXiv.org e-Print Archive

Inductive learning spatial attention

Author: Baluja S.
Bennett A.
Berlin B.
Bobick A.
Cha M.
Chris Needham
Colton S.
Derek Magee
Dickmanns E. D.
dos Santos M.
Freksa C.
Gilbert A.
Kaelbling L.
Khadhouri B.
Magee D.
Magee D. R.
Mitchell T. M.
Muggleton S.
Muggleton S.
Muggleton S.
Needham C.
Paulo Santos
Peirce C. S.
Santos P.
Santos P.
Santos P.
Santos P.
Santos P. E.
Scivos A.
Shanahan M.
Siskind J.
Tamaddoni-Nezhada A.
Tsotsos J
Tsotsos J.
Publication venue: 'FapUNIFESP (SciELO)'
Publication date: 01/01/2008
Field of study

This paper investigates the automatic induction of spatial attention from the visual observation of objects manipulated on a table top. In this work, space is represented in terms of a novel observer-object relative reference system, named Local Cardinal System, defined upon the local neighbourhood of objects on the table. We present results of applying the proposed methodology on five distinct scenarios involving the construction of spatial patterns of coloured blocks

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

White Rose Research Online

Linear vs. Nonlinear Feature Combination for Saliency Computation: A Comparison with Human Vision

Author: C. Koch
D. Parkhurst
D. Walther
J.K. Tsotsos
L. Itti
N. Ouerhani
N. Ouerhani
N. Ouerhani
T. Jost
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

Crossref

Modelling Visual Search with the Selective Attention for Identification Model (VS-SAIM): A Novel Explanation for Visual Search Asymmetries

Author: A Treisman
A Treisman
Andreas Backhaus
BA Olshausen
C Royden
CJ McAdams
CM Boehler
D Coppola
D Foster
D Heinke
D Heinke
D Heinke
Dietmar Heinke
E Mjolsness
F Cutzu
G Deco
G Humphreys
H Egeth
HB Barlow
HE Jones
J Daugman
J Duncan
J Hopfield
JK Tsotsos
JK Tsotsos
JM Wolfe
JM Wolfe
JM Wolfe
JRW Mounts
K Grill-Spector
KR Cave
M Carrasco
M Chun
M Dick
M Meytlis
M Mozer
NG Müller
P Cook
P Malinowski
P Schiller
Publication venue: Springer-Verlag
Publication date: 01/01/2010
Field of study

In earlier work, we developed the Selective Attention for Identification Model (SAIM [16]). SAIM models the human ability to perform translation-invariant object identification in multiple object scenes. SAIM suggests that central for this ability is an interaction between parallel competitive processes in a selection stage and a object identification stage. In this paper, we applied the model to visual search experiments involving simple lines and letters. We presented successful simulation results for asymmetric and symmetric searches and for the influence of background line orientations. Search asymmetry refers to changes in search performance when the roles of target item and non-target item (distractor) are swapped. In line with other models of visual search, the results suggest that a large part of the empirical evidence can be explained by competitive processes in the brain, which are modulated by the similarity between target and distractor. The simulations also suggest that another important factor is the feature properties of distractors. Finally, the simulations indicate that search asymmetries can be the outcome of interactions between top-down (knowledge about search items) and bottom-up (feature of search items) processing. This interaction in VS-SAIM is dominated by a novel mechanism, the knowledge-based on-centre-off-surround receptive field. This receptive field is reminiscent of the classical receptive fields but the exact shape is modulated by both, top-down and bottom-up processes. The paper discusses supporting evidence for the existence of this novel concept

Crossref

Springer - Publisher Connector

University of Birmingham Research Portal

Fraunhofer-ePrints

PubMed Central

The time course of exogenous and endogenous control of covert attention

Author: A Kiesel
AR Hunt
AR Hunt
AR Hunt
AV Belopolsky
B Brisson
C Hickey
C Hickey
CJH Ludwig
CL Folk
CL Folk
Clayton Hickey
DM Beck
E Leblanc
GF Woodman
HE Egeth
J Theeuwes
J Theeuwes
J Theeuwes
J Theeuwes
J Theeuwes
Jan Theeuwes
JH Reynolds
JK Tsotsos
L Itti
M Eimer
MC Lien
R Desimone
R Godijn
R Ulrich
SC Wu
SJ Luck
SJ Luck
W Zoest van
W Zoest van
WF Bacon
Wieske van Zoest
Publication venue: Springer-Verlag
Publication date: 01/01/2009
Field of study

Studies of eye-movements and manual response have established that rapid overt selection is largely exogenously driven toward salient stimuli, whereas slower selection is largely endogenously driven to relevant objects. We use the N2pc, an event-related potential index of covert attention, to demonstrate that this time course reflects an underlying pattern in the deployment of covert attention. We find that shifts of attention that occur soon after the onset of a visual search array are directed toward salient, task-irrelevant visual stimuli and are associated with slow responses to the target. In contrast, slower shifts are target-directed and are associated with fast responses. The time course of exogenous and endogenous control provides a framework in which some inconsistent results in the capture literature might be reconciled; capture may occur when attention is rapidly deployed

CiteSeerX

Crossref

VU Research Portal

Springer - Publisher Connector

University of Birmingham Research Portal

PubMed Central

Investigation of the nanostructure and wear properties of physical vapor deposited CrCuN nanocomposite coatings

Author: A. Leyland
A. Matthews
Baker M. A.
Baker M. A.
Briggs D.
C. Tsotsos
Jensen P.
Lide D. R.
M. A. Baker
Massalski T. B.
P. J. Kench
P. N. Gibson
Publication venue: 'American Vacuum Society'
Publication date
Field of study

Crossref

We get the algorithms of our ground truths: Designing referential databases in digital image processing.

Author: Anderson CW
Callon M
Cormen TH
Edwards PN
Florian Jaton
Heinke D
Introna L
Itti L
Knorr-Cetina KD
Koch C
Latour B
Latour B
Latour B
Lynch M
Ross J
Seaver N
Seitz F
Simondon G
Steiner C
Suchman L
Tsotsos JK
Vetterli M
Wang Z
Publication venue: 'SAGE Publications'
Publication date: 13/11/2017
Field of study

This article documents the practical efforts of a group of scientists designing an image-processing algorithm for saliency detection. By following the actors of this computer science project, the article shows that the problems often considered to be the starting points of computational models are in fact provisional results of time-consuming, collective and highly material processes that engage habits, desires, skills and values. In the project being studied, problematization processes lead to the constitution of referential databases called 'ground truths' that enable both the effective shaping of algorithms and the evaluation of their performances. Working as important common touchstones for research communities in image processing, the ground truths are inherited from prior problematization processes and may be imparted to subsequent ones. The ethnographic results of this study suggest two complementary analytical perspectives on algorithms: (1) an 'axiomatic' perspective that understands algorithms as sets of instructions designed to solve given problems computationally in the best possible way, and (2) a 'problem-oriented' perspective that understands algorithms as sets of instructions designed to computationally retrieve outputs designed and designated during specific problematization processes. If the axiomatic perspective on algorithms puts the emphasis on the numerical transformations of inputs into outputs, the problem-oriented perspective puts the emphasis on the definition of both inputs and outputs

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Serveur académique lausannois

Mechanical and high pressure tribological properties of nanocrystalline Ti(N,C) and amorphous C:H nanocomposite coatings

Author: A.A. Polycarpou
Avelar-Batista
Baker
Baker
C. Rebholz
C. Tsotsos
Cruz
Dahotre
Donnet
G. Constantinides
Holmberg
Holmberg
Holmberg
K. Böbel
K. Polychronopoulou
Lee
Leyland
M.A. Baker
N.G. Demas
Oliver
Polychronopoulou
Reuter
S. Gravani
Siegel
Treutler
Voevodin
Wang
Weber
Zukerman
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Visual search for direction of shading is influenced by apparent depth

Author: A. P. Pentland
A. P. Pentland
A. Treisman
B. Julesz
B. K. P. Horn
C. Benson
C. W. Eriksen
D. A. Taylor
D. Marr
Deborah J. Aks
E. L. J. Leeuwenberg
E. Minoolla
F. Attneave
F. Attneave
G. Ju
G. R. Lovrus
H. B. Barlow
H. Buffart
I. Biederman
I. Rock
J. Beck
J. Beck
J. Beck
J. J. Gibson
J. J. Koenderink
J. K. Tsotsos
J. L. McClelland
J. M. Brown
J. Malik
J. T. Enns
J. T. Enns
J. T. Enns
J. T. Enns
J. T. Enns
J. T. Todd
James T. Enns
K. Nakayama
M. T. Turvey
P. J. Kellman
P. Mcleod
S. Grossberg
S. R. Lehky
S. Sternberg
S. Sternberg
V. S. Ramachandran
V. S. Ramachandran
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Influence of Low-Level Stimulus Features, Task Dependent Factors, and Spatial Biases on Overt Visual Attention

Author: A Açık
A Torralba
AJ Calder
AL Yarbus
AT Bahill
B Sere
B Tatler
B Tatler
BA Olshausen
BG Tabachnick
BT Vincent
BW Tatler
BW Tatler
BW Tatler
C Kanan
C Koch
CA Rothkopf
CM Privitera
D Brockmann
D Gao
D Parkhurst
D Walther
DH Hubel
DJ Felleman
DJ Parkhurst
DJ Parkhurst
ET Rolls
F Cristino
F Gosselin
F Gosselin
F Gosselin
F Schumann
F Schumann
FH Hamker
G Avidan
G Krieger
G Rizzolatti
G Underwood
G Vinette
GL Malcolm
GT Buswell
HP Frey
HP Frey
I Hooge
JK Tsotsos
JM Wolfe
JR Antes
K Tanaka
KA Ehinger
KA Turano
Karl J. Friston
L Itti
L Itti
L Jansen
L Zhang
M Land
MF Land
N Kanwisher
N Tottenham
NH Mackworth
Nora Nortmann
P Reinagel
Peter König
PG Schyns
R Carmi
RB Tootell
RF Murray
RJ Baddeley
RJ Peters
RP Rao
Sepp Kollmorgen
SK Mannan
Sylvia Schröder
T Smith
V Navalpakkam
V Navalpakkam
W Einhäuser
W Einhäuser
W Einhäuser
W Einhäuser
W Einhäuser
W Freiwald
W Kienzle
X Chen
Z Li
Publication venue: Public Library of Science
Publication date: 01/05/2010
Field of study

Visual attention is thought to be driven by the interplay between low-level visual features and task dependent information content of local image regions, as well as by spatial viewing biases. Though dependent on experimental paradigms and model assumptions, this idea has given rise to varying claims that either bottom-up or top-down mechanisms dominate visual attention. To contribute toward a resolution of this discussion, here we quantify the influence of these factors and their relative importance in a set of classification tasks. Our stimuli consist of individual image patches (bubbles). For each bubble we derive three measures: a measure of salience based on low-level stimulus features, a measure of salience based on the task dependent information content derived from our subjects' classification responses and a measure of salience based on spatial viewing biases. Furthermore, we measure the empirical salience of each bubble based on our subjects' measured eye gazes thus characterizing the overt visual attention each bubble receives. A multivariate linear model relates the three salience measures to overt visual attention. It reveals that all three salience measures contribute significantly. The effect of spatial viewing biases is highest and rather constant in different tasks. The contribution of task dependent information is a close runner-up. Specifically, in a standardized task of judging facial expressions it scores highly. The contribution of low-level features is, on average, somewhat lower. However, in a prototypical search task, without an available template, it makes a strong contribution on par with the two other measures. Finally, the contributions of the three factors are only slightly redundant, and the semi-partial correlation coefficients are only slightly lower than the coefficients for full correlations. These data provide evidence that all three measures make significant and independent contributions and that none can be neglected in a model of human overt visual attention

Repository for Publications and Research Data

Crossref

Directory of Open Access Journals

PubMed Central

UCL Discovery