50,445 research outputs found
Gaussian Processes with Context-Supported Priors for Active Object Localization
We devise an algorithm using a Bayesian optimization framework in conjunction
with contextual visual data for the efficient localization of objects in still
images. Recent research has demonstrated substantial progress in object
localization and related tasks for computer vision. However, many current
state-of-the-art object localization procedures still suffer from inaccuracy
and inefficiency, in addition to failing to provide a principled and
interpretable system amenable to high-level vision tasks. We address these
issues with the current research.
Our method encompasses an active search procedure that uses contextual data
to generate initial bounding-box proposals for a target object. We train a
convolutional neural network to approximate an offset distance from the target
object. Next, we use a Gaussian Process to model this offset response signal
over the search space of the target. We then employ a Bayesian active search
for accurate localization of the target.
In experiments, we compare our approach to a state-of-theart bounding-box
regression method for a challenging pedestrian localization task. Our method
exhibits a substantial improvement over this baseline regression method.Comment: 10 pages, 4 figure
The cybernetic Bayesian brain: from interoceptive inference to sensorimotor contingencies
Is there a single principle by which neural operations can account for perception, cognition, action, and even consciousness? A strong candidate is now taking shape in the form of âpredictive processingâ. On this theory, brains engage in predictive inference on the causes of sensory inputs by continuous minimization of prediction errors or informational âfree energyâ. Predictive processing can account, supposedly, not only for perception, but also for action and for the essential contribution of the body and environment in structuring sensorimotor interactions. In this paper I draw together some recent developments within predictive processing that involve predictive modelling of internal physiological states (interoceptive inference), and integration with âenactiveâ and âembodiedâ approaches to cognitive science (predictive perception of sensorimotor contingencies). The upshot is a development of predictive processing that originates, not in Helmholtzian perception-as-inference, but rather in 20th-century cybernetic principles that emphasized homeostasis and predictive control. This way of thinking leads to (i) a new view of emotion as active interoceptive inference; (ii) a common predictive framework linking experiences of body ownership, emotion, and exteroceptive perception; (iii) distinct interpretations of active inference as involving disruptive and disambiguatoryânot just confirmatoryâactions to test perceptual hypotheses; (iv) a neurocognitive operationalization of the âmastery of sensorimotor contingenciesâ (where sensorimotor contingencies reflect the rules governing sensory changes produced by various actions); and (v) an account of the sense of subjective reality of perceptual contents (âperceptual presenceâ) in terms of the extent to which predictive models encode potential sensorimotor relations (this being âcounterfactual richnessâ). This is rich and varied territory, and surveying its landmarks emphasizes the need for experimental tests of its key contributions
Optical techniques for 3D surface reconstruction in computer-assisted laparoscopic surgery
One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-opera- tive morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeonâs navigation capabilites by observ- ing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted in- struments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This paper reviews the state-of-the-art methods for optical intra-operative 3D reconstruction in laparoscopic surgery and discusses the technical challenges and future perspectives towards clinical translation. With the recent paradigm shift of surgical practice towards MIS and new developments in 3D opti- cal imaging, this is a timely discussion about technologies that could facilitate complex CAS procedures in dynamic and deformable anatomical regions
Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age
Simultaneous Localization and Mapping (SLAM)consists in the concurrent
construction of a model of the environment (the map), and the estimation of the
state of the robot moving within it. The SLAM community has made astonishing
progress over the last 30 years, enabling large-scale real-world applications,
and witnessing a steady transition of this technology to industry. We survey
the current state of SLAM. We start by presenting what is now the de-facto
standard formulation for SLAM. We then review related work, covering a broad
set of topics including robustness and scalability in long-term mapping, metric
and semantic representations for mapping, theoretical performance guarantees,
active SLAM and exploration, and other new frontiers. This paper simultaneously
serves as a position paper and tutorial to those who are users of SLAM. By
looking at the published research with a critical eye, we delineate open
challenges and new research issues, that still deserve careful scientific
investigation. The paper also contains the authors' take on two questions that
often animate discussions during robotics conferences: Do robots need SLAM? and
Is SLAM solved
Predictive Coding as a Model of Biased Competition in Visual Attention
Attention acts, through cortical feedback pathways, to enhance the response of cells encoding expected or predicted information. Such observations are inconsistent with the predictive coding theory of cortical function which proposes that feedback acts to suppress information predicted by higher-level cortical regions. Despite this discrepancy, this article demonstrates that the predictive coding model can be used to simulate a number of the effects of attention. This is achieved via a simple mathematical rearrangement of the predictive coding model, which allows it to be interpreted as a form of biased competition model. Nonlinear extensions to the model are proposed that enable it to explain a wider range of data
Negative Results in Computer Vision: A Perspective
A negative result is when the outcome of an experiment or a model is not what
is expected or when a hypothesis does not hold. Despite being often overlooked
in the scientific community, negative results are results and they carry value.
While this topic has been extensively discussed in other fields such as social
sciences and biosciences, less attention has been paid to it in the computer
vision community. The unique characteristics of computer vision, particularly
its experimental aspect, call for a special treatment of this matter. In this
paper, I will address what makes negative results important, how they should be
disseminated and incentivized, and what lessons can be learned from cognitive
vision research in this regard. Further, I will discuss issues such as computer
vision and human vision interaction, experimental design and statistical
hypothesis testing, explanatory versus predictive modeling, performance
evaluation, model comparison, as well as computer vision research culture
- âŠ