4,438 research outputs found
Event-based Vision: A Survey
Event cameras are bio-inspired sensors that differ from conventional frame
cameras: Instead of capturing images at a fixed rate, they asynchronously
measure per-pixel brightness changes, and output a stream of events that encode
the time, location and sign of the brightness changes. Event cameras offer
attractive properties compared to traditional cameras: high temporal resolution
(in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low
power consumption, and high pixel bandwidth (on the order of kHz) resulting in
reduced motion blur. Hence, event cameras have a large potential for robotics
and computer vision in challenging scenarios for traditional cameras, such as
low-latency, high speed, and high dynamic range. However, novel methods are
required to process the unconventional output of these sensors in order to
unlock their potential. This paper provides a comprehensive overview of the
emerging field of event-based vision, with a focus on the applications and the
algorithms developed to unlock the outstanding properties of event cameras. We
present event cameras from their working principle, the actual sensors that are
available and the tasks that they have been used for, from low-level vision
(feature detection and tracking, optic flow, etc.) to high-level vision
(reconstruction, segmentation, recognition). We also discuss the techniques
developed to process events, including learning-based techniques, as well as
specialized processors for these novel sensors, such as spiking neural
networks. Additionally, we highlight the challenges that remain to be tackled
and the opportunities that lie ahead in the search for a more efficient,
bio-inspired way for machines to perceive and interact with the world
Dynamic Estimation of Rigid Motion from Perspective Views via Recursive Identification of Exterior Differential Systems with Parameters on a Topological Manifold
We formulate the problem of estimating the motion of a rigid object viewed under perspective projection as the identification of a dynamic model in Exterior Differential form with parameters on a topological manifold.
We first describe a general method for recursive identification of nonlinear implicit systems using prediction error criteria. The parameters are allowed to move slowly on some topological (not necessarily smooth) manifold. The basic recursion is solved in two different ways: one is based on a simple extension of the traditional Kalman Filter to nonlinear and implicit measurement constraints, the other may be regarded as a generalized "Gauss-Newton" iteration, akin to traditional Recursive Prediction Error Method techniques in linear identification. A derivation of the "Implicit Extended Kalman Filter" (IEKF) is reported in the appendix.
The ID framework is then applied to solving the visual motion problem: it indeed is possible to characterize it in terms of identification of an Exterior Differential System with parameters living on a C0 topological manifold, called the "essential manifold". We consider two alternative estimation paradigms. The first is in the local coordinates of the essential manifold: we estimate the state of a nonlinear implicit model on a linear space. The second is obtained by a linear update on the (linear) embedding space followed by a projection onto the essential manifold. These schemes proved successful in performing the motion estimation task, as we show in experiments on real and noisy synthetic image sequences
Interaction between high-level and low-level image analysis for semantic video object extraction
Authors of articles published in EURASIP Journal on Advances in Signal Processing are the copyright holders of their articles and have granted to any third party, in advance and in perpetuity, the right to use, reproduce or disseminate the article, according to the SpringerOpen copyright and license agreement (http://www.springeropen.com/authors/license)
Optical techniques for 3D surface reconstruction in computer-assisted laparoscopic surgery
One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-opera- tive morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeon’s navigation capabilites by observ- ing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted in- struments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This paper reviews the state-of-the-art methods for optical intra-operative 3D reconstruction in laparoscopic surgery and discusses the technical challenges and future perspectives towards clinical translation. With the recent paradigm shift of surgical practice towards MIS and new developments in 3D opti- cal imaging, this is a timely discussion about technologies that could facilitate complex CAS procedures in dynamic and deformable anatomical regions
Deep Functional Maps: Structured Prediction for Dense Shape Correspondence
We introduce a new framework for learning dense correspondence between
deformable 3D shapes. Existing learning based approaches model shape
correspondence as a labelling problem, where each point of a query shape
receives a label identifying a point on some reference domain; the
correspondence is then constructed a posteriori by composing the label
predictions of two input shapes. We propose a paradigm shift and design a
structured prediction model in the space of functional maps, linear operators
that provide a compact representation of the correspondence. We model the
learning process via a deep residual network which takes dense descriptor
fields defined on two shapes as input, and outputs a soft map between the two
given objects. The resulting correspondence is shown to be accurate on several
challenging benchmarks comprising multiple categories, synthetic models, real
scans with acquisition artifacts, topological noise, and partiality.Comment: Accepted for publication at ICCV 201
Digital twins that learn and correct themselves
Digital twins can be defined as digital representations of physical entities that employ real‐time data to enable understanding of the operating conditions of these entities. Here we present a particular type of digital twin that involves a combination of computer vision, scientific machine learning, and augmented reality. This novel digital twin is able, therefore, to see, to interpret what it sees—and, if necessary, to correct the model it is equipped with—and presents the resulting information in the form of augmented reality. The computer vision capabilities allow the twin to receive data continuously. As any other digital twin, it is equipped with one or more models so as to assimilate data. However, if persistent deviations from the predicted values are found, the proposed methodology is able to correct on the fly the existing models, so as to accommodate them to the measured reality. Finally, the suggested methodology is completed with augmented reality capabilities so as to render a completely new type of digital twin. These concepts are tested against a proof‐of‐concept model consisting on a nonlinear, hyperelastic beam subjected to moving loads whose exact position is to be determined
A Flexible Framework for Designing Trainable Priors with Adaptive Smoothing and Game Encoding
We introduce a general framework for designing and training neural network
layers whose forward passes can be interpreted as solving non-smooth convex
optimization problems, and whose architectures are derived from an optimization
algorithm. We focus on convex games, solved by local agents represented by the
nodes of a graph and interacting through regularization functions. This
approach is appealing for solving imaging problems, as it allows the use of
classical image priors within deep models that are trainable end to end. The
priors used in this presentation include variants of total variation, Laplacian
regularization, bilateral filtering, sparse coding on learned dictionaries, and
non-local self similarities. Our models are fully interpretable as well as
parameter and data efficient. Our experiments demonstrate their effectiveness
on a large diversity of tasks ranging from image denoising and compressed
sensing for fMRI to dense stereo matching.Comment: NeurIPS 202
Simultaneous localisation and mapping: A stereo vision based approach
With limited dynamic range and poor noise performance, cameras still pose considerable challenges in the application of range sensors in the context of robotic navigation, especially in the implementation of Simultaneous Localisation and Mapping (SLAM) with sparse features. This paper presents a combination of methods in solving the SLAM problem in a constricted indoor environment using small baseline stereo vision. Main contributions include a feature selection and tracking algorithm, a stereo noise filter, a robust feature validation algorithm and a multiple hypotheses adaptive window positioning method in 'closing the loop'. These methods take a novel approach in that information from the image processing and robotic navigation domains are used in tandem to augment each other. Experimental results including a real-time implementation in an office-like environment are also presented. © 2006 IEEE
Region of Interest Generation for Pedestrian Detection using Stereo Vision
Pedestrian detection is an active research area in the field of computer vision. The sliding window paradigm is usually followed to extract all possible detector windows, however, it is very time consuming. Subsequently, stereo vision using a pair of camera is preferred to reduce the search space that includes the depth information. Disparity map generation using feature correspondence is an integral part and a prior task to depth estimation. In our work, we apply the ORB features to fasten the feature correspondence process. Once the ROI generation phase is over, the extracted detector window is represented by low level histogram of oriented gradient (HOG) features. Subsequently, Linear Support Vector Machine (SVM) is applied to classify them as either pedestrian or non-pedestrian. The experimental results reveal that ORB driven depth estimation is at least seven times faster than the SURF descriptor and ten times faster than the SIFT descriptor
- …