7,164 research outputs found
Object Referring in Videos with Language and Human Gaze
We investigate the problem of object referring (OR) i.e. to localize a target
object in a visual scene coming with a language description. Humans perceive
the world more as continued video snippets than as static images, and describe
objects not only by their appearance, but also by their spatio-temporal context
and motion features. Humans also gaze at the object when they issue a referring
expression. Existing works for OR mostly focus on static images only, which
fall short in providing many such cues. This paper addresses OR in videos with
language and human gaze. To that end, we present a new video dataset for OR,
with 30, 000 objects over 5, 000 stereo video sequences annotated for their
descriptions and gaze. We further propose a novel network model for OR in
videos, by integrating appearance, motion, gaze, and spatio-temporal context
into one network. Experimental results show that our method effectively
utilizes motion cues, human gaze, and spatio-temporal context. Our method
outperforms previousOR methods. For dataset and code, please refer
https://people.ee.ethz.ch/~arunv/ORGaze.html.Comment: Accepted to CVPR 2018, 10 pages, 6 figure
Computational intelligence approaches to robotics, automation, and control [Volume guest editors]
No abstract available
Bio-inspired retinal optic flow perception in robotic navigation
This thesis concerns the bio-inspired visual perception of motion with emphasis on locomotion targeting robotic systems. By continuously registering moving visual features in the human retina, a sensation of a visual flow cue is created. An interpretation of visual flow cues forms a low-level motion perception more known as retinal optic flow. Retinal optic flow is often mentioned and credited in human locomotor research but only in theory and simulated environments so far. Reconstructing the retinal optic flow fields using existing methods of estimating optic flow and experimental data from naive test subjects provides further insight into how it interacts with intermittent control behavior and dynamic gazing. The retinal optic flow is successfully demonstrated during a vehicular steering task scenario and further supports the idea that humans may use such perception to aid their ability to correct their steering during navigation.To achieve the reconstruction and estimation of the retinal optic flow, a set of optic flow estimators were fairly and systematically evaluated on the criteria on run-time predictability and reliability, and performance accuracy. A formalized methodology using containerization technology for performing the benchmarking was developed to generate the results. Furthermore, the readiness in road vehicles for the adoption of modern robotic software and related software processes were investigated. This was done with special emphasis on real-time computing and introducing containerization and microservice design paradigm. By doing so, continuous integration, continuous deployment, and continuous experimentation were enabled in order to aid further development and research. With the method of estimating retinal optic flow and its interaction with intermittent control, a more complete vision-based bionic steering control model is to be proposed and tested in a live robotic system
Estimation of Driver's Gaze Region from Head Position and Orientation using Probabilistic Confidence Regions
A smart vehicle should be able to understand human behavior and predict their
actions to avoid hazardous situations. Specific traits in human behavior can be
automatically predicted, which can help the vehicle make decisions, increasing
safety. One of the most important aspects pertaining to the driving task is the
driver's visual attention. Predicting the driver's visual attention can help a
vehicle understand the awareness state of the driver, providing important
contextual information. While estimating the exact gaze direction is difficult
in the car environment, a coarse estimation of the visual attention can be
obtained by tracking the position and orientation of the head. Since the
relation between head pose and gaze direction is not one-to-one, this paper
proposes a formulation based on probabilistic models to create salient regions
describing the visual attention of the driver. The area of the predicted region
is small when the model has high confidence on the prediction, which is
directly learned from the data. We use Gaussian process regression (GPR) to
implement the framework, comparing the performance with different regression
formulations such as linear regression and neural network based methods. We
evaluate these frameworks by studying the tradeoff between spatial resolution
and accuracy of the probability map using naturalistic recordings collected
with the UTDrive platform. We observe that the GPR method produces the best
result creating accurate predictions with localized salient regions. For
example, the 95% confidence region is defined by an area that covers 3.77%
region of a sphere surrounding the driver.Comment: 13 Pages, 12 figures, 2 table
Comparative analysis of Kinect-based and Oculus-based gaze region estimation methods in a driving simulator
Producción CientÃficaDriver’s gaze information can be crucial in driving research because of its relation to driver attention. Particularly, the inclusion of gaze data in driving simulators broadens the scope of research studies as they can relate drivers’ gaze patterns to their features and performance. In this paper, we present two gaze region estimation modules integrated in a driving simulator. One uses the 3D Kinect device and another uses the virtual reality Oculus Rift device. The modules are able to detect the region, out of seven in which the driving scene was divided, where a driver is gazing at in every route processed frame. Four methods were implemented and compared for gaze estimation, which learn the relation between gaze displacement and head movement. Two are simpler and based on points that try to capture this relation and two are based on classifiers such as MLP and SVM. Experiments were carried out with 12 users that drove on the same scenario twice, each one with a different visualization display, first with a big screen and later with Oculus Rift. On the whole, Oculus Rift outperformed Kinect as the best hardware for gaze estimation. The Oculus-based gaze region estimation method with the highest performance achieved an accuracy of 97.94%. The information provided by the Oculus Rift module enriches the driving simulator data and makes it possible a multimodal driving performance analysis apart from the immersion and realism obtained with the virtual reality experience provided by Oculus.Dirección General de Tráfico y Ministerio del Interior - (Proyecto SPIP2015-01801
Developing Predictive Models of Driver Behaviour for the Design of Advanced Driving Assistance Systems
World-wide injuries in vehicle accidents have been on the rise in recent
years, mainly due to driver error. The main objective of this research is to
develop a predictive system for driving maneuvers by analyzing the cognitive
behavior (cephalo-ocular) and the driving behavior of the driver (how the vehicle
is being driven). Advanced Driving Assistance Systems (ADAS) include
different driving functions, such as vehicle parking, lane departure warning,
blind spot detection, and so on. While much research has been performed on
developing automated co-driver systems, little attention has been paid to the
fact that the driver plays an important role in driving events. Therefore, it
is crucial to monitor events and factors that directly concern the driver. As
a goal, we perform a quantitative and qualitative analysis of driver behavior
to find its relationship with driver intentionality and driving-related actions.
We have designed and developed an instrumented vehicle (RoadLAB) that is
able to record several synchronized streams of data, including the surrounding
environment of the driver, vehicle functions and driver cephalo-ocular behavior,
such as gaze/head information. We subsequently analyze and study the
behavior of several drivers to find out if there is a meaningful relation between
driver behavior and the next driving maneuver
- …