15,923 research outputs found
GazeDPM: Early Integration of Gaze Information in Deformable Part Models
An increasing number of works explore collaborative human-computer systems in
which human gaze is used to enhance computer vision systems. For object
detection these efforts were so far restricted to late integration approaches
that have inherent limitations, such as increased precision without increase in
recall. We propose an early integration approach in a deformable part model,
which constitutes a joint formulation over gaze and visual data. We show that
our GazeDPM method improves over the state-of-the-art DPM baseline by 4% and a
recent method for gaze-supported object detection by 3% on the public POET
dataset. Our approach additionally provides introspection of the learnt models,
can reveal salient image structures, and allows us to investigate the interplay
between gaze attracting and repelling areas, the importance of view-specific
models, as well as viewers' personal biases in gaze patterns. We finally study
important practical aspects of our approach, such as the impact of using
saliency maps instead of real fixations, the impact of the number of fixations,
as well as robustness to gaze estimation error
Complexer-YOLO: Real-Time 3D Object Detection and Tracking on Semantic Point Clouds
Accurate detection of 3D objects is a fundamental problem in computer vision
and has an enormous impact on autonomous cars, augmented/virtual reality and
many applications in robotics. In this work we present a novel fusion of neural
network based state-of-the-art 3D detector and visual semantic segmentation in
the context of autonomous driving. Additionally, we introduce
Scale-Rotation-Translation score (SRTs), a fast and highly parameterizable
evaluation metric for comparison of object detections, which speeds up our
inference time up to 20\% and halves training time. On top, we apply
state-of-the-art online multi target feature tracking on the object
measurements to further increase accuracy and robustness utilizing temporal
information. Our experiments on KITTI show that we achieve same results as
state-of-the-art in all related categories, while maintaining the performance
and accuracy trade-off and still run in real-time. Furthermore, our model is
the first one that fuses visual semantic with 3D object detection
Multiple Object Tracking in Urban Traffic Scenes with a Multiclass Object Detector
Multiple object tracking (MOT) in urban traffic aims to produce the
trajectories of the different road users that move across the field of view
with different directions and speeds and that can have varying appearances and
sizes. Occlusions and interactions among the different objects are expected and
common due to the nature of urban road traffic. In this work, a tracking
framework employing classification label information from a deep learning
detection approach is used for associating the different objects, in addition
to object position and appearances. We want to investigate the performance of a
modern multiclass object detector for the MOT task in traffic scenes. Results
show that the object labels improve tracking performance, but that the output
of object detectors are not always reliable.Comment: 13th International Symposium on Visual Computing (ISVC
- …