26,520 research outputs found
The Conditional Lucas & Kanade Algorithm
The Lucas & Kanade (LK) algorithm is the method of choice for efficient dense
image and object alignment. The approach is efficient as it attempts to model
the connection between appearance and geometric displacement through a linear
relationship that assumes independence across pixel coordinates. A drawback of
the approach, however, is its generative nature. Specifically, its performance
is tightly coupled with how well the linear model can synthesize appearance
from geometric displacement, even though the alignment task itself is
associated with the inverse problem. In this paper, we present a new approach,
referred to as the Conditional LK algorithm, which: (i) directly learns linear
models that predict geometric displacement as a function of appearance, and
(ii) employs a novel strategy for ensuring that the generative pixel
independence assumption can still be taken advantage of. We demonstrate that
our approach exhibits superior performance to classical generative forms of the
LK algorithm. Furthermore, we demonstrate its comparable performance to
state-of-the-art methods such as the Supervised Descent Method with
substantially less training examples, as well as the unique ability to "swap"
geometric warp functions without having to retrain from scratch. Finally, from
a theoretical perspective, our approach hints at possible redundancies that
exist in current state-of-the-art methods for alignment that could be leveraged
in vision systems of the future.Comment: 17 pages, 11 figure
Deep Decision Trees for Discriminative Dictionary Learning with Adversarial Multi-Agent Trajectories
With the explosion in the availability of spatio-temporal tracking data in
modern sports, there is an enormous opportunity to better analyse, learn and
predict important events in adversarial group environments. In this paper, we
propose a deep decision tree architecture for discriminative dictionary
learning from adversarial multi-agent trajectories. We first build up a
hierarchy for the tree structure by adding each layer and performing feature
weight based clustering in the forward pass. We then fine tune the player role
weights using back propagation. The hierarchical architecture ensures the
interpretability and the integrity of the group representation. The resulting
architecture is a decision tree, with leaf-nodes capturing a dictionary of
multi-agent group interactions. Due to the ample volume of data available, we
focus on soccer tracking data, although our approach can be used in any
adversarial multi-agent domain. We present applications of proposed method for
simulating soccer games as well as evaluating and quantifying team strategies.Comment: To appear in 4th International Workshop on Computer Vision in Sports
(CVsports) at CVPR 201
Complexer-YOLO: Real-Time 3D Object Detection and Tracking on Semantic Point Clouds
Accurate detection of 3D objects is a fundamental problem in computer vision
and has an enormous impact on autonomous cars, augmented/virtual reality and
many applications in robotics. In this work we present a novel fusion of neural
network based state-of-the-art 3D detector and visual semantic segmentation in
the context of autonomous driving. Additionally, we introduce
Scale-Rotation-Translation score (SRTs), a fast and highly parameterizable
evaluation metric for comparison of object detections, which speeds up our
inference time up to 20\% and halves training time. On top, we apply
state-of-the-art online multi target feature tracking on the object
measurements to further increase accuracy and robustness utilizing temporal
information. Our experiments on KITTI show that we achieve same results as
state-of-the-art in all related categories, while maintaining the performance
and accuracy trade-off and still run in real-time. Furthermore, our model is
the first one that fuses visual semantic with 3D object detection
- …