1,729 research outputs found
The Structure Transfer Machine Theory and Applications
Representation learning is a fundamental but challenging problem, especially
when the distribution of data is unknown. We propose a new representation
learning method, termed Structure Transfer Machine (STM), which enables feature
learning process to converge at the representation expectation in a
probabilistic way. We theoretically show that such an expected value of the
representation (mean) is achievable if the manifold structure can be
transferred from the data space to the feature space. The resulting structure
regularization term, named manifold loss, is incorporated into the loss
function of the typical deep learning pipeline. The STM architecture is
constructed to enforce the learned deep representation to satisfy the intrinsic
manifold structure from the data, which results in robust features that suit
various application scenarios, such as digit recognition, image classification
and object tracking. Compared to state-of-the-art CNN architectures, we achieve
the better results on several commonly used benchmarks\footnote{The source code
is available. https://github.com/stmstmstm/stm }
Tracking by 3D Model Estimation of Unknown Objects in Videos
Most model-free visual object tracking methods formulate the tracking task as
object location estimation given by a 2D segmentation or a bounding box in each
video frame. We argue that this representation is limited and instead propose
to guide and improve 2D tracking with an explicit object representation, namely
the textured 3D shape and 6DoF pose in each video frame. Our representation
tackles a complex long-term dense correspondence problem between all 3D points
on the object for all video frames, including frames where some points are
invisible. To achieve that, the estimation is driven by re-rendering the input
video frames as well as possible through differentiable rendering, which has
not been used for tracking before. The proposed optimization minimizes a novel
loss function to estimate the best 3D shape, texture, and 6DoF pose. We improve
the state-of-the-art in 2D segmentation tracking on three different datasets
with mostly rigid objects
- …