385 research outputs found
Every Moment Counts: Dense Detailed Labeling of Actions in Complex Videos
Every moment counts in action recognition. A comprehensive understanding of
human activity in video requires labeling every frame according to the actions
occurring, placing multiple labels densely over a video sequence. To study this
problem we extend the existing THUMOS dataset and introduce MultiTHUMOS, a new
dataset of dense labels over unconstrained internet videos. Modeling multiple,
dense labels benefits from temporal relations within and across classes. We
define a novel variant of long short-term memory (LSTM) deep networks for
modeling these temporal relations via multiple input and output connections. We
show that this model improves action labeling accuracy and further enables
deeper understanding tasks ranging from structured retrieval to action
prediction.Comment: To appear in IJC
Generalizable Neural Fields as Partially Observed Neural Processes
Neural fields, which represent signals as a function parameterized by a
neural network, are a promising alternative to traditional discrete vector or
grid-based representations. Compared to discrete representations, neural
representations both scale well with increasing resolution, are continuous, and
can be many-times differentiable. However, given a dataset of signals that we
would like to represent, having to optimize a separate neural field for each
signal is inefficient, and cannot capitalize on shared information or
structures among signals. Existing generalization methods view this as a
meta-learning problem and employ gradient-based meta-learning to learn an
initialization which is then fine-tuned with test-time optimization, or learn
hypernetworks to produce the weights of a neural field. We instead propose a
new paradigm that views the large-scale training of neural representations as a
part of a partially-observed neural process framework, and leverage neural
process algorithms to solve this task. We demonstrate that this approach
outperforms both state-of-the-art gradient-based meta-learning approaches and
hypernetwork approaches.Comment: To appear ICCV 202
- …