5,927 research outputs found
Going Deeper into Action Recognition: A Survey
Understanding human actions in visual data is tied to advances in
complementary research areas including object recognition, human dynamics,
domain adaptation and semantic segmentation. Over the last decade, human action
analysis evolved from earlier schemes that are often limited to controlled
environments to nowadays advanced solutions that can learn from millions of
videos and apply to almost all daily activities. Given the broad range of
applications from video surveillance to human-computer interaction, scientific
milestones in action recognition are achieved more rapidly, eventually leading
to the demise of what used to be good in a short time. This motivated us to
provide a comprehensive review of the notable steps taken towards recognizing
human actions. To this end, we start our discussion with the pioneering methods
that use handcrafted representations, and then, navigate into the realm of deep
learning based approaches. We aim to remain objective throughout this survey,
touching upon encouraging improvements as well as inevitable fallbacks, in the
hope of raising fresh questions and motivating new research directions for the
reader
Diffusion Model as Representation Learner
Diffusion Probabilistic Models (DPMs) have recently demonstrated impressive
results on various generative tasks.Despite its promises, the learned
representations of pre-trained DPMs, however, have not been fully understood.
In this paper, we conduct an in-depth investigation of the representation power
of DPMs, and propose a novel knowledge transfer method that leverages the
knowledge acquired by generative DPMs for recognition tasks. Our study begins
by examining the feature space of DPMs, revealing that DPMs are inherently
denoising autoencoders that balance the representation learning with
regularizing model capacity. To this end, we introduce a novel knowledge
transfer paradigm named RepFusion. Our paradigm extracts representations at
different time steps from off-the-shelf DPMs and dynamically employs them as
supervision for student networks, in which the optimal time is determined
through reinforcement learning. We evaluate our approach on several image
classification, semantic segmentation, and landmark detection benchmarks, and
demonstrate that it outperforms state-of-the-art methods. Our results uncover
the potential of DPMs as a powerful tool for representation learning and
provide insights into the usefulness of generative models beyond sample
generation. The code is available at
\url{https://github.com/Adamdad/Repfusion}.Comment: Accepted by ICCV 202
Continuous-variable quantum neural networks
We introduce a general method for building neural networks on quantum
computers. The quantum neural network is a variational quantum circuit built in
the continuous-variable (CV) architecture, which encodes quantum information in
continuous degrees of freedom such as the amplitudes of the electromagnetic
field. This circuit contains a layered structure of continuously parameterized
gates which is universal for CV quantum computation. Affine transformations and
nonlinear activation functions, two key elements in neural networks, are
enacted in the quantum network using Gaussian and non-Gaussian gates,
respectively. The non-Gaussian gates provide both the nonlinearity and the
universality of the model. Due to the structure of the CV model, the CV quantum
neural network can encode highly nonlinear transformations while remaining
completely unitary. We show how a classical network can be embedded into the
quantum formalism and propose quantum versions of various specialized model
such as convolutional, recurrent, and residual networks. Finally, we present
numerous modeling experiments built with the Strawberry Fields software
library. These experiments, including a classifier for fraud detection, a
network which generates Tetris images, and a hybrid classical-quantum
autoencoder, demonstrate the capability and adaptability of CV quantum neural
networks
- …