197 research outputs found
IDEA: Interactive DoublE Attentions from Label Embedding for Text Classification
Current text classification methods typically encode the text merely into
embedding before a naive or complicated classifier, which ignores the
suggestive information contained in the label text. As a matter of fact, humans
classify documents primarily based on the semantic meaning of the
subcategories. We propose a novel model structure via siamese BERT and
interactive double attentions named IDEA ( Interactive DoublE Attentions) to
capture the information exchange of text and label names. Interactive double
attentions enable the model to exploit the inter-class and intra-class
information from coarse to fine, which involves distinguishing among all labels
and matching the semantical subclasses of ground truth labels. Our proposed
method outperforms the state-of-the-art methods using label texts significantly
with more stable results.Comment: Accepted by ICTAI202
Physically Plausible Animation of Human Upper Body from a Single Image
We present a new method for generating controllable, dynamically responsive,
and photorealistic human animations. Given an image of a person, our system
allows the user to generate Physically plausible Upper Body Animation (PUBA)
using interaction in the image space, such as dragging their hand to various
locations. We formulate a reinforcement learning problem to train a dynamic
model that predicts the person's next 2D state (i.e., keypoints on the image)
conditioned on a 3D action (i.e., joint torque), and a policy that outputs
optimal actions to control the person to achieve desired goals. The dynamic
model leverages the expressiveness of 3D simulation and the visual realism of
2D videos. PUBA generates 2D keypoint sequences that achieve task goals while
being responsive to forceful perturbation. The sequences of keypoints are then
translated by a pose-to-image generator to produce the final photorealistic
video.Comment: WACV 202
Towards Real-World Visual Tracking with Temporal Contexts
Visual tracking has made significant improvements in the past few decades.
Most existing state-of-the-art trackers 1) merely aim for performance in ideal
conditions while overlooking the real-world conditions; 2) adopt the
tracking-by-detection paradigm, neglecting rich temporal contexts; 3) only
integrate the temporal information into the template, where temporal contexts
among consecutive frames are far from being fully utilized. To handle those
problems, we propose a two-level framework (TCTrack) that can exploit temporal
contexts efficiently. Based on it, we propose a stronger version for real-world
visual tracking, i.e., TCTrack++. It boils down to two levels: features and
similarity maps. Specifically, for feature extraction, we propose an
attention-based temporally adaptive convolution to enhance the spatial features
using temporal information, which is achieved by dynamically calibrating the
convolution weights. For similarity map refinement, we introduce an adaptive
temporal transformer to encode the temporal knowledge efficiently and decode it
for the accurate refinement of the similarity map. To further improve the
performance, we additionally introduce a curriculum learning strategy. Also, we
adopt online evaluation to measure performance in real-world conditions.
Exhaustive experiments on 8 wellknown benchmarks demonstrate the superiority of
TCTrack++. Real-world tests directly verify that TCTrack++ can be readily used
in real-world applications.Comment: Accepted by IEEE TPAMI, Code:
https://github.com/vision4robotics/TCTrac
Progressive Learning without Forgetting
Learning from changing tasks and sequential experience without forgetting the
obtained knowledge is a challenging problem for artificial neural networks. In
this work, we focus on two challenging problems in the paradigm of Continual
Learning (CL) without involving any old data: (i) the accumulation of
catastrophic forgetting caused by the gradually fading knowledge space from
which the model learns the previous knowledge; (ii) the uncontrolled tug-of-war
dynamics to balance the stability and plasticity during the learning of new
tasks. In order to tackle these problems, we present Progressive Learning
without Forgetting (PLwF) and a credit assignment regime in the optimizer. PLwF
densely introduces model functions from previous tasks to construct a knowledge
space such that it contains the most reliable knowledge on each task and the
distribution information of different tasks, while credit assignment controls
the tug-of-war dynamics by removing gradient conflict through projection.
Extensive ablative experiments demonstrate the effectiveness of PLwF and credit
assignment. In comparison with other CL methods, we report notably better
results even without relying on any raw data
For One Child
The entirety of this project was completed on the foundation of the three focus areas, which were identified by our client as areas of high need. The client wanted to prioritize these three areas as they believed that these three areas were the most integral to the successful achievement of their mission, as well as to the overall health and longevity of the organization
- …