2,944 research outputs found
Dynamic Patch-aware Enrichment Transformer for Occluded Person Re-Identification
Person re-identification (re-ID) continues to pose a significant challenge,
particularly in scenarios involving occlusions. Prior approaches aimed at
tackling occlusions have predominantly focused on aligning physical body
features through the utilization of external semantic cues. However, these
methods tend to be intricate and susceptible to noise. To address the
aforementioned challenges, we present an innovative end-to-end solution known
as the Dynamic Patch-aware Enrichment Transformer (DPEFormer). This model
effectively distinguishes human body information from occlusions automatically
and dynamically, eliminating the need for external detectors or precise image
alignment. Specifically, we introduce a dynamic patch token selection module
(DPSM). DPSM utilizes a label-guided proxy token as an intermediary to identify
informative occlusion-free tokens. These tokens are then selected for deriving
subsequent local part features. To facilitate the seamless integration of
global classification features with the finely detailed local features selected
by DPSM, we introduce a novel feature blending module (FBM). FBM enhances
feature representation through the complementary nature of information and the
exploitation of part diversity. Furthermore, to ensure that DPSM and the entire
DPEFormer can effectively learn with only identity labels, we also propose a
Realistic Occlusion Augmentation (ROA) strategy. This strategy leverages the
recent advances in the Segment Anything Model (SAM). As a result, it generates
occlusion images that closely resemble real-world occlusions, greatly enhancing
the subsequent contrastive learning process. Experiments on occluded and
holistic re-ID benchmarks signify a substantial advancement of DPEFormer over
existing state-of-the-art approaches. The code will be made publicly available.Comment: 12 pages, 6 figure
Integrating Multiple 3D Views through Frame-of-reference Interaction
Frame-of-reference interaction consists of a unified set of 3D interaction techniques for exploratory navigation of large virtual spaces in nonimmersive environments. It is based on a conceptual framework that considers navigation from a cognitive perspective, as a way of facilitating changes in user attention from one reference frame to another, rather than from the mechanical perspective of moving a camera between different points of interest. All of our techniques link multiple frames of reference in some meaningful way. Some techniques link multiple windows within a zooming environment while others allow seamless changes of user focus between static objects, moving objects, and groups of moving objects. We present our techniques as they are implemented in GeoZui3D, a geographic visualization system for ocean data
A Fusion Approach for Multi-Frame Optical Flow Estimation
To date, top-performing optical flow estimation methods only take pairs of
consecutive frames into account. While elegant and appealing, the idea of using
more than two frames has not yet produced state-of-the-art results. We present
a simple, yet effective fusion approach for multi-frame optical flow that
benefits from longer-term temporal cues. Our method first warps the optical
flow from previous frames to the current, thereby yielding multiple plausible
estimates. It then fuses the complementary information carried by these
estimates into a new optical flow field. At the time of writing, our method
ranks first among published results in the MPI Sintel and KITTI 2015
benchmarks. Our models will be available on https://github.com/NVlabs/PWC-Net.Comment: Work accepted at IEEE Winter Conference on Applications of Computer
Vision (WACV 2019
Dynamic worlds in miniature
The World in Miniature (WIM) metaphor allows users to interact and travel efficiently in virtual environments. In addition to the first-person perspective offered by typical VR applications, the WIM offers a second dynamic viewpoint through a hand-held miniature copy of the virtual environment. In the original WIM paper the miniature was a scaled down replica of the whole environment, thus limiting the technique to simple models being manipulated at a single level of scale. Several WIM extensions have been proposed where the replica shows only a part of the virtual environment. In this paper we present an improved visualization of WIM that supports arbitrarily-complex, densely-occluded scenes. In particular, we discuss algorithms for selecting the region of the virtual environment which will be covered by the miniature copy and efficient
algorithms for handling 3D occlusion from an exocentric viewpoint.Peer ReviewedPostprint (author’s final draft
- …