41,835 research outputs found
STA: Spatial-Temporal Attention for Large-Scale Video-based Person Re-Identification
In this work, we propose a novel Spatial-Temporal Attention (STA) approach to
tackle the large-scale person re-identification task in videos. Different from
the most existing methods, which simply compute representations of video clips
using frame-level aggregation (e.g. average pooling), the proposed STA adopts a
more effective way for producing robust clip-level feature representation.
Concretely, our STA fully exploits those discriminative parts of one target
person in both spatial and temporal dimensions, which results in a 2-D
attention score matrix via inter-frame regularization to measure the
importances of spatial parts across different frames. Thus, a more robust
clip-level feature representation can be generated according to a weighted sum
operation guided by the mined 2-D attention score matrix. In this way, the
challenging cases for video-based person re-identification such as pose
variation and partial occlusion can be well tackled by the STA. We conduct
extensive experiments on two large-scale benchmarks, i.e. MARS and
DukeMTMC-VideoReID. In particular, the mAP reaches 87.7% on MARS, which
significantly outperforms the state-of-the-arts with a large margin of more
than 11.6%.Comment: Accepted as a conference paper at AAAI 201
Digital Image Access & Retrieval
The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio
Learning Deep Context-aware Features over Body and Latent Parts for Person Re-identification
Person Re-identification (ReID) is to identify the same person across
different cameras. It is a challenging task due to the large variations in
person pose, occlusion, background clutter, etc How to extract powerful
features is a fundamental problem in ReID and is still an open problem today.
In this paper, we design a Multi-Scale Context-Aware Network (MSCAN) to learn
powerful features over full body and body parts, which can well capture the
local context knowledge by stacking multi-scale convolutions in each layer.
Moreover, instead of using predefined rigid parts, we propose to learn and
localize deformable pedestrian parts using Spatial Transformer Networks (STN)
with novel spatial constraints. The learned body parts can release some
difficulties, eg pose variations and background clutters, in part-based
representation. Finally, we integrate the representation learning processes of
full body and body parts into a unified framework for person ReID through
multi-class person identification tasks. Extensive evaluations on current
challenging large-scale person ReID datasets, including the image-based
Market1501, CUHK03 and sequence-based MARS datasets, show that the proposed
method achieves the state-of-the-art results.Comment: Accepted by CVPR 201
Spatial and Temporal Mutual Promotion for Video-based Person Re-identification
Video-based person re-identification is a crucial task of matching video
sequences of a person across multiple camera views. Generally, features
directly extracted from a single frame suffer from occlusion, blur,
illumination and posture changes. This leads to false activation or missing
activation in some regions, which corrupts the appearance and motion
representation. How to explore the abundant spatial-temporal information in
video sequences is the key to solve this problem. To this end, we propose a
Refining Recurrent Unit (RRU) that recovers the missing parts and suppresses
noisy parts of the current frame's features by referring historical frames.
With RRU, the quality of each frame's appearance representation is improved.
Then we use the Spatial-Temporal clues Integration Module (STIM) to mine the
spatial-temporal information from those upgraded features. Meanwhile, the
multi-level training objective is used to enhance the capability of RRU and
STIM. Through the cooperation of those modules, the spatial and temporal
features mutually promote each other and the final spatial-temporal feature
representation is more discriminative and robust. Extensive experiments are
conducted on three challenging datasets, i.e., iLIDS-VID, PRID-2011 and MARS.
The experimental results demonstrate that our approach outperforms existing
state-of-the-art methods of video-based person re-identification on iLIDS-VID
and MARS and achieves favorable results on PRID-2011.Comment: Accepted by AAAI19 as spotligh
The Cyborg Astrobiologist: Testing a Novelty-Detection Algorithm on Two Mobile Exploration Systems at Rivas Vaciamadrid in Spain and at the Mars Desert Research Station in Utah
(ABRIDGED) In previous work, two platforms have been developed for testing
computer-vision algorithms for robotic planetary exploration (McGuire et al.
2004b,2005; Bartolo et al. 2007). The wearable-computer platform has been
tested at geological and astrobiological field sites in Spain (Rivas
Vaciamadrid and Riba de Santiuste), and the phone-camera has been tested at a
geological field site in Malta. In this work, we (i) apply a Hopfield
neural-network algorithm for novelty detection based upon color, (ii) integrate
a field-capable digital microscope on the wearable computer platform, (iii)
test this novelty detection with the digital microscope at Rivas Vaciamadrid,
(iv) develop a Bluetooth communication mode for the phone-camera platform, in
order to allow access to a mobile processing computer at the field sites, and
(v) test the novelty detection on the Bluetooth-enabled phone-camera connected
to a netbook computer at the Mars Desert Research Station in Utah. This systems
engineering and field testing have together allowed us to develop a real-time
computer-vision system that is capable, for example, of identifying lichens as
novel within a series of images acquired in semi-arid desert environments. We
acquired sequences of images of geologic outcrops in Utah and Spain consisting
of various rock types and colors to test this algorithm. The algorithm robustly
recognized previously-observed units by their color, while requiring only a
single image or a few images to learn colors as familiar, demonstrating its
fast learning capability.Comment: 28 pages, 12 figures, accepted for publication in the International
Journal of Astrobiolog
Characterization and mapping of surface physical properties of Mars from CRISM multi-angular data: application to Gusev Crater and Meridiani Planum
The analysis of the surface texture from the particle (grain size, shape and
internal structure) to its organization (surface roughness) provides
information on the geological processes. CRISM multi-angular observations
(varied emission angles) allow to characterize the surface scattering behavior
which depends on the composition but also the material physical properties
(e.g., grain size, shape, internal structure, the surface roughness). After an
atmospheric correction by the Multi-angle Approach for Retrieval of the Surface
Reflectance from CRISM Observations, the surface reflectances at different
geometries are analyzed by inverting the Hapke photometric model depending on
the single scattering albedo, the 2-term phase function, the macroscopic
roughness and the 2-term opposition effects. Surface photometric maps are
created to observe the spatial variations of surface scattering properties as a
function of geological units at the CRISM spatial resolution (200m/pixel). An
application at the Mars Exploration Rover (MER) landing sites located at Gusev
Crater and Meridiani Planum where orbital and in situ observations are
available, is presented. Complementary orbital observations (e.g. CRISM
spectra, THermal EMission Imaging System, High Resolution Imaging Science
Experiment images) are used for interpreting the estimated Hapke photometric
parameters in terms of physical properties. The in situ observations are used
as ground truth to validate the interpretations. Varied scattering properties
are observed inside a CRISM observation (5x10km) suggesting that the surfaces
are controlled by local geological processes (e.g. volcanic resurfacing,
aeolian and impact processes) rather than regional or global. Consistent
results with the in situ observations are observed thus validating the approach
and the use of photometry for the characterization of Martian surface physical
properties
- …