254,721 research outputs found
Continuous Cross-resolution Remote Sensing Image Change Detection
Most contemporary supervised Remote Sensing (RS) image Change Detection (CD)
approaches are customized for equal-resolution bitemporal images. Real-world
applications raise the need for cross-resolution change detection, aka, CD
based on bitemporal images with different spatial resolutions. Given training
samples of a fixed bitemporal resolution difference (ratio) between the
high-resolution (HR) image and the low-resolution (LR) one, current
cross-resolution methods may fit a certain ratio but lack adaptation to other
resolution differences. Toward continuous cross-resolution CD, we propose
scale-invariant learning to enforce the model consistently predicting HR
results given synthesized samples of varying resolution differences.
Concretely, we synthesize blurred versions of the HR image by random
downsampled reconstructions to reduce the gap between HR and LR images. We
introduce coordinate-based representations to decode per-pixel predictions by
feeding the coordinate query and corresponding multi-level embedding features
into an MLP that implicitly learns the shape of land cover changes, therefore
benefiting recognizing blurred objects in the LR image. Moreover, considering
that spatial resolution mainly affects the local textures, we apply
local-window self-attention to align bitemporal features during the early
stages of the encoder. Extensive experiments on two synthesized and one
real-world different-resolution CD datasets verify the effectiveness of the
proposed method. Our method significantly outperforms several vanilla CD
methods and two cross-resolution CD methods on the three datasets both in
in-distribution and out-of-distribution settings. The empirical results suggest
that our method could yield relatively consistent HR change predictions
regardless of varying bitemporal resolution ratios. Our code is available at
\url{https://github.com/justchenhao/SILI_CD}.Comment: 21 pages, 11 figures. Accepted article by IEEE TGR
Perceptual Video Coding for Machines via Satisfied Machine Ratio Modeling
Video Coding for Machines (VCM) aims to compress visual signals for machine
analysis. However, existing methods only consider a few machines, neglecting
the majority. Moreover, the machine perceptual characteristics are not
effectively leveraged, leading to suboptimal compression efficiency. In this
paper, we introduce Satisfied Machine Ratio (SMR) to address these issues. SMR
statistically measures the quality of compressed images and videos for machines
by aggregating satisfaction scores from them. Each score is calculated based on
the difference in machine perceptions between original and compressed images.
Targeting image classification and object detection tasks, we build two
representative machine libraries for SMR annotation and construct a large-scale
SMR dataset to facilitate SMR studies. We then propose an SMR prediction model
based on the correlation between deep features differences and SMR.
Furthermore, we introduce an auxiliary task to increase the prediction accuracy
by predicting the SMR difference between two images in different quality
levels. Extensive experiments demonstrate that using the SMR models
significantly improves compression performance for VCM, and the SMR models
generalize well to unseen machines, traditional and neural codecs, and
datasets. In summary, SMR enables perceptual coding for machines and advances
VCM from specificity to generality. Code is available at
\url{https://github.com/ywwynm/SMR}
Gaze Distribution Analysis and Saliency Prediction Across Age Groups
Knowledge of the human visual system helps to develop better computational
models of visual attention. State-of-the-art models have been developed to
mimic the visual attention system of young adults that, however, largely ignore
the variations that occur with age. In this paper, we investigated how visual
scene processing changes with age and we propose an age-adapted framework that
helps to develop a computational model that can predict saliency across
different age groups. Our analysis uncovers how the explorativeness of an
observer varies with age, how well saliency maps of an age group agree with
fixation points of observers from the same or different age groups, and how age
influences the center bias. We analyzed the eye movement behavior of 82
observers belonging to four age groups while they explored visual scenes.
Explorativeness was quantified in terms of the entropy of a saliency map, and
area under the curve (AUC) metrics was used to quantify the agreement analysis
and the center bias. These results were used to develop age adapted saliency
models. Our results suggest that the proposed age-adapted saliency model
outperforms existing saliency models in predicting the regions of interest
across age groups
Quicksilver: Fast Predictive Image Registration - a Deep Learning Approach
This paper introduces Quicksilver, a fast deformable image registration
method. Quicksilver registration for image-pairs works by patch-wise prediction
of a deformation model based directly on image appearance. A deep
encoder-decoder network is used as the prediction model. While the prediction
strategy is general, we focus on predictions for the Large Deformation
Diffeomorphic Metric Mapping (LDDMM) model. Specifically, we predict the
momentum-parameterization of LDDMM, which facilitates a patch-wise prediction
strategy while maintaining the theoretical properties of LDDMM, such as
guaranteed diffeomorphic mappings for sufficiently strong regularization. We
also provide a probabilistic version of our prediction network which can be
sampled during the testing time to calculate uncertainties in the predicted
deformations. Finally, we introduce a new correction network which greatly
increases the prediction accuracy of an already existing prediction network. We
show experimental results for uni-modal atlas-to-image as well as uni- / multi-
modal image-to-image registrations. These experiments demonstrate that our
method accurately predicts registrations obtained by numerical optimization, is
very fast, achieves state-of-the-art registration results on four standard
validation datasets, and can jointly learn an image similarity measure.
Quicksilver is freely available as an open-source software.Comment: Add new discussion
- …