1,110 research outputs found
ShadowSense: Unsupervised Domain Adaptation and Feature Fusion for Shadow-Agnostic Tree Crown Detection from RGB-Thermal Drone Imagery
Accurate detection of individual tree crowns from remote sensing data poses a
significant challenge due to the dense nature of forest canopy and the presence
of diverse environmental variations, e.g., overlapping canopies, occlusions,
and varying lighting conditions. Additionally, the lack of data for training
robust models adds another limitation in effectively studying complex forest
conditions. This paper presents a novel method for detecting shadowed tree
crowns and provides a challenging dataset comprising roughly 50k paired
RGB-thermal images to facilitate future research for illumination-invariant
detection. The proposed method (ShadowSense) is entirely self-supervised,
leveraging domain adversarial training without source domain annotations for
feature extraction and foreground feature alignment for feature pyramid
networks to adapt domain-invariant representations by focusing on visible
foreground regions, respectively. It then fuses complementary information of
both modalities to effectively improve upon the predictions of an RGB-trained
detector and boost the overall accuracy. Extensive experiments demonstrate the
superiority of the proposed method over both the baseline RGB-trained detector
and state-of-the-art techniques that rely on unsupervised domain adaptation or
early image fusion. Our code and data are available:
https://github.com/rudrakshkapil/ShadowSenseComment: Accepted in IEEE/CVF Winter Applications of Computer Vision (WACV)
2024 main conference! 8 pages (11 with bibliography), 5 figures, 3 table
A Chronological Survey of Theoretical Advancements in Generative Adversarial Networks for Computer Vision
Generative Adversarial Networks (GANs) have been workhorse generative models
for last many years, especially in the research field of computer vision.
Accordingly, there have been many significant advancements in the theory and
application of GAN models, which are notoriously hard to train, but produce
good results if trained well. There have been many a surveys on GANs,
organizing the vast GAN literature from various focus and perspectives.
However, none of the surveys brings out the important chronological aspect: how
the multiple challenges of employing GAN models were solved one-by-one over
time, across multiple landmark research works. This survey intends to bridge
that gap and present some of the landmark research works on the theory and
application of GANs, in chronological order
Adaptive Face Recognition Using Adversarial Information Network
In many real-world applications, face recognition models often degenerate
when training data (referred to as source domain) are different from testing
data (referred to as target domain). To alleviate this mismatch caused by some
factors like pose and skin tone, the utilization of pseudo-labels generated by
clustering algorithms is an effective way in unsupervised domain adaptation.
However, they always miss some hard positive samples. Supervision on
pseudo-labeled samples attracts them towards their prototypes and would cause
an intra-domain gap between pseudo-labeled samples and the remaining unlabeled
samples within target domain, which results in the lack of discrimination in
face recognition. In this paper, considering the particularity of face
recognition, we propose a novel adversarial information network (AIN) to
address it. First, a novel adversarial mutual information (MI) loss is proposed
to alternately minimize MI with respect to the target classifier and maximize
MI with respect to the feature extractor. By this min-max manner, the positions
of target prototypes are adaptively modified which makes unlabeled images
clustered more easily such that intra-domain gap can be mitigated. Second, to
assist adversarial MI loss, we utilize a graph convolution network to predict
linkage likelihoods between target data and generate pseudo-labels. It
leverages valuable information in the context of nodes and can achieve more
reliable results. The proposed method is evaluated under two scenarios, i.e.,
domain adaptation across poses and image conditions, and domain adaptation
across faces with different skin tones. Extensive experiments show that AIN
successfully improves cross-domain generalization and offers a new
state-of-the-art on RFW dataset.Comment: Accepted by TI
Bridging the Domain Gap for Multi-Agent Perception
Existing multi-agent perception algorithms usually select to share deep
neural features extracted from raw sensing data between agents, achieving a
trade-off between accuracy and communication bandwidth limit. However, these
methods assume all agents have identical neural networks, which might not be
practical in the real world. The transmitted features can have a large domain
gap when the models differ, leading to a dramatic performance drop in
multi-agent perception. In this paper, we propose the first lightweight
framework to bridge such domain gaps for multi-agent perception, which can be a
plug-in module for most existing systems while maintaining confidentiality. Our
framework consists of a learnable feature resizer to align features in multiple
dimensions and a sparse cross-domain transformer for domain adaption. Extensive
experiments on the public multi-agent perception dataset V2XSet have
demonstrated that our method can effectively bridge the gap for features from
different domains and outperform other baseline methods significantly by at
least 8% for point-cloud-based 3D object detection.Comment: Accepted by ICRA2023.Code: https://github.com/DerrickXuNu/MPD
Low-light Pedestrian Detection in Visible and Infrared Image Feeds: Issues and Challenges
Pedestrian detection has become a cornerstone for several high-level tasks,
including autonomous driving, intelligent transportation, and traffic
surveillance. There are several works focussed on pedestrian detection using
visible images, mainly in the daytime. However, this task is very intriguing
when the environmental conditions change to poor lighting or nighttime.
Recently, new ideas have been spurred to use alternative sources, such as Far
InfraRed (FIR) temperature sensor feeds for detecting pedestrians in low-light
conditions. This study comprehensively reviews recent developments in low-light
pedestrian detection approaches. It systematically categorizes and analyses
various algorithms from region-based to non-region-based and graph-based
learning methodologies by highlighting their methodologies, implementation
issues, and challenges. It also outlines the key benchmark datasets that can be
used for research and development of advanced pedestrian detection algorithms,
particularly in low-light situation
- …