155 research outputs found
Expansion and Shrinkage of Localization for Weakly-Supervised Semantic Segmentation
Generating precise class-aware pseudo ground-truths, a.k.a, class activation
maps (CAMs), is essential for weakly-supervised semantic segmentation. The
original CAM method usually produces incomplete and inaccurate localization
maps. To tackle with this issue, this paper proposes an Expansion and Shrinkage
scheme based on the offset learning in the deformable convolution, to
sequentially improve the recall and precision of the located object in the two
respective stages. In the Expansion stage, an offset learning branch in a
deformable convolution layer, referred as "expansion sampler" seeks for
sampling increasingly less discriminative object regions, driven by an inverse
supervision signal that maximizes image-level classification loss. The located
more complete object in the Expansion stage is then gradually narrowed down to
the final object region during the Shrinkage stage. In the Shrinkage stage, the
offset learning branch of another deformable convolution layer, referred as
"shrinkage sampler", is introduced to exclude the false positive background
regions attended in the Expansion stage to improve the precision of the
localization maps. We conduct various experiments on PASCAL VOC 2012 and MS
COCO 2014 to well demonstrate the superiority of our method over other
state-of-the-art methods for weakly-supervised semantic segmentation. Code will
be made publicly available here https://github.com/TyroneLi/ESOL_WSSS.Comment: NeurIPS2022 accepte
Unmanned aerial vehicle-based computer vision for structural vibration measurement and condition assessment: A concise survey
With the rapid advance in camera sensor technology, the acquisition of high-resolution images or videos has become extremely convenient and cost-effective. Computer vision that extracts semantic knowledge directly from digital images or videos, offers a promising solution for non-contact and full-field structural vibration measurement and condition assessment. Unmanned aerial vehicles (UAVs), also known as flying robots or drones, are being actively developed to suit a wide range of applications. Taking advantage of its excellent mobility and flexibility, camera-equipped UAV systems can facilitate the use of computer vision, thus enhancing the capacity of the structural condition assessment. The current article aims to provide a concise survey of the recent progress and applications of UAV-based computer vision in the field of structural dynamics. The different aspects to be discussed include the UAV system design and algorithmic development in computer vision. The main challenges, future trends, and opportunities to advance the technology and close the gap between research and practice will also be stated
Weakly Supervised Semantic Segmentation via Progressive Patch Learning
Most of the existing semantic segmentation approaches with image-level class
labels as supervision, highly rely on the initial class activation map (CAM)
generated from the standard classification network. In this paper, a novel
"Progressive Patch Learning" approach is proposed to improve the local details
extraction of the classification, producing the CAM better covering the whole
object rather than only the most discriminative regions as in CAMs obtained in
conventional classification models. "Patch Learning" destructs the feature maps
into patches and independently processes each local patch in parallel before
the final aggregation. Such a mechanism enforces the network to find weak
information from the scattered discriminative local parts, achieving enhanced
local details sensitivity. "Progressive Patch Learning" further extends the
feature destruction and patch learning to multi-level granularities in a
progressive manner. Cooperating with a multi-stage optimization strategy, such
a "Progressive Patch Learning" mechanism implicitly provides the model with the
feature extraction ability across different locality-granularities. As an
alternative to the implicit multi-granularity progressive fusion approach, we
additionally propose an explicit method to simultaneously fuse features from
different granularities in a single model, further enhancing the CAM quality on
the full object coverage. Our proposed method achieves outstanding performance
on the PASCAL VOC 2012 dataset e.g., with 69.6$% mIoU on the test set), which
surpasses most existing weakly supervised semantic segmentation methods. Code
will be made publicly available here https://github.com/TyroneLi/PPL_WSSS.Comment: TMM2022 accepte
Lifelong Embedding Learning and Transfer for Growing Knowledge Graphs
Existing knowledge graph (KG) embedding models have primarily focused on
static KGs. However, real-world KGs do not remain static, but rather evolve and
grow in tandem with the development of KG applications. Consequently, new facts
and previously unseen entities and relations continually emerge, necessitating
an embedding model that can quickly learn and transfer new knowledge through
growth. Motivated by this, we delve into an expanding field of KG embedding in
this paper, i.e., lifelong KG embedding. We consider knowledge transfer and
retention of the learning on growing snapshots of a KG without having to learn
embeddings from scratch. The proposed model includes a masked KG autoencoder
for embedding learning and update, with an embedding transfer strategy to
inject the learned knowledge into the new entity and relation embeddings, and
an embedding regularization method to avoid catastrophic forgetting. To
investigate the impacts of different aspects of KG growth, we construct four
datasets to evaluate the performance of lifelong KG embedding. Experimental
results show that the proposed model outperforms the state-of-the-art inductive
and lifelong embedding baselines.Comment: Accepted in the 37th AAAI Conference on Artificial Intelligence (AAAI
2023
- …