293 research outputs found
Dual Pyramid Generative Adversarial Networks for Semantic Image Synthesis
The goal of semantic image synthesis is to generate photo-realistic images
from semantic label maps. It is highly relevant for tasks like content
generation and image editing. Current state-of-the-art approaches, however,
still struggle to generate realistic objects in images at various scales. In
particular, small objects tend to fade away and large objects are often
generated as collages of patches. In order to address this issue, we propose a
Dual Pyramid Generative Adversarial Network (DP-GAN) that learns the
conditioning of spatially-adaptive normalization blocks at all scales jointly,
such that scale information is bi-directionally used, and it unifies
supervision at different scales. Our qualitative and quantitative results show
that the proposed approach generates images where small and large objects look
more realistic compared to images generated by state-of-the-art methods.Comment: BMVC202
MSPE: Multi-Scale Patch Embedding Prompts Vision Transformers to Any Resolution
Although Vision Transformers (ViTs) have recently advanced computer vision
tasks significantly, an important real-world problem was overlooked: adapting
to variable input resolutions. Typically, images are resized to a fixed
resolution, such as 224x224, for efficiency during training and inference.
However, uniform input size conflicts with real-world scenarios where images
naturally vary in resolution. Modifying the preset resolution of a model may
severely degrade the performance. In this work, we propose to enhance the model
adaptability to resolution variation by optimizing the patch embedding. The
proposed method, called Multi-Scale Patch Embedding (MSPE), substitutes the
standard patch embedding with multiple variable-sized patch kernels and selects
the best parameters for different resolutions, eliminating the need to resize
the original image. Our method does not require high-cost training or
modifications to other parts, making it easy to apply to most ViT models.
Experiments in image classification, segmentation, and detection tasks
demonstrate the effectiveness of MSPE, yielding superior performance on
low-resolution inputs and performing comparably on high-resolution inputs with
existing methods
Towards Trustworthy Dataset Distillation
Efficiency and trustworthiness are two eternal pursuits when applying deep
learning in real-world applications. With regard to efficiency, dataset
distillation (DD) endeavors to reduce training costs by distilling the large
dataset into a tiny synthetic dataset. However, existing methods merely
concentrate on in-distribution (InD) classification in a closed-world setting,
disregarding out-of-distribution (OOD) samples. On the other hand, OOD
detection aims to enhance models' trustworthiness, which is always
inefficiently achieved in full-data settings. For the first time, we
simultaneously consider both issues and propose a novel paradigm called
Trustworthy Dataset Distillation (TrustDD). By distilling both InD samples and
outliers, the condensed datasets are capable to train models competent in both
InD classification and OOD detection. To alleviate the requirement of real
outlier data and make OOD detection more practical, we further propose to
corrupt InD samples to generate pseudo-outliers and introduce Pseudo-Outlier
Exposure (POE). Comprehensive experiments on various settings demonstrate the
effectiveness of TrustDD, and the proposed POE surpasses state-of-the-art
method Outlier Exposure (OE). Compared with the preceding DD, TrustDD is more
trustworthy and applicable to real open-world scenarios. Our code will be
publicly available.Comment: 20 pages, 20 figure
Wide-Area Damping Controller of FACTS Devices for Inter-Area Oscillations Considering Communication Time Delays
The usage of remote signals obtained from a wide-area measurement system (WAMS) introduces time delays to a wide-area damping controller (WADC), which would degrade system damping and even cause system instability. The time-delay margin is defined as the maximum time delay under which a closed-loop system can remain stable. In this paper, the delay margin is introduced as an additional performance index for the synthesis of classical WADCs for flexible ac transmission systems (FACTS) devices to damp inter-area oscillations. The proposed approach includes three parts: a geometric measure approach for selecting feedback remote signals, a residue method for designing phase-compensation parameters, and a Lyapunov stability criterion and linear matrix inequalities (LMI) for calculating the delay margin and determining the gain of the WADC based on a tradeoff between damping performance and delay margin. Three case studies are undertaken based on a four-machine two-area power system for demonstrating the design principle of the proposed approach, a New England ten-machine 39-bus power system and a 16-machine 68-bus power system for verifying the feasibility on larger and more complex power systems. The simulation results verify the effectiveness of the proposed approach on providing a balance between the delay margin and the damping performance
Recommended from our members
Dynamic Network Characteristics of Power-electronics-based Power Systems
Power flow studies in traditional power systems aim to uncover the stationary relationship between voltage amplitude and phase and active and reactive powers; they are important for both stationary and dynamic power system analysis. With the increasing penetration of large-scale power electronics devices including renewable generations interfaced with converters, the power systems become gradually power-electronics-dominant and correspondingly their dynamical behavior changes substantially. Due to the fast dynamics of converters, such as AC current controller, the quasi-stationary state approximation, which has been widely used in power systems, is no longer appropriate and should be reexamined. In this paper, for a better description of network characteristics, we develop a novel concept of dynamic power flow and uncover an explicit dynamic relation between the instantaneous powers and the voltage vectors. This mathematical relation has been well verified by simulations on transient analysis of a small power-electronics-based power system, and a small-signal frequency-domain stability analysis of a voltage source converter connected to an infinitely strong bus. These results demonstrate the applicability of the proposed method and shed an improved light on our understanding of power-electronics-dominant power systems, whose dynamical nature remains obscure
Adaptive Fusion of Single-View and Multi-View Depth for Autonomous Driving
Multi-view depth estimation has achieved impressive performance over various
benchmarks. However, almost all current multi-view systems rely on given ideal
camera poses, which are unavailable in many real-world scenarios, such as
autonomous driving. In this work, we propose a new robustness benchmark to
evaluate the depth estimation system under various noisy pose settings.
Surprisingly, we find current multi-view depth estimation methods or
single-view and multi-view fusion methods will fail when given noisy pose
settings. To address this challenge, we propose a single-view and multi-view
fused depth estimation system, which adaptively integrates high-confident
multi-view and single-view results for both robust and accurate depth
estimations. The adaptive fusion module performs fusion by dynamically
selecting high-confidence regions between two branches based on a wrapping
confidence map. Thus, the system tends to choose the more reliable branch when
facing textureless scenes, inaccurate calibration, dynamic objects, and other
degradation or challenging conditions. Our method outperforms state-of-the-art
multi-view and fusion methods under robustness testing. Furthermore, we achieve
state-of-the-art performance on challenging benchmarks (KITTI and DDAD) when
given accurate pose estimations. Project website:
https://github.com/Junda24/AFNet/.Comment: Accepted to CVPR 202
Active Generalized Category Discovery
Generalized Category Discovery (GCD) is a pragmatic and challenging
open-world task, which endeavors to cluster unlabeled samples from both novel
and old classes, leveraging some labeled data of old classes. Given that
knowledge learned from old classes is not fully transferable to new classes,
and that novel categories are fully unlabeled, GCD inherently faces intractable
problems, including imbalanced classification performance and inconsistent
confidence between old and new classes, especially in the low-labeling regime.
Hence, some annotations of new classes are deemed necessary. However, labeling
new classes is extremely costly. To address this issue, we take the spirit of
active learning and propose a new setting called Active Generalized Category
Discovery (AGCD). The goal is to improve the performance of GCD by actively
selecting a limited amount of valuable samples for labeling from the oracle. To
solve this problem, we devise an adaptive sampling strategy, which jointly
considers novelty, informativeness and diversity to adaptively select novel
samples with proper uncertainty. However, owing to the varied orderings of
label indices caused by the clustering of novel classes, the queried labels are
not directly applicable to subsequent training. To overcome this issue, we
further propose a stable label mapping algorithm that transforms ground truth
labels to the label space of the classifier, thereby ensuring consistent
training across different active selection stages. Our method achieves
state-of-the-art performance on both generic and fine-grained datasets. Our
code is available at https://github.com/mashijie1028/ActiveGCDComment: Accepted to CVPR 202
- …