108 research outputs found
Latency-aware Unified Dynamic Networks for Efficient Image Recognition
Dynamic computation has emerged as a promising avenue to enhance the
inference efficiency of deep networks. It allows selective activation of
computational units, leading to a reduction in unnecessary computations for
each input sample. However, the actual efficiency of these dynamic models can
deviate from theoretical predictions. This mismatch arises from: 1) the lack of
a unified approach due to fragmented research; 2) the focus on algorithm design
over critical scheduling strategies, especially in CUDA-enabled GPU contexts;
and 3) challenges in measuring practical latency, given that most libraries
cater to static operations. Addressing these issues, we unveil the
Latency-Aware Unified Dynamic Networks (LAUDNet), a framework that integrates
three primary dynamic paradigms-spatially adaptive computation, dynamic layer
skipping, and dynamic channel skipping. To bridge the theoretical and practical
efficiency gap, LAUDNet merges algorithmic design with scheduling optimization,
guided by a latency predictor that accurately gauges dynamic operator latency.
We've tested LAUDNet across multiple vision tasks, demonstrating its capacity
to notably reduce the latency of models like ResNet-101 by over 50% on
platforms such as V100, RTX3090, and TX2 GPUs. Notably, LAUDNet stands out in
balancing accuracy and efficiency. Code is available at:
https://www.github.com/LeapLabTHU/LAUDNet
Avalon's Game of Thoughts: Battle Against Deception through Recursive Contemplation
Recent breakthroughs in large language models (LLMs) have brought remarkable
success in the field of LLM-as-Agent. Nevertheless, a prevalent assumption is
that the information processed by LLMs is consistently honest, neglecting the
pervasive deceptive or misleading information in human society and AI-generated
content. This oversight makes LLMs susceptible to malicious manipulations,
potentially resulting in detrimental outcomes. This study utilizes the
intricate Avalon game as a testbed to explore LLMs' potential in deceptive
environments. Avalon, full of misinformation and requiring sophisticated logic,
manifests as a "Game-of-Thoughts". Inspired by the efficacy of humans'
recursive thinking and perspective-taking in the Avalon game, we introduce a
novel framework, Recursive Contemplation (ReCon), to enhance LLMs' ability to
identify and counteract deceptive information. ReCon combines formulation and
refinement contemplation processes; formulation contemplation produces initial
thoughts and speech, while refinement contemplation further polishes them.
Additionally, we incorporate first-order and second-order perspective
transitions into these processes respectively. Specifically, the first-order
allows an LLM agent to infer others' mental states, and the second-order
involves understanding how others perceive the agent's mental state. After
integrating ReCon with different LLMs, extensive experiment results from the
Avalon game indicate its efficacy in aiding LLMs to discern and maneuver around
deceptive information without extra fine-tuning and data. Finally, we offer a
possible explanation for the efficacy of ReCon and explore the current
limitations of LLMs in terms of safety, reasoning, speaking style, and format,
potentially furnishing insights for subsequent research.Comment: 40 page
Learning to Weight Samples for Dynamic Early-exiting Networks
Early exiting is an effective paradigm for improving the inference efficiency
of deep networks. By constructing classifiers with varying resource demands
(the exits), such networks allow easy samples to be output at early exits,
removing the need for executing deeper layers. While existing works mainly
focus on the architectural design of multi-exit networks, the training
strategies for such models are largely left unexplored. The current
state-of-the-art models treat all samples the same during training. However,
the early-exiting behavior during testing has been ignored, leading to a gap
between training and testing. In this paper, we propose to bridge this gap by
sample weighting. Intuitively, easy samples, which generally exit early in the
network during inference, should contribute more to training early classifiers.
The training of hard samples (mostly exit from deeper layers), however, should
be emphasized by the late classifiers. Our work proposes to adopt a weight
prediction network to weight the loss of different training samples at each
exit. This weight prediction network and the backbone model are jointly
optimized under a meta-learning framework with a novel optimization objective.
By bringing the adaptive behavior during inference into the training phase, we
show that the proposed weighting mechanism consistently improves the trade-off
between classification accuracy and inference efficiency. Code is available at
https://github.com/LeapLabTHU/L2W-DEN.Comment: ECCV 202
Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models
Recently, diffusion models have made remarkable progress in text-to-image
(T2I) generation, synthesizing images with high fidelity and diverse contents.
Despite this advancement, latent space smoothness within diffusion models
remains largely unexplored. Smooth latent spaces ensure that a perturbation on
an input latent corresponds to a steady change in the output image. This
property proves beneficial in downstream tasks, including image interpolation,
inversion, and editing. In this work, we expose the non-smoothness of diffusion
latent spaces by observing noticeable visual fluctuations resulting from minor
latent variations. To tackle this issue, we propose Smooth Diffusion, a new
category of diffusion models that can be simultaneously high-performing and
smooth. Specifically, we introduce Step-wise Variation Regularization to
enforce the proportion between the variations of an arbitrary input latent and
that of the output image is a constant at any diffusion training step. In
addition, we devise an interpolation standard deviation (ISTD) metric to
effectively assess the latent space smoothness of a diffusion model. Extensive
quantitative and qualitative experiments demonstrate that Smooth Diffusion
stands out as a more desirable solution not only in T2I generation but also
across various downstream tasks. Smooth Diffusion is implemented as a
plug-and-play Smooth-LoRA to work with various community models. Code is
available at https://github.com/SHI-Labs/Smooth-Diffusion.Comment: GitHub: https://github.com/SHI-Labs/Smooth-Diffusio
- …