Search CORE

22 research outputs found

Latency-aware Unified Dynamic Networks for Efficient Image Recognition

Author: Han Yizeng
Huang Gao
Liu Zeyu
Pu Yifan
Song Shiji
Wang Chaofei
Yuan Zhihang
Publication venue
Publication date: 02/09/2023
Field of study

Dynamic computation has emerged as a promising avenue to enhance the inference efficiency of deep networks. It allows selective activation of computational units, leading to a reduction in unnecessary computations for each input sample. However, the actual efficiency of these dynamic models can deviate from theoretical predictions. This mismatch arises from: 1) the lack of a unified approach due to fragmented research; 2) the focus on algorithm design over critical scheduling strategies, especially in CUDA-enabled GPU contexts; and 3) challenges in measuring practical latency, given that most libraries cater to static operations. Addressing these issues, we unveil the Latency-Aware Unified Dynamic Networks (LAUDNet), a framework that integrates three primary dynamic paradigms-spatially adaptive computation, dynamic layer skipping, and dynamic channel skipping. To bridge the theoretical and practical efficiency gap, LAUDNet merges algorithmic design with scheduling optimization, guided by a latency predictor that accurately gauges dynamic operator latency. We've tested LAUDNet across multiple vision tasks, demonstrating its capacity to notably reduce the latency of models like ResNet-101 by over 50% on platforms such as V100, RTX3090, and TX2 GPUs. Notably, LAUDNet stands out in balancing accuracy and efficiency. Code is available at: https://www.github.com/LeapLabTHU/LAUDNet

arXiv.org e-Print Archive

AdaTT: Adaptive Task-to-Task Fusion Network for Multitask Learning in Recommendations

Author: Gao Mingze
Li Danwei
Liu Xi
Yang Chaofei
Yang Jiyan
Yuan Siyang
Zhang Weilin
Zhang Zhengyu
Publication venue
Publication date: 11/04/2023
Field of study

Multi-task learning (MTL) aims at enhancing the performance and efficiency of machine learning models by training them on multiple tasks simultaneously. However, MTL research faces two challenges: 1) modeling the relationships between tasks to effectively share knowledge between them, and 2) jointly learning task-specific and shared knowledge. In this paper, we present a novel model Adaptive Task-to-Task Fusion Network (AdaTT) to address both challenges. AdaTT is a deep fusion network built with task specific and optional shared fusion units at multiple levels. By leveraging a residual mechanism and gating mechanism for task-to-task fusion, these units adaptively learn shared knowledge and task specific knowledge. To evaluate the performance of AdaTT, we conduct experiments on a public benchmark and an industrial recommendation dataset using various task groups. Results demonstrate AdaTT can significantly outperform existing state-of-the-art baselines

arXiv.org e-Print Archive

Learning to Weight Samples for Dynamic Early-exiting Networks

Author: Cao Junfen
Deng Chao
Han Yizeng
Huang Gao
Huang Wenhui
Lai Zihang
Pu Yifan
Song Shiji
Wang Chaofei
Publication venue
Publication date: 17/09/2022
Field of study

Early exiting is an effective paradigm for improving the inference efficiency of deep networks. By constructing classifiers with varying resource demands (the exits), such networks allow easy samples to be output at early exits, removing the need for executing deeper layers. While existing works mainly focus on the architectural design of multi-exit networks, the training strategies for such models are largely left unexplored. The current state-of-the-art models treat all samples the same during training. However, the early-exiting behavior during testing has been ignored, leading to a gap between training and testing. In this paper, we propose to bridge this gap by sample weighting. Intuitively, easy samples, which generally exit early in the network during inference, should contribute more to training early classifiers. The training of hard samples (mostly exit from deeper layers), however, should be emphasized by the late classifiers. Our work proposes to adopt a weight prediction network to weight the loss of different training samples at each exit. This weight prediction network and the backbone model are jointly optimized under a meta-learning framework with a novel optimization objective. By bringing the adaptive behavior during inference into the training phase, we show that the proposed weighting mechanism consistently improves the trade-off between classification accuracy and inference efficiency. Code is available at https://github.com/LeapLabTHU/L2W-DEN.Comment: ECCV 202

arXiv.org e-Print Archive

Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models

Author: Guo Jiayi
Huang Gao
Ni Zanlin
Pu Yifan
Shi Humphrey
Song Shiji
Vasu Manushree
Wang Chaofei
Xu Xingqian
Publication venue
Publication date: 07/12/2023
Field of study

Recently, diffusion models have made remarkable progress in text-to-image (T2I) generation, synthesizing images with high fidelity and diverse contents. Despite this advancement, latent space smoothness within diffusion models remains largely unexplored. Smooth latent spaces ensure that a perturbation on an input latent corresponds to a steady change in the output image. This property proves beneficial in downstream tasks, including image interpolation, inversion, and editing. In this work, we expose the non-smoothness of diffusion latent spaces by observing noticeable visual fluctuations resulting from minor latent variations. To tackle this issue, we propose Smooth Diffusion, a new category of diffusion models that can be simultaneously high-performing and smooth. Specifically, we introduce Step-wise Variation Regularization to enforce the proportion between the variations of an arbitrary input latent and that of the output image is a constant at any diffusion training step. In addition, we devise an interpolation standard deviation (ISTD) metric to effectively assess the latent space smoothness of a diffusion model. Extensive quantitative and qualitative experiments demonstrate that Smooth Diffusion stands out as a more desirable solution not only in T2I generation but also across various downstream tasks. Smooth Diffusion is implemented as a plug-and-play Smooth-LoRA to work with various community models. Code is available at https://github.com/SHI-Labs/Smooth-Diffusion.Comment: GitHub: https://github.com/SHI-Labs/Smooth-Diffusio

arXiv.org e-Print Archive

Avalon's Game of Thoughts: Battle Against Deception through Recursive Contemplation

Author: Chen Shuo
Huang Gao
Liu Chang
Qi Siyuan
Song Shiji
Wang Chaofei
Wang Shenzhi
Yang Qisen
Zhao Andrew
Zheng Zilong
Publication venue
Publication date: 24/10/2023
Field of study

Recent breakthroughs in large language models (LLMs) have brought remarkable success in the field of LLM-as-Agent. Nevertheless, a prevalent assumption is that the information processed by LLMs is consistently honest, neglecting the pervasive deceptive or misleading information in human society and AI-generated content. This oversight makes LLMs susceptible to malicious manipulations, potentially resulting in detrimental outcomes. This study utilizes the intricate Avalon game as a testbed to explore LLMs' potential in deceptive environments. Avalon, full of misinformation and requiring sophisticated logic, manifests as a "Game-of-Thoughts". Inspired by the efficacy of humans' recursive thinking and perspective-taking in the Avalon game, we introduce a novel framework, Recursive Contemplation (ReCon), to enhance LLMs' ability to identify and counteract deceptive information. ReCon combines formulation and refinement contemplation processes; formulation contemplation produces initial thoughts and speech, while refinement contemplation further polishes them. Additionally, we incorporate first-order and second-order perspective transitions into these processes respectively. Specifically, the first-order allows an LLM agent to infer others' mental states, and the second-order involves understanding how others perceive the agent's mental state. After integrating ReCon with different LLMs, extensive experiment results from the Avalon game indicate its efficacy in aiding LLMs to discern and maneuver around deceptive information without extra fine-tuning and data. Finally, we offer a possible explanation for the efficacy of ReCon and explore the current limitations of LLMs in terms of safety, reasoning, speaking style, and format, potentially furnishing insights for subsequent research.Comment: 40 page

arXiv.org e-Print Archive

Study on the Relationship between Electrical Tree Development and Partial Discharge of XLPE Cables

Author: Chaofei Gao
Jian Du
Liwei Zheng
Wei Wang
Yanlong Yu
Zan Wang
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2019
Field of study

Based on the slice materials of 35 kV and 110 kV XLPE cables, an experimental platform is built to study the relationship between electrical tree and PDs in XLPE with different voltage levels. There are three significant statistical characteristics of the PDs during the growth of electrical trees. The analysis of the results shows that each growth stage has certain characteristics. Different features existed between the growth of the electrical trees and the PD in the insulation of the 35 and 110 kV cables. Evident characteristics such as large spans of time and frequency were present as the electrical trees grew violently in the equivalent time-frequency diagram at every stage. These results could provide criteria for the identification of the deterioration using PD to monitor cables in service at rated voltages. The results are important for the identification of defects in cable insulation in order to provide an early warning of insulation breakdown in the cables

Directory of Open Access Journals

Broadband Equivalent Modeling and Common-Mode Voltage Conduction Analysis of Electrochemical Energy Storage System

Author: Bo Zhao
Chaofei Gao
Dong Hui
Juan Hu
Lei Liu
Zhanzhan Qu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2023
Field of study

Electrochemical energy storage system play an important role in the reform of the national energy system and the construction of the energy Internet. Whether small or large capacity battery storage converters, the characteristics of their power electronics can generate high frequency common mode voltage that can be potentially harmful to battery storage system. This paper systematically investigates the common-mode interference (CMI) of electrochemical energy storage system. The mechanism of common-mode interference is revealed, a broadband equivalent circuit model of common-mode voltage in electrochemical energy storage system is established, the effect of parasitic capacitance of the battery on common-mode voltage is simulated and analyzed, and the broadband equivalent circuit model is verified based on laboratory data

Directory of Open Access Journals

Partial Discharge Online Monitoring and Localization for Critical Air Gaps Among SiC-Based Medium-Voltage Converter Prototype

Author: Chaofei Gao
Dushan Boroyevich
Jun Wang
Rolando Burgos
Wei Wang
Yue Xu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Study on the Relationship between Electrical Tree Development and Partial Discharge of XLPE Cables

Author: Chaofei Gao
Jian Du
Liwei Zheng
Wei Wang
Yanlong Yu
Zan Wang
Publication venue: 'Hindawi Limited'
Publication date
Field of study

Crossref

Localization of partial discharge in transformer oil using Fabry-Pérot optical fiber sensor array

Author: Chaofei Gao
Lei Yu
Shijie Wang
Shu Song
Wei Wang
Yangchao Wang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref