221 research outputs found
Edge-MoE: Memory-Efficient Multi-Task Vision Transformer Architecture with Task-level Sparsity via Mixture-of-Experts
Computer vision researchers are embracing two promising paradigms: Vision
Transformers (ViTs) and Multi-task Learning (MTL), which both show great
performance but are computation-intensive, given the quadratic complexity of
self-attention in ViT and the need to activate an entire large MTL model for
one task. MViT is the latest multi-task ViT model that introduces
mixture-of-experts (MoE), where only a small portion of subnetworks ("experts")
are sparsely and dynamically activated based on the current task. MViT
achieves better accuracy and over 80% computation reduction but leaves
challenges for efficient deployment on FPGA.
Our work, dubbed Edge-MoE, solves the challenges to introduce the first
end-to-end FPGA accelerator for multi-task ViT with a collection of
architectural innovations, including (1) a novel reordering mechanism for
self-attention, which requires only constant bandwidth regardless of the target
parallelism; (2) a fast single-pass softmax approximation; (3) an accurate and
low-cost GELU approximation; (4) a unified and flexible computing unit that is
shared by almost all computational layers to maximally reduce resource usage;
and (5) uniquely for MViT, a novel patch reordering method to eliminate
memory access overhead. Edge-MoE achieves 2.24x and 4.90x better energy
efficiency comparing with GPU and CPU, respectively. A real-time video
demonstration is available online, along with our code written using High-Level
Synthesis, which will be open-sourced.Comment: 11 pages, 12 figures. Submitted to ICCAD 202
Control theoretically explainable application of autoencoder methods to fault detection in nonlinear dynamic systems
This paper is dedicated to control theoretically explainable application of
autoencoders to optimal fault detection in nonlinear dynamic systems.
Autoencoder-based learning is a standard method of machine learning technique
and widely applied for fault (anomaly) detection and classification. In the
context of representation learning, the so-called latent (hidden) variable
plays an important role towards an optimal fault detection. In ideal case, the
latent variable should be a minimal sufficient statistic. The existing
autoencoder-based fault detection schemes are mainly application-oriented, and
few efforts have been devoted to optimal autoencoder-based fault detection and
explainable applications. The main objective of our work is to establish a
framework for learning autoencoder-based optimal fault detection in nonlinear
dynamic systems. To this aim, a process model form for dynamic systems is
firstly introduced with the aid of control and system theory, which also leads
to a clear system interpretation of the latent variable. The major efforts are
devoted to the development of a control theoretical solution to the optimal
fault detection problem, in which an analog concept to minimal sufficient
statistic, the so-called lossless information compression, is introduced for
dynamic systems and fault detection specifications. In particular, the
existence conditions for such a latent variable are derived, based on which a
loss function and further a learning algorithm are developed. This learning
algorithm enables optimally training of autoencoders to achieve an optimal
fault detection in nonlinear dynamic systems. A case study on three-tank system
is given at the end of this paper to illustrate the capability of the proposed
autoencoder-based fault detection and to explain the essential role of the
latent variable in the proposed fault detection system
Enhanced oxidation resistance of active nanostructures via dynamic size effect.
A major challenge limiting the practical applications of nanomaterials is that the activities of nanostructures (NSs) increase with reduced size, often sacrificing their stability in the chemical environment. Under oxidative conditions, NSs with smaller sizes and higher defect densities are commonly expected to oxidize more easily, since high-concentration defects can facilitate oxidation by enhancing the reactivity with O2 and providing a fast channel for oxygen incorporation. Here, using FeO NSs as an example, we show to the contrary, that reducing the size of active NSs can drastically increase their oxidation resistance. A maximum oxidation resistance is found for FeO NSs with dimensions below 3.2 nm. Rather than being determined by the structure or electronic properties of active sites, the enhanced oxidation resistance originates from the size-dependent structural dynamics of FeO NSs in O2. We find this dynamic size effect to govern the chemical properties of active NSs
- …