221 research outputs found

    Edge-MoE: Memory-Efficient Multi-Task Vision Transformer Architecture with Task-level Sparsity via Mixture-of-Experts

    Full text link
    Computer vision researchers are embracing two promising paradigms: Vision Transformers (ViTs) and Multi-task Learning (MTL), which both show great performance but are computation-intensive, given the quadratic complexity of self-attention in ViT and the need to activate an entire large MTL model for one task. M3^3ViT is the latest multi-task ViT model that introduces mixture-of-experts (MoE), where only a small portion of subnetworks ("experts") are sparsely and dynamically activated based on the current task. M3^3ViT achieves better accuracy and over 80% computation reduction but leaves challenges for efficient deployment on FPGA. Our work, dubbed Edge-MoE, solves the challenges to introduce the first end-to-end FPGA accelerator for multi-task ViT with a collection of architectural innovations, including (1) a novel reordering mechanism for self-attention, which requires only constant bandwidth regardless of the target parallelism; (2) a fast single-pass softmax approximation; (3) an accurate and low-cost GELU approximation; (4) a unified and flexible computing unit that is shared by almost all computational layers to maximally reduce resource usage; and (5) uniquely for M3^3ViT, a novel patch reordering method to eliminate memory access overhead. Edge-MoE achieves 2.24x and 4.90x better energy efficiency comparing with GPU and CPU, respectively. A real-time video demonstration is available online, along with our code written using High-Level Synthesis, which will be open-sourced.Comment: 11 pages, 12 figures. Submitted to ICCAD 202

    Control theoretically explainable application of autoencoder methods to fault detection in nonlinear dynamic systems

    Full text link
    This paper is dedicated to control theoretically explainable application of autoencoders to optimal fault detection in nonlinear dynamic systems. Autoencoder-based learning is a standard method of machine learning technique and widely applied for fault (anomaly) detection and classification. In the context of representation learning, the so-called latent (hidden) variable plays an important role towards an optimal fault detection. In ideal case, the latent variable should be a minimal sufficient statistic. The existing autoencoder-based fault detection schemes are mainly application-oriented, and few efforts have been devoted to optimal autoencoder-based fault detection and explainable applications. The main objective of our work is to establish a framework for learning autoencoder-based optimal fault detection in nonlinear dynamic systems. To this aim, a process model form for dynamic systems is firstly introduced with the aid of control and system theory, which also leads to a clear system interpretation of the latent variable. The major efforts are devoted to the development of a control theoretical solution to the optimal fault detection problem, in which an analog concept to minimal sufficient statistic, the so-called lossless information compression, is introduced for dynamic systems and fault detection specifications. In particular, the existence conditions for such a latent variable are derived, based on which a loss function and further a learning algorithm are developed. This learning algorithm enables optimally training of autoencoders to achieve an optimal fault detection in nonlinear dynamic systems. A case study on three-tank system is given at the end of this paper to illustrate the capability of the proposed autoencoder-based fault detection and to explain the essential role of the latent variable in the proposed fault detection system

    Enhanced oxidation resistance of active nanostructures via dynamic size effect.

    Get PDF
    A major challenge limiting the practical applications of nanomaterials is that the activities of nanostructures (NSs) increase with reduced size, often sacrificing their stability in the chemical environment. Under oxidative conditions, NSs with smaller sizes and higher defect densities are commonly expected to oxidize more easily, since high-concentration defects can facilitate oxidation by enhancing the reactivity with O2 and providing a fast channel for oxygen incorporation. Here, using FeO NSs as an example, we show to the contrary, that reducing the size of active NSs can drastically increase their oxidation resistance. A maximum oxidation resistance is found for FeO NSs with dimensions below 3.2 nm. Rather than being determined by the structure or electronic properties of active sites, the enhanced oxidation resistance originates from the size-dependent structural dynamics of FeO NSs in O2. We find this dynamic size effect to govern the chemical properties of active NSs
    corecore