78 research outputs found
A compact butterfly-style silicon photonic-electronic neural chip for hardware-efficient deep learning
The optical neural network (ONN) is a promising hardware platform for
next-generation neurocomputing due to its high parallelism, low latency, and
low energy consumption. Previous ONN architectures are mainly designed for
general matrix multiplication (GEMM), leading to unnecessarily large area cost
and high control complexity. Here, we move beyond classical GEMM-based ONNs and
propose an optical subspace neural network (OSNN) architecture, which trades
the universality of weight representation for lower optical component usage,
area cost, and energy consumption. We devise a butterfly-style
photonic-electronic neural chip to implement our OSNN with up to 7x fewer
trainable optical components compared to GEMM-based ONNs. Additionally, a
hardware-aware training framework is provided to minimize the required device
programming precision, lessen the chip area, and boost the noise robustness. We
experimentally demonstrate the utility of our neural chip in practical image
recognition tasks, showing that a measured accuracy of 94.16% can be achieved
in hand-written digit recognition tasks with 3-bit weight programming
precision.Comment: 17 pages,5 figure
ADEPT: Automatic Differentiable DEsign of Photonic Tensor Cores
Photonic tensor cores (PTCs) are essential building blocks for optical
artificial intelligence (AI) accelerators based on programmable photonic
integrated circuits. PTCs can achieve ultra-fast and efficient tensor
operations for neural network (NN) acceleration. Current PTC designs are either
manually constructed or based on matrix decomposition theory, which lacks the
adaptability to meet various hardware constraints and device specifications. To
our best knowledge, automatic PTC design methodology is still unexplored. It
will be promising to move beyond the manual design paradigm and "nurture"
photonic neurocomputing with AI and design automation. Therefore, in this work,
for the first time, we propose a fully differentiable framework, dubbed ADEPT,
that can efficiently search PTC designs adaptive to various circuit footprint
constraints and foundry PDKs. Extensive experiments show superior flexibility
and effectiveness of the proposed ADEPT framework to explore a large PTC design
space. On various NN models and benchmarks, our searched PTC topology
outperforms prior manually-designed structures with competitive matrix
representability, 2-30x higher footprint compactness, and better noise
robustness, demonstrating a new paradigm in photonic neural chip design. The
code of ADEPT is available at https://github.com/JeremieMelo/ADEPT using the
https://github.com/JeremieMelo/pytorch-onn (TorchONN) library.Comment: Accepted to ACM/IEEE Design Automation Conference (DAC), 202
DOTA: A Dynamically-Operated Photonic Tensor Core for Energy-Efficient Transformer Accelerator
The wide adoption and significant computing resource consumption of
attention-based Transformers, e.g., Vision Transformer and large language
models, have driven the demands for efficient hardware accelerators. While
electronic accelerators have been commonly used, there is a growing interest in
exploring photonics as an alternative technology due to its high energy
efficiency and ultra-fast processing speed. Optical neural networks (ONNs) have
demonstrated promising results for convolutional neural network (CNN) workloads
that only require weight-static linear operations. However, they fail to
efficiently support Transformer architectures with attention operations due to
the lack of ability to process dynamic full-range tensor multiplication. In
this work, we propose a customized high-performance and energy-efficient
photonic Transformer accelerator, DOTA. To overcome the fundamental limitation
of existing ONNs, we introduce a novel photonic tensor core, consisting of a
crossbar array of interference-based optical vector dot-product engines, that
supports highly-parallel, dynamic, and full-range matrix-matrix multiplication.
Our comprehensive evaluation demonstrates that DOTA achieves a >4x energy and a
>10x latency reduction compared to prior photonic accelerators, and delivers
over 20x energy reduction and 2 to 3 orders of magnitude lower latency compared
to the electronic Transformer accelerator. Our work highlights the immense
potential of photonic computing for efficient hardware accelerators,
particularly for advanced machine learning workloads.Comment: The short version is accepted by Next-Gen AI System Workshop at MLSys
202
Association between FGA gene polymorphisms and coronary artery lesion in Kawasaki disease
ObjectiveTo investigate the correlation between FGA gene polymorphisms and coronary artery lesion in Kawasaki disease.MethodsTwo hundred and thirty four children with Kawasaki disease (KD group), 200 healthy children (normal group) and 208 children with non-KD fever (fever group) were enrolled. General clinical indicators, the concentration of serum MMPs, TIMP-1, FG-α,fibrinogen level, molecular function (FMPV/ODmax) and FGA Thr312Ala polymorphism were detected individually by testing peripheral venous blood after fasting in the morning.ResultsThere was no significant difference in average age among the three groups, which were 3.03 ± 1.22 years, 3.17 ± 1.30 years, and 3.21 ± 1.31 years, respectively. Compared with those in the fever group, the levels of white blood cell count (WBC), platelet count (PLT), procalcitonin (PCT), C-reactive protein (CRP), erythrocyte sedimentation rate (ESR), interleukin-6 (IL-6), monocyte chemoattractant protein-1 (MCP-1), and fibrinogen (Fg) levels were significantly increased in the KD group. Red blood cell count (RBC) and hemoglobin (Hb) levels were significantly decreased (p < 0.05).The concentration of serum MMPs, TIMP-1, and FG-α in the KD and fever groups were significantly higher than those in the normal group (p < 0.05). The concentration of MMP-2, MMP-3, MMP-9, MMP-13, TIMP-1, and FG-α in the KD group were significantly higher than those in the fever group (p < 0.05).The KD group was divided into two subgroups,55 patients with combined CAL and 179 patients without combined CAL. The plasma fibrinogen concentration in the combined CAL group was significantly higher than that in the non-combined CAL and normal groups (p < 0.01). There was no statistically significant difference in FMPV/ODmax among the three groups (p > 0.05). Compared with normal group, the FGA GG, GA, and AA genotype and G, A allele frequency of the FGA gene polymorphism in the KD group showed no significant difference (p > 0.05). In the KD group, the most common type in children with CAL was GA, while the most common type in children without CAL was GG.ConclusionMMPs and FG-α were significantly upregulated in KD patients. The proportion of FGA genotype GA in children with CAL was significantly higher than that in children without CAL, suggesting that FGA gene polymorphisms affect coronary artery lesion in children with KD
Fine-Grained Modeling and Optimization for Intelligent Resource Management in Big Data Processing
International audienceBig data processing at the production scale presents a highly complex environment for resource optimization (RO), a problem crucial for meeting performance goals and budgetary constraints of analytical users. The RO problem is challenging because it involves a set of decisions (the partition count, placement of parallel instances on machines, and resource allocation to each instance), requires multi-objective optimization (MOO), and is compounded by the scale and complexity of big data systems while having to meet stringent time constraints for scheduling. This paper presents a MaxCompute based integrated system to support multi-objective resource optimization via ne-grained instance-level modeling and optimization. We propose a new architecture that breaks RO into a series of simpler problems, new ne-grained predictive models, and novel optimization methods that exploit these models to make effective instance-level RO decisions well under a second. Evaluation using production workloads shows that our new RO system could reduce 37-72% latency and 43-78% cost at the same time, compared to the current optimizer and scheduler, while running in 0.02-0.23s
Robust estimation of bacterial cell count from optical density
Optical density (OD) is widely used to estimate the density of cells in liquid culture, but cannot be compared between instruments without a standardized calibration protocol and is challenging to relate to actual cell count. We address this with an interlaboratory study comparing three simple, low-cost, and highly accessible OD calibration protocols across 244 laboratories, applied to eight strains of constitutive GFP-expressing E. coli. Based on our results, we recommend calibrating OD to estimated cell count using serial dilution of silica microspheres, which produces highly precise calibration (95.5% of residuals <1.2-fold), is easily assessed for quality control, also assesses instrument effective linear range, and can be combined with fluorescence calibration to obtain units of Molecules of Equivalent Fluorescein (MEFL) per cell, allowing direct comparison and data fusion with flow cytometry measurements: in our study, fluorescence per cell measurements showed only a 1.07-fold mean difference between plate reader and flow cytometry data
Recommended from our members
High-speed, energy-efficient, and scalable optical computing and interconnects with CMOS-compatible silicon photonic-electronic integrated circuits
Integrated photonics is a promising technology for next-generation computing because of the essential characteristics of light, including low latency, high bandwidth, and low power consumption. In the past decades, Integrated photonics has evolved significantly over the past few decades, with abundant passive and active optical components offering ultrahigh bandwidth and ultralow power consumption. In addition, advancements in fabrication technologies have also enabled the co-integration of silicon-based electronic and photonic circuits on a chip, allowing for the realization of complex computing tasks with electrons and photons. Previous work reveals that photonic-electronic computing circuits have the potential to outperform transistor-based electronic computing circuits by orders of magnitude in speed and energy efficiency. However, the scalability of photonic-electronic circuits still requires improvement, which is critical to the success of optical computing in the post-Moore’s law era, especially given the need for this technology to compete with other emerging computing technologies.
This dissertation proposes the development of scalable photonic-electronic integrated circuits that capitalize on the strengths of electrons and photons to facilitate high-speed, energy-efficient computing and intra-chip interconnects. We explore scaling technologies for photonic computing systems that optimize the area and energy efficiency, such as wavelength division multiplexing (WDM), and demonstrate their effectiveness through experimental demonstrations. Our investigation of photonic-electronic computing circuits spans from the device to the architecture level and includes both digital and analog computing.
We first introduce the building blocks of optical computing, including essential components like electro-optic modulators, and discuss general scaling technologies in silicon-based photonic-electronic computing circuit designs. We then present a WDM-based photonic-electronic digital comparator with experimental demonstrations that exhibit its practicality in performing high-performance arithmetic logic operations. Next, we investigate photonic-electronic circuits for intra-chip interconnect with a WDM-based photonic-electronic switching network. These photonic-electronic digital logic circuits can be operated at 20 Gb/s with experimental demonstrations.
Additionally, we focus on optical analog computing and discuss scaling strategies for photonic-electronic analog computing circuits that can accelerate artificial intelligence (AI) tasks. We present a subspace optical neural network architecture that trades the universality of weight representation for better hardware usage, such as a smaller footprint and lower energy consumption. We experimentally demonstrate its utility using a butterfly-style photonic-electronic neural chip. Finally, we investigate device-level optimization of the optical neural network using a promising multi-operand optical neuron to further scale down the footprint of photonic neural chips. We conduct thorough performance discussions of these photonic-electronic computing circuits, demonstrating their potential to outperform transistor-based computing circuits in terms of computational speed and energy efficiency.Electrical and Computer Engineerin
- …