53 research outputs found
DIAC: Design Exploration of Intermittent-Aware Computing Realizing Batteryless Systems
Battery-powered IoT devices face challenges like cost, maintenance, and
environmental sustainability, prompting the emergence of batteryless
energy-harvesting systems that harness ambient sources. However, their
intermittent behavior can disrupt program execution and cause data loss,
leading to unpredictable outcomes. Despite exhaustive studies employing
conventional checkpoint methods and intricate programming paradigms to address
these pitfalls, this paper proposes an innovative systematic methodology,
namely DIAC. The DIAC synthesis procedure enhances the performance and
efficiency of intermittent computing systems, with a focus on maximizing
forward progress and minimizing the energy overhead imposed by distinct memory
arrays for backup. Then, a finite-state machine is delineated, encapsulating
the core operations of an IoT node, sense, compute, transmit, and sleep states.
First, we validate the robustness and functionalities of a DIAC-based design in
the presence of power disruptions. DIAC is then applied to a wide range of
benchmarks, including ISCAS-89, MCNS, and ITC-99. The simulation results
substantiate the power-delay-product (PDP) benefits. For example, results for
complex MCNC benchmarks indicate a PDP improvement of 61%, 56%, and 38% on
average compared to three alternative techniques, evaluated at 45 nm.Comment: 6 pages, will be appeared in Design, Automation and Test in Europe
Conference 202
Enabling Intelligent IoTs for Histopathology Image Analysis Using Convolutional Neural Networks
Medical imaging is an essential data source that has been leveraged worldwide in healthcare systems. In pathology, histopathology images are used for cancer diagnosis, whereas these images are very complex and their analyses by pathologists require large amounts of time and effort. On the other hand, although convolutional neural networks (CNNs) have produced near-human results in image processing tasks, their processing time is becoming longer and they need higher computational power. In this paper, we implement a quantized ResNet model on two histopathology image datasets to optimize the inference power consumption. We analyze classification accuracy, energy estimation, and hardware utilization metrics to evaluate our method. First, the original RGBcolored images are utilized for the training phase, and then compression methods such as channel reduction and sparsity are applied. Our results show an accuracy increase of 6% from RGB on 32-bit (baseline) to the optimized representation of sparsity on RGB with a lower bit-width, i.e., \u3c8:8\u3e. For energy estimation on the used CNN model, we found that the energy used in RGB color mode with 32-bit is considerably higher than the other lower bit-width and compressed color modes. Moreover, we show that lower bit-width implementations yield higher resource utilization and a lower memory bottleneck ratio. This work is suitable for inference on energy-limited devices, which are increasingly being used in the Internet of Things (IoT) systems that facilitate healthcare systems
Semi-decentralized Inference in Heterogeneous Graph Neural Networks for Traffic Demand Forecasting: An Edge-Computing Approach
Prediction of taxi service demand and supply is essential for improving
customer's experience and provider's profit. Recently, graph neural networks
(GNNs) have been shown promising for this application. This approach models
city regions as nodes in a transportation graph and their relations as edges.
GNNs utilize local node features and the graph structure in the prediction.
However, more efficient forecasting can still be achieved by following two main
routes; enlarging the scale of the transportation graph, and simultaneously
exploiting different types of nodes and edges in the graphs. However, both
approaches are challenged by the scalability of GNNs. An immediate remedy to
the scalability challenge is to decentralize the GNN operation. However, this
creates excessive node-to-node communication. In this paper, we first
characterize the excessive communication needs for the decentralized GNN
approach. Then, we propose a semi-decentralized approach utilizing multiple
cloudlets, moderately sized storage and computation devices, that can be
integrated with the cellular base stations. This approach minimizes
inter-cloudlet communication thereby alleviating the communication overhead of
the decentralized approach while promoting scalability due to cloudlet-level
decentralization. Also, we propose a heterogeneous GNN-LSTM algorithm for
improved taxi-level demand and supply forecasting for handling dynamic taxi
graphs where nodes are taxis. Extensive experiments over real data show the
advantage of the semi-decentralized approach as tested over our heterogeneous
GNN-LSTM algorithm. Also, the proposed semi-decentralized GNN approach is shown
to reduce the overall inference time by about an order of magnitude compared to
centralized and decentralized inference schemes.Comment: 13 pages, 10 figures, LaTeX; typos corrected, references added,
mathematical analysis adde
Threshold Breaker: Can Counter-Based RowHammer Prevention Mechanisms Truly Safeguard DRAM?
This paper challenges the existing victim-focused counter-based RowHammer
detection mechanisms by experimentally demonstrating a novel multi-sided fault
injection attack technique called Threshold Breaker. This mechanism can
effectively bypass the most advanced counter-based defense mechanisms by
soft-attacking the rows at a farther physical distance from the target rows.
While no prior work has demonstrated the effect of such an attack, our work
closes this gap by systematically testing 128 real commercial DDR4 DRAM
products and reveals that the Threshold Breaker affects various chips from
major DRAM manufacturers. As a case study, we compare the performance
efficiency between our mechanism and a well-known double-sided attack by
performing adversarial weight attacks on a modern Deep Neural Network (DNN).
The results demonstrate that the Threshold Breaker can deliberately deplete the
intelligence of the targeted DNN system while DRAM is fully protected.Comment: 7 pages, 6 figure
OISA: Architecting an Optical In-Sensor Accelerator for Efficient Visual Computing
Targeting vision applications at the edge, in this work, we systematically
explore and propose a high-performance and energy-efficient Optical In-Sensor
Accelerator architecture called OISA for the first time. Taking advantage of
the promising efficiency of photonic devices, the OISA intrinsically implements
a coarse-grained convolution operation on the input frames in an innovative
minimum-conversion fashion in low-bit-width neural networks. Such a design
remarkably reduces the power consumption of data conversion, transmission, and
processing in the conventional cloud-centric architecture as well as
recently-presented edge accelerators. Our device-to-architecture simulation
results on various image data-sets demonstrate acceptable accuracy while OISA
achieves 6.68 TOp/s/W efficiency. OISA reduces power consumption by a factor of
7.9 and 18.4 on average compared with existing electronic in-/near-sensor and
ASIC accelerators.Comment: 7 page
Energy Efficient In-Memory Binary Deep Neural Network Accelerator With Dual-Mode Sot-Mram
In this paper, we explore potentials of leveraging spin-based in-memory computing platform as an accelerator for Binary Convolutional Neural Networks (BCNN). Such platform can implement the dominant convolution computation based on presented Spin Orbit Torque Magnetic Random Access Memory (SOT-MRAM) array. The proposed array architecture could simultaneously work as non-volatile memory and a reconfigurable in-memory logic (AND, OR) without add-on logic circuits to memory chip as in conventional logic-in-memory designs. The computed logic output could be also simply read out like a normal MRAM bit-cell using the shared memory peripheral circuits. We employ such intrinsic in-memory computing architecture to efficiently process data within memory to greatly reduce power hungry and omit long distance data communication concerning state-of-the-art BCNN hardware
- …