15 research outputs found

    HOLLOWS: A Power-Aware Task Scheduler for Energy Harvesting Sensor Nodes

    Get PDF
    Energy-harvesting sensor nodes (EHSNs) have stringent low-energy consumption requirements, but they need to concurrently execute several types of tasks (processing, sensing, actuation, etc). Furthermore, no accurate models exist to predict the energy harvesting income in order to adapt at run-time the executing set of prioritized tasks. In this paper, we propose a novel power-aware task scheduler for EHSNs, namely, HOLLOWS: Head-of-Line Low-Overhead Wide-priority Service. HOLLOWS uses an energy-constrained prioritized queue model to describe the residence time of tasks entering the system and dynamically selects the set of tasks to execute, according to system accuracy requirements and expected energy. Moreover, HOLLOWS includes a new energy harvesting prediction algorithm, i.e., Weather-Conditioned Moving Average (WCMA), which we have developed to estimate the solar panel energy income. We have tested HOLLOWS using the real-life working conditions of Shimmer, a sensor node for structural health monitoring. Our results indicate that HOLLOWS accurately predicts the energy available in Shimmer to guarantee a certain damage monitoring quality for long-term autonomous scenarios. Also, HOLLOWS is able to adjust the use of the incoming energy harvesting to achieve high accuracy for rapid event damage assessment (after earthquakes, fires, etc.)

    HD-Bind: Encoding of Molecular Structure with Low Precision, Hyperdimensional Binary Representations

    Full text link
    Publicly available collections of drug-like molecules have grown to comprise 10s of billions of possibilities in recent history due to advances in chemical synthesis. Traditional methods for identifying ``hit'' molecules from a large collection of potential drug-like candidates have relied on biophysical theory to compute approximations to the Gibbs free energy of the binding interaction between the drug to its protein target. A major drawback of the approaches is that they require exceptional computing capabilities to consider for even relatively small collections of molecules. Hyperdimensional Computing (HDC) is a recently proposed learning paradigm that is able to leverage low-precision binary vector arithmetic to build efficient representations of the data that can be obtained without the need for gradient-based optimization approaches that are required in many conventional machine learning and deep learning approaches. This algorithmic simplicity allows for acceleration in hardware that has been previously demonstrated for a range of application areas. We consider existing HDC approaches for molecular property classification and introduce two novel encoding algorithms that leverage the extended connectivity fingerprint (ECFP) algorithm. We show that HDC-based inference methods are as much as 90 times more efficient than more complex representative machine learning methods and achieve an acceleration of nearly 9 orders of magnitude as compared to inference with molecular docking. We demonstrate multiple approaches for the encoding of molecular data for HDC and examine their relative performance on a range of challenging molecular property prediction and drug-protein binding classification tasks. Our work thus motivates further investigation into molecular representation learning to develop ultra-efficient pre-screening tools

    On Potential Design Impacts of Electromigration Awareness

    No full text
    Abstract—Reliability issues significantly limit performance improvements from Moore’s-Law scaling. At 45nm and below, electromigration (EM) is a serious reliability issue which affects global and local interconnects in a chip and limits performance scaling. Traditional IC implementation flows meet a 10-year lifetime requirement by overdesigning and sacrificing performance. At the same time, it is well-known among circuit designers that Black’s Equation [2] suggests that lifetime can be traded for performance. In our work, we carefully study the impacts of EM-awareness on IC implementation outcomes, and show that circuit performance does not trade off so smoothly with mean time to failure (MTTF) as suggested by Black’s Equation. We conduct two basic studies: EM lifetime versus performance with fixed resource budget, and EM lifetime versus resource with fixed performance. Using design examples implemented in two process nodes, we show that performance scaling achieved by reducing the EM lifetime requirement depends on the EM slack in the circuit, which in turn depends on factors such as timing constraints, length of critical paths and the mix of cell sizes. Depending on these factors, the performance gain can range from 10 % to 80 % when the lifetime requirement is reduced from 10 years to one year. We show that at a fixed performance requirement, power and area resources are affected by the timing slack and can either decrease by 3 % or increase by 7.8 % when the MTTF requirement is reduced. We also study how conventional EM fixes using per net Non-Default Rule (NDR) routing, downsizing of drivers, and fanout reduction affect performance at reduced lifetime requirements. Our study indicates, e.g., that NDR routing can increase performance by up to 5 % but at the cost of 2 % increase in area at a reduced 7-year lifetime requirement. I

    Massively Parallel Big Data Classification on a Programmable Processing In-Memory Architecture

    No full text
    With the emergence of Internet of Things, massive data created in the world pose huge technical challenges for efficient processing. Processing in-memory (PIM) technology has been widely investigated to overcome expensive data movements between processors and memory bloclcs. However, existing PIM designs incur large area overhead to enable computing capability via additional near-data processing cores and analog/mixed signal circuits. In this paper, we propose a new massively-parallel processing in-memory (PIM) architecture, called CHOIR, based on emerging nonvolatile memory technology for big data classification. Unlike existii PIM designs which demand large analog/mixed signal circuits, we support the parallel PIM instructions for conditional and arithmetic operations in an area-efficient way. As a result, the classification solution performs both training and testing on the PIM architecture by fully utilizing the massive parallelism. Our design significantly improves the performance and energy áfidency of the classification tasks by 123 x and 52 x respectively as compared to the state-of-the-art tree boosting library running on GPU. ©2021 IEE

    Optimal Performance-Aware Cooling on Enterprise Servers

    No full text

    MAPG: Memory Access Power Gating

    No full text
    In mobile systems, the problems of short battery life and increased temperature are exacerbated by wasted leakage power. Leakage power waste can be reduced by power-gating a core while it is stalled waiting for a resource. In this work, we propose and model memory access power gating (MAPG), a low-overhead technique to enable power gating of an active core when it stalls during a long memory access. We describe a programmable two-stage power gating switch design that can vary a core's wake-up delay while maintaining voltage noise limits and leakage power savings. We also model the processor power distribution network and the effect of memory access power gating on neighboring cores. Last, we apply our power gating technique to actual benchmarks, and examine energy savings and overheads from power gating stalled cores during long memory accesses. Our analyses show the potential for over 38% energy savings given ???perfect??? power gating on memory accesses; we achieve energy savings exceeding 20% for a practical, counter-based implementation

    Scalable-Application Design for the IoT

    No full text

    Accelerators for Classical Molecular Dynamics Simulations of Biomolecules.

    No full text
    Atomistic Molecular Dynamics (MD) simulations provide researchers the ability to model biomolecular structures such as proteins and their interactions with drug-like small molecules with greater spatiotemporal resolution than is otherwise possible using experimental methods. MD simulations are notoriously expensive computational endeavors that have traditionally required massive investment in specialized hardware to access biologically relevant spatiotemporal scales. Our goal is to summarize the fundamental algorithms that are employed in the literature to then highlight the challenges that have affected accelerator implementations in practice. We consider three broad categories of accelerators: Graphics Processing Units (GPUs), Field-Programmable Gate Arrays (FPGAs), and Application Specific Integrated Circuits (ASICs). These categories are comparatively studied to facilitate discussion of their relative trade-offs and to gain context for the current state of the art. We conclude by providing insights into the potential of emerging hardware platforms and algorithms for MD

    Efficient Energy Management and Data Recovery in Sensor Networks using Latent Variables Based Tensor Factorization

    No full text
    A key factor in a successful sensor network deployment is finding a good balance between maximizing the number of measurements taken (to maintain a good sampling rate) and minimizing the overall energy consumption (to extend the network lifetime). In this work, we present a data-driven statistical model to optimize this tradeoff. Our approach takes advantage of the multivariate nature of the data collected by a heterogeneous sensor network to learn spatio-temporal patterns. These patterns enable us to employ an aggressive duty cycling policy on the individual sensor nodes, thereby reducing the overall energy consumption. Our experiments with the OMNeT++ network simulator using realistic wireless channel conditions, on data collected from two realworld sensor networks, show that we can sample just 20% of the data and can reconstruct the remaining 80 % of the data with less than 9 % mean error, outperforming similar techniques such is distributed compressive sampling. In addition, energy savings ranging up to 76%, depending on the sampling rate and the hardware configuration of the node
    corecore