86 research outputs found

    Investigating the Potential of Custom Instruction Set Extensions for SHA-3 Candidates on a 16-bit Microcontroller Architecture

    Get PDF
    In this paper, we investigate the benefit of instruction set extensions for software implementations of all five SHA-3 candidates. To this end, we start from optimized assembly code for a common 16-bit microcontroller instruction set architecture. By themselves, these implementations provide reference for complexity of the algorithms on 16-bit architectures, commonly used in embedded systems. For each algorithm, we then propose suitable instruction set extensions and implement the modified processor core. We assess the gains in throughput, memory consumption, and the area overhead. Our results show that with less than 10% additional area, it is possible to increase the execution speed on average by almost 40%, while reducing memory requirements on average by more than 40%. In particular, the Grostl algorithm, which was one of the slowest algorithms in previous reference implementations, ends up being the fastest implementation by some margin, once minor (but dedicated) instruction set extensions are taken into account

    Microarchitectural Low-Power Design Techniques for Embedded Microprocessors

    Get PDF
    With the omnipresence of embedded processing in all forms of electronics today, there is a strong trend towards wireless, battery-powered, portable embedded systems which have to operate under stringent energy constraints. Consequently, low power consumption and high energy efficiency have emerged as the two key criteria for embedded microprocessor design. In this thesis we present a range of microarchitectural low-power design techniques which enable the increase of performance for embedded microprocessors and/or the reduction of energy consumption, e.g., through voltage scaling. In the context of cryptographic applications, we explore the effectiveness of instruction set extensions (ISEs) for a range of different cryptographic hash functions (SHA-3 candidates) on a 16-bit microcontroller architecture (PIC24). Specifically, we demonstrate the effectiveness of light-weight ISEs based on lookup table integration and microcoded instructions using finite state machines for operand and address generation. On-node processing in autonomous wireless sensor node devices requires deeply embedded cores with extremely low power consumption. To address this need, we present TamaRISC, a custom-designed ISA with a corresponding ultra-low-power microarchitecture implementation. The TamaRISC architecture is employed in conjunction with an ISE and standard cell memories to design a sub-threshold capable processor system targeted at compressed sensing applications. We furthermore employ TamaRISC in a hybrid SIMD/MIMD multi-core architecture targeted at moderate to high processing requirements (> 1 MOPS). A range of different microarchitectural techniques for efficient memory organization are presented. Specifically, we introduce a configurable data memory mapping technique for private and shared access, as well as instruction broadcast together with synchronized code execution based on checkpointing. We then study an inherent suboptimality due to the worst-case design principle in synchronous circuits, and introduce the concept of dynamic timing margins. We show that dynamic timing margins exist in microprocessor circuits, and that these margins are to a large extent state-dependent and that they are correlated to the sequences of instruction types which are executed within the processor pipeline. To perform this analysis we propose a circuit/processor characterization flow and tool called dynamic timing analysis. Moreover, this flow is employed in order to devise a high-level instruction set simulation environment for impact-evaluation of timing errors on application performance. The presented approach improves the state of the art significantly in terms of simulation accuracy through the use of statistical fault injection. The dynamic timing margins in microprocessors are then systematically exploited for throughput improvements or energy reductions via our proposed instruction-based dynamic clock adjustment (DCA) technique. To this end, we introduce a 6-stage 32-bit microprocessor with cycle-by-cycle DCA. Besides a comprehensive design flow and simulation environment for evaluation of the DCA approach, we additionally present a silicon prototype of a DCA-enabled OpenRISC microarchitecture fabricated in 28 nm FD-SOI CMOS. The test chip includes a suitable clock generation unit which allows for cycle-by-cycle DCA over a wide range with fine granularity at frequencies exceeding 1 GHz. Measurement results of speedups and power reductions are provided

    An integrated soft- and hard-programmable multithreaded architecture

    Get PDF

    Proceedings of the 5th International Workshop on Reconfigurable Communication-centric Systems on Chip 2010 - ReCoSoC\u2710 - May 17-19, 2010 Karlsruhe, Germany. (KIT Scientific Reports ; 7551)

    Get PDF
    ReCoSoC is intended to be a periodic annual meeting to expose and discuss gathered expertise as well as state of the art research around SoC related topics through plenary invited papers and posters. The workshop aims to provide a prospective view of tomorrow\u27s challenges in the multibillion transistor era, taking into account the emerging techniques and architectures exploring the synergy between flexible on-chip communication and system reconfigurability

    Low power architectures for streaming applications

    Get PDF

    Flexi-WVSNP-DASH: A Wireless Video Sensor Network Platform for the Internet of Things

    Get PDF
    abstract: Video capture, storage, and distribution in wireless video sensor networks (WVSNs) critically depends on the resources of the nodes forming the sensor networks. In the era of big data, Internet of Things (IoT), and distributed demand and solutions, there is a need for multi-dimensional data to be part of the Sensor Network data that is easily accessible and consumable by humanity as well as machinery. Images and video are expected to become as ubiquitous as is the scalar data in traditional sensor networks. The inception of video-streaming over the Internet, heralded a relentless research for effective ways of distributing video in a scalable and cost effective way. There has been novel implementation attempts across several network layers. Due to the inherent complications of backward compatibility and need for standardization across network layers, there has been a refocused attention to address most of the video distribution over the application layer. As a result, a few video streaming solutions over the Hypertext Transfer Protocol (HTTP) have been proposed. Most notable are Apple’s HTTP Live Streaming (HLS) and the Motion Picture Experts Groups Dynamic Adaptive Streaming over HTTP (MPEG-DASH). These frameworks, do not address the typical and future WVSN use cases. A highly flexible Wireless Video Sensor Network Platform and compatible DASH (WVSNP-DASH) are introduced. The platform's goal is to usher video as a data element that can be integrated into traditional and non-Internet networks. A low cost, scalable node is built from the ground up to be fully compatible with the Internet of Things Machine to Machine (M2M) concept, as well as the ability to be easily re-targeted to new applications in a short time. Flexi-WVSNP design includes a multi-radio node, a middle-ware for sensor operation and communication, a cross platform client facing data retriever/player framework, scalable security as well as a cohesive but decoupled hardware and software design.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201

    Proceedings Work-In-Progress Session of the 13th Real-Time and Embedded Technology and Applications Symposium

    Get PDF
    The Work-In-Progress session of the 13th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS\u2707) presents papers describing contributions both to state of the art and state of the practice in the broad field of real-time and embedded systems. The 17 accepted papers were selected from 19 submissions. This proceedings is also available as Washington University in St. Louis Technical Report WUCSE-2007-17, at http://www.cse.seas.wustl.edu/Research/FileDownload.asp?733. Special thanks go to the General Chairs – Steve Goddard and Steve Liu and Program Chairs - Scott Brandt and Frank Mueller for their support and guidance
    • …
    corecore