Search CORE

1,945 research outputs found

What is the Path to Fast Fault Simulation?

Author: Abramovici Miron
Krishnamurthy Balaji
Mathews Rob
Rogers Bill
Schulz Michael
Seth Sharad C.
Waicukauski John
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 01/01/1988
Field of study

Motivated by the recent advances in fast fault simulation techniques for large combinational circuits, a panel discussion has been organized for the 1988 International Test Conference. This paper is a collective account of the position statements offered by the panelists

DigitalCommons@University of Nebraska

A Biofuel-Cell-Based Energy Harvester With 86% Peak Efficiency and 0.25-V Minimum Input Voltage Using Source-Adaptive MPPT

Author: Agarwal Abhinav
Chen Kuan-Chang
Emami Azita
Gao Wei
Hoskuldsdottir Gudrun
Kuo William Wei-Ting
Talkhooncheh Arian Hashemi
Wang Minwo
Yu You
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/03/2021
Field of study

This article presents an efficient cold-starting energy harvester system, fabricated in 65-nm CMOS. The proposed harvester uses no external electrical components and is compatible with biofuel-cell (BFC) voltage and power ranges. A power-efficient system architecture is proposed to keep the internal circuitry operating at 0.4 V while regulating the output voltage at 1 V using switched-capacitor dc–dc converters and a hysteretic controller. A startup enhancement block is presented to facilitate cold startup with any arbitrary input voltage. A real-time on-chip 2-D maximum power point tracking with source degradation tracing is also implemented to maintain power efficiency maximized over time. The system performs cold startup with a minimum input voltage of 0.39 V and continues its operation if the input voltage degrades to as low as 0.25 V. Peak power efficiency of 86% is achieved at 0.39 V of input voltage and 1.34 μW of output power with 220 nW of average power consumption of the chip. The end-to-end power efficiency is kept above 70% for a wide range of loading powers from 1 to 12 μW. The chip is integrated with a pair of lactate BFC electrodes with 2 mm of diameter on a prototype-printed circuit board (PCB). Integrated operation of the chip with the electrodes and a lactate solution is demonstrated

Recommended from our members

Automatic synthesis of analog layout : a survey

Author: Rentmeesters Mark J.
Publication venue: eScholarship, University of California
Publication date: 02/08/1990
Field of study

A review of recent research in the automatic synthesis of physical geometry for analog integrated circuits is presented. On introduction, an explanation of the difficulties involved in analog layout as opposed to digital layout is covered. Review of the literature then follows. Emphasis is placed on the exposition of general methods for addressing problems specific to analog layout, with the details of specific systems only being given when they surve to illustrate these methods well. The conclusion discusses problems remaining and offers a prediction as to how technology will evolve to solve them. It is argued that although progress has been and will continue to be made in the automation of analog IC layout, due to fundamental differences in the nature of analog IC design as opposed to digital design, it should not be expected that the level of automation of the former will reach that of the latter any time soon

eScholarship - University of California

Optimized Surface Code Communication in Superconducting Quantum Computers

Author: Brown Kenneth R.
Chong Frederic T.
Franklin Diana
Gokhale Pranav
Holmes Adam
Javadi-Abhari Ali
Martonosi Margaret
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

Quantum computing (QC) is at the cusp of a revolution. Machines with 100 quantum bits (qubits) are anticipated to be operational by 2020 [googlemachine,gambetta2015building], and several-hundred-qubit machines are around the corner. Machines of this scale have the capacity to demonstrate quantum supremacy, the tipping point where QC is faster than the fastest classical alternative for a particular problem. Because error correction techniques will be central to QC and will be the most expensive component of quantum computation, choosing the lowest-overhead error correction scheme is critical to overall QC success. This paper evaluates two established quantum error correction codes---planar and double-defect surface codes---using a set of compilation, scheduling and network simulation tools. In considering scalable methods for optimizing both codes, we do so in the context of a full microarchitectural and compiler analysis. Contrary to previous predictions, we find that the simpler planar codes are sometimes more favorable for implementation on superconducting quantum computers, especially under conditions of high communication congestion.Comment: 14 pages, 9 figures, The 50th Annual IEEE/ACM International Symposium on Microarchitectur

arXiv.org e-Print Archive

Princeton University Open Access Repository

Crossref

Turbo decoder VLSI implementations for multi-standards wireless communication systems

Author: Han Jong Hun
Publication venue: The University of Edinburgh
Publication date: 01/01/2006
Field of study

Edinburgh Research Archive

Protein Alignment Systolic Array Throughput Optimization

Author: Causapruno Giovanni
Graziano Mariagrazia
Urgese Gianvito
Vacca Marco
Zamboni Maurizio
Publication venue: IEEE - INST ELECTRICAL ELECTRONICS ENGINEERS INC
Publication date: 01/01/2015
Field of study

Protein comparison is gaining importance year after year since it has been demonstrated that biologists can find cor- relation between different species, or genetic mutations that can lead to cancer and genetic diseases. Protein sequence alignment is the most computational intensive task when performing protein comparison. In order to speed-up alignment, dedicated processors that can perform different computations in parallel have been designed. Among them, the best performance have been achieved using Systolic Arrays. However, when the Processing Elements of the Systolic Array have an internal loop, performance could be highly reduced. In this work we present an architectural strategy to address this problem applying pipeline interleaving; this strategy is applied to a Systolic Array for Smith Waterman algorithm that we designed. Results encourage the adoption of pipeline interleaving for parallel circuits with loop based Processing Elements. We demonstrate that important benefits in terms of higher operating frequency can be derived without so relevant costs as increased complexity, area and power required

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Doctor of Philosophy

Author: Ramani Karthik
Publication venue: University of Utah
Publication date: 01/12/2012
Field of study

dissertationThe embedded system space is characterized by a rapid evolution in the complexity and functionality of applications. In addition, the short time-to-market nature of the business motivates the use of programmable devices capable of meeting the conflicting constraints of low-energy, high-performance, and short design times. The keys to achieving these conflicting constraints are specialization and maximally extracting available application parallelism. General purpose processors are flexible but are either too power hungry or lack the necessary performance. Application-specific integrated circuits (ASICS) efficiently meet the performance and power needs but are inflexible. Programmable domain-specific architectures (DSAs) are an attractive middle ground, but their design requires significant time, resources, and expertise in a variety of specialties, which range from application algorithms to architecture and ultimately, circuit design. This dissertation presents CoGenE, a design framework that automates the design of energy-performance-optimal DSAs for embedded systems. For a given application domain and a user-chosen initial architectural specification, CoGenE consists of a a Compiler to generate execution binary, a simulator Generator to collect performance/energy statistics, and an Explorer that modifies the current architecture to improve energy-performance-area characteristics. The above process repeats automatically until the user-specified constraints are achieved. This removes or alleviates the time needed to understand the application, manually design the DSA, and generate object code for the DSA. Thus, CoGenE is a new design methodology that represents a significant improvement in performance, energy dissipation, design time, and resources. This dissertation employs the face recognition domain to showcase a flexible architectural design methodology that creates "ASIC-like" DSAs. The DSAs are instruction set architecture (ISA)-independent and achieve good energy-performance characteristics by coscheduling the often conflicting constraints of data access, data movement, and computation through a flexible interconnect. This represents a significant increase in programming complexity and code generation time. To address this problem, the CoGenE compiler employs integer linear programming (ILP)-based 'interconnect-aware' scheduling techniques for automatic code generation. The CoGenE explorer employs an iterative technique to search the complete design space and select a set of energy-performance-optimal candidates. When compared to manual designs, results demonstrate that CoGenE produces superior designs for three application domains: face recognition, speech recognition and wireless telephony. While CoGenE is well suited to applications that exhibit a streaming behavior, multithreaded applications like ray tracing present a different but important challenge. To demonstrate its generality, CoGenE is evaluated in designing a novel multicore N-wide SIMD architecture, known as StreamRay, for the ray tracing domain. CoGenE is used to synthesize the SIMD execution cores, the compiler that generates the application binary, and the interconnection subsystem. Further, separating address and data computations in space reduces data movement and contention for resources, thereby significantly improving performance compared to existing ray tracing approaches

The University of Utah: J. Willard Marriott Digital Library

A hardware spinal decoder

Author: Balakrishnan Hari
Fleming Kermin Elliott
Iannucci Peter A.
Perry Jonathan
Shah Devavrat
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2012
Field of study

Spinal codes are a recently proposed capacity-achieving rateless code. While hardware encoding of spinal codes is straightforward, the design of an efficient, high-speed hardware decoder poses significant challenges. We present the first such decoder. By relaxing data dependencies inherent in the classic M-algorithm decoder, we obtain area and throughput competitive with 3GPP turbo codes as well as greatly reduced latency and complexity. The enabling architectural feature is a novel alpha-beta incremental approximate selection algorithm. We also present a method for obtaining hints which anticipate successful or failed decoding, permitting early termination and/or feedback-driven adaptation of the decoding parameters. We have validated our implementation in FPGA with on-air testing. Provisional hardware synthesis suggests that a near-capacity implementation of spinal codes can achieve a throughput of 12.5 Mbps in a 65 nm technology while using substantially less area than competitive 3GPP turbo code implementations.Irwin Mark Jacobs and Joan Klein Jacobs Presidential FellowshipIntel Corporation (Fellowship)Claude E. Shannon Research Assistantshi

CiteSeerX

DSpace@MIT

Crossref