Search CORE

1,833 research outputs found

HIGH-LEVEL SYNTHESIS OF ELASTICITY: FROM MODELS TO CIRCUITS

Author: Jelodari Mamaghani Mahdi
Publication venue
Publication date: 01/08/2016
Field of study

The University of Manchester - Institutional Repository

AI/ML Algorithms and Applications in VLSI Design and Technology

Author: Abbas Zia
Ahmad Amir
Amuru Deepthi
Cherupally Pavan K.
Gurram Sushanth R.
Vudumula Harsha V.
Zahra Andleeb
Publication venue
Publication date: 15/02/2023
Field of study

An evident challenge ahead for the integrated circuit (IC) industry in the nanometer regime is the investigation and development of methods that can reduce the design complexity ensuing from growing process variations and curtail the turnaround time of chip manufacturing. Conventional methodologies employed for such tasks are largely manual; thus, time-consuming and resource-intensive. In contrast, the unique learning strategies of artificial intelligence (AI) provide numerous exciting automated approaches for handling complex and data-intensive tasks in very-large-scale integration (VLSI) design and testing. Employing AI and machine learning (ML) algorithms in VLSI design and manufacturing reduces the time and effort for understanding and processing the data within and across different abstraction levels via automated learning algorithms. It, in turn, improves the IC yield and reduces the manufacturing turnaround time. This paper thoroughly reviews the AI/ML automated approaches introduced in the past towards VLSI design and manufacturing. Moreover, we discuss the scope of AI/ML applications in the future at various abstraction levels to revolutionize the field of VLSI design, aiming for high-speed, highly intelligent, and efficient implementations

arXiv.org e-Print Archive

An Ultra-Low Cost and Multicast-Enabled Asynchronous NoC for Neuromorphic Edge Computing

Author: Coffen Marcolin Demetra
Indiveri Giacomo
Krstic Milos
Ramini Simone
Su Zhe
Veronesi Alessandro
Publication venue
Publication date: 25/07/2024
Field of study

Biological brains are increasingly taken as a guide toward more efficient forms of computing. The latest frontier considers the use of spiking neural-network-based neuromorphic processors for near-sensor data processing, in order to fit the tight power and resource budgets of edge computing devices. However, a prevailing focus on brain-inspired computing and storage primitives in the design of neuromorphic systems is currently bringing a fundamental bottleneck to the forefront: chip-scale communications. While communication architectures (typically, a network-on-chip) are generally inspired by, or even borrowed from, general purpose computing, neuromorphic communications exhibit unique characteristics: they consist of the event-driven routing of small amounts of information to a large number of destinations within tight area and power budgets. This article aims at an inflection point in network-on-chip design for brain-inspired communications, revolving around the combination of cost-effective and robust asynchronous design, architecture specialization for short messaging and lightweight hardware support for tree-based multicast. When validated with functional spiking neural network traffic, the proposed NoC delivers energy savings ranging from 42% to 71% over a state-of-the-art NoC used in a real multi-core neuromorphic processor for edge computing applications.<br/

The University of Manchester - Institutional Repository

Recommended from our members

Microarchitecture optimization for timing and layout

Author: Gajski Daniel
Kanehara Kenichi
Zanden Nels Vander
Publication venue: eScholarship, University of California
Publication date: 01/01/1991
Field of study

In recent years the drive to produce more complex integrated circuits while spending less design time has driven the demand for design automation tools. The search for design automation methods has resulted in the design of numerous behavioral synthesis and logic synthesis tools. This report describes a system that fills the gap between traditional behavioral synthesis and logic synthesis tools. Techniques are introduced for improving the microarchitecture structure and using feedback from lower-level optimization tools to guide design optimizations while attempting to meet user specified area and time constraints. These techniques include the capability for mixing layout styles such as custom layout for random-logic components and bit-slicing for regularly structured components. In this manner the entire design, control logic and datapath, can be optimized at the same time. Further, this paper presents a new methodology for microarchitecture-level optimization that greatly reduces the amount of technology-specific knowledge necessary to perform the optimizations

eScholarship - University of California

Beyond the arithmetic constraint: depth-optimal mapping of logic chains in reconfigurable fabrics

Author: Frederick Michael Todd
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2008
Field of study

Look-up table based FPGAs have migrated from a niche technology for design prototyping to a valuable end-product component and, in some cases, a replacement for general purpose processors and ASICs alike. One way architects have bridged the performance gap between FPGAs and ASICs is through the inclusion of specialized components such as multipliers, RAM modules, and microcontrollers. Another dedicated structure that has become standard in reconfigurable fabrics is the arithmetic carry chain. Currently, it is only used to map arithmetic operations as identified by HDL macros. For non-arithmetic operations, it is an idle but potentially powerful resource.;Obstacles to using the carry chain for generic logic operations include lack of architectural and computer-aided design support. Current carry-select architectures facilitate carry chain reuse, although they do so only for (K-1)-input operations. Additionally, hardware description language (HDL) macros are the only recourse for a designer wishing to map generic logic chains in a carry-select architecture. A novel architecture that allows the full K-input operational capacity of the carry chain to be harnessed is presented as a solution to current architectural limitations. It is shown to have negligible impact on logic element area and delay. Using only two additional 2:1 pass transistor multiplexers, it enables the transmission of a K-input operation to the carry chain and general routing simultaneously. To successfully identify logic chains in an arbitrary Boolean network, ChainMap is presented as a novel technology mapping algorithm. ChainMap creates delay-optimal generic logic chains in polynomial time without HDL macros. It maps both arithmetic and non-arithmetic logic chains whenever depth increasing nodes, which increase logic depth but not routing depth, are encountered. Use of the chain is not reserved for arithmetic, but rather any set of gates exhibiting similar characteristics. By using the carry chain as a generic, near zero-delay adjacent cell interconnection structure a potential average optimal speedup of 1.4x is revealed. Post place and route experiments indicate that ChainMap solutions perform similarly to HDL chains when cluster resources are abundant and significantly better in cluster-constrained arrays

Digital Repository @ Iowa State University (ISU)

Methodologies and Toolflows for the Predictable Design of Reliable and Low-Power NoCs

Author: Ghiribaldi Alberto
Publication venue
Publication date: 01/01/2014
Field of study

There is today the unmistakable need to evolve design methodologies and tool ows for Network-on-Chip based embedded systems. In particular, the quest for low-power requirements is nowadays a more-than-ever urgent dilemma. Modern circuits feature billion of transistors, and neither power management techniques nor batteries capacity are able to endure the increasingly higher integration capability of digital devices. Besides, power concerns come together with modern nanoscale silicon technology design issues. On one hand, system failure rates are expected to increase exponentially at every technology node when integrated circuit wear-out failure mechanisms are not compensated for. However, error detection and/or correction mechanisms have a non-negligible impact on the network power. On the other hand, to meet the stringent time-to-market deadlines, the design cycle of such a distributed and heterogeneous architecture must not be prolonged by unnecessary design iterations. Overall, there is a clear need to better discriminate reliability strategies and interconnect topology solutions upfront, by ranking designs based on power metric. In this thesis, we tackle this challenge by proposing power-aware design technologies. Finally, we take into account the most aggressive and disruptive methodology for embedded systems with ultra-low power constraints, by migrating NoC basic building blocks to asynchronous (or clockless) design style. We deal with this challenge delivering a standard cell design methodology and mainstream CAD tool ows, in this way partially relaxing the requirement of using asynchronous blocks only as hard macros

EprintsUnife

Archivio istituzionale della ricerca - Università di Ferrara

A Low-Power DSP Architecture for a Fully Implantable Cochlear Implant System-on-a-Chip.

Author: Marsman Eric David
Publication venue
Publication date
Field of study

The National Science Foundation Wireless Integrated Microsystems (WIMS) Engineering Research Center at the University of Michigan developed Systems-on-a-Chip to achieve biomedical implant and environmental monitoring functionality in low-milliwatt power consumption and 1-2 cm3 volume. The focus of this work is implantable electronics for cochlear implants (CIs), surgically implanted devices that utilize existing nerve connections between the brain and inner-ear in cases where degradation of the sensory hair cells in the cochlea has occurred. In the absence of functioning hair cells, a CI processes sound information and stimulates the nderlying nerve cells with currents from implanted electrodes, enabling the patient to understand speech. As the brain of the WIMS CI, the WIMS microcontroller unit (MCU) delivers the communication, signal processing, and storage capabilities required to satisfy the aggressive goals set forth. The 16-bit MCU implements a custom instruction set architecture focusing on power-efficient execution by providing separate data and address register windows, multi-word arithmetic, eight addressing modes, and interrupt and subroutine support. Along with 32KB of on-chip SRAM, a low-power 512-byte scratchpad memory is utilized by the WIMS custom compiler to obtain an average of 18% energy savings across benchmarks. A synthesizable dynamic frequency scaling circuit allows the chip to select a precision on-chip LC or ring oscillator, and perform clock scaling to minimize power dissipation; it provides glitch-free, software-controlled frequency shifting in 100ns, and dissipates only 480μW. A highly flexible and expandable 16-channel Continuous Interleaved Sampling Digital Signal Processor (DSP) is included as an MCU peripheral component. Modes are included to process data, stimulate through electrodes, and allow experimental stimulation or processing. The entire WIMS MCU occupies 9.18mm2 and consumes only 1.79mW from 1.2V in DSP mode. This is the lowest reported consumption for a cochlear DSP. Design methodologies were analyzed and a new top-down design flow is presented that encourages hardware and software co-design as well as cross-domain verification early in the design process. An O(n) technique for energy-per-instruction estimations both pre- and post-silicon is presented that achieves less than 4% error across benchmarks. This dissertation advances low-power system design while providing an improvement in hearing recovery devices.Ph.D.Electrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/91488/1/emarsman_1.pd

Deep Blue Documents at the University of Michigan

High level synthesis of memory architectures

Author: Fallside Hamish
Publication venue: The University of Edinburgh
Publication date: 01/01/1995
Field of study

Edinburgh Research Archive

Partial VLSI implementation of the architecture for reusable components (ARC)

Author: Kakadasam Deepak
Publication venue: Digital Commons @ NJIT
Publication date: 31/01/1993
Field of study

This work describes a novel VLSI implementation of the Architecture for Reusable Components (ARC) processor, using Hardware Description Language (HDL). The main goal here is to achieve efficient execution of reusable software through proper hardware support. This involves the hard wired implementation of each instruction designed for the ARC processor. Instructions are broken down into their logical functions, then modeled and simulated through the hierarchical design methods that HDL offers. The structural model of the processor has been developed and simulated. The purpose here has been to begin work on the design and implementation of the ARC processor. The instructions were built using HDL modules, and then simulated using a logic simulator. The effect of internal propagation delays in the execution of the logic modules have been investigated. Changes in delay parameters have been applied to obtain correct logic transfer operations. The redundancy in the logic transfer operations have also been investigated to see parallelism at the instruction execution level

Digital Commons @ New Jersey Institute of Technology (NJIT)

Driving the Network-on-Chip Revolution to Remove the Interconnect Bottleneck in Nanoscale Multi-Processor Systems-on-Chip

Author: Medardoni Simone
Publication venue
Publication date: 01/01/2009
Field of study

The sustained demand for faster, more powerful chips has been met by the availability of chip manufacturing processes allowing for the integration of increasing numbers of computation units onto a single die. The resulting outcome, especially in the embedded domain, has often been called SYSTEM-ON-CHIP (SoC) or MULTI-PROCESSOR SYSTEM-ON-CHIP (MP-SoC). MPSoC design brings to the foreground a large number of challenges, one of the most prominent of which is the design of the chip interconnection. With a number of on-chip blocks presently ranging in the tens, and quickly approaching the hundreds, the novel issue of how to best provide on-chip communication resources is clearly felt. NETWORKS-ON-CHIPS (NoCs) are the most comprehensive and scalable answer to this design concern. By bringing large-scale networking concepts to the on-chip domain, they guarantee a structured answer to present and future communication requirements. The point-to-point connection and packet switching paradigms they involve are also of great help in minimizing wiring overhead and physical routing issues. However, as with any technology of recent inception, NoC design is still an evolving discipline. Several main areas of interest require deep investigation for NoCs to become viable solutions: • The design of the NoC architecture needs to strike the best tradeoff among performance, features and the tight area and power constraints of the onchip domain. • Simulation and verification infrastructure must be put in place to explore, validate and optimize the NoC performance. • NoCs offer a huge design space, thanks to their extreme customizability in terms of topology and architectural parameters. Design tools are needed to prune this space and pick the best solutions. • Even more so given their global, distributed nature, it is essential to evaluate the physical implementation of NoCs to evaluate their suitability for next-generation designs and their area and power costs. This dissertation performs a design space exploration of network-on-chip architectures, in order to point-out the trade-offs associated with the design of each individual network building blocks and with the design of network topology overall. The design space exploration is preceded by a comparative analysis of state-of-the-art interconnect fabrics with themselves and with early networkon- chip prototypes. The ultimate objective is to point out the key advantages that NoC realizations provide with respect to state-of-the-art communication infrastructures and to point out the challenges that lie ahead in order to make this new interconnect technology come true. Among these latter, technologyrelated challenges are emerging that call for dedicated design techniques at all levels of the design hierarchy. In particular, leakage power dissipation, containment of process variations and of their effects. The achievement of the above objectives was enabled by means of a NoC simulation environment for cycleaccurate modelling and simulation and by means of a back-end facility for the study of NoC physical implementation effects. Overall, all the results provided by this work have been validated on actual silicon layout

EprintsUnife

Archivio istituzionale della ricerca - Università di Ferrara