Search CORE

11 research outputs found

Developments and experimental evaluation of partitioning algorithms for adaptive computing systems

Author: Kerkiz Nabil
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/12/2000
Field of study

Multi-FPGA systems offer the potential to deliver higher performance solutions than traditional computers for some low-level computing tasks. This requires a flexible hardware substrate and an automated mapping system. CHAMPION is an automated mapping system for implementing image processing applications in multi-FPGA systems under development at the University of Tennessee. CHAMPION will map applications in the Khoros Cantata graphical programming environment to hardware. The work described in this dissertation involves the automation of the CHAMPION backend design flow, which includes the partitioning problem, netlist to structural VHDL conversion, synthesis and placement and routing, and host code generation. The primary goal is to investigate the development and evaluation of three different k-way partitioning approaches. In the first and the second approaches, we discuss the development and implementation of two existing algorithms. The first approach is a hierarchical partitioning method based on topological ordering (HP). The second approach is a recursive algorithm based on the Fiduccia and Mattheyses bipartitioning heuristic (RP). We extend these algorithms to handle the multiple constraints imposed by adaptive computing systems. We also introduce a new recursive partitioning method based on topological ordering and levelization (RPL). In addition to handling the partitioning constraints, the new approach efficiently addresses the problem of minimizing the number of FPGAs used and the amount of computation, thereby overcoming some of the weaknesses of the HP and RP algorithms

University of Tennessee, Knoxville: Trace

Hardware design and CAD for processor-based logic emulation systems.

Author: Yazdanshenas Amir Ali
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2006
Field of study

Scholarship at UWindsor

Algorithms and Hardware for Efficient Processing of Logic-based Neural Networks

Author: Esmaili Amirhossein
Fayyazi Arash
Hong Jingkai
Nazemi Mahdi
Pedram Massoud
Publication venue
Publication date: 13/04/2023
Field of study

Recent efforts to improve the performance of neural network (NN) accelerators that meet today's application requirements have given rise to a new trend of logic-based NN inference relying on fixed-function combinational logic (FFCL). This paper presents an innovative optimization methodology for compiling and mapping NNs utilizing FFCL into a logic processor. The presented method maps FFCL blocks to a set of Boolean functions where Boolean operations in each function are mapped to high-performance, low-latency, parallelized processing elements. Graph partitioning and scheduling algorithms are presented to handle FFCL blocks that cannot straightforwardly fit the logic processor. Our experimental evaluations across several datasets and NNs demonstrate the superior performance of our framework in terms of the inference throughput compared to prior art NN accelerators. We achieve 25x higher throughput compared with the XNOR-based accelerator for VGG16 model that can be amplified 5x deploying the graph partitioning and merging algorithms

arXiv.org e-Print Archive

Automatic mapping of graphical programming applications to microelectronic technologies

Author: Ong Sze-Wei
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/05/2001
Field of study

Adaptive computing systems (ACSs) and application-specific integrated circuits (ASICs) can serve as flexible hardware accelerators for applications in domains such as image processing and digital signal processing. However, the mapping of applications onto ACSs and ASICs using the traditional methods can take months for a hardware engineer to develop and debug. In this dissertation, a new approach for automatic mapping of software applications onto ACSs and ASICs has been developed, implemented and validated. This dissertation presents the design flow of the software environment called CHAMPION, which is being developed at the University of Tennessee. This environment permits high-level design entry using the Cantata graphical programming software fromKRI. Using Cantata as the design entry, CHAMPION hides from the user the low-level details of the hardware architecture and the finer issues of application mapping onto the hardware. Validation of the CHAMPION environment was performed using multiple applications of moderate complexity. In one case, theapplication mapping time which required six weeks to perform manually took only six minutes for CHAMPION, yet comparable results were produced. Furthermore, the CHAMPION environment was constructed such that retargeting to a new adaptive computing system could be accomplished in just a few hours as opposed to weeks using manual methods. Thus, CHAMPION permits both ACSs and ASICs to be utilized by a wider audience and application development accomplished in less time

University of Tennessee, Knoxville: Trace

Wirelength and power - aware driven standard cell placement

Author: Αρβανιτάκης Ιωάννης
Publication venue
Publication date: 01/01/2013
Field of study

University of Thessaly Institutional Repository

Recommended from our members

Design of Hardware with Quantifiable Security against Reverse Engineering

Author: Keshavarz Shahrzad
Publication venue: ScholarWorks@UMass Amherst
Publication date: 24/03/2020
Field of study

Semiconductors are a 412 billion dollar industry and integrated circuits take on important roles in human life, from everyday use in smart-devices to critical applications like healthcare and aviation. Saving today\u27s hardware systems from attackers can be a huge concern considering the budget spent on designing these chips and the sensitive information they may contain. In particular, after fabrication, the chip can be subject to a malicious reverse engineer that tries to invasively figure out the function of the chip or other sensitive data. Subsequent to an attack, a system can be subject to cloning, counterfeiting, or IP theft. This dissertation addresses some issues concerning the security of hardware systems in such scenarios. First, the issue of privacy risks from approximate computing is investigated in Chapter 2. Simulation experiments show that the erroneous outputs produced on each chip instance can reveal the identity of the chip that performed the computation, which jeopardizes user privacy. The next two chapters deal with camouflaging, which is a technique to prevent reverse engineering from extracting circuit information from the layout. Chapter 3 provides a design automation method to protect camouflaged circuits against an adversary with prior knowledge about the circuit\u27s viable functions. Chapter 4 provides a method to reverse engineer camouflaged circuits. The proposed reverse engineering formulation uses Boolean Satisfiability (SAT) solving in a way that incorporates laser fault injection and laser voltage probing capabilities to figure out the function of an aggressively camouflaged circuit with unknown gate functions and connections. Chapter 5 addresses the challenge of secure key storage in hardware by proposing a new key storage method that applies threshold-defined behavior of memory cells to store secret information in a way that achieves a high degree of protection against invasive reverse engineering. This approach requires foundry support to encode the secrets as threshold voltage offsets in transistors. In Chapter 6, a secret key storage approach is introduced that does not rely on a trusted foundry. This approach only relies on the foundry to fabricate the hardware infrastructure for key generation but not to encode the secret key. The key is programmed by the IP integrator or the user after fabrication via directed accelerated aging of transistors. Additionally, this chapter presents the design of a working hardware prototype on PCB that demonstrates this scheme. Finally, chapter 7 concludes the dissertation and summarizes possible future research

ScholarWorks@UMass Amherst

Wirelength and timing driven standard-cell placement

Author: Ψύρρα Ειρήνη
Publication venue
Publication date: 01/01/2013
Field of study

University of Thessaly Institutional Repository

Harnessing Simulation Acceleration to Solve the Digital Design Verification Challenge.

Author: Chatterjee Debapriya
Publication venue
Publication date: 01/01/2013
Field of study

Today, design verification is by far the most resource and time-consuming activity of any new digital integrated circuit development. Within this area, the vast majority of the verification effort in industry relies on simulation platforms, which are implemented either in hardware or software. A "simulator" includes a model of each component of a design and has the capability of simulating its behavior under any input scenario provided by an engineer. Thus, simulators are deployed to evaluate the behavior of a design under as many input scenarios as possible and to identify and debug all incorrect functionality. Two features are critical in simulators for the validation effort to be effective: performance and checking/debugging capabilities. A wide range of simulator platforms are available today: on one end of the spectrum there are software-based simulators, providing a very rich software infrastructure for checking and debugging the design's functionality, but executing only at 1-10 simulation cycles per second (while actual chips operate at GHz speeds). At the other end of the spectrum, there are hardware-based platforms, such as accelerators, emulators and even prototype silicon chips, providing higher performances by 4 to 9 orders of magnitude, at the cost of very limited or non-existent checking/debugging capabilities. As a result, today, simulation-based validation is crippled: one can either have satisfactory performance on hardware-accelerated platforms or critical infrastructures for checking/debugging on software simulators, but not both. This dissertation brings together these two ends of the spectrum by presenting solutions that offer high-performance simulation with effective checking and debugging capabilities. Specifically, it addresses the performance challenge of software simulators by leveraging inexpensive off-the-shelf graphics processors as massively parallel execution substrates, and then exposing the parallelism inherent in the design model to that architecture. For hardware-based platforms, the dissertation provides solutions that offer enhanced checking and debugging capabilities by abstracting the relevant data to be logged during simulation so to minimize the cost of collection, transfer and processing. Altogether, the contribution of this dissertation has the potential to solve the challenge of digital design verification by enabling effective high-performance simulation-based validation.PHDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/99781/1/dchatt_1.pd

Deep Blue Documents at the University of Michigan

Hardware Acceleration of Electronic Design Automation Algorithms

Author: Gulati Kanupriya
Publication venue
Publication date
Field of study

With the advances in very large scale integration (VLSI) technology, hardware is going parallel. Software, which was traditionally designed to execute on single core microprocessors, now faces the tough challenge of taking advantage of this parallelism, made available by the scaling of hardware. The work presented in this dissertation studies the acceleration of electronic design automation (EDA) software on several hardware platforms such as custom integrated circuits (ICs), field programmable gate arrays (FPGAs) and graphics processors. This dissertation concentrates on a subset of EDA algorithms which are heavily used in the VLSI design flow, and also have varying degrees of inherent parallelism in them. In particular, Boolean satisfiability, Monte Carlo based statistical static timing analysis, circuit simulation, fault simulation and fault table generation are explored. The architectural and performance tradeoffs of implementing the above applications on these alternative platforms (in comparison to their implementation on a single core microprocessor) are studied. In addition, this dissertation also presents an automated approach to accelerate uniprocessor code using a graphics processing unit (GPU). The key idea is to partition the software application into kernels in an automated fashion, such that multiple instances of these kernels, when executed in parallel on the GPU, can maximally benefit from the GPU?s hardware resources. The work presented in this dissertation demonstrates that several EDA algorithms can be successfully rearchitected to maximally harness their performance on alternative platforms such as custom designed ICs, FPGAs and graphic processors, and obtain speedups upto 800X. The approaches in this dissertation collectively aim to contribute towards enabling the computer aided design (CAD) community to accelerate EDA algorithms on arbitrary hardware platforms

Texas A&M Repository

Design Space Re-Engineering for Power Minimization in Modern Embedded Systems

Author: Yuan Lin
Publication venue
Publication date: 01/06/2006
Field of study

Power minimization is a critical challenge for modern embedded system design. Recently, due to the rapid increase of system's complexity and the power density, there is a growing need for power control techniques at various design levels. Meanwhile, due to technology scaling, leakage power has become a significant part of power dissipation in the CMOS circuits and new techniques are needed to reduce leakage power. As a result, many new power minimization techniques have been proposed such as voltage island, gate sizing, multiple supply and threshold voltage, power gating and input vector control, etc. These design options further enlarge the design space and make it prohibitively expensive to explore for the most energy efficient design solution. Consequently, heuristic algorithms and randomized algorithms are frequently used to explore the design space, seeking sub-optimal solutions to meet the time-to-market requirements. These algorithms are based on the idea of truncating the design space and restricting the search in a subset of the original design space. While this approach can effectively reduce the runtime of searching, it may also exclude high-quality design solutions and cause design quality degradation. When the solution to one problem is used as the base for another problem, such solution quality degradation will accumulate. In modern electronics system design, when several such algorithms are used in series to solve problems in different design levels, the final solution can be far off the optimal one. In my Ph.D. work, I develop a {\em re-engineering} methodology to facilitate exploring the design space of power efficient embedded systems design. The direct goal is to enhance the performance of existing low power techniques. The methodology is based on the idea that design quality can be improved via iterative ``re-shaping'' the design space based on the ``bad'' structure in the obtained design solutions; the searching run-time can be reduced by the guidance from previous exploration. This approach can be described in three phases: (1) apply the existing techniques to obtain a sub-optimal solution; (2) analyze the solution and expand the design space accordingly; and (3) re-apply the technique to re-explore the enlarged design space. We apply this methodology at different levels of embedded system design to minimize power: (i) switching power reduction in sequential logic synthesis; (ii) gate-level static leakage current reduction; (iii) dual threshold voltage CMOS circuits design; and (iv) system-level energy-efficient detection scheme for wireless sensor networks. An extensive amount of experiments have been conducted and the results have shown that this methodology can effectively enhance the power efficiency of the existing embedded system design flows with very little overhead

Digital Repository at the University of Maryland