159 research outputs found
Evolving Quantum Circuits and an FPGA-based Quantum Computing Emulator
The goal of the PQLG group is to develop complete methodologies, software tools and circuits for quantum logic. Our interests are mainly in logic synthesis for quantum circuits and quantum system design [10]. Emulation of quantum circuits using standard reconfigurable FPGA technology and FPGA-based Evolvable Quantum Hardware, proposed here, are research areas not yet dealt with by other research groups. A parallel software simulator was presented in [13]
Parallel Searching-Based Sphere Detector for MIMO Downlink OFDM Systems
In this paper, implementation of a detector with parallel partial candidate-search algorithm is described. Two fully independent partial candidate search processes are simultaneously employed for two groups of transmit antennas based
on QR decomposition (QRD) and QL decomposition (QLD) of a multiple-input multiple-output (MIMO) channel matrix. By using separate simultaneous candidate searching processes, the proposed implementation of QRD-QLD searching-based sphere detector provides a smaller latency and a lower computational complexity
than the original QRD-M detector for similar error-rate performance in wireless communications systems employing four transmit and four receive antennas with 16-QAM or 64-QAM constellation size. It is shown that in coded MIMO orthogonal
frequency division multiplexing (MIMO OFDM) systems, the detection latency and computational complexity of a receiver can be substantially reduced by using the proposed QRD-QLD detector implementation. The QRD-QLD-based sphere detector is also implemented using Field Programmable Gate Array (FPGA) and application specific integrated circuit (ASIC), and its hardware design complexity is compared with that of other sphere detectors reported in the literature.Nokia Renesas MobileTexas InstrumentsXilinxNational Science Foundatio
FPGA-based architectures for acoustic beamforming with microphone arrays : trends, challenges and research opportunities
Over the past decades, many systems composed of arrays of microphones have been developed to satisfy the quality demanded by acoustic applications. Such microphone arrays are sound acquisition systems composed of multiple microphones used to sample the sound field with spatial diversity. The relatively recent adoption of Field-Programmable Gate Arrays (FPGAs) to manage the audio data samples and to perform the signal processing operations such as filtering or beamforming has lead to customizable architectures able to satisfy the most demanding computational, power or performance acoustic applications. The presented work provides an overview of the current FPGA-based architectures and how FPGAs are exploited for different acoustic applications. Current trends on the use of this technology, pending challenges and open research opportunities on the use of FPGAs for acoustic applications using microphone arrays are presented and discussed
TAG: Learning Circuit Spatial Embedding From Layouts
Analog and mixed-signal (AMS) circuit designs still rely on human design
expertise. Machine learning has been assisting circuit design automation by
replacing human experience with artificial intelligence. This paper presents
TAG, a new paradigm of learning the circuit representation from layouts
leveraging text, self-attention and graph. The embedding network model learns
spatial information without manual labeling. We introduce text embedding and a
self-attention mechanism to AMS circuit learning. Experimental results
demonstrate the ability to predict layout distances between instances with
industrial FinFET technology benchmarks. The effectiveness of the circuit
representation is verified by showing the transferability to three other
learning tasks with limited data in the case studies: layout matching
prediction, wirelength estimation, and net parasitic capacitance prediction.Comment: Accepted by ICCAD 202
Detecting Tangled Logic Structures in VLSI Netlists
This thesis proposes a new problem of identifying large and tangled logic structures in a
synthesized netlist. Large groups of cells that are highly interconnected to each other can
often create potential routing hotspots that require special placement constraints. They can
also indicate problematic clumps of logic that either require resynthesis to reduce wiring
demand or specialized datapath placement. At a glance, this formulation appears similar
to conventional circuit clustering, but there are two important distinctions. First, we are
interested in finding large groups of cells that represent entire logic structures like adders
and decoders, as opposed to clusters with only a handful of cells. Second, we seek to pull
out only the structures of interest, instead of assigning every cell to a cluster to reduce
problem complexity. This work proposes new metrics for detecting structures based on
Rent’s rule that, unlike traditional cluster metrics, are able to fairly differentiate between
large and small groups of cells. Next, we demonstrate how these metrics can be applied to
identify structures in a netlist. Finally, our experiments demonstrate the ability to predict
and alleviate routing hotspots on a real industry design using our metrics and method
Recommended from our members
Formal Analysis of Arithmetic Circuits using Computer Algebra - Verification, Abstraction and Reverse Engineering
Despite a considerable progress in verification and abstraction of random and control logic, advances in formal verification of arithmetic designs have been lagging. This can be attributed mostly to the difficulty in an efficient modeling of arithmetic circuits and datapaths without resorting to computationally expensive Boolean methods, such as Binary Decision Diagrams (BDDs) and Boolean Satisfiability (SAT), that require “bit blasting”, i.e., flattening the design to a bit-level netlist. Approaches that rely on computer algebra and Satisfiability Modulo Theories (SMT) methods are either too abstract to handle the bit-level nature of arithmetic designs or require solving computationally expensive decision or satisfiability problems. The work proposed in this thesis aims at overcoming the limitations of analyzing arithmetic circuits, specifically at the post-synthesized phase. It addresses the verification, abstraction and reverse engineering problems of arithmetic circuits at an algebraic level, treating an arithmetic circuit and its specification as a properly constructed algebraic system. The proposed technique solves these problems by function extraction, i.e., by deriving arithmetic function computed by the circuit from its low-level circuit implementation using computer algebraic rewriting technique. The proposed techniques work on large integer arithmetic circuits and finite field arithmetic circuits, up to 512-bit wide containing millions of logic gates
Analysis and hardware implementation of color map inversion algorithms
The purpose of this thesis is to investigate several algorithms that are used to compute the inverse of a forward printer map. The forward printer map models the printer by mapping points in the printer\u27s input color space to points in the printer\u27s output color space. The inverse of this forward map is required to convert input color specifications in a device-independent color space to a color in the printer\u27s device-dependent color space before being presented to the print engine. The accuracy of the inverse printer map directly affects the accuracy of the reproduced colors. Therefore, any measured change in the forward printer map requires re-computation of the inverse map if accurate and consistent color reproduction is to be maintained. An efficient and accurate method of computing the inverse map could be used in an automatic color correction system. Three algorithms for computing the inverse of the forward printer map are studied in this thesis project. These are the Shepard\u27s, Moving Matrix, and Iteratively Clustered Interpolation (ICI) algorithms. The algorithms are implemented in C and simulated in order to benchmark their relative accuracy, speed, and complexity. The simulations show the ICI algorithm to be the fastest and most accurate at computing the inverse map, and its complexity does not far exceed that of the other algorithms. The ICI algorithm was implemented in VHDL and synthesized to a Synopsys generic library in order to determine the approximate size and speed of an ASIC that could perform the inverse computation. The final implementation resulted in two modules: one that implements the ICI algorithm, and one that implements the trilinear interpolation function that is used by the ICI algorithm. The synthesized ICI module contained 112,683 cells, and the synthesized trilinear interpolation module contained 190,357 cells. The timing of the modules resulted in a 40 nanosecond clock period, which corresponds to a maximum operating frequency of 25 MHz. These synthesized results show that this algorithm is suitable for an ASIC that could be used in a real-time automatic color correction system
Timing Aware Partitioning for Multi-FPGA based Logic Simulation using Top-down Selective Flattening
In order to accelerate logic simulation, it is highly beneficial to simulate the circuit design on FPGA hardware. However, limited hardware resources on FPGAs prevent large designs from being implemented on a single FPGA. Hence there is a need to partition the design and simulate it on a multi-FPGA platform. In contrast to existing FPGA-based post-synthesis partitioning approaches which first completely flatten the circuit and then possibly perform bottom-up clustering, we perform a selective top-down flattening and thereby avoid the potential netlist blowup. This also allows us to preserve the design hierarchy to guide the partitioning and to make subsequent debugging easier. Our approach analyzes the hierarchical design and selectively flattens instances using two metrics based on slack. The resulting partially flattened netlist is converted to a hypergraph, partitioned using a public domain partitioner (hMetis), and reconverted back to a plurality of FPGA netlists, one for each FPGA of the FPGA-based accelerated logic simulation platform. We compare our approach with a partitioning approach that operates on a completely flattened netlist. Static timing analysis was performed for both approaches, and over 15 examples from the OpenCores project, our approach yields a 52% logic simulation speedup and about 0.74x runtime for the entire flow, compared to the completely flat approach. The entire tool chain of our approach is automated in an end-to-end flow from hierarchy extraction, selective flattening, partitioning, and netlist reconstruction. Compared to an existing method which also performs slack-based partitioning of a hierarchical netlist, we obtain a 35% simulation speedup
- …