171,083 research outputs found
A Design Methodology for Space-Time Adapter
This paper presents a solution to efficiently explore the design space of
communication adapters. In most digital signal processing (DSP) applications,
the overall architecture of the system is significantly affected by
communication architecture, so the designers need specifically optimized
adapters. By explicitly modeling these communications within an effective
graph-theoretic model and analysis framework, we automatically generate an
optimized architecture, named Space-Time AdapteR (STAR). Our design flow inputs
a C description of Input/Output data scheduling, and user requirements
(throughput, latency, parallelism...), and formalizes communication constraints
through a Resource Constraints Graph (RCG). The RCG properties enable an
efficient architecture space exploration in order to synthesize a STAR
component. The proposed approach has been tested to design an industrial data
mixing block example: an Ultra-Wideband interleaver.Comment: ISBN : 978-1-59593-606-
A Methodology for Efficient Space-Time Adapter Design Space Exploration: A Case Study of an Ultra Wide Band Interleaver
This paper presents a solution to efficiently explore the design space of
communication adapters. In most digital signal processing (DSP) applications,
the overall architecture of the system is significantly affected by
communication architecture, so the designers need specifically optimized
adapters. By explicitly modeling these communications within an effective
graph-theoretic model and analysis framework, we automatically generate an
optimized architecture, named Space-Time AdapteR (STAR). Our design flow inputs
a C description of Input/Output data scheduling, and user requirements
(throughput, latency, parallelism...), and formalizes communication constraints
through a Resource Constraints Graph (RCG). The RCG properties enable an
efficient architecture space exploration in order to synthesize a STAR
component. The proposed approach has been tested to design an industrial data
mixing block example: an Ultra-Wideband interleaver.Comment: ISBN:1-4244-0921-
BIST hardware synthesis for RTL data paths based on test compatibility classes
New BIST methodology for RTL data paths is presented. The proposed BIST methodology takes advantage of the structural information of RTL data path and reduces the test application time by grouping same-type modules into test compatibility classes (TCCs). During testing, compatible modules share a small number of test pattern generators at the same test time leading to significant reductions in BIST area overhead, performance degradation and test application time. Module output responses from each TCC are checked by comparators leading to substantial reduction in fault-escape probability. Only a single signature analysis register is required to compress the responses of each TCC which leads to high reductions in volume of output data and overall test application time (the sum of test application time and shifting time required to shift out test responses). This paper shows how the proposed TCC grouping methodology is a general case of the traditional BIST embedding methodology for RTL data paths with both uniform and variable bit width. A new BIST hardware synthesis algorithm employs efficient tabu search-based testable design space exploration which combines the accuracy of incremental test scheduling algorithms and the exploration speed of test scheduling algorithms based on fixed test resource allocation. To illustrate TCC grouping methodology efficiency, various benchmark and complex hypothetical data paths have been evaluated and significant improvements over BIST embedding methodology are achieved
A Micro Power Hardware Fabric for Embedded Computing
Field Programmable Gate Arrays (FPGAs) mitigate many of the problemsencountered with the development of ASICs by offering flexibility, faster time-to-market, and amortized NRE costs, among other benefits. While FPGAs are increasingly being used for complex computational applications such as signal and image processing, networking, and cryptology, they are far from ideal for these tasks due to relatively high power consumption and silicon usage overheads compared to direct ASIC implementation. A reconfigurable device that exhibits ASIC-like power characteristics and FPGA-like costs and tool support is desirable to fill this void. In this research, a parameterized, reconfigurable fabric model named as domain specific fabric (DSF) is developed that exhibits ASIC-like power characteristics for Digital Signal Processing (DSP) style applications. Using this model, the impact of varying different design parameters on power and performance has been studied. Different optimization techniques like local search and simulated annealing are used to determine the appropriate interconnect for a specific set of applications. A design space exploration tool has been developed to automate and generate a tailored architectural instance of the fabric.The fabric has been synthesized on 160 nm cell-based ASIC fabrication process from OKI and 130 nm from IBM. A detailed power-performance analysis has been completed using signal and image processing benchmarks from the MediaBench benchmark suite and elsewhere with comparisons to other hardware and software implementations. The optimized fabric implemented using the 130 nm process yields energy within 3X of a direct ASIC implementation, 330X better than a Virtex-II Pro FPGA and 2016X better than an Intel XScale processor
FASTCUDA: Open Source FPGA Accelerator & Hardware-Software Codesign Toolset for CUDA Kernels
Using FPGAs as hardware accelerators that communicate with a central CPU is becoming a common practice in the embedded design world but there is no standard methodology and toolset to facilitate this path yet. On the other hand, languages such as CUDA and OpenCL provide standard development environments for Graphical Processing Unit (GPU) programming. FASTCUDA is a platform that provides the necessary software toolset, hardware architecture, and design methodology to efficiently adapt the CUDA approach into a new FPGA design flow. With FASTCUDA, the CUDA kernels of a CUDA-based application are partitioned into two groups with minimal user intervention: those that are compiled and executed in parallel software, and those that are synthesized and implemented in hardware. A modern low power FPGA can provide the processing power (via numerous embedded micro-CPUs) and the logic capacity for both the software and hardware implementations of the CUDA kernels. This paper describes the system requirements and the architectural decisions behind the FASTCUDA approach
Synthesis of all-digital delay lines
© 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other worksThe synthesis of delay lines (DLs) is a core task during the generation of matched delays, ring oscillator clocks or delay monitors. The main figure of merit of a DL is the fidelity to track variability. Unfortunately, complex systems have a great diversity of timing paths that exhibit different sensitivities to static and dynamic variations. Designing DLs that capture this diversity is an ardous task. This paper proposes an algorithmic approach for the synthesis of DLs that can be integrated in a conventional design flow. The algorithm uses heuristics to perform a combinatorial search in a vast space of solutions that combine different types of gates and wire lengths. The synthesized DLs are (1) all digital, i.e., built of conventional standard cells, (2) accurate in tracking variability and (3) configurable at runtime. Experimental results with a commercial standard cell library confirm the quality of the DLs that only exhibit delay mismatches of about 1% on average over all PVT corners.Peer ReviewedPostprint (author's final draft
A Survey of Symbolic Execution Techniques
Many security and software testing applications require checking whether
certain properties of a program hold for any possible usage scenario. For
instance, a tool for identifying software vulnerabilities may need to rule out
the existence of any backdoor to bypass a program's authentication. One
approach would be to test the program using different, possibly random inputs.
As the backdoor may only be hit for very specific program workloads, automated
exploration of the space of possible inputs is of the essence. Symbolic
execution provides an elegant solution to the problem, by systematically
exploring many possible execution paths at the same time without necessarily
requiring concrete inputs. Rather than taking on fully specified input values,
the technique abstractly represents them as symbols, resorting to constraint
solvers to construct actual instances that would cause property violations.
Symbolic execution has been incubated in dozens of tools developed over the
last four decades, leading to major practical breakthroughs in a number of
prominent software reliability applications. The goal of this survey is to
provide an overview of the main ideas, challenges, and solutions developed in
the area, distilling them for a broad audience.
The present survey has been accepted for publication at ACM Computing
Surveys. If you are considering citing this survey, we would appreciate if you
could use the following BibTeX entry: http://goo.gl/Hf5FvcComment: This is the authors pre-print copy. If you are considering citing
this survey, we would appreciate if you could use the following BibTeX entry:
http://goo.gl/Hf5Fv
- …