585 research outputs found

    Self-timed field programmmable gate array architectures

    Get PDF

    Design of Timer for Application in ATM using FPGA and VHDL

    Get PDF
    A watchdog timer is a computer hardware timing device that triggers a system reset if the main program, due to some fault condition, such as a hang, neglects to regularly service the watchdog (writing a “service pulse” to it, also referred to as “petting the dog”). The intention is to bring the system back from the hung state into normal operation. Such a timer has got various important applications, one of them being in ATMs (Automated Teller Machine) which we have studied and designed in our project

    System-on-chip Computing and Interconnection Architectures for Telecommunications and Signal Processing

    Get PDF
    This dissertation proposes novel architectures and design techniques targeting SoC building blocks for telecommunications and signal processing applications. Hardware implementation of Low-Density Parity-Check decoders is approached at both the algorithmic and the architecture level. Low-Density Parity-Check codes are a promising coding scheme for future communication standards due to their outstanding error correction performance. This work proposes a methodology for analyzing effects of finite precision arithmetic on error correction performance and hardware complexity. The methodology is throughout employed for co-designing the decoder. First, a low-complexity check node based on the P-output decoding principle is designed and characterized on a CMOS standard-cells library. Results demonstrate implementation loss below 0.2 dB down to BER of 10^{-8} and a saving in complexity up to 59% with respect to other works in recent literature. High-throughput and low-latency issues are addressed with modified single-phase decoding schedules. A new "memory-aware" schedule is proposed requiring down to 20% of memory with respect to the traditional two-phase flooding decoding. Additionally, throughput is doubled and logic complexity reduced of 12%. These advantages are traded-off with error correction performance, thus making the solution attractive only for long codes, as those adopted in the DVB-S2 standard. The "layered decoding" principle is extended to those codes not specifically conceived for this technique. Proposed architectures exhibit complexity savings in the order of 40% for both area and power consumption figures, while implementation loss is smaller than 0.05 dB. Most modern communication standards employ Orthogonal Frequency Division Multiplexing as part of their physical layer. The core of OFDM is the Fast Fourier Transform and its inverse in charge of symbols (de)modulation. Requirements on throughput and energy efficiency call for FFT hardware implementation, while ubiquity of FFT suggests the design of parametric, re-configurable and re-usable IP hardware macrocells. In this context, this thesis describes an FFT/IFFT core compiler particularly suited for implementation of OFDM communication systems. The tool employs an accuracy-driven configuration engine which automatically profiles the internal arithmetic and generates a core with minimum operands bit-width and thus minimum circuit complexity. The engine performs a closed-loop optimization over three different internal arithmetic models (fixed-point, block floating-point and convergent block floating-point) using the numerical accuracy budget given by the user as a reference point. The flexibility and re-usability of the proposed macrocell are illustrated through several case studies which encompass all current state-of-the-art OFDM communications standards (WLAN, WMAN, xDSL, DVB-T/H, DAB and UWB). Implementations results are presented for two deep sub-micron standard-cells libraries (65 and 90 nm) and commercially available FPGA devices. Compared with other FFT core compilers, the proposed environment produces macrocells with lower circuit complexity and same system level performance (throughput, transform size and numerical accuracy). The final part of this dissertation focuses on the Network-on-Chip design paradigm whose goal is building scalable communication infrastructures connecting hundreds of core. A low-complexity link architecture for mesochronous on-chip communication is discussed. The link enables skew constraint looseness in the clock tree synthesis, frequency speed-up, power consumption reduction and faster back-end turnarounds. The proposed architecture reaches a maximum clock frequency of 1 GHz on 65 nm low-leakage CMOS standard-cells library. In a complex test case with a full-blown NoC infrastructure, the link overhead is only 3% of chip area and 0.5% of leakage power consumption. Finally, a new methodology, named metacoding, is proposed. Metacoding generates correct-by-construction technology independent RTL codebases for NoC building blocks. The RTL coding phase is abstracted and modeled with an Object Oriented framework, integrated within a commercial tool for IP packaging (Synopsys CoreTools suite). Compared with traditional coding styles based on pre-processor directives, metacoding produces 65% smaller codebases and reduces the configurations to verify up to three orders of magnitude

    Proceedings of the 5th International Workshop on Reconfigurable Communication-centric Systems on Chip 2010 - ReCoSoC\u2710 - May 17-19, 2010 Karlsruhe, Germany. (KIT Scientific Reports ; 7551)

    Get PDF
    ReCoSoC is intended to be a periodic annual meeting to expose and discuss gathered expertise as well as state of the art research around SoC related topics through plenary invited papers and posters. The workshop aims to provide a prospective view of tomorrow\u27s challenges in the multibillion transistor era, taking into account the emerging techniques and architectures exploring the synergy between flexible on-chip communication and system reconfigurability

    Practical advances in asynchronous design and in asynchronous/synchronous interfaces

    Get PDF
    Journal ArticleAsynchronous systems are being viewed as an increasingly viable alternative to purely synchronous systems. This paper gives an overview of the current state of the art in practical asynchronous circuit and system design in four areas: controllers, datapaths, processors, and the design of asynchronous/synchronous interfaces

    Analysis and Test of the Effects of Single Event Upsets Affecting the Configuration Memory of SRAM-based FPGAs

    Get PDF
    SRAM-based FPGAs are increasingly relevant in a growing number of safety-critical application fields, ranging from automotive to aerospace. These application fields are characterized by a harsh radiation environment that can cause the occurrence of Single Event Upsets (SEUs) in digital devices. These faults have particularly adverse effects on SRAM-based FPGA systems because not only can they temporarily affect the behaviour of the system by changing the contents of flip-flops or memories, but they can also permanently change the functionality implemented by the system itself, by changing the content of the configuration memory. Designing safety-critical applications requires accurate methodologies to evaluate the system’s sensitivity to SEUs as early as possible during the design process. Moreover it is necessary to detect the occurrence of SEUs during the system life-time. To this purpose test patterns should be generated during the design process, and then applied to the inputs of the system during its operation. In this thesis we propose a set of software tools that could be used by designers of SRAM-based FPGA safety-critical applications to assess the sensitivity to SEUs of the system and to generate test patterns for in-service testing. The main feature of these tools is that they implement a model of SEUs affecting the configuration bits controlling the logic and routing resources of an FPGA device that has been demonstrated to be much more accurate than the classical stuck-at and open/short models, that are commonly used in the analysis of faults in digital devices. By keeping this accurate fault model into account, the proposed tools are more accurate than similar academic and commercial tools today available for the analysis of faults in digital circuits, that do not take into account the features of the FPGA technology.. In particular three tools have been designed and developed: (i) ASSESS: Accurate Simulator of SEuS affecting the configuration memory of SRAM-based FPGAs, a simulator of SEUs affecting the configuration memory of an SRAM-based FPGA system for the early assessment of the sensitivity to SEUs; (ii) UA2TPG: Untestability Analyzer and Automatic Test Pattern Generator for SEUs Affecting the Configuration Memory of SRAM-based FPGAs, a static analysis tool for the identification of the untestable SEUs and for the automatic generation of test patterns for in-service testing of the 100% of the testable SEUs; and (iii) GABES: Genetic Algorithm Based Environment for SEU Testing in SRAM-FPGAs, a Genetic Algorithm-based Environment for the generation of an optimized set of test patterns for in-service testing of SEUs. The proposed tools have been applied to some circuits from the ITC’99 benchmark. The results obtained from these experiments have been compared with results obtained by similar experiments in which we considered the stuck-at fault model, instead of the more accurate model for SEUs. From the comparison of these experiments we have been able to verify that the proposed software tools are actually more accurate than similar tools today available. In particular the comparison between results obtained using ASSESS with those obtained by fault injection has shown that the proposed fault simulator has an average error of 0:1% and a maximum error of 0:5%, while using a stuck-at fault simulator the average error with respect of the fault injection experiment has been 15:1% with a maximum error of 56:2%. Similarly the comparison between the results obtained using UA2TPG for the accurate SEU model, with the results obtained for stuck-at faults has shown an average difference of untestability of 7:9% with a maximum of 37:4%. Finally the comparison between fault coverages obtained by test patterns generated for the accurate model of SEUs and the fault coverages obtained by test pattern designed for stuck-at faults, shows that the former detect the 100% of the testable faults, while the latter reach an average fault coverage of 78:9%, with a minimum of 54% and a maximum of 93:16%

    An integrated soft- and hard-programmable multithreaded architecture

    Get PDF

    Combining dynamic and static scheduling in high-level synthesis

    Get PDF
    Field Programmable Gate Arrays (FPGAs) are starting to become mainstream devices for custom computing, particularly deployed in data centres. However, using these FPGA devices requires familiarity with digital design at a low abstraction level. In order to enable software engineers without a hardware background to design custom hardware, high-level synthesis (HLS) tools automatically transform a high-level program, for example in C/C++, into a low-level hardware description. A central task in HLS is scheduling: the allocation of operations to clock cycles. The classic approach to scheduling is static, in which each operation is mapped to a clock cycle at compile time, but recent years have seen the emergence of dynamic scheduling, in which an operation’s clock cycle is only determined at run-time. Both approaches have their merits: static scheduling can lead to simpler circuitry and more resource sharing, while dynamic scheduling can lead to faster hardware when the computation has a non-trivial control flow. This thesis proposes a scheduling approach that combines the best of both worlds. My idea is to use existing program analysis techniques in software designs, such as probabilistic analysis and formal verification, to optimize the HLS hardware. First, this thesis proposes a tool named DASS that uses a heuristic-based approach to identify the code regions in the input program that are amenable to static scheduling and synthesises them into statically scheduled components, also known as static islands, leaving the top-level hardware dynamically scheduled. Second, this thesis addresses a problem of this approach: that the analysis of static islands and their dynamically scheduled surroundings are separate, where one treats the other as black boxes. We apply static analysis including dependence analysis between static islands and their dynamically scheduled surroundings to optimize the offsets of static islands for high performance. We also apply probabilistic analysis to estimate the performance of the dynamically scheduled part and use this information to optimize the static islands for high area efficiency. Finally, this thesis addresses the problem of conservatism in using sequential control flow designs which can limit the throughput of the hardware. We show this challenge can be solved by formally proving that certain control flows can be safely parallelised for high performance. This thesis demonstrates how to use automated formal verification to find out-of-order loop pipelining solutions and multi-threading solutions from a sequential program.Open Acces

    Desynchronization: Synthesis of asynchronous circuits from synchronous specifications

    Get PDF
    Asynchronous implementation techniques, which measure logic delays at run time and activate registers accordingly, are inherently more robust than their synchronous counterparts, which estimate worst-case delays at design time, and constrain the clock cycle accordingly. De-synchronization is a new paradigm to automate the design of asynchronous circuits from synchronous specifications, thus permitting widespread adoption of asynchronicity, without requiring special design skills or tools. In this paper, we first of all study different protocols for de-synchronization and formally prove their correctness, using techniques originally developed for distributed deployment of synchronous language specifications. We also provide a taxonomy of existing protocols for asynchronous latch controllers, covering in particular the four-phase handshake protocols devised in the literature for micro-pipelines. We then propose a new controller which exhibits provably maximal concurrency, and analyze the performance of desynchronized circuits with respect to the original synchronous optimized implementation. We finally prove the feasibility and effectiveness of our approach, by showing its application to a set of real designs, including a complete implementation of the DLX microprocessor architectur
    corecore