Search CORE

42 research outputs found

Globally asynchronous locally synchronous configurable array architecture for algorithm embeddings

Author: Gao Bo
Publication venue: The University of Edinburgh
Publication date: 01/01/1996
Field of study

Cryptographic extensions for custom and GPU-like architectures

Author: Argenziano Domenico
Publication venue
Publication date: 11/12/2017
Field of study

The PhD thesis work deals with the exploration of hardware architectures dedicated to cryptographic applications, in particular, solutions based on reconfigurable hardware, such as FPGA. The thesis presents the results achieved for the acceleration of operations essential to homomorphic cryptography, specifically, the integer multiplication of very long operands, based on the Schonhage-Strassen algorithm and implemented with an ad-hoc FPGA hardware. Then, the thesis reports the exploration of novelty approaches for cryptographic acceleration, based on vectorial dedicated architectures, software programmable, with the corresponding implementation of symmetric and public key operations (namely, AES encryption and Montgomery multiplication) with improved performances

Strategies towards high performance (high-resolution/linearity) time-to-digital converters on field-programmable gate arrays

Author: Xie Wujun
Publication venue
Publication date
Field of study

Time-correlated single-photon counting (TCSPC) technology has become popular in scientific research and industrial applications, such as high-energy physics, bio-sensing, non-invasion health monitoring, and 3D imaging. Because of the increasing demand for high-precision time measurements, time-to-digital converters (TDCs) have attracted attention since the 1970s. As a fully digital solution, TDCs are portable and have great potential for multichannel applications compared to bulky and expensive time-to-amplitude converters (TACs). A TDC can be implemented in ASIC and FPGA devices. Due to the low cost, flexibility, and short development cycle, FPGA-TDCs have become promising. Starting with a literature review, three original FPGA-TDCs with outstanding performance are introduced. The first design is the first efficient wave union (WU) based TDC implemented in Xilinx UltraScale (20 nm) FPGAs with a bubble-free sub-TDL structure. Combining with other existing methods, the resolution is further enhanced to 1.23 ps. The second TDC has been designed for LiDAR applications, especially in driver-less vehicles. Using the proposed new calibration method, the resolution is adjustable (50, 80, and 100 ps), and the linearity is exceptionally high (INL pk-pk and INL pk-pk are lower than 0.05 LSB). Meanwhile, a software tool has been open-sourced with a graphic user interface (GUI) to predict TDCs’ performance. In the third TDC, an onboard automatic calibration (AC) function has been realized by exploiting Xilinx ZYNQ SoC architectures. The test results show the robustness of the proposed method. Without the manual calibration, the AC function enables FPGA-TDCs to be applied in commercial products where mass production is required.Time-correlated single-photon counting (TCSPC) technology has become popular in scientific research and industrial applications, such as high-energy physics, bio-sensing, non-invasion health monitoring, and 3D imaging. Because of the increasing demand for high-precision time measurements, time-to-digital converters (TDCs) have attracted attention since the 1970s. As a fully digital solution, TDCs are portable and have great potential for multichannel applications compared to bulky and expensive time-to-amplitude converters (TACs). A TDC can be implemented in ASIC and FPGA devices. Due to the low cost, flexibility, and short development cycle, FPGA-TDCs have become promising. Starting with a literature review, three original FPGA-TDCs with outstanding performance are introduced. The first design is the first efficient wave union (WU) based TDC implemented in Xilinx UltraScale (20 nm) FPGAs with a bubble-free sub-TDL structure. Combining with other existing methods, the resolution is further enhanced to 1.23 ps. The second TDC has been designed for LiDAR applications, especially in driver-less vehicles. Using the proposed new calibration method, the resolution is adjustable (50, 80, and 100 ps), and the linearity is exceptionally high (INL pk-pk and INL pk-pk are lower than 0.05 LSB). Meanwhile, a software tool has been open-sourced with a graphic user interface (GUI) to predict TDCs’ performance. In the third TDC, an onboard automatic calibration (AC) function has been realized by exploiting Xilinx ZYNQ SoC architectures. The test results show the robustness of the proposed method. Without the manual calibration, the AC function enables FPGA-TDCs to be applied in commercial products where mass production is required

A custom computing framework for orientation and photogrammetry

Author: Fiore Paul D. (Paul David), 1963-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2000
Field of study

Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2000.Includes bibliographical references (p. 211-223).This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.There is great demand today for real-time computer vision systems, with applications including image enhancement, target detection and surveillance, autonomous navigation, and scene reconstruction. These operations generally require extensive computing power; when multiple conventional processors and custom gate arrays are inappropriate, due to either excessive cost or risk, a class of devices known as Field-Programmable Gate Arrays (FPGAs) can be employed. FPGAs per the flexibility of a programmable solution and nearly the performance of a custom gate array. When implementing a custom algorithm in an FPGA, one must be more efficient than with a gate array technology. By tailoring the algorithms, architectures, and precisions, the gate count of an algorithm may be sufficiently reduced to t into an FPGA. The challenge is to perform this customization of the algorithm, while still maintaining the required performance. The techniques required to perform algorithmic optimization for FPGAs are scattered across many fields; what is currently lacking is a framework for utilizing all these well known and developing techniques. The purpose of this thesis is to develop this framework for orientation and photogrammetry systems.by Paul D. Fiore.Ph.D

High integrity hardware-software codesign

Author: Hilton Adrian J.
Publication venue
Publication date: 01/01/2004
Field of study

Programmable logic devices (PLDs) are increasing in complexity and speed, and are being used as important components in safety-critical systems. Methods for developing high-integrity software for these systems are well-known, but this is not true for programmable logic. We propose a process for developing a system incorporating software and PLDs, suitable for safety critical systems of the highest levels of integrity. This process incorporates the use of Synchronous Receptive Process Theory as a semantic basis for specifying and proving properties of programs executing on PLDs, and extends the use of SPARK Ada from a programming language for safety-critical systems software to cover the interface between software and programmable logic. We have validated this approach through the specification and development of a substantial safety-critical system incorporating both software and programmable logic components, and the development of tools to support this work. This enables us to claim that the methods demonstrated are not only feasible but also scale up to realistic system sizes, allowing development of such safety-critical software-hardware systems to the levels required by current system safety standards

A Solder-Defined Computer Architecture for Backdoor and Malware Resistance

Author: Abel Marc W.
Publication venue: CORE Scholar
Publication date: 01/01/2022
Field of study

This research is about securing control of those devices we most depend on for integrity and confidentiality. An emerging concern is that complex integrated circuits may be subject to exploitable defects or backdoors, and measures for inspection and audit of these chips are neither supported nor scalable. One approach for providing a “supply chain firewall” may be to forgo such components, and instead to build central processing units (CPUs) and other complex logic from simple, generic parts. This work investigates the capability and speed ceiling when open-source hardware methodologies are fused with maker-scale assembly tools and visible-scale final inspection. The author has designed, and demonstrated in simulation, a 36-bit CPU and protected memory subsystem that use only synchronous static random access memory (SRAM) and trivial glue logic integrated circuits as components. The design presently lacks preemptive multitasking, ability to load firmware into the SRAMs used as logic elements, and input/output. Strategies are presented for adding these missing subsystems, again using only SRAM and trivial glue logic. A load-store architecture is employed with four clock cycles per instruction. Simulations indicate that a clock speed of at least 64 MHz is probable, corresponding to 16 million instructions per second (16 MIPS), despite the architecture containing no microprocessors, field programmable gate arrays, programmable logic devices, application specific integrated circuits, or other purchased complex logic. The lower speed, larger size, higher power consumption, and higher cost of an “SRAM minicomputer,” compared to traditional microcontrollers, may be offset by the fully open architecture—hardware and firmware—along with more rigorous user control, reliability, transparency, and auditability of the system. SRAM logic is also particularly well suited for building arithmetic logic units, and can implement complex operations such as population count, a hash function for associative arrays, or a pseudorandom number generator with good statistical properties in as few as eight clock cycles per 36-bit word processed. 36-bit unsigned multiplication can be implemented in software in 47 instructions or fewer (188 clock cycles). A general theory is developed for fast SRAM parallel multipliers should they be needed

HySim: A Hybrid Software/Hardware Simulation Framework for Early Architectural Exploration of Chip Multiprocessors

Author
Publication venue
Publication date
Field of study