8,370 research outputs found
Empowering parallel computing with field programmable gate arrays
After more than 30 years, reconfigurable computing has grown from a concept to a mature field of science and technology. The cornerstone of this evolution is the field programmable gate array, a building block enabling the configuration of a custom hardware architecture. The departure from static von Neumannlike architectures opens the way to eliminate the instruction overhead and to optimize the execution speed and power consumption. FPGAs now live in a growing ecosystem of development tools, enabling software programmers to map algorithms directly onto hardware. Applications abound in many directions, including data centers, IoT, AI, image processing and space exploration. The increasing success of FPGAs is largely due to an improved toolchain with solid high-level synthesis support as well as a better integration with processor and memory systems. On the other hand, long compile times and complex design exploration remain areas for improvement. In this paper we address the evolution of FPGAs towards advanced multi-functional accelerators, discuss different programming models and their HLS language implementations, as well as high-performance tuning of FPGAs integrated into a heterogeneous platform. We pinpoint fallacies and pitfalls, and identify opportunities for language enhancements and architectural refinements
Array-based architecture for FET-based, nanoscale electronics
Advances in our basic scientific understanding at the molecular and atomic level place us on the verge of engineering designer structures with key features at the single nanometer scale. This offers us the opportunity to design computing systems at what may be the ultimate limits on device size. At this scale, we are faced with new challenges and a new cost structure which motivates different computing architectures than we found efficient and appropriate in conventional very large scale integration (VLSI). We sketch a basic architecture for nanoscale electronics based on carbon nanotubes, silicon nanowires, and nano-scale FETs. This architecture can provide universal logic functionality with all logic and signal restoration operating at the nanoscale. The key properties of this architecture are its minimalism, defect tolerance, and compatibility with emerging bottom-up nanoscale fabrication techniques. The architecture further supports micro-to-nanoscale interfacing for communication with conventional integrated circuits and bootstrap loading
Evaluation of Single-Chip, Real-Time Tomographic Data Processing on FPGA - SoC Devices
A novel approach to tomographic data processing has been developed and
evaluated using the Jagiellonian PET (J-PET) scanner as an example. We propose
a system in which there is no need for powerful, local to the scanner
processing facility, capable to reconstruct images on the fly. Instead we
introduce a Field Programmable Gate Array (FPGA) System-on-Chip (SoC) platform
connected directly to data streams coming from the scanner, which can perform
event building, filtering, coincidence search and Region-Of-Response (ROR)
reconstruction by the programmable logic and visualization by the integrated
processors. The platform significantly reduces data volume converting raw data
to a list-mode representation, while generating visualization on the fly.Comment: IEEE Transactions on Medical Imaging, 17 May 201
Hybrid FPGA: Architecture and Interface
Hybrid FPGAs (Field Programmable Gate Arrays) are composed of general-purpose logic resources
with different granularities, together with domain-specific coarse-grained units. This thesis proposes
a novel hybrid FPGA architecture with embedded coarse-grained Floating Point Units (FPUs) to
improve the floating point capability of FPGAs. Based on the proposed hybrid FPGA architecture,
we examine three aspects to optimise the speed and area for domain-specific applications.
First, we examine the interface between large coarse-grained embedded blocks (EBs) and fine-grained
elements in hybrid FPGAs. The interface includes parameters for varying: (1) aspect ratio of EBs,
(2) position of the EBs in the FPGA, (3) I/O pins arrangement of EBs, (4) interconnect flexibility of
EBs, and (5) location of additional embedded elements such as memory.
Second, we examine the interconnect structure for hybrid FPGAs. We investigate how large and highdensity
EBs affect the routing demand for hybrid FPGAs over a set of domain-specific applications.
We then propose three routing optimisation methods to meet the additional routing demand introduced
by large EBs: (1) identifying the best separation distance between EBs, (2) adding routing switches on
EBs to increase routing flexibility, and (3) introducing wider channel width near the edge of EBs. We
study and compare the trade-offs in delay, area and routability of these three optimisation methods.
Finally, we employ common subgraph extraction to determine the number of floating point adders/subtractors,
multipliers and wordblocks in the FPUs. The wordblocks include registers and can implement fixed
point operations. We study the area, speed and utilisation trade-offs of the selected FPU subgraphs
in a set of floating point benchmark circuits. We develop an optimised coarse-grained FPU, taking
into account both architectural and system-level issues. Furthermore, we investigate the trade-offs
between granularities and performance by composing small FPUs into a large FPU.
The results of this thesis would help design a domain-specific hybrid FPGA to meet user requirements,
by optimising for speed, area or a combination of speed and area
Development and implementation of a LabVIEW based SCADA system for a meshed multi-terminal VSC-HVDC grid scaled platform
This project is oriented to the development of a Supervisory, Control and Data Acquisition
(SCADA) software to control and supervise electrical variables from a scaled platform that
represents a meshed HVDC grid employing National Instruments hardware and LabVIEW logic
environment. The objective is to obtain real time visualization of DC and AC electrical variables
and a lossless data stream acquisition.
The acquisition system hardware elements have been configured, tested and installed on the
grid platform. The system is composed of three chassis, each inside of a VSC terminal cabinet,
with integrated Field-Programmable Gate Arrays (FPGAs), one of them connected via PCI bus
to a local processor and the rest too via Ethernet through a switch. Analogical acquisition
modules were A/D conversion takes place are inserted into the chassis. A personal computer is
used as host, screen terminal and storing space.
There are two main access modes to the FPGAs through the real time system. It has been
implemented a Scan mode VI to monitor all the grid DC signals and a faster FPGA access mode
VI to monitor one converter AC and DC values. The FPGA application consists of two tasks
running at different rates and a FIFO has been implemented to communicate between them
without data loss.
Multiple structures have been tested on the grid platform and evaluated, ensuring the
compliance of previously established specifications, such as sampling and scanning rate, screen
refreshment or possible data loss.
Additionally a turbine emulator was implemented and tested in Labview for further testing
PROGRAPE-1: A Programmable, Multi-Purpose Computer for Many-Body Simulations
We have developed PROGRAPE-1 (PROgrammable GRAPE-1), a programmable
multi-purpose computer for many-body simulations. The main difference between
PROGRAPE-1 and "traditional" GRAPE systems is that the former uses FPGA (Field
Programmable Gate Array) chips as the processing elements, while the latter
rely on the hardwired pipeline processor specialized to gravitational
interactions. Since the logic implemented in FPGA chips can be reconfigured, we
can use PROGRAPE-1 to calculate not only gravitational interactions but also
other forms of interactions such as van der Waals force, hydrodynamical
interactions in SPH calculation and so on. PROGRAPE-1 comprises two Altera
EPF10K100 FPGA chips, each of which contains nominally 100,000 gates. To
evaluate the programmability and performance of PROGRAPE-1, we implemented a
pipeline for gravitational interaction similar to that of GRAPE-3. One pipeline
fitted into a single FPGA chip, which operated at 16 MHz clock. Thus, for
gravitational interaction, PROGRAPE-1 provided the speed of 0.96
Gflops-equivalent. PROGRAPE will prove to be useful for wide-range of
particle-based simulations in which the calculation cost of interactions other
than gravity is high, such as the evaluation of SPH interactions.Comment: 20 pages with 9 figures; submitted to PAS
- …