Search CORE

3 research outputs found

A Scalable Unsegmented Multiport Memory for FPGA-Based Systems

Author: Attia Osama
Jones Phillip
Jones Phillip
Townsend Kevin
Zambreno Joseph
Zambreno Joseph
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2015
Field of study

On-chip multiport memory cores are crucial primitives for many modern high-performance reconfigurable architectures and multicore systems. Previous approaches for scaling memory cores come at the cost of operating frequency, communication overhead, and logic resources without increasing the storage capacity of the memory. In this paper, we present two approaches for designing multiport memory cores that are suitable for reconfigurable accelerators with substantial on-chip memory or complex communication. Our design approaches tackle these challenges by banking RAM blocks and utilizing interconnect networks which allows scaling without sacrificing logic resources. With banking, memory congestion is unavoidable and we evaluate our multiport memory cores under different memory access patterns to gain insights about different design trade-offs. We demonstrate our implementation with up to 256 memory ports using a Xilinx Virtex-7 FPGA. Our experimental results report high throughput memories with resource usage that scales with the number of ports

Digital Repository @ Iowa State University (ISU)

Crossref

Directory of Open Access Journals

Computing SpMV on FPGAs

Author: Townsend Kevin Rice
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2016
Field of study

There are hundreds of papers on accelerating sparse matrix vector multiplication (SpMV), however, only a handful target FPGAs. Some claim that FPGAs inherently perform inferiorly to CPUs and GPUs. FPGAs do perform inferiorly for some applications like matrix-matrix multiplication and matrix-vector multiplication. CPUs and GPUs have too much memory bandwidth and too much floating point computation power for FPGAs to compete. However, the low computations to memory operations ratio and irregular memory access of SpMV trips up both CPUs and GPUs. We see this as a leveling of the playing field for FPGAs. Our implementation focuses on three pillars: matrix traversal, multiply-accumulator design, and matrix compression. First, most SpMV implementations traverse the matrix in row-major order, but we mix column and row traversal. Second, To accommodate the new traversal the multiply accumulator stores many intermediate y values. Third, we compress the matrix to increase the transfer rate of the matrix from RAM to the FPGA. Together these pillars enable our SpMV implementation to perform competitively with CPUs and GPUs

Digital Repository @ Iowa State University (ISU)

Hardware-Accelerated RNA Secondary-Structure Alignment

Author: Brown M. P. S.
Griffiths-Jones S.
James Moscola
Ron K. Cytron
Schmidt B.
Searls D.
Washietl S.
Young H. Cho
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref