3 research outputs found
HyperLogLog Sketch Acceleration on FPGA
Data sketches are a set of widely used approximated data summarizing
techniques. Their fundamental property is sub-linear memory complexity on the
input cardinality, an important aspect when processing streams or data sets
with a vast base domain (URLs, IP addresses, user IDs, etc.). Among the many
data sketches available, HyperLogLog has become the reference for cardinality
counting (how many distinct data items there are in a data set). Although it
does not count every data item (to reduce memory consumption), it provides
probabilistic guarantees on the result, and it is, thus, often used to analyze
data streams. In this paper, we explore how to implement HyperLogLog on an FPGA
to benefit from the parallelism available and the ability to process data
streams coming from high-speed networks. Our multi-pipelined high-cardinality
HyperLogLog implementation delivers 1.8x higher throughput than an optimized
HyperLogLog running on a dual-socket Intel Xeon E5-2630 v3 system with a total
of 16 cores and 32 hyper-threads.Comment: This paper was accepted as a full paper to FPL 202
Reconfigurable Hardware Accelerators: Opportunities, Trends, and Challenges
With the emerging big data applications of Machine Learning, Speech
Recognition, Artificial Intelligence, and DNA Sequencing in recent years,
computer architecture research communities are facing the explosive scale of
various data explosion. To achieve high efficiency of data-intensive computing,
studies of heterogeneous accelerators which focus on latest applications, have
become a hot issue in computer architecture domain. At present, the
implementation of heterogeneous accelerators mainly relies on heterogeneous
computing units such as Application-specific Integrated Circuit (ASIC),
Graphics Processing Unit (GPU), and Field Programmable Gate Array (FPGA). Among
the typical heterogeneous architectures above, FPGA-based reconfigurable
accelerators have two merits as follows: First, FPGA architecture contains a
large number of reconfigurable circuits, which satisfy requirements of high
performance and low power consumption when specific applications are running.
Second, the reconfigurable architectures of employing FPGA performs prototype
systems rapidly and features excellent customizability and reconfigurability.
Nowadays, in top-tier conferences of computer architecture, emerging a batch of
accelerating works based on FPGA or other reconfigurable architectures. To
better review the related work of reconfigurable computing accelerators
recently, this survey reserves latest high-level research products of
reconfigurable accelerator architectures and algorithm applications as the
basis. In this survey, we compare hot research issues and concern domains,
furthermore, analyze and illuminate advantages, disadvantages, and challenges
of reconfigurable accelerators. In the end, we prospect the development
tendency of accelerator architectures in the future, hoping to provide a
reference for computer architecture researchers
Farview: Disaggregated Memory with Operator Off-loading for Database Engines
Cloud deployments disaggregate storage from compute, providing more
flexibility to both the storage and compute layers. In this paper, we explore
disaggregation by taking it one step further and applying it to memory (DRAM).
Disaggregated memory uses network attached DRAM as a way to decouple memory
from CPU. In the context of databases, such a design offers significant
advantages in terms of making a larger memory capacity available as a central
pool to a collection of smaller processing nodes. To explore these
possibilities, we have implemented Farview, a disaggregated memory solution for
databases, operating as a remote buffer cache with operator offloading
capabilities. Farview is implemented as an FPGA-based smart NIC making DRAM
available as a disaggregated, network attached memory module capable of
performing data processing at line rate over data streams to/from disaggregated
memory. Farview supports query offloading using operators such as selection,
projection, aggregation, regular expression matching and encryption. In this
paper we focus on analytical queries and demonstrate the viability of the idea
through an extensive experimental evaluation of Farview under different
workloads. Farview is competitive with a local buffer cache solution for all
the workloads and outperforms it in a number of cases, proving that a smart
disaggregated memory can be a viable alternative for databases deployed in
cloud environments.Comment: 12 page