2,428 research outputs found
Honeycomb: ordered key-value store acceleration on an FPGA-based SmartNIC
In-memory ordered key-value stores are an important building block in modern
distributed applications. We present Honeycomb, a hybrid software-hardware
system for accelerating read-dominated workloads on ordered key-value stores
that provides linearizability for all operations including scans. Honeycomb
stores a B-Tree in host memory, and executes SCAN and GET on an FPGA-based
SmartNIC, and PUT, UPDATE and DELETE on the CPU. This approach enables large
stores and simplifies the FPGA implementation but raises the challenge of data
access and synchronization across the slow PCIe bus. We describe how Honeycomb
overcomes this challenge with careful data structure design, caching, request
parallelism with out-of-order request execution, wait-free read operations, and
batching synchronization between the CPU and the FPGA. For read-heavy YCSB
workloads, Honeycomb improves the throughput of a state-of-the-art ordered
key-value store by at least 1.8x. For scan-heavy workloads inspired by cloud
storage, Honeycomb improves throughput by more than 2x. The cost-performance,
which is more important for large-scale deployments, is improved by at least
1.5x on these workloads
AxleDB: A novel programmable query processing platform on FPGA
With the rise of Big Data, providing high-performance query processing capabilities through the acceleration of the database analytic has gained significant attention. Leveraging Field Programmable Gate Array (FPGA) technology, this approach can lead to clear benefits. In this work, we present the design and implementation of AxleDB: An FPGA-based platform that enables fast query processing for database systems by melding novel database-specific accelerators with commercial-off-the-shelf (COTS) storage using modern interfaces, in a novel, unified, and a programmable environment. AxleDB can perform a large subset of SQL queries through its set of instructions that can map compute-intensive database operations, such as filter, arithmetic, aggregate, group by, table join, or sort, on to the specialized high-throughput accelerators. To minimize the amount of SSD I/O operations required, AxleDB also supports hardware MinMax indexing for databases. We evaluated AxleDB with five decision support queries from the TPC-H benchmark suite and achieved a speedup from 1.8X to 34.2X and energy efficiency from 2.8X to 62.1X, in comparison to the state-of-the-art DBMS, i.e., PostgreSQL and MonetDB.The research leading to these results has received funding from the European Union Seventh Framework Program (FP7) (under the AXLE project GA number 318633), the Ministry of Economy and Competitiveness
of Spain (under contract number TIN2015-65316-p), Turkish Ministry of Development TAM Project (number 2007K120610), and Bogazici University Scientific Projects (number 7060).Peer ReviewedPostprint (author's final draft
FPGA-based Query Acceleration for Non-relational Databases
Database management systems are an integral part of today’s everyday life. Trends like smart applications, the internet of things, and business and social networks require applications to deal efficiently with data in various data models close to the underlying domain. Therefore, non-relational database systems provide a wide variety of database models, like graphs and documents. However, current non-relational database systems face performance challenges due to the end of Dennard scaling and therefore performance scaling of CPUs. In the meanwhile, FPGAs have gained traction as accelerators for data management.
Our goal is to tackle the performance challenges of non-relational database
systems with FPGA acceleration and, at the same time, address design challenges of FPGA acceleration itself. Therefore, we split this thesis up into two main lines of work: graph processing and flexible data processing.
Because of the lacking benchmark practices for graph processing accelerators, we propose GraphSim. GraphSim is able to reproduce runtimes of these accelerators based on a memory access model of the approach. Through this simulation environment, we extract three performance-critical accelerator properties: asynchronous graph processing, compressed graph data structure, and multi-channel memory. Since these accelerator properties have not been combined in one system, we propose GraphScale. GraphScale is the first scalable, asynchronous graph processing accelerator working on a compressed graph and outperforms all state-of-the-art graph processing accelerators.
Focusing on accelerator flexibility, we propose PipeJSON as the first FPGA-based JSON parser for arbitrary JSON documents. PipeJSON is able to achieve
parsing at line-speed, outperforming the fastest, vectorized parsers for CPUs. Lastly, we propose the subgraph query processing accelerator GraphMatch which outperforms state-of-the-art CPU systems for subgraph query processing and is able to flexibly switch queries during runtime in a matter of clock cycles
- …