Search CORE

5 research outputs found

FPGA as a RESTful Service: Release Your Accelerator from the PCIe Cage!

Author: Alonso Gustavo
Maschi Fabio
Publication venue: European Conference on Computer Systems
Publication date: 08/05/2023
Field of study

FPGAs have proven to be an efficient execution node to accelerate compute-intense tasks in distributed system. However, keeping them as a primary PCIe-attached device remains a limiting factor for their broader adoption: (i) the tight coupling of technology stacks increases development and maintenance costs; (ii) PCIe communication pattern reduces the potential for low-latency and high-bandwidth acceleration; and (iii) cloud offerings are restricted to few settings. In this work, we present Strega, an HTTP server-side stack for FPGA kernels, making them available as a RESTful micro-service directly over the network. The enhanced hardware abstraction facilitates software integration, while offering deterministic, orders of magnitude higher performance

Repository for Publications and Research Data

The Difficult Balance Between Modern Hardware and Conventional CPUs

Author: Alonso Gustavo
Maschi Fabio
Publication venue: Association for Computing Machinery
Publication date: 01/06/2023
Field of study

Research has demonstrated the potential of accelerators in a wide range of use cases. However, there is a growing imbalance between modern hardware and the CPUs that submit the workload. Recent studies of GPUs on real systems have shown that many servers are often needed per accelerator to generate a high enough load so the computing power is leveraged. This fact is often ignored in research, although it often determines the actual feasibility and overall efficiency of a deployment. In this paper, we conduct a detailed study of the possible configurations and overall cost efficiency of deploying an FPGA-based accelerator on a commercial search engine. First, we show that there are many possible configurations balancing the upstream system and the way the accelerator is configured. Of these configurations, not all of them are suitable in practice, even if they provide some of the highest throughput. Second, we analyse the cost of a deployment capable of sustaining the required workload of the commercial search engine. We examine deployments both on-premises and in the cloud with and without FPGAs and with different board models. The results show that, while FPGAs have the potential to significantly improve overall performance, the performance imbalance between their host CPUs and the FPGAs can make the deployments economically unattractive. These findings are intended to inform the development and deployment of accelerators by showing what is needed on the CPU side to make them effective and also to provide important insights into their end-to-end integration within existing systems

Repository for Publications and Research Data

Serverless FPGA: Work-In-Progress

Author: Alonso Gustavo
Korolija Dario
Maschi Fabio
Publication venue: Association for Computing Machinery
Publication date: 08/05/2023
Field of study

In this short paper we investigate the combination of two emerging technologies: the tight provisioning requirements of Serverless computing and the acceleration potential of FPGAs. Serverless platforms suffer from container overheads, notably cold start latency, while having to adapt to Function-as-a-Service (FaaS) workloads. By exploring re-configurability of FPGAs and their acceleration power, we propose an innovative light-weight Serverless platform for FPGA-based FaaS applications which aims to reduce these overheads. In this study, we explore the feasibility of the idea by implementing key elements of such platform onto the FPGA. Our initial results show potential for acceleration in all aspects of function invocation

Repository for Publications and Research Data

Hardware Acceleration of Compression and Encryption in SAP HANA

Author: Alonso Gustavo
Chiosa Monica
Maschi Fabio
May Norman
Müller Ingo
Publication venue: 'VLDB Endowment'
Publication date: 01/08/2022
Field of study

With the advent of cloud computing, where computational resources are expensive and data movement needs to be secured and minimized, database management systems need to reconsider their architecture to accommodate such requirements. In this paper, we present our analysis, design and evaluation of an FPGA-based hardware accelerator for offloading compression and encryption for SAP HANA, SAP's Software-as-a-Service (SaaS) in-memory database. Firstly, we identify expensive data-transformation operations in the I/O path. Then we present the design details of a system consisting of compression followed by different types of encryption to accommodate different security levels, and identify which combinations maximize performance. We also analyze the performance benefits of offloading decryption to the FPGA followed by decompression on the CPU. The experimental evaluation using SAP HANA traces shows that analytical engines can benefit from FPGA hardware offloading. The results identify a number of important trade-offs (e.g., the system can accommodate low-latency secured transactions to high-performance use cases or offer lower storage cost by also compressing payloads for less critical use cases), and provide valuable information to researchers and practitioners exploring the nascent space of hardware accelerators for database engines.ISSN:2150-809

Repository for Publications and Research Data

From Research to Proof-of-Concept: Analysis of a Deployment of FPGAs on a Commercial Search Engine

Author: Alonso Gustavo
Bondoux Nicolas
Boudia Mourad
Casalino Matteo
Hock-Koon Anthony
Maschi Fabio
Roy Teddy
Publication venue
Publication date: 01/01/2021
Field of study

FPGAs are quickly becoming available in data centres and in the cloud as a one more heterogeneous processing element complementing CPUs and GPUs. There are many reports in the research literature showing the potential for FPGAs to accelerate a wide variety of algorithms, which combined with their growing availability, would seem to also indicate a widespread use in many applications. Unfortunately, there is not much published research exploring what it takes to integrate an FPGA into an existing application in a cost-effective way and keeping the algorithmic performance advantages. Building on recent results exploring how to employ FPGAs to improve the search engines used in the travel industry, this paper analyses the end-to-end performance of the search engine when using FPGAs, as well as the necessary changes to the software and the cost of such deployments. The results provide important insights on current FPGA deployments and what needs to be done to make FPGAs more widely used. For instance, the large potential performance gains provided by an FPGA are greatly diminished in practice if the application cannot submit request in the most optimal way for the FPGA, something that is not always possible and might require significant changes to the application. Similarly, some existing cloud deployments turn out to use a very imbalanced architecture: a powerful FPGA connected to a not so powerful CPU. The result is that the CPU cannot generate enough load for the FPGA, which potentially eliminates all performance gains and might even result in a more expensive system. In this paper, we report on an extensive study and development effort to incorporate FPGAs into a search engine and analyse the issues encountered and their practical impact. We expect that these results will inform the development and deployment of FPGAs in the future by providing important insights on the end-to-end integration of FPGAs within existing systems

Repository for Publications and Research Data