Search CORE

2 research outputs found

Exploring the Behavior of Coherent Accelerator Processor Interface (CAPI) on IBM Power8+ Architecture and FlashSystem 900

Author: Halem Milton
Prathapan Smriti
Velusamy Kaushik
Publication venue
Publication date: 12/09/2019
Field of study

The Coherent Accelerator Processor Interface (CAPI) is a general term for the infrastructure that provides high throughput and low latency path to the flash storage connected to the IBM POWER 8+ System. CAPI accelerator card is attached coherently as a peer to the Power8+ processor. This removes the overhead and complexity of the IO subsystem and allows the accelerator to operate as part of an application. In this paper, we present the results of experiments on IBM FlashSystem900 (FS900) with CAPI accelerator card using the "CAPI-Flash IBM Data Engine for NoSQL Software" Library. This library provides the application, a direct access to the underlying flash storage through user space APIs, to manage and access the data in flash. This offloads kernel IO driver functionality to dedicated CAPI FPGA accelerator hardware. We conducted experiments to analyze the performance of FS900 with CAPI accelerator card, using the Key Value Layer APIs, employing NASA's MODIS Land Surface Reflectance dataset as a large dataset use case. We performed Read and Write operations on datasets of size ranging from 1MB to 3TB by varying the number of threads. We then compared this performance with other heterogeneous storage and memory devices such as NVM, SSD and RAM, without using the CAPI Accelerator in synchronous and asynchronous file IO modes of operations. The results indicate that FS900 & CAPI, together with the metadata cache in RAM, delivers the highest IO/s and OP/s for read operations. This was higher than just using RAM, along with utilizing lesser CPU resources. Among FS900, SSD and NVM, FS900 had the highest write IO/s. Another important observation is that, when the size of the input dataset exceeds the capacity of RAM, and when the data access is non-uniform and sparse, FS900 with CAPI would be a cost-effective alternative.Comment: 18 pages, 7 figures, 3 tables, Accepted for publication at 2019 International Workshop on OpenPOWER for HPC (IWOPH19) International Supercomputing Conference HPC Frankfurt, German

arXiv.org e-Print Archive

Crossref

Persistência Poliglota - Diferentes Necessidades de Armazenamento de Dados

Author: Gomes Tiago Alexandre Nunes
Publication venue
Publication date: 18/12/2017
Field of study

A necessidade crescente de se armazenarem grandes quantidades de dados, de forma a prover serviços escaláveis, obrigou à procura de novas soluções, ao longo do tempo. No decurso da história podem enumerar-se diversos sistemas de dados: o hierárquico, o em rede, o relacional, o orientado a objetos, o objeto-relacional e, mais recentemente, o NoSQL. Todos estes sistemas tentaram dar resposta a diferentes realidades do armazenamento de dados, indo ao encontro dos problemas de cada época. Devido à necessidade de se aproveitarem as vantagens que cada tipo de base de dados oferece, surgiu o conceito de Persistência Poliglota, que traduz a integração de vários tipos de bases de dados num só sistema. Esta abordagem tem como objetivo retirar o melhor de cada base de dados, apresentando uma solução fiável e alternativa aos sistemas com apenas um tipo de base de dados. Como tal, este trabalho visa a análise da abordagem de Persistência Poliglota para comparar sistemas compostos por diversos sistemas de gestão de base de dados versus os que utilizam apenas um motor de base de dados, de modo a verificar se esta abordagem é útil e vantajosa. Desta forma, elaborou-se uma prova de conceito, com base num problema proposto, com o objetivo de se analisarem dois sistemas, um único e outro poliglota, tendo por base três volumes de dados diferentes. Para isto, foi necessário proceder-se à análise e escolha dos sistemas de gestão de base de dados a utilizar e montar os ambientes de teste, para ambos os sistemas. Com recurso a várias consultas individuais (a cada base de dados) e globais (conjunto das bases de dados que compõem o sistema poliglota), foram analisados os resultados obtidos com recurso à métrica de medição do desempenho relativa aos tempos de consulta. O trabalho e os resultados obtidos evidenciaram um aumento do desempenho, quanto à utilização individual das bases de dados. Perante o conjunto das bases de dados, apesar de um ligeiro aumento, nota-se que os resultados não são claros e que carecem de uma investigação mais profunda. Por fim, é possível afirmar que a abordagem poliglota é principalmente útil em sistemas complexos, onde o volume de dados é elevado, e onde se pretende armazenar diferentes tipos de dados.ABSTRACT: The increasing need to store large amounts of data in order to provide scalable services has forced the search for new solutions over time. In the course of history, several data systems can be enumerated: hierarchical, network, relational, object-oriented, object-relational, and, more recently, NoSQL. All these systems tried to respond to different realities of data storage, meeting the problems of each era. However, due to the need to take benefit from all the advantages that each type of database offers, the concept of Polyglot Persistence has emerged, which allows the integration of several types of databases in a single system. This approach aims to get the best out of each database, presenting a reliable and alternative solution to systems with only one type of database. As such, this work aims at the analysis of Polyglot Persistence approach to compare systems composed of several database management systems versus those using a single database engine, in order to verify if this approach is useful and advantageous. In this way, a proof of concept was elaborated, based on a proposed problem, with the objective of analyzing two systems, a single and another polyglot, based on three different data volumes. For this, it was necessary to proceed to the analysis and choice of the database management systems to be used and to assemble the test environments, for both systems. Using a number of individual queries (for each database) and global queries (set of databases that make up the polyglot system), the results obtained were analyzed using the performance metric relative to the query times. The work and the results obtained showed an increase in the performance, regarding the individual use of the databases. In spite of a slight increase, the results are not clear and need further investigation. Finally, it is possible to affirm that the polyglot approach is mainly useful in complex systems, where the volume of data is high, and it is intended to store different types of data

Repositório Científico do Instituto Politécnico de Viseu