Search CORE

984 research outputs found

Towards Scalable Real-time Analytics:: An Architecture for Scale-out of OLxP Workloads

Author: Auch Nathan
Bodner Thomas
Bumbulis Peter
Färber Franz
Goel Anil K.
Gropengiesser Francis
Lehner Wolfgang
MacLean Scott
Mathis Christian
Pound Jeffrey
Publication venue: 'VLDB Endowment'
Publication date: 10/01/2023
Field of study

We present an overview of our work on the SAP HANA Scale-out Extension, a novel distributed database architecture designed to support large scale analytics over real-time data. This platform permits high performance OLAP with massive scale-out capabilities, while concurrently allowing OLTP workloads. This dual capability enables analytics over real-time changing data and allows fine grained user-specified service level agreements (SLAs) on data freshness. We advocate the decoupling of core database components such as query processing, concurrency control, and persistence, a design choice made possible by advances in high-throughput low-latency networks and storage devices. We provide full ACID guarantees and build on a logical timestamp mechanism to provide MVCC-based snapshot isolation, while not requiring synchronous updates of replicas. Instead, we use asynchronous update propagation guaranteeing consistency with timestamp validation. We provide a view into the design and development of a large scale data management platform for real-time analytics, driven by the needs of modern enterprise customers

Qucosa

HSSS - Hochschulschriftenserver der SLUB

Technische Universität Dresden: Qucosa

SAP HANA Data Volume Management

Author: Kumar Subhadip
Publication venue
Publication date: 28/05/2023
Field of study

Today information technology is a data-driven environment. The role of data is to empower business leaders to make decisions based on facts, trends, and statistical numbers. SAP is no exception. In modern days many companies use business suites like SAP on HANA S/4 or ERP or SAP Business Warehouse and other non-SAP applications and run those on HANA databases for faster processing. While HANA is an extremely powerful in-memory database, growing business data has an impact on the overall performance and budget of the organization. This paper presents best practices to reduce the overall data footprint of HANA databases for three use cases like SAP Business Suite on HANA, SAP Business Warehouse, and Native HANA database

arXiv.org e-Print Archive

O Paradigma "Code Push-Down"

Author: Gomes Tiago Leão
Publication venue
Publication date: 01/01/2021
Field of study

A SAP é um dos maiores e mais bem-conceituados fornecedores de sistemas ERP. Tal como a maioria dos sistemas ERP, estes também têm estado em constante evolução. Desde a disponibilização da sua base de dados SAP High-Speed Analytical Appliance (HANA), a SAP tem tentado persuadir os seus clientes a adotar esta base de dados. Em 2018, a SAP anunciou que iria acabar o suporte do seu ERP, SAP ECC, favorecendo a adoção do seu novo ERP, SAP S/4 HANA, que apenas suporta o uso de bases de dados SAP HANA. O suporte estava previsto acabar em 2025, no entanto foi adiado para 2027 a pedido dos seus clientes. O fim deste suporte significa que uma porção significativa dos clientes da SAP irão migrar para o ERP SAP S/4 HANA (+ SAP HANA) e, como recomendado pela SAP, provavelmente também irão adotar o paradigma de desenvolvimento “Code Push-Down”, que se foca em empurrar lógica aplicacional para a camada/nível da base de dados. Apesar desta mudança no paradigma de desenvolvimento poder, supostamente, trazer benefícios significativos de desempenho, também pode ter consequências no que toca às outras qualidades do software desenvolvido. Este trabalho tem como objetivo analisar o paradigma de desenvolvimento “Code PushDown”, descobrir possíveis desvantagens/limitações e tentar elaborar um guião geral de como aplicar o paradigma de forma a tentar mitigá-las. E talvez, ao suceder nos seus objetivos, também incentivar a realização de mais trabalhos sobre o tema.SAP is one of the biggest and most well-established ERP system providers. Like most ERP systems, their ERP systems and surrounding ecosystems have been in constant evolution. Since the introduction of their SAP High-Speed Analytical Appliance (HANA) database, they have been pushing their clients towards its adoption. In 2018, they announced the end of support for their SAP ECC ERP in favor of the new SAP S/4 HANA ERP, which only supports SAP HANA. This end of support was to take place in 2025 but, due to requests by their customers, it has since been extended to 2027. This end of support means a significant portion of SAP’s clients are migrating to SAP S/4 HANA (+ SAP HANA) and, as recommended by SAP, will most likely also adopt their “Code PushDown” development paradigm, which is based around pushing application logic down to the database tier/layer. Although this shift in development paradigms can, supposedly, bring significant gains in performance, it may also have consequences when it comes to other qualities of the developed software. This work aims to analyze the “Code Push-Down” development paradigm, discover possible downsides/tradeoffs and try to provide general guidelines on how to apply it in order to possibly mitigate them. And perhaps, by succeeding in meeting the objectives, to incentivize further work about this topic

Repositório Científico do Instituto Politécnico do Porto

In-Memory Databases

Author: Možucha Jakub
Publication venue: Vysoké učení technické v Brně. Fakulta informačních technologií
Publication date: 01/01/2014
Field of study

Táto práca sa zaoberá databázami pracujúcimi v pamäti a tiež konceptmi, ktoré boli vyvinuté na vytvorenie takýchto systémov, pretože dáta sú v týchto databázach uložené v hlavnej pamäti, ktorá je schopná spracovať data niekoľkokrát rýchlejšie, ale je to súčasne nestabilné pamäťové medium. Na podloženie týchto konceptov je v práci zhrnutý vývoj databázových systémov od počiatku ich vývoja až do súčasnosti. Prvými databázovými typmi boli hierarchické a sieťové databázy, ktoré boli už v 70. rokoch 20. storočia nahradené prvými relačnými databázami ktorých vývoj trvá až do dnes a v súčastnosti sú zastúpené hlavne OLTP a OLAP systémami. Ďalej sú spomenuté objektové, objektovo-relačné a NoSQL databázy a spomenuté je tiež rozširovanie Big Dát a možnosti ich spracovania. Pre porozumenie uloženia dát v hlavnej pamäti je predstavená pamäťová hierarchia od registrov procesoru, cez cache a hlavnú pamäť až po pevné disky spolu s informáciami o latencii a stabilite týchto pamäťových médií. Ďalej sú spomenuté možnosti usporiadania dát v pamäti a je vysvetlené riadkové a stĺpcové usporiadanie dát spolu s možnosťami ich využitia pre čo najvyšší výkon pri spracovaní dát. V tejto sekcii sú spomenuté aj kompresné techniky, ktoré slúžia na čo najúspornejšie využitie priestoru hlavnej pamäti. V nasledujúcej sekcii sú uvedené postupy, ktoré zabezpečujú, že zmeny v týchto databázach sú persistentné aj napriek tomu, že databáza beží na nestabilnom pamäťovom médiu. Popri tradičných technikách zabezpečujúcich trvanlivosť zmien je predstavený koncept diferenciálnej vyrovnávacej pamäte do ktorej sa ukladajú všetky zmeny v a taktiež je popísaný proces spájania dát z tejto vyrovnávacej pamäti a dát z hlavného úložiska. V ďalšej sekcii práce je prehľad existujúcich databáz, ktoré pracujú v pamäti ako SAP HANA, Times Ten od Oracle ale aj hybridných systémov, ktoré pracujú primárne na disku, ale sú schopné pracovať aj v pamäti. Jedným z takýchto systémov je SQLite. Táto sekcia porovnáva jednotlivé systémy, hodnotí nakoľko využívajú koncepty predstavené v predchádzajúcich kapitolách, a na jej konci je tabuľka kde sú prehľadne zobrazené informácie o týchto systémoch. Ďalšie časti práce sa týkajú už samotného testovania výkonnosti týchto databáz. Zo začiatku sú popísané testovacie dáta pochádzajúce z DBLP databázy a spôsob ich získania a transformácie do použiteľnej formy pre testovanie. Ďalej je popísaná metodika testovania, ktorá sa deli na dve časti. Prvá časť porovnáva výkon databázy pracujúcej v disku s databázou pracujúcou v pamäti. Pre tento účel bola využitá databáza SQLite a možnosť spustenia databázy v pamäti. Druhá časť testovania sa zaoberá porovnaním výkonu riadkového a stĺpcového usporiadania dát v databáze pracujúcej v pamäti. Na tento účel bola využitá databáza SAP HANA, ktorá umožňuje ukladať dáta v oboch usporiadaniach. Výsledkom práce je analýza výsledkov, ktoré boli získané pomocou týchto testov.This bachelor thesis deals with in-memory databases and concepts that were developed to create such systems. To lay the base ground for in-memory concepts, the thesis summarizes the development of the most used database systems. The data layouts like the column and the row layout are introduced together with the compression and storage techniques used to maintain persistence of the in-memory databases. The other parts contain the overview of the existing in-memory database systems and describe the benchmarks used to test the performance of the in-memory databases. At the end, the thesis analyses the results of benchmarks.

Digital library of Brno University of Technology

National Repository of Grey Literature

SAP HANA Platform

Author: Uhlíř Michal
Publication venue: Vysoké učení technické v Brně. Fakulta informačních technologií
Publication date: 01/01/2014
Field of study

Tato práce pojednává o databázi pracující v paměti nazývané SAP HANA. Detailně popisuje architekturu a nové technologie, které tato databáze využívá. V další části se zabývá porovnáním rychlosti provedení vkládání a vybírání záznamů z databáze se stávající používanou relační databází MaxDB. Pro účely tohoto testování jsem vytvořil jednoduchou aplikaci v jazyce ABAP, která umožňuje testy provádět a zobrazuje jejich výsledky. Ty jsou shrnuty v poslední kapitole a ukazují SAP HANA jako jednoznačně rychlejší ve vybírání dat, avšak srovnatelnou, či pomalejší při vkládání dat do databáze. Přínos mé práce vidím v shrnutí podstatných změn, které s sebou data uložená v paměti přináší a názorné srovnání rychlosti provedení základních typů dotazů.This thesis discusses the in-memory database called SAP HANA. It describes in detail the architecture and new technologies used in this type of database. The next section presents a comparison of speed of the inserting and selecting data from the database with existing relational database MaxDB. For the purposes of this testing I created a simple application in ABAP language, which allows user to perform and display their results. These are summarized in the last chapter and demonstrate SAP HANA as clearly faster during selection of data, but comparable, or slower when inserting data into the database. I see contribution of my work in the summary of significant changes that come with data stored in the main memory and brings comparison of speed of basic types of queries.

Digital library of Brno University of Technology

National Repository of Grey Literature

Generative AI-Driven for Sap Hana Analytics

Author: Amol Kulkarni
Publication venue: Auricle Global Society of Education and Research
Publication date: 08/07/2024
Field of study

During the course of a year, a large organization that utilizes a complex information technology system such as SAP ERP typically receives hundreds of thousands of requests from its help desk. It is possible to make these requests either over the phone or online through the use of Service Manager (SM) or from the Service Desk. "Enterprise resource planning" (ERP) software automates procedures pertaining to technology, services, and human resources through a network of interconnected applications. It is a form of software used for business process management. An intelligent system that can provide user assistance for SAP ERP is suggested as a solution by this research study. Consumers are able to obtain automatic responses to their support requests, which not only results in a reduction in the amount of time spent on the investigation and resolution of issues, but also increases the level of responsiveness to end users. Classifying multiclass text for the purpose of efficient query interpretation is accomplished by the system through the utilization of machine learning methods. The evidence is retrieved by the system through the utilization of a customized framework, which enables the most effective response. The capabilities of conversational artificial intelligence make it possible for the framework to construct chatbots that enable different groups of people to work together simultaneously

International Journal on Recent and Innovation Trends in Computing and Communication

Efficient Transaction Processing in SAP HANA Database: The End of a Column Store Myth

Author: Bornhövd Christof
Cha Sang Kyun
Färber Franz
Lehner Wolfgang
Peh Thomas
Sikka Vishal
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 11/08/2022
Field of study

The SAP HANA database is the core of SAP's new data management platform. The overall goal of the SAP HANA database is to provide a generic but powerful system for different query scenarios, both transactional and analytical, on the same data representation within a highly scalable execution environment. Within this paper, we highlight the main features that differentiate the SAP HANA database from classical relational database engines. Therefore, we outline the general architecture and design criteria of the SAP HANA in a first step. In a second step, we challenge the common belief that column store data structures are only superior in analytical workloads and not well suited for transactional workloads. We outline the concept of record life cycle management to use different storage formats for the different stages of a record. We not only discuss the general concept but also dive into some of the details of how to efficiently propagate records through their life cycle and moving database entries from write-optimized to read-optimized storage formats. In summary, the paper aims at illustrating how the SAP HANA database is able to efficiently work in analytical as well as transactional workload environments

Qucosa

HSSS - Hochschulschriftenserver der SLUB

Technische Universität Dresden: Qucosa

Resource-efficient processing of large data volumes

Author: Noll Stefan
Publication venue
Publication date: 01/01/2021
Field of study

The complex system environment of data processing applications makes it very challenging to achieve high resource efficiency. In this thesis, we develop solutions that improve resource efficiency at multiple system levels by focusing on three scenarios that are relevant—but not limited—to database management systems. First, we address the challenge of understanding complex systems by analyzing memory access characteristics via efficient memory tracing. Second, we leverage information about memory access characteristics to optimize the cache usage of algorithms and to avoid cache pollution by applying hardware-based cache partitioning. Third, after optimizing resource usage within a multicore processor, we optimize resource usage across multiple computer systems by addressing the problem of resource contention for bulk loading, i.e., ingesting large volumes of data into the system. We develop a distributed bulk loading mechanism, which utilizes network bandwidth and compute power more efficiently and improves both bulk loading throughput and query processing performance

Eldorado - Ressourcen aus und für Lehre, Studium und Forschung

Analysing data mining methods in sports analytics: a case study in NHL player salary prediction

Author: Minčev Štěpán
Publication venue
Publication date: 11/05/2021
Field of study

Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceThe deployment of Internet of Things has become a systematic phenomenon around the world, leading to the exponential growth of data and data analysis practices. This particular growth is being seen within the sporting industry as new hardware and software are continuously being developed for home and professional use. Though there are several use cases of effective data usage within elite sports, there remains the notion that professional sporting organizations should expand their resources to fully cease the possibility of competitive advantage, through effective data mining techniques. This project conducts a comprehensive analysis of extensive open-sourced NHL data, utilizing SAS’s established SEMMA process. Through the SEMMA process, this project yields a predictive data-mining model, designed to predict future player salaries. With player salaries within the NHL steadily increasing, reaching upwards of

10millon per year, a predictive model with an overall average error of

150,000 and Mean absolute error of $870,000 can grant team’s unique knowledge, which if used effectively within the NHL, can lead to superior decision making. Though there remain limitations due to unquantifiable variables linked to a player’s psychology, as a whole, concrete deductions show that if effectively analyzed, sporting organizations have the power to leverage data to develop a competitive advantage. Our research indicates concludes that organizations pushing towards developing an established data science department are increasing their odds of winning

Repositório da Universidade Nova de Lisboa