984 research outputs found

    Towards Scalable Real-time Analytics:: An Architecture for Scale-out of OLxP Workloads

    Get PDF
    We present an overview of our work on the SAP HANA Scale-out Extension, a novel distributed database architecture designed to support large scale analytics over real-time data. This platform permits high performance OLAP with massive scale-out capabilities, while concurrently allowing OLTP workloads. This dual capability enables analytics over real-time changing data and allows fine grained user-specified service level agreements (SLAs) on data freshness. We advocate the decoupling of core database components such as query processing, concurrency control, and persistence, a design choice made possible by advances in high-throughput low-latency networks and storage devices. We provide full ACID guarantees and build on a logical timestamp mechanism to provide MVCC-based snapshot isolation, while not requiring synchronous updates of replicas. Instead, we use asynchronous update propagation guaranteeing consistency with timestamp validation. We provide a view into the design and development of a large scale data management platform for real-time analytics, driven by the needs of modern enterprise customers

    SAP HANA Data Volume Management

    Full text link
    Today information technology is a data-driven environment. The role of data is to empower business leaders to make decisions based on facts, trends, and statistical numbers. SAP is no exception. In modern days many companies use business suites like SAP on HANA S/4 or ERP or SAP Business Warehouse and other non-SAP applications and run those on HANA databases for faster processing. While HANA is an extremely powerful in-memory database, growing business data has an impact on the overall performance and budget of the organization. This paper presents best practices to reduce the overall data footprint of HANA databases for three use cases like SAP Business Suite on HANA, SAP Business Warehouse, and Native HANA database

    O Paradigma "Code Push-Down"

    Get PDF
    A SAP é um dos maiores e mais bem-conceituados fornecedores de sistemas ERP. Tal como a maioria dos sistemas ERP, estes também têm estado em constante evolução. Desde a disponibilização da sua base de dados SAP High-Speed Analytical Appliance (HANA), a SAP tem tentado persuadir os seus clientes a adotar esta base de dados. Em 2018, a SAP anunciou que iria acabar o suporte do seu ERP, SAP ECC, favorecendo a adoção do seu novo ERP, SAP S/4 HANA, que apenas suporta o uso de bases de dados SAP HANA. O suporte estava previsto acabar em 2025, no entanto foi adiado para 2027 a pedido dos seus clientes. O fim deste suporte significa que uma porção significativa dos clientes da SAP irão migrar para o ERP SAP S/4 HANA (+ SAP HANA) e, como recomendado pela SAP, provavelmente também irão adotar o paradigma de desenvolvimento “Code Push-Down”, que se foca em empurrar lógica aplicacional para a camada/nível da base de dados. Apesar desta mudança no paradigma de desenvolvimento poder, supostamente, trazer benefícios significativos de desempenho, também pode ter consequências no que toca às outras qualidades do software desenvolvido. Este trabalho tem como objetivo analisar o paradigma de desenvolvimento “Code PushDown”, descobrir possíveis desvantagens/limitações e tentar elaborar um guião geral de como aplicar o paradigma de forma a tentar mitigá-las. E talvez, ao suceder nos seus objetivos, também incentivar a realização de mais trabalhos sobre o tema.SAP is one of the biggest and most well-established ERP system providers. Like most ERP systems, their ERP systems and surrounding ecosystems have been in constant evolution. Since the introduction of their SAP High-Speed Analytical Appliance (HANA) database, they have been pushing their clients towards its adoption. In 2018, they announced the end of support for their SAP ECC ERP in favor of the new SAP S/4 HANA ERP, which only supports SAP HANA. This end of support was to take place in 2025 but, due to requests by their customers, it has since been extended to 2027. This end of support means a significant portion of SAP’s clients are migrating to SAP S/4 HANA (+ SAP HANA) and, as recommended by SAP, will most likely also adopt their “Code PushDown” development paradigm, which is based around pushing application logic down to the database tier/layer. Although this shift in development paradigms can, supposedly, bring significant gains in performance, it may also have consequences when it comes to other qualities of the developed software. This work aims to analyze the “Code Push-Down” development paradigm, discover possible downsides/tradeoffs and try to provide general guidelines on how to apply it in order to possibly mitigate them. And perhaps, by succeeding in meeting the objectives, to incentivize further work about this topic

    In-Memory Databases

    Get PDF
    Táto práca sa zaoberá databázami pracujúcimi v pamäti a tiež konceptmi, ktoré boli vyvinuté na vytvorenie takýchto systémov, pretože dáta sú v týchto databázach uložené v hlavnej pamäti, ktorá je schopná spracovať data niekoľkokrát rýchlejšie, ale je to súčasne nestabilné pamäťové medium. Na podloženie týchto konceptov je v práci zhrnutý vývoj databázových systémov od počiatku ich vývoja až do súčasnosti. Prvými databázovými typmi boli hierarchické a sieťové databázy, ktoré boli už v 70. rokoch 20. storočia nahradené prvými relačnými databázami ktorých vývoj trvá až do dnes a v súčastnosti sú zastúpené hlavne OLTP a OLAP systémami. Ďalej sú spomenuté objektové, objektovo-relačné a NoSQL databázy a spomenuté je tiež rozširovanie Big Dát a možnosti ich spracovania. Pre porozumenie uloženia dát v hlavnej pamäti je predstavená pamäťová hierarchia od registrov procesoru, cez cache a hlavnú pamäť až po pevné disky spolu s informáciami o latencii a stabilite týchto pamäťových médií. Ďalej sú spomenuté možnosti usporiadania dát v pamäti a je vysvetlené riadkové a stĺpcové usporiadanie dát spolu s možnosťami ich využitia pre čo najvyšší výkon pri spracovaní dát. V tejto sekcii sú spomenuté aj kompresné techniky, ktoré slúžia na čo najúspornejšie využitie priestoru hlavnej pamäti. V nasledujúcej sekcii sú uvedené postupy, ktoré zabezpečujú, že zmeny v týchto databázach sú persistentné aj napriek tomu, že databáza beží na nestabilnom pamäťovom médiu. Popri tradičných technikách zabezpečujúcich trvanlivosť zmien je predstavený koncept diferenciálnej vyrovnávacej pamäte do ktorej sa ukladajú všetky zmeny v a taktiež je popísaný proces spájania dát z tejto vyrovnávacej pamäti a dát z hlavného úložiska. V ďalšej sekcii práce je prehľad existujúcich databáz, ktoré pracujú v pamäti ako SAP HANA, Times Ten od Oracle ale aj hybridných systémov, ktoré pracujú primárne na disku, ale sú schopné pracovať aj v pamäti. Jedným z takýchto systémov je SQLite. Táto sekcia porovnáva jednotlivé systémy, hodnotí nakoľko využívajú koncepty predstavené v predchádzajúcich kapitolách, a na jej konci je tabuľka kde sú prehľadne zobrazené informácie o týchto systémoch. Ďalšie časti práce sa týkajú už samotného testovania výkonnosti týchto databáz. Zo začiatku sú popísané testovacie dáta pochádzajúce z DBLP databázy a spôsob ich získania a transformácie do použiteľnej formy pre testovanie. Ďalej je popísaná metodika testovania, ktorá sa deli na dve časti. Prvá časť porovnáva výkon databázy pracujúcej v disku s databázou pracujúcou v pamäti. Pre tento účel bola využitá databáza SQLite a možnosť spustenia databázy v pamäti. Druhá časť testovania sa zaoberá porovnaním výkonu riadkového a stĺpcového usporiadania dát v databáze pracujúcej v pamäti. Na tento účel bola využitá databáza SAP HANA, ktorá umožňuje ukladať dáta v oboch usporiadaniach. Výsledkom práce je analýza výsledkov, ktoré boli získané pomocou týchto testov.This bachelor thesis deals with in-memory databases and concepts that were developed to create such systems. To lay the base ground for in-memory concepts, the thesis summarizes the development of the most used database systems. The data layouts like the column and the row layout are introduced together with the compression and storage techniques used to maintain persistence of the in-memory databases. The other parts contain the overview of the existing in-memory database systems and describe the benchmarks used to test the performance of the in-memory databases. At the end, the thesis analyses the results of benchmarks.

    SAP HANA Platform

    Get PDF
    Tato práce pojednává o databázi pracující v paměti nazývané SAP HANA. Detailně popisuje architekturu a nové technologie, které tato databáze využívá. V další části se zabývá porovnáním rychlosti provedení vkládání a vybírání záznamů z databáze se stávající používanou relační databází MaxDB. Pro účely tohoto testování jsem vytvořil jednoduchou aplikaci v jazyce ABAP, která umožňuje testy provádět a zobrazuje jejich výsledky. Ty jsou shrnuty v poslední kapitole a ukazují SAP HANA jako jednoznačně rychlejší ve vybírání dat, avšak srovnatelnou, či pomalejší při vkládání dat do databáze. Přínos mé práce vidím v shrnutí podstatných změn, které s sebou data uložená v paměti přináší a názorné srovnání rychlosti provedení základních typů dotazů.This thesis discusses the in-memory database called SAP HANA. It describes in detail the architecture and new technologies used in this type of database. The next section presents a comparison of speed of the inserting and selecting data from the database with existing relational database MaxDB. For the purposes of this testing I created a simple application in ABAP language, which allows user to perform and display their results. These are summarized in the last chapter and demonstrate SAP HANA as clearly faster during selection of data, but comparable, or slower when inserting data into the database. I see contribution of my work in the summary of significant changes that come with data stored in the main memory and brings comparison of speed of basic types of queries.

    Generative AI-Driven for Sap Hana Analytics

    Get PDF
    During the course of a year, a large organization that utilizes a complex information technology system such as SAP ERP typically receives hundreds of thousands of requests from its help desk. It is possible to make these requests either over the phone or online through the use of Service Manager (SM) or from the Service Desk. "Enterprise resource planning" (ERP) software automates procedures pertaining to technology, services, and human resources through a network of interconnected applications. It is a form of software used for business process management. An intelligent system that can provide user assistance for SAP ERP is suggested as a solution by this research study. Consumers are able to obtain automatic responses to their support requests, which not only results in a reduction in the amount of time spent on the investigation and resolution of issues, but also increases the level of responsiveness to end users. Classifying multiclass text for the purpose of efficient query interpretation is accomplished by the system through the utilization of machine learning methods. The evidence is retrieved by the system through the utilization of a customized framework, which enables the most effective response. The capabilities of conversational artificial intelligence make it possible for the framework to construct chatbots that enable different groups of people to work together simultaneously

    Efficient Transaction Processing in SAP HANA Database: The End of a Column Store Myth

    Get PDF
    The SAP HANA database is the core of SAP's new data management platform. The overall goal of the SAP HANA database is to provide a generic but powerful system for different query scenarios, both transactional and analytical, on the same data representation within a highly scalable execution environment. Within this paper, we highlight the main features that differentiate the SAP HANA database from classical relational database engines. Therefore, we outline the general architecture and design criteria of the SAP HANA in a first step. In a second step, we challenge the common belief that column store data structures are only superior in analytical workloads and not well suited for transactional workloads. We outline the concept of record life cycle management to use different storage formats for the different stages of a record. We not only discuss the general concept but also dive into some of the details of how to efficiently propagate records through their life cycle and moving database entries from write-optimized to read-optimized storage formats. In summary, the paper aims at illustrating how the SAP HANA database is able to efficiently work in analytical as well as transactional workload environments

    Resource-efficient processing of large data volumes

    Get PDF
    The complex system environment of data processing applications makes it very challenging to achieve high resource efficiency. In this thesis, we develop solutions that improve resource efficiency at multiple system levels by focusing on three scenarios that are relevant—but not limited—to database management systems. First, we address the challenge of understanding complex systems by analyzing memory access characteristics via efficient memory tracing. Second, we leverage information about memory access characteristics to optimize the cache usage of algorithms and to avoid cache pollution by applying hardware-based cache partitioning. Third, after optimizing resource usage within a multicore processor, we optimize resource usage across multiple computer systems by addressing the problem of resource contention for bulk loading, i.e., ingesting large volumes of data into the system. We develop a distributed bulk loading mechanism, which utilizes network bandwidth and compute power more efficiently and improves both bulk loading throughput and query processing performance

    Analysing data mining methods in sports analytics: a case study in NHL player salary prediction

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceThe deployment of Internet of Things has become a systematic phenomenon around the world, leading to the exponential growth of data and data analysis practices. This particular growth is being seen within the sporting industry as new hardware and software are continuously being developed for home and professional use. Though there are several use cases of effective data usage within elite sports, there remains the notion that professional sporting organizations should expand their resources to fully cease the possibility of competitive advantage, through effective data mining techniques. This project conducts a comprehensive analysis of extensive open-sourced NHL data, utilizing SAS’s established SEMMA process. Through the SEMMA process, this project yields a predictive data-mining model, designed to predict future player salaries. With player salaries within the NHL steadily increasing, reaching upwards of 10millonperyear,apredictivemodelwithanoverallaverageerrorof10millon per year, a predictive model with an overall average error of 150,000 and Mean absolute error of $870,000 can grant team’s unique knowledge, which if used effectively within the NHL, can lead to superior decision making. Though there remain limitations due to unquantifiable variables linked to a player’s psychology, as a whole, concrete deductions show that if effectively analyzed, sporting organizations have the power to leverage data to develop a competitive advantage. Our research indicates concludes that organizations pushing towards developing an established data science department are increasing their odds of winning
    corecore