11 research outputs found

    Effizienz in Cluster-Datenbanksystemen - Dynamische und Arbeitslastberücksichtigende Skalierung und Allokation

    Get PDF
    Database systems have been vital in all forms of data processing for a long time. In recent years, the amount of processed data has been growing dramatically, even in small projects. Nevertheless, database management systems tend to be static in terms of size and performance which makes scaling a difficult and expensive task. Because of performance and especially cost advantages more and more installed systems have a shared nothing cluster architecture. Due to the massive parallelism of the hardware programming paradigms from high performance computing are translated into data processing. Database research struggles to keep up with this trend. A key feature of traditional database systems is to provide transparent access to the stored data. This introduces data dependencies and increases system complexity and inter process communication. Therefore, many developers are exchanging this feature for a better scalability. However, explicitly managing the data distribution and data flow requires a deep understanding of the distributed system and reduces the possibilities for automatic and autonomic optimization. In this thesis we present an approach for database system scaling and allocation that features good scalability although it keeps the data distribution transparent. The first part of this thesis analyzes the challenges and opportunities for self-scaling database management systems in cluster environments. Scalability is a major concern of Internet based applications. Access peaks that overload the application are a financial risk. Therefore, systems are usually configured to be able to process peaks at any given moment. As a result, server systems often have a very low utilization. In distributed systems the efficiency can be increased by adapting the number of nodes to the current workload. We propose a processing model and an architecture that allows efficient self-scaling of cluster database systems. In the second part we consider different allocation approaches. To increase the efficiency we present a workload-aware, query-centric model. The approach is formalized; optimal and heuristic algorithms are presented. The algorithms optimize the data distribution for local query execution and balance the workload according to the query history. We present different query classification schemes for different forms of partitioning. The approach is evaluated for OLTP and OLAP style workloads. It is shown that variants of the approach scale well for both fields of application. The third part of the thesis considers benchmarks for large, adaptive systems. First, we present a data generator for cloud-sized applications. Due to its architecture the data generator can easily be extended and configured. A key feature is the high degree of parallelism that makes linear speedup for arbitrary numbers of nodes possible. To simulate systems with user interaction, we have analyzed a productive online e-learning management system. Based on our findings, we present a model for workload generation that considers the temporal dependency of user interaction.Datenbanksysteme sind seit langem die Grundlage für alle Arten von Informationsverarbeitung. In den letzten Jahren ist das Datenaufkommen selbst in kleinen Projekten dramatisch angestiegen. Dennoch sind viele Datenbanksysteme statisch in Bezug auf ihre Kapazität und Verarbeitungsgeschwindigkeit was die Skalierung aufwendig und teuer macht. Aufgrund der guten Geschwindigkeit und vor allem aus Kostengründen haben immer mehr Systeme eine Shared-Nothing-Architektur, bestehen also aus unabhängigen, lose gekoppelten Rechnerknoten. Da dieses Konstruktionsprinzip einen sehr hohen Grad an Parallelität aufweist, werden zunehmend Programmierparadigmen aus dem klassischen Hochleistungsrechen für die Informationsverarbeitung eingesetzt. Dieser Trend stellt die Datenbankforschung vor große Herausforderungen. Eine der grundlegenden Eigenschaften traditioneller Datenbanksysteme ist der transparente Zugriff zu den gespeicherten Daten, der es dem Nutzer erlaubt unabhängig von der internen Organisation auf die Daten zuzugreifen. Die resultierende Unabhängigkeit führt zu Abhängigkeiten in den Daten und erhöht die Komplexität der Systeme und der Kommunikation zwischen einzelnen Prozessen. Daher wird Transparenz von vielen Entwicklern für eine bessere Skalierbarkeit geopfert. Diese Entscheidung führt dazu, dass der die Datenorganisation und der Datenfluss explizit behandelt werden muss, was die Möglichkeiten für eine automatische und autonome Optimierung des Systems einschränkt. Der in dieser Arbeit vorgestellte Ansatz zur Skalierung und Allokation erhält den transparenten Zugriff und zeichnet sich dabei durch seine vollständige Automatisierbarkeit und sehr gute Skalierbarkeit aus. Im ersten Teil dieser Dissertation werden die Herausforderungen und Chancen für selbst-skalierende Datenbankmanagementsysteme behandelt, die in auf Computerclustern betrieben werden. Gute Skalierbarkeit ist eine notwendige Eigenschaft für Anwendungen, die über das Internet zugreifbar sind. Lastspitzen im Zugriff, die die Anwendung überladen stellen ein finanzielles Risiko dar. Deshalb werden Systeme so konfiguriert, dass sie eventuelle Lastspitzen zu jedem Zeitpunkt verarbeiten können. Das führt meist zu einer im Schnitt sehr geringen Auslastung der unterliegenden Systeme. Eine Möglichkeit dieser Ineffizienz entgegen zu steuern ist es die Anzahl der verwendeten Rechnerknoten an die vorliegende Last anzupassen. In dieser Dissertation werden ein Modell und eine Architektur für die Anfrageverarbeitung vorgestellt, mit denen es möglich ist Datenbanksysteme auf Clusterrechnern einfach und effizient zu skalieren. Im zweiten Teil der Arbeit werden verschieden Möglichkeiten für die Datenverteilung behandelt. Um die Effizienz zu steigern wird ein Modell verwendet, das die Lastverteilung im Anfragestrom berücksichtigt. Der Ansatz ist formalisiert und optimale und heuristische Lösungen werden präsentiert. Die vorgestellten Algorithmen optimieren die Datenverteilung für eine lokale Ausführung aller Anfragen und balancieren die Last auf den Rechnerknoten. Es werden unterschiedliche Arten der Anfrageklassifizierung vorgestellt, die zu verschiedenen Arten von Partitionierung führen. Der Ansatz wird sowohl für Onlinetransaktionsverarbeitung, als auch Onlinedatenanalyse evaluiert. Die Evaluierung zeigt, dass der Ansatz für beide Felder sehr gut skaliert. Im letzten Teil der Arbeit werden verschiedene Techniken für die Leistungsmessung von großen, adaptiven Systemen präsentiert. Zunächst wird ein Datengenerierungsansatz gezeigt, der es ermöglicht sehr große Datenmengen völlig parallel zu erzeugen. Um die Benutzerinteraktion von Onlinesystemen zu simulieren wurde ein produktives E-learningsystem analysiert. Anhand der Analyse wurde ein Modell für die Generierung von Arbeitslasten erstellt, das die zeitlichen Abhängigkeiten von Benutzerinteraktion berücksichtigt

    IDEAS-1997-2021-Final-Programs

    Get PDF
    This document records the final program for each of the 26 meetings of the International Database and Engineering Application Symposium from 1997 through 2021. These meetings were organized in various locations on three continents. Most of the papers published during these years are in the digital libraries of IEEE(1997-2007) or ACM(2008-2021)

    A fragmented data-declustering strategy for high skew tolerance and efficient failure recovery

    No full text

    Safety and Reliability - Safe Societies in a Changing World

    Get PDF
    The contributions cover a wide range of methodologies and application areas for safety and reliability that contribute to safe societies in a changing world. These methodologies and applications include: - foundations of risk and reliability assessment and management - mathematical methods in reliability and safety - risk assessment - risk management - system reliability - uncertainty analysis - digitalization and big data - prognostics and system health management - occupational safety - accident and incident modeling - maintenance modeling and applications - simulation for safety and reliability analysis - dynamic risk and barrier management - organizational factors and safety culture - human factors and human reliability - resilience engineering - structural reliability - natural hazards - security - economic analysis in risk managemen

    Development and application of chiral analytical methods for metabolic profiling

    No full text
    Metabonomics utilises high resolution analytical platforms to generate spectroscopic profiles that are rich in latent biological information. At present, the two principal analytical platforms used routinely in metabonomic studies are nuclear magnetic resonance (NMR) spectroscopy and mass spectrometry (MS). While these analytical methods in their current state of development in the field of metabonomics give broad coverage across multiple chemical classes, none report sufficiently well, if at all, on the absolute configuration of chiral molecules. As a consequence of the stereospecificity present in biological systems, a very large number of biologically active molecules exhibit chirality. Therefore, an important aspect of the metabolome remains largely unexplored by current metabonomic analytical platforms. Current enantioselective methods exist, but these are largely unsuitable for multi-analyte measurements. The work in this thesis addresses the need for appropriate chiral analytical methods for metabolic profiling. The aim was to develop a practical, fit-for-purpose chiral analytical method that will report simultaneously on numerous endogenous metabolites by untargeted or widely targeted analysis, and importantly, could be aligned with existing analytical workflows. Initial evaluation of an NMR spectroscopic approach using chiral solvating agents concluded that additional spectral complexity and interaction-induced internal reference compound chemical shift instability would prohibit efficient chiral profiling in a metabonomics workflow. An existing targeted MS-based assay for enantioselective separation of amino acids was selected for optimisation and further development. Focused analyses were performed to characterise its performance in the differentiation of individual pairs of enantiomers, and subsequently expanded, first to cover the panel of proteinogenic amino acids, and secondly to other detectable metabolites. Each stage of assay optimisation incorporated an evaluation in a representative set of samples and was able to provide highly relevant biological information that would not be accessible using current achiral metabonomic methods. Further work to expand the number of detectable metabolites and quantitative nature of the assay are discussed.Open Acces

    Ciguatoxins

    Get PDF
    Ciguatoxins (CTXs), which are responsible for Ciguatera fish poisoning (CFP), are liposoluble toxins produced by microalgae of the genera Gambierdiscus and Fukuyoa. This book presents 18 scientific papers that offer new information and scientific evidence on: (i) CTX occurrence in aquatic environments, with an emphasis on edible aquatic organisms; (ii) analysis methods for the determination of CTXs; (iii) advances in research on CTX-producing organisms; (iv) environmental factors involved in the presence of CTXs; and (v) the assessment of public health risks related to the presence of CTXs, as well as risk management and mitigation strategies

    Critical Thinking Skills Profile of High School Students In Learning Science-Physics

    Get PDF
    This study aims to describe Critical Thinking Skills high school students in the city of Makassar. To achieve this goal, the researchers conducted an analysis of student test results of 200 people scattered in six schools in the city of Makassar. The results of the quantitative descriptive analysis of the data found that the average value of students doing the interpretation, analysis, and inference in a row by 1.53, 1.15, and 1.52. This value is still very low when compared with the maximum value that may be obtained by students, that is equal to 10.00. This shows that the critical thinking skills of high school students are still very low. One fact Competency Standards science subjects-Physics is demonstrating the ability to think logically, critically, and creatively with the guidance of teachers and demonstrate the ability to solve simple problems in daily life. In fact, according to Michael Scriven stated that the main task of education is to train students and or students to think critically because of the demands of work in the global economy, the survival of a democratic and personal decisions and decisions in an increasingly complex society needs people who can think well and make judgments good. Therefore, the need for teachers in the learning device scenario such as: driving question or problem, authentic Investigation: Science Processes
    corecore