67 research outputs found

    Database Workload Management (Dagstuhl Seminar 12282)

    Get PDF
    This report documents the program and the outcomes of Dagstuhl Seminar 12282 "Database Workload Management". Dagstuhl Seminar 12282 was designed to provide a venue where researchers can engage in dialogue with industrial participants for an in-depth exploration of challenging industrial workloads, where industrial participants can challenge researchers to apply the lessons-learned from their large-scale experiments to multiple real systems, and that would facilitate the release of real workloads that can be used to drive future research, and concrete measures to evaluate and compare workload management techniques in the context of these workloads

    Speedup Your Analytics : Automatic Parameter Tuning for Databases and Big Data Systems

    Get PDF
    Database and big data analytics systems such as Hadoop and Spark have a large number of configuration parameters that control memory distribution, I/O optimization, parallelism, and compression. Improper parameter settings can cause significant performance degradation and stability issues. However, regular users and even expert administrators struggle to understand and tune them to achieve good performance. In this tutorial, we review existing approaches on automatic parameter tuning for databases, Hadoop, and Spark, which we classify into six categories: rule-based, cost modeling, simulation-based, experiment-driven, machine learning, and adaptive tuning. We describe the foundations of different automatic parameter tuning algorithms and present pros and cons of each approach. We also highlight real-world applications and systems, and identify research challenges for handling cloud services, resource heterogeneity, and real-time analytics.Peer reviewe

    Toward Self-Healing Multitier Services

    Get PDF
    Are self-healing database-centric multitier services utopia or just a hard puzzle? We argue for the latter and aim to identify the missing pieces of this puzzle. We advocate robust and scalable learning-based approaches to self-healing that we expect to work well for a large class of multitier services. We identify performance-availability problems (PAPs) as the most relevant target for self-healing, and argue that PAPs are best addressed macroscopically, outside the realm of individual tiers. Finally, we lay out a research agenda for learning-based approaches to self-healing, to enable wider deployment of self-healing multi-tier services

    Cumulon: Cloudbased statistical analysis from users perspective.

    Get PDF
    Abstract Cumulon is a system aimed at simplifying the developmen

    Adaptive query processing in the looking glass

    No full text
    A great deal of work on adaptive query processing has been done over the last few years: Adaptive query processing has been used to detect and correct optimizer errors due to incorrect statistics or simplified cost metrics; it has been applied to long-running continuous queries over data streams whose characteristics vary over time; and routing-based adaptive query processing does away with the optimizer altogether. Despite this large body of interrelated work, no unifying comparison of adaptive query processing techniques or systems has been attempted; we tackle this problem. We identify three families of systems (plan-based, CQ-based, and routingbased), and compare them in detail with respect to the most important aspects of adaptive query processing: plan quality, statistics monitoring and re-optimization, plan migration, and scalability. We also suggest two new approaches to adaptive query processing that address some of the shortcomings revealed by our in-depth analysis: (1) Proactive re-optimization, where the optimizer chooses query plans with the expectation of reoptimization; and (2) Plan logging, where optimizer decisions under different conditions are logged over time, enabling plan reuse as well as analysis of relevant statistics and benefits of adaptivity.
    corecore