158 research outputs found
LIPIcs, Volume 261, ICALP 2023, Complete Volume
LIPIcs, Volume 261, ICALP 2023, Complete Volum
Do EU and U.K. Antitrust “Bite”?: A Hard Look at “Soft” Enforcement and Negotiated Penalty Settlements
EU and U.K. antitrust are contingent upon rigorous enforcement and the imposition of sanctions. Hard enforcement is key; antitrust loses its effect when it does not “bite.” Soft instruments (non-adversarial, informal) and negotiated penalty settlements may be used, but authorities are expected to exercise self-restraint. This article reveals that despite the prevalence of hard-enforcement rhetoric, the vast majority of actions taken by the European Commission (1958–2021) and German, Dutch, and U.K. antitrust authorities (2004–2021) were not fully adversarial. The hard-enforcement actions, moreover, were confined to limited practices and sectors. Despite the prominence of non-fully adversarial instruments in Europe, and in striking contrast to the United States, only limited attention was devoted to their existence and implications. Urging to take a hard look at soft enforcement and negotiated penalty settlements, the article systematically records the enforcement instruments and their particularities, questions their effectiveness, and calls to align enforcement theory to practice
Online Schema Evolution is (Almost) Free for Snapshot Databases
Modern database applications often change their schemas to keep up with the
changing requirements. However, support for online and transactional schema
evolution remains challenging in existing database systems. Specifically, prior
work often takes ad hoc approaches to schema evolution with 'patches' applied
to existing systems, leading to many corner cases and often incomplete
functionality. Applications therefore often have to carefully schedule
downtimes for schema changes, sacrificing availability.
This paper presents Tesseract, a new approach to online and transactional
schema evolution without the aforementioned drawbacks. We design Tesseract
based on a key observation: in widely used multi-versioned database systems,
schema evolution can be modeled as data modification operations that change the
entire table, i.e., data-definition-as-modification (DDaM). This allows us to
support schema almost 'for free' by leveraging the concurrency control
protocol. By simple tweaks to existing snapshot isolation protocols, on a
40-core server we show that under a variety of workloads, Tesseract is able to
provide online, transactional schema evolution without service downtime, and
retain high application performance when schema evolution is in progress.Comment: To appear at Proceedings of the 2023 International Conference on Very
Large Data Bases (VLDB 2023
Adaptive Management of Multimodel Data and Heterogeneous Workloads
Data management systems are facing a growing demand for a tighter integration of heterogeneous data from different applications and sources for both operational and analytical purposes in real-time. However, the vast diversification of the data management landscape has led to a situation where there is a trade-off between high operational performance and a tight integration of data. The difference between the growth of data volume and the growth of computational power demands a new approach for managing multimodel data and handling heterogeneous workloads.
With PolyDBMS we present a novel class of database management systems, bridging the gap between multimodel database and polystore systems. This new kind of database system combines the operational capabilities of traditional database systems with the flexibility of polystore systems. This includes support for data modifications, transactions, and schema changes at runtime. With native support for multiple data models and query languages, a PolyDBMS presents a holistic solution for the management of heterogeneous data. This does not only enable a tight integration of data across different applications, it also allows a more efficient usage of resources. By leveraging and combining highly optimized database systems as storage and execution engines, this novel class of database system takes advantage of decades of database systems research and development.
In this thesis, we present the conceptual foundations and models for building a PolyDBMS. This includes a holistic model for maintaining and querying multiple data models in one logical schema that enables cross-model queries. With the PolyAlgebra, we present a solution for representing queries based on one or multiple data models while preserving their semantics. Furthermore, we introduce a concept for the adaptive planning and decomposition of queries across heterogeneous database systems with different capabilities and features.
The conceptual contributions presented in this thesis materialize in Polypheny-DB, the first implementation of a PolyDBMS. Supporting the relational, document, and labeled property graph data model, Polypheny-DB is a suitable solution for structured, semi-structured, and unstructured data. This is complemented by an extensive type system that includes support for binary large objects. With support for multiple query languages, industry standard query interfaces, and a rich set of domain-specific data stores and data sources, Polypheny-DB offers a flexibility unmatched by existing data management solutions
Foundations of Software Science and Computation Structures
This open access book constitutes the proceedings of the 25th International Conference on Foundations of Software Science and Computational Structures, FOSSACS 2022, which was held during April 4-6, 2022, in Munich, Germany, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022. The 23 regular papers presented in this volume were carefully reviewed and selected from 77 submissions. They deal with research on theories and methods to support the analysis, integration, synthesis, transformation, and verification of programs and software systems
Universal Database System Analysis for Insight and Adaptivity
Database systems are ubiquitous; they serve as the cornerstone of modern
application infrastructure due to their efficient data access and
storage. Database systems are commonly deployed in a wide range of environments,
from transaction processing to analytics.
Unfortunately, this broad support comes with a trade-off in system
complexity. Database systems contain many components and features that
must work together to meet client demand. Administrators responsible
for maintaining database systems face a daunting task: they must
determine the access characteristics of the client workload they are
serving and tailor the system to optimize for
it. Complicating matters, client workloads are known to shift in
access patterns and load. Thus, administrators continuously
perform this optimization task, refining system design and
configuration to meet ever-changing client request patterns.
Researchers have focused on creating next-generation, natively adaptive database systems to
address this administrator burden. Natively adaptive database systems construct
client-request models, determine workload characteristics, and tailor
processing strategies to optimize accordingly. These systems
continuously refine their models, ensuring they are responsive to
workload shifts. While these new systems show promise in adapting system
behaviour to their environment, existing, popularly-used database systems
lack these adaptive capabilities. Porting the ideas in these new
adaptive systems to existing infrastructure requires monumental
engineering effort, slowing their adoption and leaving users stranded
with their existing, non-adaptive database systems.
In this thesis, I present Dendrite, a framework that easily
``bolts on'' to existing database systems to endow them with adaptive
capabilities. Dendrite captures database system behaviour in
a system-agnostic fashion, ensuring that its techniques are
generalizable. It compares captured behaviour to determine
how system behaviour changes over time and with respect to idealized
system performance. These differences are matched against
configurable adaption rules, which deploy user-defined
functions to remedy performance problems. As such, Dendrite can
deploy whatever adaptions are necessary to address a behaviour shift
and tailor the system to the workload at hand. Dendrite
has low tracking overhead, making it practical for intensive database
system deployments
Yavaa: supporting data workflows from discovery to visualization
Recent years have witness an increasing number of data silos being opened up both within organizations and to the general public: Scientists publish their raw data as supplements to articles or even standalone artifacts to enable others to verify and extend their work. Governments pass laws to open up formerly protected data treasures to improve accountability and transparency as well as to enable new business ideas based on this public good. Even companies share structured information about their products and services to advertise their use and thus increase revenue. Exploiting this wealth of information holds many challenges for users, though. Oftentimes data is provided as tables whose sheer endless rows of daunting numbers are barely accessible. InfoVis can mitigate this gap. However, offered visualization options are generally very limited and next to no support is given in applying any of them. The same holds true for data wrangling. Only very few options to adjust the data to the current needs and barely any protection are in place to prevent even the most obvious mistakes. When it comes to data from multiple providers, the situation gets even bleaker. Only recently tools emerged to search for datasets across institutional borders reasonably. Easy-to-use ways to combine these datasets are still missing, though. Finally, results generally lack proper documentation of their provenance. So even the most compelling visualizations can be called into question when their coming about remains unclear. The foundations for a vivid exchange and exploitation of open data are set, but the barrier of entry remains relatively high, especially for non-expert users. This thesis aims to lower that barrier by providing tools and assistance, reducing the amount of prior experience and skills required. It covers the whole workflow ranging from identifying proper datasets, over possible transformations, up until the export of the result in the form of suitable visualizations
How digital data are used in the domain of health: A short review of current knowledge
In the era of digitalization, digital data is available about every aspect of our daily lives,
including our physical and mental health. Digital data has been applied in the domain of healthcare
for the detection of an outbreak of infectious diseases, clinical decision support, personalized care, and genomics. This paper will serve as a review of the rapidly evolving field of digital health. More specifically, we will discuss (1) big data and physical health, (2) big data and mental health, (3) digital contact tracing during the COVID-19 pandemic, and finally, (4) ethical issues with using digital data for health-related purposes. With this review, we aim to stimulate a public debate on the appropriate usage of digital data in the health sector
A Design Framework for Efficient Distributed Analytics on Structured Big Data
Distributed analytics architectures are often comprised of two elements: a compute engine and a storage system. Conventional distributed storage systems usually store data in the form of files or key-value pairs. This abstraction simplifies how the data is accessed and reasoned about by an application developer. However, the separation of compute and storage systems makes it difficult to optimize costly disk and network operations. By design the storage system is isolated from the workload and its performance requirements such as block co-location and replication. Furthermore, optimizing fine-grained data access requests becomes difficult as the storage layer is hidden away behind such abstractions.
Using a clean slate approach, this thesis proposes a modular distributed analytics system design which is centered around a unified interface for distributed data objects named the DDO. The interface couples key mechanisms that utilize storage, memory, and compute resources. This coupling makes it ideal to optimize data access requests across all memory hierarchy levels, with respect to the workload and its performance requirements. In addition to the DDO, a complementary DDO controller implementation controls the logical view of DDOs, their replication, and distribution across the cluster. A proof-of-concept implementation shows improvement in mean query time by 3-6x on the TPC-H and TPC-DS benchmarks, and more than an order of magnitude improvement in many cases
- …