448 research outputs found

    Forecasting the cost of processing multi-join queries via hashing for main-memory databases (Extended version)

    Full text link
    Database management systems (DBMSs) carefully optimize complex multi-join queries to avoid expensive disk I/O. As servers today feature tens or hundreds of gigabytes of RAM, a significant fraction of many analytic databases becomes memory-resident. Even after careful tuning for an in-memory environment, a linear disk I/O model such as the one implemented in PostgreSQL may make query response time predictions that are up to 2X slower than the optimal multi-join query plan over memory-resident data. This paper introduces a memory I/O cost model to identify good evaluation strategies for complex query plans with multiple hash-based equi-joins over memory-resident data. The proposed cost model is carefully validated for accuracy using three different systems, including an Amazon EC2 instance, to control for hardware-specific differences. Prior work in parallel query evaluation has advocated right-deep and bushy trees for multi-join queries due to their greater parallelization and pipelining potential. A surprising finding is that the conventional wisdom from shared-nothing disk-based systems does not directly apply to the modern shared-everything memory hierarchy. As corroborated by our model, the performance gap between the optimal left-deep and right-deep query plan can grow to about 10X as the number of joins in the query increases.Comment: 15 pages, 8 figures, extended version of the paper to appear in SoCC'1

    The End of Slow Networks: It's Time for a Redesign

    Full text link
    Next generation high-performance RDMA-capable networks will require a fundamental rethinking of the design and architecture of modern distributed DBMSs. These systems are commonly designed and optimized under the assumption that the network is the bottleneck: the network is slow and "thin", and thus needs to be avoided as much as possible. Yet this assumption no longer holds true. With InfiniBand FDR 4x, the bandwidth available to transfer data across network is in the same ballpark as the bandwidth of one memory channel, and it increases even further with the most recent EDR standard. Moreover, with the increasing advances of RDMA, the latency improves similarly fast. In this paper, we first argue that the "old" distributed database design is not capable of taking full advantage of the network. Second, we propose architectural redesigns for OLTP, OLAP and advanced analytical frameworks to take better advantage of the improved bandwidth, latency and RDMA capabilities. Finally, for each of the workload categories, we show that remarkable performance improvements can be achieved

    Database architecture evolution: Mammals flourished long before dinosaurs became extinct

    Get PDF
    The holy grail for database architecture research is to find a solution that is Scalable & Speedy, to run on anything from small ARM processors up to globally distributed compute clusters, Stable & Secure, to service a broad user community, Small & Simple, to be comprehensible to a small team of programmers, Self-managing, to let it run out-of-the-box without hassle. In this paper, we provide a trip report on this quest, covering both past experiences, ongoing research on hardware-conscious algorithms, and novel ways towards self-management specifically focused on column store solutions

    Modularis: Modular Relational Analytics over Heterogeneous Distributed Platforms

    Full text link
    The enormous quantity of data produced every day together with advances in data analytics has led to a proliferation of data management and analysis systems. Typically, these systems are built around highly specialized monolithic operators optimized for the underlying hardware. While effective in the short term, such an approach makes the operators cumbersome to port and adapt, which is increasingly required due to the speed at which algorithms and hardware evolve. To address this limitation, we present Modularis, an execution layer for data analytics based on sub-operators, i.e.,composable building blocks resembling traditional database operators but at a finer granularity. To demonstrate the advantages of our approach, we use Modularis to build a distributed query processing system supporting relational queries running on an RDMA cluster, a serverless cloud platform, and a smart storage engine. Modularis requires minimal code changes to execute queries across these three diverse hardware platforms, showing that the sub-operator approach reduces the amount and complexity of the code. In fact, changes in the platform affect only sub-operators that depend on the underlying hardware. We show the end-to-end performance of Modularis by comparing it with a framework for SQL processing (Presto), a commercial cluster database (SingleStore), as well as Query-as-a-Service systems (Athena, BigQuery). Modularis outperforms all these systems, proving that the design and architectural advantages of a modular design can be achieved without degrading performance. We also compare Modularis with a hand-optimized implementation of a join for RDMA clusters. We show that Modularis has the advantage of being easily extensible to a wider range of join variants and group by queries, all of which are not supported in the hand-tuned join.Comment: Accepted at PVLDB vol. 1

    10381 Summary and Abstracts Collection -- Robust Query Processing

    Get PDF
    Dagstuhl seminar 10381 on robust query processing (held 19.09.10 - 24.09.10) brought together a diverse set of researchers and practitioners with a broad range of expertise for the purpose of fostering discussion and collaboration regarding causes, opportunities, and solutions for achieving robust query processing. The seminar strove to build a unified view across the loosely-coupled system components responsible for the various stages of database query processing. Participants were chosen for their experience with database query processing and, where possible, their prior work in academic research or in product development towards robustness in database query processing. In order to pave the way to motivate, measure, and protect future advances in robust query processing, seminar 10381 focused on developing tests for measuring the robustness of query processing. In these proceedings, we first review the seminar topics, goals, and results, then present abstracts or notes of some of the seminar break-out sessions. We also include, as an appendix, the robust query processing reading list that was collected and distributed to participants before the seminar began, as well as summaries of a few of those papers that were contributed by some participants

    Analysis, characterization and optimization of the energy efficiency on softwarized mobile platforms

    Get PDF
    Mención Internacional en el título de doctorLa inminente 5ª generación de sistemas móviles (5G) está a punto de revolucionar la industria, trayendo una nueva arquitectura orientada a los nuevos mercados verticales y servicios. Debido a esto, el 5G Infrastructure Public Private Partnership (5G-PPP) ha especificado una lista de Indicadores de Rendimiento Clave (KPI) que todo sistema 5G tiene que soportar, por ejemplo incrementar por 1000 el volumen de datos, de 10 a 100 veces m´as dispositivos conectados o consumos energéticos 10 veces inferiores. Con el fin de conseguir estos requisitos, se espera expandir los despligues actuales usando mas Puntos de Acceso (PoA) incrementando así su densidad con múltiples tecnologías inalámbricas. Esta estrategia de despliegue masivo tiene una contrapartida en la eficiencia energética, generando un conflicto con el KPI de reducir por 10 el consumo energético. En este contexto, la comunidad investigadora ha propuesto nuevos paradigmas para alcanzar los requisitos impuestos para los sistemas 5G, siendo materializados en tecnologías como Redes Definidas por Software (SDN) y Virtualización de Funciones de Red (NFV). Estos nuevos paradigmas son el primer paso hacia la softwarización de los despliegues móviles, incorporando nuevos grados de flexibilidad y reconfigurabilidad de la Red de Acceso Radio (RAN). En esta tesis, presentamos primero un análisis detallado y caracterización de las redes móviles softwarizadas. Consideramos el software como la base de la nueva generación de redes celulares y, por lo tanto, analizaremos y caracterizaremos el impacto en la eficiencia energética de estos sistemas. La primera meta de este trabajo es caracterizar las plataformas software disponibles para Radios Definidas por Software (SDR), centrándonos en las dos soluciones principales de código abierto: OpenAirInterface (OAI) y srsLTE. Como resultado, proveemos una metodología para analizar y caracterizar el rendimiento de estas soluciones en función del uso de la CPU, rendimiento de red, compatibilidad y extensibilidad de dicho software. Una vez hemos entendido qué rendimiento podemos esperar de este tipo de soluciones, estudiamos un prototipo SDR construido con aceleración hardware, que emplea una plataformas basada en FPGA. Este prototipo está diseñado para incluir capacidad de ser consciente de la energía, permiento al sistema ser reconfigurado para minimizar la huella energética cuando sea posible. Con el fin de validar el diseño de nuestro sistema, más tarde presentamos una plataforma para caracterizar la energía que será empleada para medir experimentalmente el consumo energético de dispositivos reales. En nuestro enfoque, realizamos dos tipos de análisis: a pequeña escala de tiempo y a gran escala de tiempo. Por lo tanto, para validar nuestro entorno de medidas, caracterizamos a través de análisis numérico los algoritmos para la Adaptación de la Tasa (RA) en IEEE 802.11, para entonces comparar nuestros resultados teóricos con los experimentales. A continuación extendemos nuestro análisis a la plataforma SDR acelerada por hardware previamente mencionada. Nuestros resultados experimentales muestran que nuestra sistema puede en efecto reducir la huella energética reconfigurando el despligue del sistema. Entonces, la escala de tiempos es elevada y presentamos los esquemas para Recursos bajo Demanda (RoD) en despliegues de red ultra-densos. Esta estrategia está basada en apagar/encender dinámicamente los elementos que forman la red con el fin de reducir el total del consumo energético. Por lo tanto, presentamos un modelo analítico en dos sabores, un modelo exacto que predice el comportamiento del sistema con precisión pero con un alto coste computacional y uno simplificado que es más ligero en complejidad mientras que mantiene la precisión. Nuestros resultados muestran que estos esquemas pueden efectivamente mejorar la eficiencia energética de los despliegues y mantener la Calidad de Servicio (QoS). Con el fin de probar la plausibilidad de los esquemas RoD, presentamos un plataforma softwarizada que sigue el paradigma SDN, OFTEN (OpenFlow framework for Traffic Engineering in mobile Network with energy awareness). Nuestro diseño está basado en OpenFlow con funcionalidades para hacerlo consciente de la energía. Finalmente, un prototipo real con esta plataforma es presentando, probando así la plausibilidad de los RoD en despligues reales.The upcoming 5th Generation of mobile systems (5G) is about to revolutionize the industry, bringing a new architecture oriented to new vertical markets and services. Due to this, the 5G-PPP has specified a list of Key Performance Indicator (KPI) that 5G systems need to support e.g. increasing the 1000 times higher data volume, 10 to 100 times more connected devices or 10 times lower power consumption. In order to achieve these requirements, it is expected to expand the current deployments using more Points of Attachment (PoA) by increasing their density and by using multiple wireless technologies. This massive deployment strategy triggers a side effect in the energy efficiency though, generating a conflict with the “10 times lower power consumption” KPI. In this context, the research community has proposed novel paradigms to achieve the imposed requirements for 5G systems, being materialized in technologies such as Software Defined Networking (SDN) and Network Function Virtualization (NFV). These new paradigms are the first step to softwarize the mobile network deployments, enabling new degrees of flexibility and reconfigurability of the Radio Access Network (RAN). In this thesis, we first present a detailed analysis and characterization of softwarized mobile networking. We consider software as a basis for the next generation of cellular networks and hence, we analyze and characterize the impact on the energy efficiency of these systems. The first goal of this work is to characterize the available software platforms for Software Defined Radio (SDR), focusing on the two main open source solutions: OAI and srsLTE. As result, we provide a methodology to analyze and characterize the performance of these solutions in terms of CPU usage, network performance, compatibility and extensibility of the software. Once we have understood the expected performance for such platformsc, we study an SDR prototype built with hardware acceleration, that employs a FPGA based platform. This prototype is designed to include energy-awareness capabilites, allowing the system to be reconfigured to minimize the energy footprint when possible. In order to validate our system design, we later present an energy characterization platform that we will employ to experimentally measure the energy consumption of real devices. In our approach, we perform two kind of analysis: at short time scale and large time scale. Thus, to validate our approach in short time scale and the energy framework, we have characterized though numerical analysis the Rate Adaptation (RA) algorithms in IEEE 802.11, and then compare our theoretical results to the obtained ones through experimentation. Next we extend our analysis to the hardware accelerated SDR prototype previously mentioned. Our experimental results show that our system can indeed reduce the energy footprint reconfiguring the system deployment. Then, the time scale of our analysis is elevated and we present Resource-on-Demand (RoD) schemes for ultradense network deployments. This strategy is based on dynamically switch on/off the elements that form the network to reduce the overall energy consumption. Hence, we present a analytic model in two flavors, an exact model that accurately predicts the system behaviour but high computational cost and a simplified one that is lighter in complexity while keeping the accuracy. Our results show that these schemes can effectively enhance the energy efficiency of the deployments and mantaining the Quality of Service (QoS). In order to prove the feasibility of RoD, we present a softwarized platform that follows the SDN paradigm, the OFTEN (Open Flow framework for Traffic Engineering in mobile Networks with energy awareness) framework. Our design is based on OpenFlow with energy-awareness functionalities. Finally, a real prototype of this framework is presented, proving the feasibility of the RoD in real deployments.FP7-CROWD (2013-2015) CROWD (Connectivity management for eneRgy Optimised Wireless Dense networks).-- H2020-Flex5GWare (2015-2017) Flex5GWare (Flexible and efficient hardware/software platforms for 5G network elements and devices).Programa de Doctorado en Ingeniería Telemática por la Universidad Carlos III de MadridPresidente: Gramaglia , Marco.- Secretario: José Nuñez.- Vocal: Fabrizio Giulian
    corecore