448 research outputs found
Forecasting the cost of processing multi-join queries via hashing for main-memory databases (Extended version)
Database management systems (DBMSs) carefully optimize complex multi-join
queries to avoid expensive disk I/O. As servers today feature tens or hundreds
of gigabytes of RAM, a significant fraction of many analytic databases becomes
memory-resident. Even after careful tuning for an in-memory environment, a
linear disk I/O model such as the one implemented in PostgreSQL may make query
response time predictions that are up to 2X slower than the optimal multi-join
query plan over memory-resident data. This paper introduces a memory I/O cost
model to identify good evaluation strategies for complex query plans with
multiple hash-based equi-joins over memory-resident data. The proposed cost
model is carefully validated for accuracy using three different systems,
including an Amazon EC2 instance, to control for hardware-specific differences.
Prior work in parallel query evaluation has advocated right-deep and bushy
trees for multi-join queries due to their greater parallelization and
pipelining potential. A surprising finding is that the conventional wisdom from
shared-nothing disk-based systems does not directly apply to the modern
shared-everything memory hierarchy. As corroborated by our model, the
performance gap between the optimal left-deep and right-deep query plan can
grow to about 10X as the number of joins in the query increases.Comment: 15 pages, 8 figures, extended version of the paper to appear in
SoCC'1
The End of Slow Networks: It's Time for a Redesign
Next generation high-performance RDMA-capable networks will require a
fundamental rethinking of the design and architecture of modern distributed
DBMSs. These systems are commonly designed and optimized under the assumption
that the network is the bottleneck: the network is slow and "thin", and thus
needs to be avoided as much as possible. Yet this assumption no longer holds
true. With InfiniBand FDR 4x, the bandwidth available to transfer data across
network is in the same ballpark as the bandwidth of one memory channel, and it
increases even further with the most recent EDR standard. Moreover, with the
increasing advances of RDMA, the latency improves similarly fast. In this
paper, we first argue that the "old" distributed database design is not capable
of taking full advantage of the network. Second, we propose architectural
redesigns for OLTP, OLAP and advanced analytical frameworks to take better
advantage of the improved bandwidth, latency and RDMA capabilities. Finally,
for each of the workload categories, we show that remarkable performance
improvements can be achieved
Database architecture evolution: Mammals flourished long before dinosaurs became extinct
The holy grail for database architecture research is to find a solution that is Scalable & Speedy, to run on anything from small ARM processors up to globally distributed compute clusters, Stable & Secure, to service a broad user community, Small & Simple, to be comprehensible to a small team of programmers, Self-managing, to let it run out-of-the-box without hassle. In this paper, we provide a trip report on this quest, covering both past experiences, ongoing research on hardware-conscious algorithms, and novel ways towards self-management specifically focused on column store solutions
Modularis: Modular Relational Analytics over Heterogeneous Distributed Platforms
The enormous quantity of data produced every day together with advances in
data analytics has led to a proliferation of data management and analysis
systems. Typically, these systems are built around highly specialized
monolithic operators optimized for the underlying hardware. While effective in
the short term, such an approach makes the operators cumbersome to port and
adapt, which is increasingly required due to the speed at which algorithms and
hardware evolve. To address this limitation, we present Modularis, an execution
layer for data analytics based on sub-operators, i.e.,composable building
blocks resembling traditional database operators but at a finer granularity. To
demonstrate the advantages of our approach, we use Modularis to build a
distributed query processing system supporting relational queries running on an
RDMA cluster, a serverless cloud platform, and a smart storage engine.
Modularis requires minimal code changes to execute queries across these three
diverse hardware platforms, showing that the sub-operator approach reduces the
amount and complexity of the code. In fact, changes in the platform affect only
sub-operators that depend on the underlying hardware. We show the end-to-end
performance of Modularis by comparing it with a framework for SQL processing
(Presto), a commercial cluster database (SingleStore), as well as
Query-as-a-Service systems (Athena, BigQuery). Modularis outperforms all these
systems, proving that the design and architectural advantages of a modular
design can be achieved without degrading performance. We also compare Modularis
with a hand-optimized implementation of a join for RDMA clusters. We show that
Modularis has the advantage of being easily extensible to a wider range of join
variants and group by queries, all of which are not supported in the hand-tuned
join.Comment: Accepted at PVLDB vol. 1
10381 Summary and Abstracts Collection -- Robust Query Processing
Dagstuhl seminar 10381 on robust query processing (held 19.09.10 -
24.09.10) brought together a diverse set of researchers and practitioners
with a broad range of expertise for the purpose of fostering discussion
and collaboration regarding causes, opportunities, and solutions for
achieving robust query processing.
The seminar strove to build a unified view across
the loosely-coupled system components responsible for
the various stages of database query processing.
Participants were chosen for their experience with database
query processing and, where possible, their prior work in academic
research or in product development towards robustness in database query
processing.
In order to pave the way to motivate, measure, and protect future advances
in robust query processing, seminar 10381 focused on developing tests
for measuring the robustness of query processing.
In these proceedings, we first review the seminar topics, goals,
and results, then present abstracts or notes of some of the seminar break-out
sessions.
We also include, as an appendix,
the robust query processing reading list that
was collected and distributed to participants before the seminar began,
as well as summaries of a few of those papers that were
contributed by some participants
Analysis, characterization and optimization of the energy efficiency on softwarized mobile platforms
Mención Internacional en el título de doctorLa inminente 5ª generación de sistemas móviles (5G) está a punto de revolucionar la industria, trayendo una nueva arquitectura orientada a los nuevos mercados verticales y servicios. Debido a esto, el 5G Infrastructure Public Private Partnership (5G-PPP) ha especificado una lista de Indicadores de Rendimiento Clave (KPI) que todo sistema 5G tiene que soportar, por ejemplo incrementar por 1000 el volumen de datos, de 10 a 100 veces m´as dispositivos conectados o consumos energéticos 10 veces inferiores. Con el fin de conseguir estos requisitos, se espera expandir los despligues actuales usando mas Puntos de Acceso (PoA) incrementando así su densidad con
múltiples tecnologías inalámbricas. Esta estrategia de despliegue masivo tiene una contrapartida en la eficiencia energética, generando un conflicto con el KPI de reducir por 10 el consumo energético. En este contexto, la comunidad investigadora ha propuesto nuevos paradigmas para alcanzar los requisitos impuestos para los sistemas 5G, siendo materializados en tecnologías como Redes Definidas por Software (SDN) y Virtualización de Funciones de Red (NFV). Estos nuevos paradigmas son el primer paso hacia la softwarización de los despliegues móviles, incorporando nuevos grados de flexibilidad y reconfigurabilidad de la Red de Acceso Radio (RAN). En esta tesis, presentamos primero un análisis detallado y caracterización de las redes móviles softwarizadas. Consideramos el software como la base de la nueva generación de redes celulares y, por lo tanto, analizaremos y caracterizaremos el impacto en la eficiencia energética de estos
sistemas. La primera meta de este trabajo es caracterizar las plataformas software disponibles para Radios Definidas por Software (SDR), centrándonos en las dos soluciones principales de código abierto: OpenAirInterface (OAI) y srsLTE. Como resultado, proveemos una metodología para analizar y caracterizar el rendimiento de estas soluciones en función del uso de la CPU, rendimiento de red, compatibilidad y extensibilidad de dicho software. Una vez hemos entendido
qué rendimiento podemos esperar de este tipo de soluciones, estudiamos un prototipo SDR construido con aceleración hardware, que emplea una plataformas basada en FPGA. Este prototipo está diseñado para incluir capacidad de ser consciente de la energía, permiento al sistema ser reconfigurado para minimizar la huella energética cuando sea posible. Con el fin de validar el diseño de nuestro sistema, más tarde presentamos una plataforma para caracterizar la energía que será empleada para medir experimentalmente el consumo energético de dispositivos reales. En nuestro enfoque, realizamos dos tipos de análisis: a pequeña escala de tiempo y a gran escala de tiempo. Por lo tanto, para validar nuestro entorno de medidas, caracterizamos a través de análisis numérico los algoritmos para la Adaptación de la Tasa (RA) en IEEE 802.11, para entonces comparar
nuestros resultados teóricos con los experimentales. A continuación extendemos nuestro
análisis a la plataforma SDR acelerada por hardware previamente mencionada. Nuestros resultados experimentales muestran que nuestra sistema puede en efecto reducir la huella energética reconfigurando el despligue del sistema.
Entonces, la escala de tiempos es elevada y presentamos los esquemas para Recursos bajo Demanda (RoD) en despliegues de red ultra-densos. Esta estrategia está basada en apagar/encender
dinámicamente los elementos que forman la red con el fin de reducir el total del consumo
energético. Por lo tanto, presentamos un modelo analítico en dos sabores, un modelo exacto que predice el comportamiento del sistema con precisión pero con un alto coste computacional y uno simplificado que es más ligero en complejidad mientras que mantiene la precisión. Nuestros resultados muestran que estos esquemas pueden efectivamente mejorar la eficiencia energética de
los despliegues y mantener la Calidad de Servicio (QoS). Con el fin de probar la plausibilidad
de los esquemas RoD, presentamos un plataforma softwarizada que sigue el paradigma SDN,
OFTEN (OpenFlow framework for Traffic Engineering in mobile Network with energy awareness).
Nuestro diseño está basado en OpenFlow con funcionalidades para hacerlo consciente de
la energía. Finalmente, un prototipo real con esta plataforma es presentando, probando así la plausibilidad de los RoD en despligues reales.The upcoming 5th Generation of mobile systems (5G) is about to revolutionize the industry,
bringing a new architecture oriented to new vertical markets and services. Due to this, the 5G-PPP
has specified a list of Key Performance Indicator (KPI) that 5G systems need to support e.g. increasing
the 1000 times higher data volume, 10 to 100 times more connected devices or 10 times
lower power consumption. In order to achieve these requirements, it is expected to expand the
current deployments using more Points of Attachment (PoA) by increasing their density and by
using multiple wireless technologies. This massive deployment strategy triggers a side effect in
the energy efficiency though, generating a conflict with the “10 times lower power consumption”
KPI. In this context, the research community has proposed novel paradigms to achieve the imposed
requirements for 5G systems, being materialized in technologies such as Software Defined
Networking (SDN) and Network Function Virtualization (NFV). These new paradigms are the
first step to softwarize the mobile network deployments, enabling new degrees of flexibility and
reconfigurability of the Radio Access Network (RAN).
In this thesis, we first present a detailed analysis and characterization of softwarized mobile
networking. We consider software as a basis for the next generation of cellular networks and
hence, we analyze and characterize the impact on the energy efficiency of these systems. The
first goal of this work is to characterize the available software platforms for Software Defined
Radio (SDR), focusing on the two main open source solutions: OAI and srsLTE. As result, we
provide a methodology to analyze and characterize the performance of these solutions in terms
of CPU usage, network performance, compatibility and extensibility of the software. Once we
have understood the expected performance for such platformsc, we study an SDR prototype built
with hardware acceleration, that employs a FPGA based platform. This prototype is designed
to include energy-awareness capabilites, allowing the system to be reconfigured to minimize the
energy footprint when possible. In order to validate our system design, we later present an energy
characterization platform that we will employ to experimentally measure the energy consumption
of real devices. In our approach, we perform two kind of analysis: at short time scale and large
time scale. Thus, to validate our approach in short time scale and the energy framework, we have
characterized though numerical analysis the Rate Adaptation (RA) algorithms in IEEE 802.11,
and then compare our theoretical results to the obtained ones through experimentation. Next
we extend our analysis to the hardware accelerated SDR prototype previously mentioned. Our experimental results show that our system can indeed reduce the energy footprint reconfiguring
the system deployment.
Then, the time scale of our analysis is elevated and we present Resource-on-Demand (RoD)
schemes for ultradense network deployments. This strategy is based on dynamically switch on/off
the elements that form the network to reduce the overall energy consumption. Hence, we present
a analytic model in two flavors, an exact model that accurately predicts the system behaviour
but high computational cost and a simplified one that is lighter in complexity while keeping the
accuracy. Our results show that these schemes can effectively enhance the energy efficiency of
the deployments and mantaining the Quality of Service (QoS). In order to prove the feasibility of
RoD, we present a softwarized platform that follows the SDN paradigm, the OFTEN (Open Flow
framework for Traffic Engineering in mobile Networks with energy awareness) framework. Our
design is based on OpenFlow with energy-awareness functionalities. Finally, a real prototype of
this framework is presented, proving the feasibility of the RoD in real deployments.FP7-CROWD (2013-2015) CROWD (Connectivity management for eneRgy Optimised Wireless Dense networks).-- H2020-Flex5GWare (2015-2017) Flex5GWare (Flexible and efficient hardware/software platforms for 5G network elements and devices).Programa de Doctorado en Ingeniería Telemática por la Universidad Carlos III de MadridPresidente: Gramaglia , Marco.- Secretario: José Nuñez.- Vocal: Fabrizio Giulian
- …