10 research outputs found

    Shrink -Prescribing Resiliency Solutions for Streaming

    Get PDF
    ABSTRACT Streaming query deployments make up a vital part of cloud oriented applications. They vary widely in their data, logic, and statefulness, and are typically executed in multi-tenant distributed environments with varying uptime SLAs. In order to achieve these SLAs, one of a number of proposed resiliency strategies is employed to protect against failure. This paper has introduced the first, comprehensive, cloud friendly comparison between different resiliency techniques for streaming queries. In this paper, we introduce models which capture the costs associated with different resiliency strategies, and through a series of experiments which implement and validate these models, show that (1) there is no single resiliency strategy which efficiently handles most streaming scenarios; (2) the optimization space is too complex for a person to employ a "rules of thumb" approach; and (3) there exists a clear generalization of periodic checkpointing that is worth considering in many cases. Finally, the models presented in this paper can be adapted to fit a wide variety of resiliency strategies, and likely have important consequences for cloud services beyond those that are obviously streaming

    Puredata Systems for Analytics: Concurrency and Workload Management

    Get PDF
    PureDataTM System for Analytics also called as Netezza is a data warehouse server handling analytic operations capable of providing throughput 1000 times greater and faster than traditional database servers. Impressively, it requires minimal system tuning thereby providing high-end performance as well as requiring a low total cost of ownership (TCO). Database performance is directly linked to the allocation of system resources on a database management system. The heart of the Netezza appliance, Field-Programmable Gate Array (FPGA) plays a key role in boosting the overall performance of a server. I/O operations are always a bottleneck in any database server and it is the FPGA that eradicates the I/O problem in Netezza by filtering the data across each snippet processing unit (SPU), processing and running the queries faster thereby pumping server’s performance greatly. This paper describes the current problems the companies face in a “big data” environment which includes concurrency handling and query performance. There are various factors which affect a query\u27s performance, which includes bad data distribution, stale statistics, server load and uneven system resources. Since this paper is restricted to only the system resources, an in-depth analysis of system resources and its components will be analyzed in this research. A database server’s performance is directly related to its underlying allocation of system resources. Work Load Management (WLM) and each of its features are described in this paper which gives the reader a clear notion of how a query\u27s performance is altered using various mechanisms. The paper describes the current performance problems that exist on the traditional database servers and how the Work Load Management components can be tweaked along with the predefined system configurations to process a query to run faster on a Netezza machine

    To Replicate or Not To Replicate Queries in the Presence of Autonomous Participants?

    Get PDF
    National audienceIn summary, the main contributions of this paper are as follows. 3 We formalize the query allocation problem and make precise query replication in the presence of autonomous participants (Section II). We introduce a global satisfaction notion to characterize the fact that (i) queries have different criticality for consumers; (ii) a consumer may receive less results than it expects; and (iii) a provider may perform queries for nothing (Section IV). We propose two automatic query replication algorithms, SbQR and SbQR+, that consider global satisfaction as the basis of their functionality to decide on-the-fly (i) which queries should be replicated and (ii) how many query replicas should be created (Section V). We experimentally demonstrate that SbQR: (i) significantly outperforms popular baseline algorithms and (ii) automatically adapts to the workload and the criticality of queries (Section VI).L'objectif d'un syst'eme largement distribu'e sur Internet est d'int'egrer des participants dont les sp'ecificit'es et motivations ne sont pas toujours clairement identifi'ees a priori. En particulier, des participants autonomes peuvent avoir des int'erˆets individuels sp'ecifiques vis-'a-vis des requˆetes, mais aussi des autres participants. Dans un tel contexte, un syst'eme ne prenant pas en compte les individualit'es provoque des d'eparts qui peuvent, par un effet domino, avoir des cons'equences d'evastatrices. La satisfaction des participants passe par la prise en compte de leurs int'erˆets lors de l'allocation des requˆetes, mais elle peut aussi ˆetre affect'ee par les probl'emes de pannes. La r'eplication des requˆetes est une solution permettant de r'esoudre ce dernier probl'eme. Cependant, la pr'esence de participants autonomes rend cette approche plus d'elicate. Non seulement la r'eplication de requˆete peut rapidement surcharger les participants et le syst'eme, mais l'int'erˆet des participants 'a traiter des requˆetes au cas o'u leurs coll'egues tombent en panne peut ˆetre assez faible. Les questions qui se posent alors naturellement sont est-il opportun de r'epliquer les requˆetes ?, mais aussi quelles requˆetes r'epliquer ? et dans ce cas avec quel niveau de r'eplication ? Dans cet article, nous proposons des r'eponses 'a ces questions en revisitant le probl'eme de la r'eplication du point de vue de la satisfaction des participants. Nous pr'esentons une nouvelle proposition, SbQR, qui d'ecide en temps r'eel si une requˆete doit ˆetre r'epliqu'ee et avec quel degr'e. Pour cela, SbQR s'appuie sur les notions de satisfaction des participants et de probabilit'e de panne. Dans la mesure o'u la r'eplication d'un grand nombre de requˆetes peut surcharger le syst'eme et donc impacter fortement sur ses performances, nous proposons une variante, SbQR+. L'id'ee directrice consiste, dans les p'eriodes de forte charge, 'a utiliser les ressources disponibles prioritairement pour le traitement des requˆetes critiques. Les requˆetes 'a faible impact sur la satisfaction des participants peuvent alors voir le nombre de leurs r'eplicats diminuer. Exceptionnellement elles peuvent mˆeme ˆetre totalement abandonn'ees. Nos exp'erimentations d'emontrent que ces solutions am'eliorent de mani'ere significative les algorithmes de r'ef'erence du point de vue des performances et de la satisfaction tout en s'adaptant dynamiquement aux 'evolutions de la criticit'e des requˆetes et des probabilit'es de pannes sans n'ecessiter aucun r'eglage ("tunning") particulier

    Query Replication in Distributed Information Systems with Autonomous Participants

    Get PDF
    We consider Distributed Information Systems with Autonomous Participants (DISAP), i.e., participants (consumers and providers) may have special interests towards queries and other participants. Recent applications of DISAP on the Internet have emerged to share data, services, or computing resources at an unprecedented scale (e.g. SETI@home). With autonomous participants, the only way to avoid a participant to voluntarily leave the system is to satisfy its interests when allocating queries. But, participants' satisfaction may also be badly affected by other participants' failures or comportment. In this context, replicating queries is useful to address two different problems: tolerate providers' failures and deal with Byzantine providers. In this paper, we make the following main contributions. First, we formalize the query allocation problem over faulty participants in the context of DISAP. Second, we define participant's satisfaction and define a notion of global satisfaction, which considers participants' satisfaction and their probability of failure. Third, we propose a query replication algorithm, SbQR, which deals with the participants' failures by deciding on-line whether a query should be replicated and at which rate. Fourth, we propose another query replication algorithm, called SbQR+, which generalizes SbQR with the goal of prioritizing critical queries. Finally, we implemented both algorithms and compared them to the popular baseline algorithm. The results demonstrate that our algorithms significantly outperform the baseline algorithm from the performance and satisfaction points of view. In particular, Sb QR+ is excellent at choosing the queries that must be replicated to guarantee both participants' satisfaction and good system performance

    Reunión y ordenamiento de flujos de datos simultáneos y concurrentes basados en C-INCAMI

    Get PDF
    El presente documento corresponde con el trabajo final de la Especialización en Cómputo de Altas Prestaciones y Tecnología Grid de la Facultad de Informática, de la Universidad Nacional de La Plata. El trabajo aborda la problemática de procesamiento paralelo de flujos de datos (data streams), con el ingrediente de basarse los mismos en marcos formales de medición y evaluación para luego regir la organización de sus datos en base a ellos. El trabajo aborda inicialmente el estado del arte en términos de sistemas de gestión de flujos de datos, para luego discutir el marco formal de medición y evaluación C-INCAMI, como referencia para la estructuración del contenido del flujo. Seguido, se discute globalmente el Enfoque Integrado de Procesamiento de Flujos de Datos Centrado (EIPFD) en Metadatos de Mediciones, el que se asocia a mi tesis para Doctor en Ciencias Informáticas de la misma facultad y que a la fecha, se encuentra en revisión y edición final de escritura. Dicho enfoque, permite estudiar el impacto de paralelizar el procesamiento de la recepción de los flujos y la organización en línea dentro de un buffer centralizado, controlando el acceso concurrente y simultáneo en entornos de arquitecturas con memoria compartida. Luego, se define un formato de intercambio de mediciones basado en C-INCAMI, junto con el procesador que permite efectuar la serialización/deserialización en línea a los efectos de favorecer el procesamiento paralelo. Hecho ello, se plantea la estructura de organización de las mediciones y cómo guían los metadatos, al proceso de clasificación de mediciones en un buffer central. Se plantea un caso de aplicación para EIPFD sobre el que se basará la simulación de laboratorio. Esta simulación, persigue validar inicialmente los tiempos de procesamiento y analizar estadísticamente los resultados de la misma, para poder identificar cuellos de botellas y situaciones de mejoras en términos de procesamiento.Facultad de Informátic

    Adaptive Asynchronous Control and Consistency in Distributed Data Exploration Systems

    Get PDF
    Advances in machine learning and streaming systems provide a backbone to transform vast arrays of raw data into valuable information. Leveraging distributed execution, analysis engines can process this information effectively within an iterative data exploration workflow to solve problems at unprecedented rates. However, with increased input dimensionality, a desire to simultaneously share and isolate information, as well as overlapping and dependent tasks, this process is becoming increasingly difficult to maintain. User interaction derails exploratory progress due to manual oversight on lower level tasks such as tuning parameters, adjusting filters, and monitoring queries. We identify human-in-the-loop management of data generation and distributed analysis as an inhibiting problem precluding efficient online, iterative data exploration which causes delays in knowledge discovery and decision making. The flexible and scalable systems implementing the exploration workflow require semi-autonomous methods integrated as architectural support to reduce human involvement. We, thus, argue that an abstraction layer providing adaptive asynchronous control and consistency management over a series of individual tasks coordinated to achieve a global objective can significantly improve data exploration effectiveness and efficiency. This thesis introduces methodologies which autonomously coordinate distributed execution at a lower level in order to synchronize multiple efforts as part of a common goal. We demonstrate the impact on data exploration through serverless simulation ensemble management and multi-model machine learning by showing improved performance and reduced resource utilization enabling a more productive semi-autonomous exploration workflow. We focus on the specific genres of molecular dynamics and personalized healthcare, however, the contributions are applicable to a wide variety of domains

    Query Interactions in Database Systems

    Get PDF
    The typical workload in a database system consists of a mix of multiple queries of different types, running concurrently and interacting with each other. The same query may have different performance in different mixes. Hence, optimizing performance requires reasoning about query mixes and their interactions, rather than considering individual queries or query types. In this dissertation, we demonstrate how queries affect each other when they are executing concurrently in different mixes. We show the significant impact that query interactions can have on the end-to-end workload performance. A major hurdle in the understanding of query interactions in database systems is that there is a large spectrum of possible causes of interactions. For example, query interactions can happen because of any of the resource-related, data-related or configuration-related dependencies that exist in the system. This variation in underlying causes makes it very difficult to come up with robust analytical performance models to capture and model query interactions. We present a new approach for modeling performance in the presence of interactions, based on conducting experiments to measure the effect of query interactions and fitting statistical models to the data collected in these experiments to capture the impact of query interactions. The experiments collect samples of the different possible query mixes, and measure the performance metrics of interest for the different queries in these sample mixes. Statistical models such as simple regression and instance-based learning techniques are used to train models from these sample mixes. This approach requires no prior assumptions about the internal workings of the database system or the nature or cause of the interactions, making it portable across systems. We demonstrate the potential of capturing, modeling, and exploiting query interactions by developing techniques to help in two database performance related tasks: workload scheduling and estimating the completion time of a workload. These are important workload management problems that database administrators have to deal with routinely. We consider the problem of scheduling a workload of report-generation queries. Our scheduling algorithms employ statistical performance models to schedule appropriate query mixes for the given workload. Our experimental evaluation demonstrates that our interaction-aware scheduling algorithms outperform scheduling policies that are typically used in database systems. The problem of estimating the completion time of a workload is an important problem, and the state of the art does not offer any systematic solution. Typically database administrators rely on heuristics or observations of past behavior to solve this problem. We propose a more rigorous solution to this problem, based on a workload simulator that employs performance models to simulate the execution of the different mixes that make up a workload. This mix-based simulator provides a systematic tool that can help database administrators in estimating workload completion time. Our experimental evaluation shows that our approach can estimate the workload completion times with a high degree of accuracy. Overall, this dissertation demonstrates that reasoning about query interactions holds significant potential for realizing performance improvements in database systems. The techniques developed in this work can be viewed as initial steps in this interesting area of research, with lots of potential for future work

    Enfoque integrado de procesamiento de flujos de datos centrado en metadatos de mediciones

    Get PDF
    Cuando se trata de tomar decisiones a un nivel ingenieril, medir no es una posibilidad sino una necesidad; representa una práctica sistemática y disciplinada por la cual se puede cuantificar el estado de un ente. Si hay un aspecto que se debe tener en claro en medición, es que para comparar mediciones diferentes las mismas deben ser consistentes entre sí, esto es, deben poseer la misma escala y tipo de escala además de obtenerse bajo métodos de medición y/o reglas de cálculos equivalentes. Los marcos de medición y evaluación representan un esfuerzo, desde la óptica de cada estrategia, por formalizar el modo de definir las métricas, sus objetivos, entre otros aspectos asociados, a los efectos de garantizar la repetitividad y consistencia en el proceso de medición que sustentan. Existen aplicaciones capaces de procesar flujos de mediciones en línea, pero el inconveniente principal con el que se enfrentan, es que no contienen información con respecto al significado del dato que están procesando. Para este tipo de aplicaciones, la medición es un dato, es decir, una forma de representar un hecho captado, careciendo de información sobre el concepto al que se asocian o bien, el contexto en el cual es obtenida dicha medición. Los dispositivos de medición, están en general desarrollados para captar una medida mediante un método dado, y en la mayoría de los casos, la forma de obtener dicha medida para su posterior procesamiento en otros entornos (ordenadores de escritorio, móviles, etc.), está en función de servicios o accesorios provistos por el fabricante. Suponiendo que la totalidad de las mediciones, provenientes de diferentes dispositivos, pudieran ser incorporadas en un mismo canal de transmisión, pocos son los entornos de procesamiento de flujos de datos que incorporan comportamiento predictivo. En aquellos que se incorpora comportamiento predictivo, ninguno de los analizados se sustenta en una base conceptual, que permita contrastar una medida contra la definición formal de su métrica. Esto último, incorpora un serio riesgo de inconsistencia, que afecta directamente al proceso de medición y en consecuencia, a los posteriores análisis que en base a estos datos se realicen. Nuestra Estrategia de Procesamiento de Flujos de Datos centrado en Metadatos de Mediciones (EIPFDcMM), se focaliza en permitir la incorporación de fuentes de datos heterogéneas, cuyos flujos de mediciones estructurados y enriquecidos con metadatos embebidos basados C-INCAMI, permitan realizar análisis estadísticos de un modo consistente a los efectos de implementar un comportamiento detectivo y a su vez, permitan incorporar información contextual a las mediciones, para enriquecer la función de clasificación con el objeto de implementar el comportamiento predictivo. Tanto la implementación del comportamiento detectivo como del predictivo tendrán asociados mecanismos de alarma, que permitirán proceder a la notificación ante la eventual identificación de una zona de riesgo. De este modo, se pretende garantizar la repetitividad y consistencia en el proceso de medición que sustentan.Facultad de Informátic

    Query Suspend And Resume

    No full text
    Suppose a long-running analytical query is executing on a database server and has been allocated a large amount of physical memory. A high-priority task comes in and we need to run it immediately with all available resources. We have several choices. We could swap out the old query to disk, but writing out a large execution state may take too much time. Another option is to terminate the old query and restart it after the new task completes, but we would waste all the work already performed by the old query. Yet another alternative is to periodically checkpoint the query during execution, but traditional synchronous checkpointing carries high overhead. In this paper, we advocate a database-centric approach to implementing query suspension and resumption, with negligible execution overhead, bounded suspension cost, and efficient resumption. The basic idea is to let each physical query operator perform lightweight checkpointing according to its own semantics, and coordinate asynchronous checkpoints among operators through a novel contracting mechanism. At the time of suspension, we find an optimized suspend plan for the query, which may involve a combination of dumping current state to disk and going back to previous checkpoints. The plan seeks to minimize the suspend/resume overhead while observing the constraint on suspension time. Our approach requires only small changes to the iterator interface, which we have implemented in the PREDATOR database system. Experiments with our implementation demonstrate significant advantages of our approach over traditional alternatives
    corecore