3,355 research outputs found

    Middleware-based Database Replication: The Gaps between Theory and Practice

    Get PDF
    The need for high availability and performance in data management systems has been fueling a long running interest in database replication from both academia and industry. However, academic groups often attack replication problems in isolation, overlooking the need for completeness in their solutions, while commercial teams take a holistic approach that often misses opportunities for fundamental innovation. This has created over time a gap between academic research and industrial practice. This paper aims to characterize the gap along three axes: performance, availability, and administration. We build on our own experience developing and deploying replication systems in commercial and academic settings, as well as on a large body of prior related work. We sift through representative examples from the last decade of open-source, academic, and commercial database replication systems and combine this material with case studies from real systems deployed at Fortune 500 customers. We propose two agendas, one for academic research and one for industrial R&D, which we believe can bridge the gap within 5-10 years. This way, we hope to both motivate and help researchers in making the theory and practice of middleware-based database replication more relevant to each other.Comment: 14 pages. Appears in Proc. ACM SIGMOD International Conference on Management of Data, Vancouver, Canada, June 200

    Fully Dynamic Algorithm for Top-kk Densest Subgraphs

    Full text link
    Given a large graph, the densest-subgraph problem asks to find a subgraph with maximum average degree. When considering the top-kk version of this problem, a na\"ive solution is to iteratively find the densest subgraph and remove it in each iteration. However, such a solution is impractical due to high processing cost. The problem is further complicated when dealing with dynamic graphs, since adding or removing an edge requires re-running the algorithm. In this paper, we study the top-kk densest-subgraph problem in the sliding-window model and propose an efficient fully-dynamic algorithm. The input of our algorithm consists of an edge stream, and the goal is to find the node-disjoint subgraphs that maximize the sum of their densities. In contrast to existing state-of-the-art solutions that require iterating over the entire graph upon any update, our algorithm profits from the observation that updates only affect a limited region of the graph. Therefore, the top-kk densest subgraphs are maintained by only applying local updates. We provide a theoretical analysis of the proposed algorithm and show empirically that the algorithm often generates denser subgraphs than state-of-the-art competitors. Experiments show an improvement in efficiency of up to five orders of magnitude compared to state-of-the-art solutions.Comment: 10 pages, 8 figures, accepted at CIKM 201

    Discriminant analysis of multivariate time series using wavelets

    Get PDF
    In analyzing ECG data, the main aim is to differentiate between the signal patterns of those of healthy subjects and those of individuals with specific heart conditions. We propose an approach for classifying multivariate ECG signals based on discriminant and wavelet analyzes. For this purpose we use multiple-scale wavelet variances and wavelet correlations to distinguish between the patterns of multivariate ECG signals based on the variability of the individual components of each ECG signal and the relationships between every pair of these components. Using the results of other ECG classification studies in the literature as references, we demonstrate that our approach applied to 12-lead ECG signals from a particular database, displays quite favourable performance. We also demonstrate with real and synthetic ECG data that our approach to classifying multivariate time series out performs other well-known approaches for classifying multivariate time series. In simulation studies using multivariate time series that have patterns that are different from that of the ECG signals, we also demonstrate very favourably performance of this approach when compared to these other approaches.Time series, Wavelet Variances, Wavelet Correlations, Discriminant Analysis

    Model-Based Mitigation of Availability Risks

    Get PDF
    The assessment and mitigation of risks related to the availability of the IT infrastructure is becoming increasingly important in modern organizations. Unfortunately, present standards for Risk Assessment and Mitigation show limitations when evaluating and mitigating availability risks. This is due to the fact that they do not fully consider the dependencies between the constituents of an IT infrastructure that are paramount in large enterprises. These dependencies make the technical problem of assessing availability issues very challenging. In this paper we define a method and a tool for carrying out a Risk Mitigation activity which allows to assess the global impact of a set of risks and to choose the best set of countermeasures to cope with them. To this end, the presence of a tool is necessary due to the high complexity of the assessment problem. Our approach can be integrated in present Risk Management methodologies (e.g. COBIT) to provide a more precise Risk Mitigation activity. We substantiate the viability of this approach by showing that most of the input required by the tool is available as part of a standard business continuity plan, and/or by performing a common tool-assisted Risk Management

    The End of Slow Networks: It's Time for a Redesign

    Full text link
    Next generation high-performance RDMA-capable networks will require a fundamental rethinking of the design and architecture of modern distributed DBMSs. These systems are commonly designed and optimized under the assumption that the network is the bottleneck: the network is slow and "thin", and thus needs to be avoided as much as possible. Yet this assumption no longer holds true. With InfiniBand FDR 4x, the bandwidth available to transfer data across network is in the same ballpark as the bandwidth of one memory channel, and it increases even further with the most recent EDR standard. Moreover, with the increasing advances of RDMA, the latency improves similarly fast. In this paper, we first argue that the "old" distributed database design is not capable of taking full advantage of the network. Second, we propose architectural redesigns for OLTP, OLAP and advanced analytical frameworks to take better advantage of the improved bandwidth, latency and RDMA capabilities. Finally, for each of the workload categories, we show that remarkable performance improvements can be achieved

    An Algorithm for Network and Data-aware Placement of Multi-Tier Applications in Cloud Data Centers

    Full text link
    Today's Cloud applications are dominated by composite applications comprising multiple computing and data components with strong communication correlations among them. Although Cloud providers are deploying large number of computing and storage devices to address the ever increasing demand for computing and storage resources, network resource demands are emerging as one of the key areas of performance bottleneck. This paper addresses network-aware placement of virtual components (computing and data) of multi-tier applications in data centers and formally defines the placement as an optimization problem. The simultaneous placement of Virtual Machines and data blocks aims at reducing the network overhead of the data center network infrastructure. A greedy heuristic is proposed for the on-demand application components placement that localizes network traffic in the data center interconnect. Such optimization helps reducing communication overhead in upper layer network switches that will eventually reduce the overall traffic volume across the data center. This, in turn, will help reducing packet transmission delay, increasing network performance, and minimizing the energy consumption of network components. Experimental results demonstrate performance superiority of the proposed algorithm over other approaches where it outperforms the state-of-the-art network-aware application placement algorithm across all performance metrics by reducing the average network cost up to 67% and network usage at core switches up to 84%, as well as increasing the average number of application deployments up to 18%.Comment: Submitted for publication consideration for the Journal of Network and Computer Applications (JNCA). Total page: 28. Number of figures: 15 figure
    • ā€¦
    corecore