Search CORE

251,679 research outputs found

Middleware-based Database Replication: The Gaps between Theory and Practice

Author: Ailamaki Anastasia
Candea George
Cecchet Emmanuel
Publication venue
Publication date: 01/01/2008
Field of study

The need for high availability and performance in data management systems has been fueling a long running interest in database replication from both academia and industry. However, academic groups often attack replication problems in isolation, overlooking the need for completeness in their solutions, while commercial teams take a holistic approach that often misses opportunities for fundamental innovation. This has created over time a gap between academic research and industrial practice. This paper aims to characterize the gap along three axes: performance, availability, and administration. We build on our own experience developing and deploying replication systems in commercial and academic settings, as well as on a large body of prior related work. We sift through representative examples from the last decade of open-source, academic, and commercial database replication systems and combine this material with case studies from real systems deployed at Fortune 500 customers. We propose two agendas, one for academic research and one for industrial R&D, which we believe can bridge the gap within 5-10 years. This way, we hope to both motivate and help researchers in making the theory and practice of middleware-based database replication more relevant to each other.Comment: 14 pages. Appears in Proc. ACM SIGMOD International Conference on Management of Data, Vancouver, Canada, June 200

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

CiteSeerX

H2O: An Autonomic, Resource-Aware Distributed Database System

Author: Dearle Alan
Kirby Graham
Macdonald Angus
Publication venue
Publication date: 16/06/2010
Field of study

This paper presents the design of an autonomic, resource-aware distributed database which enables data to be backed up and shared without complex manual administration. The database, H2O, is designed to make use of unused resources on workstation machines. Creating and maintaining highly-available, replicated database systems can be difficult for untrained users, and costly for IT departments. H2O reduces the need for manual administration by autonomically replicating data and load-balancing across machines in an enterprise. Provisioning hardware to run a database system can be unnecessarily costly as most organizations already possess large quantities of idle resources in workstation machines. H2O is designed to utilize this unused capacity by using resource availability information to place data and plan queries over workstation machines that are already being used for other tasks. This paper discusses the requirements for such a system and presents the design and implementation of H2O.Comment: Presented at SICSA PhD Conference 2010 (http://www.sicsaconf.org/

arXiv.org e-Print Archive

University of St. Andrews - Pure

St Andrews Research Repository

A platform for discovering and sharing confidential ballistic crime data.

Author: Akhgar Babak
Bates Christopher
Jopek Lukasz
Wilson Richard
Yates Simeon
Publication venue: 'Inderscience Publishers'
Publication date: 01/01/2011
Field of study

Criminal investigations generate large volumes of complex data that detectives have to analyse and understand. This data tends to be "siloed" within individual jurisdictions and re-using it in other investigations can be difficult. Investigations into trans-national crimes are hampered by the problem of discovering relevant data held by agencies in other countries and of sharing those data. Gun-crimes are one major type of incident that showcases this: guns are easily moved across borders and used in multiple crimes but finding that a weapon was used elsewhere in Europe is difficult. In this paper we report on the Odyssey Project, an EU-funded initiative to mine, manipulate and share data about weapons and crimes. The project demonstrates the automatic combining of data from disparate repositories for cross-correlation and automated analysis. The data arrive from different cultural/domains with multiple reference models using real-time data feeds and historical databases

Crossref

Sheffield Hallam University Research Archive

On a Catalogue of Metrics for Evaluating Commercial Cloud Services

Author: Cai Rainbow
Li Zheng
O'Brien Liam
Zhang He
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/02/2013
Field of study

Given the continually increasing amount of commercial Cloud services in the market, evaluation of different services plays a significant role in cost-benefit analysis or decision making for choosing Cloud Computing. In particular, employing suitable metrics is essential in evaluation implementations. However, to the best of our knowledge, there is not any systematic discussion about metrics for evaluating Cloud services. By using the method of Systematic Literature Review (SLR), we have collected the de facto metrics adopted in the existing Cloud services evaluation work. The collected metrics were arranged following different Cloud service features to be evaluated, which essentially constructed an evaluation metrics catalogue, as shown in this paper. This metrics catalogue can be used to facilitate the future practice and research in the area of Cloud services evaluation. Moreover, considering metrics selection is a prerequisite of benchmark selection in evaluation implementations, this work also supplements the existing research in benchmarking the commercial Cloud services.Comment: 10 pages, Proceedings of the 13th ACM/IEEE International Conference on Grid Computing (Grid 2012), pp. 164-173, Beijing, China, September 20-23, 201

arXiv.org e-Print Archive

Crossref

Recommended from our members

Net solar generation potential from urban rooftops in Los Angeles

Author: Cheng D
Federico F
Fournier E
Gustafson H
Hirashiki C
Pincetl S
Porse E
Publication venue: eScholarship, University of California
Publication date: 01/07/2020
Field of study

Rooftops provide accessible locations for solar energy installations. While rooftop solar arrays can offset in-building electricity needs, they may also stress electric grid operations. Here we present an analysis of net electricity generation potential from distributed rooftop solar in Los Angeles. We integrate spatial and temporal data for property-level electricity demands, rooftop solar generation potential, and grid capacity constraints to estimate the potential for solar to meet on-site demands and supply net exports to the electric grid. In the study area with 1.2 million parcels, rooftop solar could meet 7200 Gigawatt Hours (GWh) of on-site building demands (~29% of demand). Overall potential net generation is negative, meaning buildings use more electricity than they can produce. Yet, cumulative net export potential from solar to grid circuits is 16,400 GWh. Current policies that regulate solar array interconnection to the grid result in unutilized solar power output of 1700 MW. Lower-income and at-risk communities in LA have greater potential for exporting net solar generation to the grid. This potential should be recognized through investments and policy innovations. The method demonstrates the need for considering time-dependent calculations of net solar potential and offers a template for distributed renewable energy planning in cities

eScholarship - University of California

Processing real-time transactions in a replicated database system

Author: Ulusoy O.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/1994
Field of study

A database system supporting a real-time application has to provide real-time information to the executing transactions. Each real-time transaction is associated with a timing constraint, typically in the form of a deadline. It is difficult to satisfy all timing constraints due to the consistency requirements of the underlying database. In scheduling the transactions it is aimed to process as many transactions as possible within their deadlines. Replicated database systems possess desirable features for real-time applications, such as a high level of data availability, and potentially improved response time for queries. On the other hand, multiple copy updates lead to a considerable overhead due to the communication required among the data sites holding the copies. In this paper, we investigate the impact of storing multiple copies of data on satisfying the timing constraints of real-time transactions. A detailed performance model of a distributed database system is employed in evaluating the effects of various workload parameters and design alternatives on the system performance. The performance is expressed in terms of the fraction of satisfied transaction deadlines. A comparison of several real-time concurrency control protocols, which are based on different approaches in involving timing constraints of transactions in scheduling, is also provided in performance experiments. © 1994 Kluwer Academic Publishers

Bilkent University Institutional Repository