Search CORE

158,064 research outputs found

Carbon Capture Clustering: the Case for Coordinated Approaches to Address Freshwater Use Concerns

Author: Amezaga J.M.
Byers E.A.
Hall J.W.
Publication venue: UCL STEaPP
Publication date: 01/01/2015
Field of study

Carbon capture and storage (CCS) will be a key technology for reducing emissions from fossil-fuelled electricity generation. The UK is developing demonstration plants and UK Government strategy proposes the clustering of CCS facilities, having identified significant cost-savings from shared pipeline infrastructure. However, cooling water use from CCS power plants are almost double those of conventional plants. There are concerns about the volumes of freshwater used and vulnerability to low river flows, particularly in areas identified for CCS clusters. Two innovative approaches may reduce water use in CCS clusters by exploiting synergies with other infrastructures; district heating and municipal wastewater. Our analysis indicates that cooling water reductions from district heating may be feasible in the northwest, but less likely in Yorkshire. We also find that across the UK there are numerous, sufficiently large wastewater treatment plants capable of providing alternative cooling water sources for large power plants. Feasibility of these promising options will be highly contextual, require detailed analysis and may face economic and regulatory barriers. Historically, ad-hoc development of energy infrastructure has struggled to exploit such synergies, but may now be facilitated by the clustering of CCS facilities

UCL Discovery

A Low Cost Two-Tier Architecture Model For High Availability Clusters Application Load Balancing

Author: Hossain Syed Akther
Moniruzzaman A B M
Publication venue
Publication date: 22/06/2014
Field of study

This article proposes a design and implementation of a low cost two-tier architecture model for high availability cluster combined with load-balancing and shared storage technology to achieve desired scale of three-tier architecture for application load balancing e.g. web servers. The research work proposes a design that physically omits Network File System (NFS) server nodes and implements NFS server functionalities within the cluster nodes, through Red Hat Cluster Suite (RHCS) with High Availability (HA) proxy load balancing technologies. In order to achieve a low-cost implementation in terms of investment in hardware and computing solutions, the proposed architecture will be beneficial. This system intends to provide steady service despite any system components fails due to uncertainly such as network system, storage and applications.Comment: Load balancing, high availability cluster, web server cluster

arXiv.org e-Print Archive

CiteSeerX

The state of SQL-on-Hadoop in the cloud

Author: Berral García Josep Lluís
Blakeley Jose
Carrera Pérez David
Fenech Thomas
Minhas Umar F.
Poggi Nicolas
Vujic Nikola
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

Managed Hadoop in the cloud, especially SQL-on-Hadoop, has been gaining attention recently. On Platform-as-a-Service (PaaS), analytical services like Hive and Spark come preconfigured for general-purpose and ready to use. Thus, giving companies a quick entry and on-demand deployment of ready SQL-like solutions for their big data needs. This study evaluates cloud services from an end-user perspective, comparing providers including: Microsoft Azure, Amazon Web Services, Google Cloud, and Rackspace. The study focuses on performance, readiness, scalability, and cost-effectiveness of the different solutions at entry/test level clusters sizes. Results are based on over 15,000 Hive queries derived from the industry standard TPC-H benchmark. The study is framed within the ALOJA research project, which features an open source benchmarking and analysis platform that has been recently extended to support SQL-on-Hadoop engines. The ALOJA Project aims to lower the total cost of ownership (TCO) of big data deployments and study their performance characteristics for optimization. The study benchmarks cloud providers across a diverse range instance types, and uses input data scales from 1GB to 1TB, in order to survey the popular entry-level PaaS SQL-on-Hadoop solutions, thereby establishing a common results-base upon which subsequent research can be carried out by the project. Initial results already show the main performance trends to both hardware and software configuration, pricing, similarities and architectural differences of the evaluated PaaS solutions. Whereas some providers focus on decoupling storage and computing resources while offering network-based elastic storage, others choose to keep the local processing model from Hadoop for high performance, but reducing flexibility. Results also show the importance of application-level tuning and how keeping up-to-date hardware and software stacks can influence performance even more than replicating the on-premises model in the cloud.This work is partially supported by the Microsoft Azure for Research program, the European Research Council (ERC) under the EUs Horizon 2020 programme (GA 639595), the Spanish Ministry of Education (TIN2015-65316-P), and the Generalitat de Catalunya (2014-SGR-1051).Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

Improving the scalability of cloud-based resilient database servers

Author: A. Correia Jr.
B. Kemme
F. Pedone
G.V. Chockler
H. Berenson
H. Garcia-Molina
J. Gray
J.M. Bernabé-Gisbert
S. Elnikety
S. Elnikety
S. Wu
T. Lahiri
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Many rely now on public cloud infrastructure-as-a-service for database servers, mainly, by pushing the limits of existing pooling and replication software to operate large shared-nothing virtual server clusters. Yet, it is unclear whether this is still the best architectural choice, namely, when cloud infrastructure provides seamless virtual shared storage and bills clients on actual disk usage. This paper addresses this challenge with Resilient Asynchronous Commit (RAsC), an improvement to awell-known shared-nothing design based on the assumption that a much larger number of servers is required for scale than for resilience. Then we compare this proposal to other database server architectures using an analytical model focused on peak throughput and conclude that it provides the best performance/cost trade-off while at the same time addressing a wide range of fault scenarios

Universidade do Minho: RepositoriUM

Crossref

GoFFish: A Sub-Graph Centric Framework for Large-Scale Graph Analytics

Author: Kumbhare Alok
Nagarkar Soonil
Prasanna Viktor
Raghavendra Cauligi
Ravi Santosh
Simmhan Yogesh
Wickramaarachchi Charith
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 22/11/2013
Field of study

Large scale graph processing is a major research area for Big Data exploration. Vertex centric programming models like Pregel are gaining traction due to their simple abstraction that allows for scalable execution on distributed systems naturally. However, there are limitations to this approach which cause vertex centric algorithms to under-perform due to poor compute to communication overhead ratio and slow convergence of iterative superstep. In this paper we introduce GoFFish a scalable sub-graph centric framework co-designed with a distributed persistent graph storage for large scale graph analytics on commodity clusters. We introduce a sub-graph centric programming abstraction that combines the scalability of a vertex centric approach with the flexibility of shared memory sub-graph computation. We map Connected Components, SSSP and PageRank algorithms to this model to illustrate its flexibility. Further, we empirically analyze GoFFish using several real world graphs and demonstrate its significant performance improvement, orders of magnitude in some cases, compared to Apache Giraph, the leading open source vertex centric implementation.Comment: Under review by a conference, 201

arXiv.org e-Print Archive

Crossref

New distributed offline processing scheme at Belle

Author: Adachi I.
Katayama N.
Ronga F. J.
Publication venue
Publication date: 01/01/2004
Field of study

The offline processing of the data collected by the Belle detector has been recently upgraded to cope with the excellent performance of the KEKB accelerator. The 127/fb of data (120 TB on tape) collected between autumn 2003 and summer 2004 has been processed in 2 months, thanks to the high speed and stability of the new, distributed processing scheme. We present here this new processing scheme and its performance.Comment: 4 pages, 8 figures, uses CHEP2004.cl

arXiv.org e-Print Archive

CiteSeerX

CERN Document Server