158,064 research outputs found
Carbon Capture Clustering: the Case for Coordinated Approaches to Address Freshwater Use Concerns
Carbon capture and storage (CCS) will be a key technology for reducing emissions from fossil-fuelled electricity
generation. The UK is developing demonstration plants and UK Government strategy proposes the clustering of
CCS facilities, having identified significant cost-savings from shared pipeline infrastructure. However, cooling water
use from CCS power plants are almost double those of conventional plants. There are concerns about the volumes
of freshwater used and vulnerability to low river flows, particularly in areas identified for CCS clusters. Two innovative
approaches may reduce water use in CCS clusters by exploiting synergies with other infrastructures; district heating
and municipal wastewater. Our analysis indicates that cooling water reductions from district heating may be feasible
in the northwest, but less likely in Yorkshire. We also find that across the UK there are numerous, sufficiently large
wastewater treatment plants capable of providing alternative cooling water sources for large power plants. Feasibility
of these promising options will be highly contextual, require detailed analysis and may face economic and regulatory
barriers. Historically, ad-hoc development of energy infrastructure has struggled to exploit such synergies, but may
now be facilitated by the clustering of CCS facilities
A Low Cost Two-Tier Architecture Model For High Availability Clusters Application Load Balancing
This article proposes a design and implementation of a low cost two-tier
architecture model for high availability cluster combined with load-balancing
and shared storage technology to achieve desired scale of three-tier
architecture for application load balancing e.g. web servers. The research work
proposes a design that physically omits Network File System (NFS) server nodes
and implements NFS server functionalities within the cluster nodes, through Red
Hat Cluster Suite (RHCS) with High Availability (HA) proxy load balancing
technologies. In order to achieve a low-cost implementation in terms of
investment in hardware and computing solutions, the proposed architecture will
be beneficial. This system intends to provide steady service despite any system
components fails due to uncertainly such as network system, storage and
applications.Comment: Load balancing, high availability cluster, web server cluster
The state of SQL-on-Hadoop in the cloud
Managed Hadoop in the cloud, especially SQL-on-Hadoop, has been gaining attention recently. On Platform-as-a-Service (PaaS), analytical services like Hive and Spark come preconfigured for general-purpose and ready to use. Thus, giving companies a quick entry and on-demand deployment of ready SQL-like solutions for their big data needs. This study evaluates cloud services from an end-user perspective, comparing providers including: Microsoft Azure, Amazon Web Services, Google Cloud,
and Rackspace. The study focuses on performance, readiness, scalability, and cost-effectiveness of the different solutions at entry/test level clusters sizes. Results are based on over 15,000 Hive queries derived from the industry standard TPC-H benchmark.
The study is framed within the ALOJA research project, which features an open source benchmarking and analysis platform that has been recently extended to support SQL-on-Hadoop engines.
The ALOJA Project aims to lower the total cost of ownership (TCO) of big data deployments and study their performance characteristics for optimization.
The study benchmarks cloud providers across a diverse range instance types, and uses input data scales from 1GB to 1TB, in order to survey the popular entry-level PaaS SQL-on-Hadoop solutions, thereby establishing a common results-base upon which subsequent research can be carried out by the project. Initial results already show the main performance trends to both hardware and software configuration, pricing, similarities and architectural differences of the evaluated PaaS solutions. Whereas some
providers focus on decoupling storage and computing resources while offering network-based elastic storage, others choose to keep the local processing model from Hadoop for high performance, but reducing flexibility. Results also show the importance of application-level tuning and how keeping up-to-date hardware and software stacks can influence performance even more than replicating the on-premises model in the cloud.This work is partially supported by the Microsoft Azure for Research program, the European Research Council (ERC) under
the EUs Horizon 2020 programme (GA 639595), the Spanish Ministry of Education (TIN2015-65316-P), and the Generalitat
de Catalunya (2014-SGR-1051).Peer ReviewedPostprint (author's final draft
Improving the scalability of cloud-based resilient database servers
Many rely now on public cloud infrastructure-as-a-service for
database servers, mainly, by pushing the limits of existing pooling and
replication software to operate large shared-nothing virtual server clusters.
Yet, it is unclear whether this is still the best architectural choice,
namely, when cloud infrastructure provides seamless virtual shared storage
and bills clients on actual disk usage.
This paper addresses this challenge with Resilient Asynchronous Commit
(RAsC), an improvement to awell-known shared-nothing design based
on the assumption that a much larger number of servers is required for
scale than for resilience. Then we compare this proposal to other database
server architectures using an analytical model focused on peak throughput
and conclude that it provides the best performance/cost trade-off while at
the same time addressing a wide range of fault scenarios
GoFFish: A Sub-Graph Centric Framework for Large-Scale Graph Analytics
Large scale graph processing is a major research area for Big Data
exploration. Vertex centric programming models like Pregel are gaining traction
due to their simple abstraction that allows for scalable execution on
distributed systems naturally. However, there are limitations to this approach
which cause vertex centric algorithms to under-perform due to poor compute to
communication overhead ratio and slow convergence of iterative superstep. In
this paper we introduce GoFFish a scalable sub-graph centric framework
co-designed with a distributed persistent graph storage for large scale graph
analytics on commodity clusters. We introduce a sub-graph centric programming
abstraction that combines the scalability of a vertex centric approach with the
flexibility of shared memory sub-graph computation. We map Connected
Components, SSSP and PageRank algorithms to this model to illustrate its
flexibility. Further, we empirically analyze GoFFish using several real world
graphs and demonstrate its significant performance improvement, orders of
magnitude in some cases, compared to Apache Giraph, the leading open source
vertex centric implementation.Comment: Under review by a conference, 201
New distributed offline processing scheme at Belle
The offline processing of the data collected by the Belle detector has been
recently upgraded to cope with the excellent performance of the KEKB
accelerator. The 127/fb of data (120 TB on tape) collected between autumn 2003
and summer 2004 has been processed in 2 months, thanks to the high speed and
stability of the new, distributed processing scheme. We present here this new
processing scheme and its performance.Comment: 4 pages, 8 figures, uses CHEP2004.cl
- …