2,548 research outputs found
Middleware-based Database Replication: The Gaps between Theory and Practice
The need for high availability and performance in data management systems has
been fueling a long running interest in database replication from both academia
and industry. However, academic groups often attack replication problems in
isolation, overlooking the need for completeness in their solutions, while
commercial teams take a holistic approach that often misses opportunities for
fundamental innovation. This has created over time a gap between academic
research and industrial practice.
This paper aims to characterize the gap along three axes: performance,
availability, and administration. We build on our own experience developing and
deploying replication systems in commercial and academic settings, as well as
on a large body of prior related work. We sift through representative examples
from the last decade of open-source, academic, and commercial database
replication systems and combine this material with case studies from real
systems deployed at Fortune 500 customers. We propose two agendas, one for
academic research and one for industrial R&D, which we believe can bridge the
gap within 5-10 years. This way, we hope to both motivate and help researchers
in making the theory and practice of middleware-based database replication more
relevant to each other.Comment: 14 pages. Appears in Proc. ACM SIGMOD International Conference on
Management of Data, Vancouver, Canada, June 200
Carbon Free Boston: Waste Technical Report
Part of a series of reports that includes:
Carbon Free Boston: Summary Report;
Carbon Free Boston: Social Equity Report;
Carbon Free Boston: Technical Summary;
Carbon Free Boston: Buildings Technical Report;
Carbon Free Boston: Transportation Technical Report;
Carbon Free Boston: Energy Technical Report;
Carbon Free Boston: Offsets Technical Report;
Available at http://sites.bu.edu/cfb/OVERVIEW:
For many people, their most perceptible interaction with their environmental footprint is through the
waste that they generate. On a daily basis people have numerous opportunities to decide whether to
recycle, compost or throwaway. In many cases, such options may not be present or apparent. Even
when such options are available, many lack the knowledge of how to correctly dispose of their waste,
leading to contamination of valuable recycling or compost streams. Once collected, people give little
thought to how their waste is treated. For Boston’s waste, plastic in the disposal stream acts becomes a
fossil fuel used to generate electricity. Organics in the waste stream have the potential to be used to
generate valuable renewable energy, while metals and electronics can be recycled to offset virgin
materials. However, challenges in global recycling markets are burdening municipalities, which are
experiencing higher costs to maintain their recycling.
The disposal of solid waste and wastewater both account for a large and visible anthropogenic impact
on human health and the environment. In terms of climate change, landfilling of solid waste and
wastewater treatment generated emissions of 131.5 Mt CO2e in 2016 or about two percent of total
United States GHG emissions that year. The combustion of solid waste contributed an additional 11.0 Mt
CO2e, over half of which (5.9 Mt CO2e) is attributable to the combustion of plastic [1]. In Massachusetts,
the GHG emissions from landfills (0.4 Mt CO2e), waste combustion (1.2 Mt CO2e), and wastewater (0.5
Mt CO2e) accounted for about 2.7 percent of the state’s gross GHG emissions in 2014 [2].
The City of Boston has begun exploring pathways to Zero Waste, a goal that seeks to systematically
redesign our waste management system that can simultaneously lead to a drastic reduction in emissions
from waste. The easiest way to achieve zero waste is to not generate it in the first place. This can start at
the source with the decision whether or not to consume a product. This is the intent behind banning
disposable items such as plastic bags that have more sustainable substitutes. When consumption occurs,
products must be designed in such a way that their lifecycle impacts and waste footprint are considered.
This includes making durable products, limiting the use of packaging or using organic packaging
materials, taking back goods at the end of their life, and designing products to ensure compatibility with
recycling systems. When reducing waste is unavoidable, efforts to increase recycling and organics
diversion becomes essential for achieving zero waste. [TRUNCATED]Published versio
ATP: a Datacenter Approximate Transmission Protocol
Many datacenter applications such as machine learning and streaming systems
do not need the complete set of data to perform their computation. Current
approximate applications in datacenters run on a reliable network layer like
TCP. To improve performance, they either let sender select a subset of data and
transmit them to the receiver or transmit all the data and let receiver drop
some of them. These approaches are network oblivious and unnecessarily transmit
more data, affecting both application runtime and network bandwidth usage. On
the other hand, running approximate application on a lossy network with UDP
cannot guarantee the accuracy of application computation. We propose to run
approximate applications on a lossy network and to allow packet loss in a
controlled manner. Specifically, we designed a new network protocol called
Approximate Transmission Protocol, or ATP, for datacenter approximate
applications. ATP opportunistically exploits available network bandwidth as
much as possible, while performing a loss-based rate control algorithm to avoid
bandwidth waste and re-transmission. It also ensures bandwidth fair sharing
across flows and improves accurate applications' performance by leaving more
switch buffer space to accurate flows. We evaluated ATP with both simulation
and real implementation using two macro-benchmarks and two real applications,
Apache Kafka and Flink. Our evaluation results show that ATP reduces
application runtime by 13.9% to 74.6% compared to a TCP-based solution that
drops packets at sender, and it improves accuracy by up to 94.0% compared to
UDP
Incremental Consistency Guarantees for Replicated Objects
Programming with replicated objects is difficult. Developers must face the
fundamental trade-off between consistency and performance head on, while
struggling with the complexity of distributed storage stacks. We introduce
Correctables, a novel abstraction that hides most of this complexity, allowing
developers to focus on the task of balancing consistency and performance. To
aid developers with this task, Correctables provide incremental consistency
guarantees, which capture successive refinements on the result of an ongoing
operation on a replicated object. In short, applications receive both a
preliminary---fast, possibly inconsistent---result, as well as a
final---consistent---result that arrives later.
We show how to leverage incremental consistency guarantees by speculating on
preliminary values, trading throughput and bandwidth for improved latency. We
experiment with two popular storage systems (Cassandra and ZooKeeper) and three
applications: a Twissandra-based microblogging service, an ad serving system,
and a ticket selling system. Our evaluation on the Amazon EC2 platform with
YCSB workloads A, B, and C shows that we can reduce the latency of strongly
consistent operations by up to 40% (from 100ms to 60ms) at little cost (10%
bandwidth increase, 6% throughput drop) in the ad system. Even if the
preliminary result is frequently inconsistent (25% of accesses), incremental
consistency incurs a bandwidth overhead of only 27%.Comment: 16 total pages, 12 figures. OSDI'16 (to appear
On the use of a reflective architecture to augment Database Management Systems
The Database Management System (DBMS) used to be a commodity software component, with well known standard interfaces and semantics. However, the performance and reliability expectations being placed on DBMSs have increased the demand for a variety add-ons, that augment the functionality of the database in a wide range of deployment scenarios, offering support for features such as clustering, replication, and selfmanagement, among others. The effectiveness of such extensions largely rests on closely matching the actual needs of applications, hence on a wide range of tradeoffs and configuration options out of the scope of traditional client interfaces. A well known software engineering approach to systems with such requirements is reflection. Unfortunately, standard reflective interfaces in DBMSs are very limited (for instance, they often do not support the desired range of atomicity guarantees in a distributed setting). Some of these limitations may be circumvented by implementing reflective features as a wrapper to the DBMS server. Unfortunately, this solutions comes at the expense of a large development effort and significant performance penalty. In this paper we propose a general purpose DBMS reflection architecture and interface, that supports multiple extensions while, at the same time, admitting efficient implementations. We illustrate the usefulness of our proposal with concrete examples, and evaluate its cost and performance under different implementation strategies
E-Z Door: Hands-free Front Door Unlocking and Opening Mechanism
The E-Z Door Senior Design project is a project with the aim of designing a hands-free system to unlock and open the front door of a home using two-factor security authentication.
The main goal of this project is to help people who may have physical limitations to be able to take advantage of recent technology, making it significantly easier to enter their homes
- …