82,729 research outputs found
CPL: A Core Language for Cloud Computing -- Technical Report
Running distributed applications in the cloud involves deployment. That is,
distribution and configuration of application services and middleware
infrastructure. The considerable complexity of these tasks resulted in the
emergence of declarative JSON-based domain-specific deployment languages to
develop deployment programs. However, existing deployment programs unsafely
compose artifacts written in different languages, leading to bugs that are hard
to detect before run time. Furthermore, deployment languages do not provide
extension points for custom implementations of existing cloud services such as
application-specific load balancing policies.
To address these shortcomings, we propose CPL (Cloud Platform Language), a
statically-typed core language for programming both distributed applications as
well as their deployment on a cloud platform. In CPL, application services and
deployment programs interact through statically typed, extensible interfaces,
and an application can trigger further deployment at run time. We provide a
formal semantics of CPL and demonstrate that it enables type-safe, composable
and extensible libraries of service combinators, such as load balancing and
fault tolerance.Comment: Technical report accompanying the MODULARITY '16 submissio
Control versus Data Flow in Parallel Database Machines
The execution of a query in a parallel database machine can be controlled in either a control flow way, or in a data flow way. In the former case a single system node controls the entire query execution. In the latter case the processes that execute the query, although possibly running on different nodes of the system, trigger each other. Lately, many database research projects focus on data flow control since it should enhance response times and throughput. The authors study control versus data flow with regard to controlling the execution of database queries. An analytical model is used to compare control and data flow in order to gain insights into the question which mechanism is better under which circumstances. Also, some systems using data flow techniques are described, and the authors investigate to which degree they are really data flow. The results show that for particular types of queries data flow is very attractive, since it reduces the number of control messages and balances these messages over the node
Aspect-Oriented Programming
Aspect-oriented programming is a promising idea that can improve the quality of software by reduce the problem of code tangling and improving the separation of concerns. At ECOOP'97, the first AOP workshop brought together a number of researchers interested in aspect-orientation. At ECOOP'98, during the second AOP workshop the participants reported on progress in some research topics and raised more issues that were further discussed. \ud
\ud
This year, the ideas and concepts of AOP have been spread and adopted more widely, and, accordingly, the workshop received many submissions covering areas from design and application of aspects to design and implementation of aspect languages
The End of Slow Networks: It's Time for a Redesign
Next generation high-performance RDMA-capable networks will require a
fundamental rethinking of the design and architecture of modern distributed
DBMSs. These systems are commonly designed and optimized under the assumption
that the network is the bottleneck: the network is slow and "thin", and thus
needs to be avoided as much as possible. Yet this assumption no longer holds
true. With InfiniBand FDR 4x, the bandwidth available to transfer data across
network is in the same ballpark as the bandwidth of one memory channel, and it
increases even further with the most recent EDR standard. Moreover, with the
increasing advances of RDMA, the latency improves similarly fast. In this
paper, we first argue that the "old" distributed database design is not capable
of taking full advantage of the network. Second, we propose architectural
redesigns for OLTP, OLAP and advanced analytical frameworks to take better
advantage of the improved bandwidth, latency and RDMA capabilities. Finally,
for each of the workload categories, we show that remarkable performance
improvements can be achieved
Optimizing Abstract Abstract Machines
The technique of abstracting abstract machines (AAM) provides a systematic
approach for deriving computable approximations of evaluators that are easily
proved sound. This article contributes a complementary step-by-step process for
subsequently going from a naive analyzer derived under the AAM approach, to an
efficient and correct implementation. The end result of the process is a two to
three order-of-magnitude improvement over the systematically derived analyzer,
making it competitive with hand-optimized implementations that compute
fundamentally less precise results.Comment: Proceedings of the International Conference on Functional Programming
2013 (ICFP 2013). Boston, Massachusetts. September, 201
On the Evaluation of RDF Distribution Algorithms Implemented over Apache Spark
Querying very large RDF data sets in an efficient manner requires a
sophisticated distribution strategy. Several innovative solutions have recently
been proposed for optimizing data distribution with predefined query workloads.
This paper presents an in-depth analysis and experimental comparison of five
representative and complementary distribution approaches. For achieving fair
experimental results, we are using Apache Spark as a common parallel computing
framework by rewriting the concerned algorithms using the Spark API. Spark
provides guarantees in terms of fault tolerance, high availability and
scalability which are essential in such systems. Our different implementations
aim to highlight the fundamental implementation-independent characteristics of
each approach in terms of data preparation, load balancing, data replication
and to some extent to query answering cost and performance. The presented
measures are obtained by testing each system on one synthetic and one
real-world data set over query workloads with differing characteristics and
different partitioning constraints.Comment: 16 pages, 3 figure
The End of a Myth: Distributed Transactions Can Scale
The common wisdom is that distributed transactions do not scale. But what if
distributed transactions could be made scalable using the next generation of
networks and a redesign of distributed databases? There would be no need for
developers anymore to worry about co-partitioning schemes to achieve decent
performance. Application development would become easier as data placement
would no longer determine how scalable an application is. Hardware provisioning
would be simplified as the system administrator can expect a linear scale-out
when adding more machines rather than some complex sub-linear function, which
is highly application specific.
In this paper, we present the design of our novel scalable database system
NAM-DB and show that distributed transactions with the very common Snapshot
Isolation guarantee can indeed scale using the next generation of RDMA-enabled
network technology without any inherent bottlenecks. Our experiments with the
TPC-C benchmark show that our system scales linearly to over 6.5 million
new-order (14.5 million total) distributed transactions per second on 56
machines.Comment: 12 page
- …