11,423 research outputs found
Performance models of concurrency control protocols for transaction processing systems
Transaction processing plays a key role in a lot of IT infrastructures. It is widely used in a variety of contexts, spanning from database management systems to concurrent programming tools. Transaction processing systems leverage on concurrency control protocols, which allow them to concurrently process transactions preserving essential properties, as isolation and atomicity. Performance is a critical aspect of transaction processing systems, and it is unavoidably affected by the concurrency control. For this reason, methods and techniques to assess and predict the performance of concurrency control protocols are of interest for many IT players, including application designers, developers and system administrators. The analysis and the proper understanding of the impact on the system performance of these protocols require quantitative approaches. Analytical modeling is a practical approach for building cost-effective computer system performance models, enabling us to quantitatively describe the complex dynamics characterizing these systems. In this dissertation we present analytical performance models of concurrency control protocols. We deal with both traditional transaction processing systems, such as database management systems, and emerging ones, as transactional memories. The analysis focuses on widely used protocols, providing detailed performance models and validation studies. In addition, we propose new modeling approaches, which also broaden the scope of our study towards a more realistic, application-oriented, performance analysis
Data Warehouse Design and Management: Theory and Practice
The need to store data and information permanently, for their reuse in later stages, is a very relevant problem in the modern world and now affects a large number of people and economic agents. The storage and subsequent use of data can indeed be a valuable source for decision making or to increase commercial activity. The next step to data storage is the efficient and effective use of information, particularly through the Business Intelligence, at whose base is just the implementation of a Data Warehouse. In the present paper we will analyze Data Warehouses with their theoretical models, and illustrate a practical implementation in a specific case study on a pharmaceutical distribution companyData warehouse, database, data model.
A modular distributed transactional memory framework
Dissertação para obtenção do Grau de Mestre em
Engenharia InformáticaThe traditional lock-based concurrency control is complex and error-prone due to its
low-level nature and composability challenges. Software transactional memory (STM), inherited from the database world, has risen as an exciting alternative, sparing the programmer from dealing explicitly with such low-level mechanisms.
In real world scenarios, software is often faced with requirements such as high availability and scalability, and the solution usually consists on building a distributed system.
Given the benefits of STM over traditional concurrency controls, Distributed Software
Transactional Memory (DSTM) is now being investigated as an attractive alternative for
distributed concurrency control.
Our long-term objective is to transparently enable multithreaded applications to execute
over a DSTM setting. In this work we intend to pave the way by defining a modular
DSTM framework for the Java programming language. We extend an existing, efficient,
STM framework with a new software layer to create a DSTM framework. This new layer
interacts with the local STM using well-defined interfaces, and allows the implementation of different distributed memory models while providing a non-intrusive, familiar,programming model to applications, unlike any other DSTM framework.
Using the proposed DSTM framework we have successfully, and easily, implemented
a replicated STM which uses a Certification protocol to commit transactions. An evaluation using common STM benchmarks showcases the efficiency of the replicated STM,and its modularity enables us to provide insight on the relevance of different implementations of the Group Communication System required by the Certification scheme, with respect to performance under different workloads.Fundação para a Ciência e Tecnologia - project (PTDC/EIA-EIA/113613/2009
Trusted Computing and Secure Virtualization in Cloud Computing
Large-scale deployment and use of cloud computing in industry
is accompanied and in the same time hampered by concerns regarding protection of
data handled by cloud computing providers. One of the consequences of moving
data processing and storage off company premises is that organizations have
less control over their infrastructure. As a result, cloud service (CS) clients
must trust that the CS provider is able to protect their data and
infrastructure from both external and internal attacks. Currently however, such
trust can only rely on organizational processes declared by the CS
provider and can not be remotely verified and validated by an external party.
Enabling the CS client to verify the integrity of the host where the
virtual machine instance will run, as well as to ensure that the virtual
machine image has not been tampered with, are some steps towards building
trust in the CS provider. Having the tools to perform such
verifications prior to the launch of the VM instance allows the CS
clients to decide in runtime whether certain data should be stored- or calculations
should be made on the VM instance offered by the CS provider.
This thesis combines three components -- trusted computing, virtualization technology
and cloud computing platforms -- to address issues of trust and
security in public cloud computing environments. Of the three components,
virtualization technology has had the longest evolution and is a cornerstone
for the realization of cloud computing. Trusted computing is a recent
industry initiative that aims to implement the root of trust in a hardware
component, the trusted platform module. The initiative has been formalized
in a set of specifications and is currently at version 1.2. Cloud computing
platforms pool virtualized computing, storage and network resources in
order to serve a large number of customers customers that use a multi-tenant
multiplexing model to offer on-demand self-service over broad network.
Open source cloud computing platforms are, similar to trusted computing, a
fairly recent technology in active development.
The issue of trust in public cloud environments is addressed
by examining the state of the art within cloud computing security and
subsequently addressing the issues of establishing trust in the launch of a
generic virtual machine in a public cloud environment. As a result, the thesis
proposes a trusted launch protocol that allows CS clients
to verify and ensure the integrity of the VM instance at launch time, as
well as the integrity of the host where the VM instance is launched. The protocol
relies on the use of Trusted Platform Module (TPM) for key generation and data protection.
The TPM also plays an essential part in the integrity attestation of the
VM instance host. Along with a theoretical, platform-agnostic protocol,
the thesis also describes a detailed implementation design of the protocol
using the OpenStack cloud computing platform.
In order the verify the implementability of the proposed protocol, a prototype
implementation has built using a distributed deployment of OpenStack.
While the protocol covers only the trusted launch procedure using generic
virtual machine images, it presents a step aimed to contribute towards
the creation of a secure and trusted public cloud computing environment
A self-healing framework for general software systems
Modern systems must guarantee high reliability, availability, and efficiency. Their complexity, exacerbated by the dynamic integration with other systems, the use of third- party services and the various different environments where they run, challenges development practices, tools and testing techniques. Testing cannot identify and remove all possible faults, thus faulty conditions may escape verification and validation activities and manifest themselves only after the system deployment. To cope with those failures, researchers have proposed the concept of self-healing systems. Such systems have the ability to examine their failures and to automatically take corrective actions. The idea is to create software systems that can integrate the knowledge that is needed to compensate for the effects of their imperfections. This knowledge is usually codified into the systems in the form of redundancy. Redundancy can be deliberately added into the systems as part of the design and the development process, as it occurs for many fault tolerance techniques. Although this kind of redundancy is widely applied, especially for safety- critical systems, it is however generally expensive to be used for common use software systems. We have some evidence that modern software systems are characterized by a different type of redundancy, which is not deliberately introduced but is naturally present due to the modern modular software design. We call it intrinsic redundancy. This thesis proposes a way to use the intrinsic redundancy of software systems to increase their reliability at a low cost. We first study the nature of the intrinsic redundancy to demonstrate that it actually exists. We then propose a way to express and encode such redundancy and an approach, Java Automatic Workaround, to exploit it automatically and at runtime to avoid system failures. Fundamentally, the Java Automatic Workaround approach replaces some failing operations with other alternative operations that are semantically equivalent in terms of the expected results and in the developer’s intent, but that they might have some syntactic difference that can ultimately overcome the failure. We qualitatively discuss the reasons of the presence of the intrinsic redundancy and we quantitatively study four large libraries to show that such redundancy is indeed a characteristic of modern software systems. We then develop the approach into a prototype and we evaluate it with four open source applications. Our studies show that the approach effectively exploits the intrinsic redundancy in avoiding failures automatically and at runtime
Enhancing the efficiency and practicality of software transactional memory on massively multithreaded systems
Chip Multithreading (CMT) processors promise to deliver higher performance by running more than one stream of instructions in parallel. To exploit CMT's capabilities, programmers have to parallelize their applications, which is not a trivial task. Transactional Memory (TM) is one of parallel programming models that aims at simplifying synchronization by raising the level of abstraction between semantic atomicity and the means by which that atomicity is achieved. TM is a promising programming model but there are still important challenges that must be addressed to make it more practical and efficient in mainstream parallel programming.
The first challenge addressed in this dissertation is that of making the evaluation of TM proposals more solid with realistic TM benchmarks and being able to run the same benchmarks on different STM systems. We first introduce a benchmark suite, RMS-TM, a comprehensive benchmark suite to evaluate HTMs and STMs. RMS-TM consists of seven applications from the Recognition, Mining and Synthesis (RMS) domain that are representative of future workloads. RMS-TM features current TM research issues such as nesting and I/O inside transactions, while also providing various TM characteristics. Most STM systems are implemented as user-level libraries: the programmer is expected to manually instrument not only transaction boundaries, but also individual loads and stores within transactions. This library-based approach is increasingly tedious and error prone and also makes it difficult to make reliable performance comparisons. To enable an "apples-to-apples" performance comparison, we then develop a software layer that allows researchers to test the same applications with interchangeable STM back ends.
The second challenge addressed is that of enhancing performance and scalability of TM applications running on aggressive multi-core/multi-threaded processors. Performance and scalability of current TM designs, in particular STM desings, do not always meet the programmer's expectation, especially at scale. To overcome this limitation, we propose a new STM design, STM2, based on an assisted execution model in which time-consuming TM operations are offloaded to auxiliary threads while application threads optimistically perform computation. Surprisingly, our results show that STM2 provides, on average, speedups between 1.8x and 5.2x over state-of-the-art STM systems. On the other hand, we notice that assisted-execution systems may show low processor utilization. To alleviate this problem and to increase the efficiency of STM2, we enriched STM2 with a runtime mechanism that automatically and adaptively detects application and auxiliary threads' computing demands and dynamically partition hardware resources between the pair through the hardware thread prioritization mechanism implemented in POWER machines.
The third challenge is to define a notion of what it means for a TM program to be correctly synchronized. The current definition of transactional data race requires all transactions to be totally ordered "as if'' serialized by a global lock, which limits the scalability of TM designs. To remove this constraint, we first propose to relax the current definition of transactional data race to allow a higher level of concurrency. Based on this definition we propose the first practical race detection algorithm for C/C++ applications (TRADE) and implement the corresponding race detection tool. Then, we introduce a new definition of transactional data race that is more intuitive, transparent to the underlying TM implementation, can be used for a broad set of C/C++ TM programs. Based on this new definition, we proposed T-Rex, an efficient and scalable race detection tool for C/C++ TM applications. Using TRADE and T-Rex, we have discovered subtle transactional data races in widely-used STAMP applications which have not been reported in the past
On Correctness of Data Structures under Reads-Write Concurrency
Abstract. We study the correctness of shared data structures under reads-write concurrency. A popular approach to ensuring correctness of read-only operations in the presence of concurrent update, is read-set validation, which checks that all read variables have not changed since they were first read. In practice, this approach is often too conserva-tive, which adversely affects performance. In this paper, we introduce a new framework for reasoning about correctness of data structures under reads-write concurrency, which replaces validation of the entire read-set with more general criteria. Namely, instead of verifying that all read conditions over the shared variables, which we call base conditions. We show that reading values that satisfy some base condition at every point in time implies correctness of read-only operations executing in parallel with updates. Somewhat surprisingly, the resulting correctness guarantee is not equivalent to linearizability, and is instead captured through two new conditions: validity and regularity. Roughly speaking, the former re-quires that a read-only operation never reaches a state unreachable in a sequential execution; the latter generalizes Lamport’s notion of regular-ity for arbitrary data structures, and is weaker than linearizability. We further extend our framework to capture also linearizability. We illus-trate how our framework can be applied for reasoning about correctness of a variety of implementations of data structures such as linked lists.
- …