462 research outputs found
Recommended from our members
Canonical approximation in the performance analysis of distributed systems
The problem of analyzing distributed systems arises in many areas of computer science, such as communication networks, distributed databases, packet radio networks, VLSI communications and switching mechanisms. Analysis of distributed systems is difficult since one must deal with many tightly-interacting components. The number of possible state configurations typically grows exponentially with the system size, making the exact analysis intractable even for relatively small systems. For the stochastic models of these systems, whose steady-state probability is of the product form, many global performance measures of interest can be computed once one knows the normalization constant of the steady-state probability distribution. This constant, called the system partition function, is typically difficult to derive in closed form. The key difficulty in performance analysis of such models can be viewed as trying to derive a good approximation to the partition function or calculate it numerically. In this Ph.D. work we introduce a new approximation technique to analyze a variety of such models of distributed systems. This technique, which we call the method of Canonical Approximation, is similar to that developed in statistical physics to compute the partition function. The new method gives a closed-form approximation of the partition function and of the global performance measures. It is computationally simple with complexity independent of the system size, gives an excellent degree of precision for large systems, and is applicable to a wide variety of problems. The method is applied to the analysis of multihop packet radio networks, locking schemes in database systems, closed queueing networks, and interconnection networks
Fastpass: A Centralized “Zero-Queue” Datacenter Network
An ideal datacenter network should provide several properties, including low median and tail latency, high utilization (throughput), fair allocation of network resources between users or applications, deadline-aware scheduling, and congestion (loss) avoidance. Current datacenter networks inherit the principles that went into the design of the Internet, where packet transmission and path selection decisions are distributed among the endpoints and routers. Instead, we propose that each sender should delegate control—to a centralized arbiter—of when each packet should be transmitted and what path it should follow. This paper describes Fastpass, a datacenter network architecture built using this principle. Fastpass incorporates two fast algorithms: the first determines the time at which each packet should be transmitted, while the second determines the path to use for that packet. In addition, Fastpass uses an efficient protocol between the endpoints and the arbiter and an arbiter replication strategy for fault-tolerant failover. We deployed and evaluated Fastpass in a portion of Facebook’s datacenter network. Our results show that Fastpass achieves high throughput comparable to current networks at a 240 reduction is queue lengths (4.35 Mbytes reducing to 18 Kbytes), achieves much fairer and consistent flow throughputs than the baseline TCP (5200 reduction in the standard deviation of per-flow throughput with five concurrent connections), scalability from 1 to 8 cores in the arbiter implementation with the ability to schedule 2.21 Terabits/s of traffic in software on eight cores, and a 2.5 reduction in the number of TCP retransmissions in a latency-sensitive service at Facebook.National Science Foundation (U.S.) (grant IIS-1065219)Irwin Mark Jacobs and Joan Klein Jacobs Presidential FellowshipHertz Foundation (Fellowship
Performance Problem Diagnostics by Systematic Experimentation
Diagnostics of performance problems requires deep expertise in performance engineering and entails a high manual effort. As a consequence, performance evaluations are postponed to the last minute of the development process. In this thesis, we introduce an automatic, experiment-based approach for performance problem diagnostics in enterprise software systems. With this approach, performance engineers can concentrate on their core competences instead of conducting repeating tasks
Performance Problem Diagnostics by Systematic Experimentation
In this book, we introduce an automatic, experiment-based approach for performance problem diagnostics in enterprise software systems. The proposed approach systematically searches for root causes of detected performance problems by executing series of systematic performance tests. The presented approach is evaluated by various case studies showing that the presented approach is applicable to a wide range of contexts
Database replication for enterprise applications
The MAP-i Doctoral Programme in Informatics, of the Universities of Minho, Aveiro and PortoA common pattern for enterprise applications, particularly in small and medium
businesses, is the reliance on an integrated traditional relational database system
that provides persistence and where the relational aspect underlies the core logic
of the application. While several solutions are proposed for scaling out such
applications, database replication is key if the relational aspect is to be preserved.
However, it is worrisome that because proposed solutions for database replication
have been evaluated using simple synthetic benchmarks, their applicability
to enterprise applications is not straightforward: the performance of conservative
solutions hinges on the ability to conveniently partition applications while optimistic
solutions may experience unacceptable abort rates, compromising fairness,
particularly considering long-running transactions.
In this thesis, we address these challenges. First, by performing a detailed
evaluation of the applicability of database replication protocols based on conservative
concurrency control to enterprise applications. Results invalidate the
common assumption that real-world databases can be easily partitioned. Then,
we tackle the issue of unacceptable abort rates in optimistic solutions by proposing
a novel transaction scheduler, AJITTS, which uses an adaptive mechanism
that by reaching and maintaining the optimal level of concurrency in the system,
minimizes aborts and improves throughput.Um padrão comum no que toca a aplicações empresariais, particularmente em pequenas e médias empresas, é a dependência de um sistema de base dados relacional integrado que garante a persistência dos dados e no qual o aspeto relacional é parte integral da logica da aplicação. Embora várias soluções tenham sido propostas para dotar este tipo de aplicações de escalabilidade horizontal, a replicação de base de dados é a solução se o aspeto relacional deve ser preservado.
No entanto, é preocupante que, dado que as soluções existentes para replicação de base de dados têm sido avaliadas utilizando testes de desempenho sintéticos e simples, a aplicabilidade destes a aplicações empresariais não é directa: o desempenho de soluções conservadoras está intimamente ligado à capacidade de particionar a aplicação convenientemente, enquanto que soluções optimistas podem sofrer de taxas de insucesso inaceitáveis o que compromete a equidade das mesmas, em particular no caso de transações especialmente longas.
Nesta tese, abordamos estes desafios. Primeiro, atravĂ©s de uma avaliação detalhada da aplicabilidade de protocolos de replicação de base de dados baseados em controlo de concorrĂŞncia conservador a aplicações empresariais. Os resultados obtidos invalidam o pressuposto comum de que bases de dados reais podem ser facilmente particionadas. Assim sendo, abordámos o problema das possĂveis taxas de insucesso inaceitáveis em soluções optimistas propondo um novo escalonador de transações, o AJITTS, que utiliza um mecanismo adaptativo que ao atingir e manter o nĂvel Ăłtimo de concorrĂŞncia no sistema, minimiza a taxa de insucesso e melhora o desempenho do mesmo
Recommended from our members
Performance Modelling of Database Designs using a Queueing Networks Approach. An investigation in the performance modelling and evaluation of detailed database designs using queueing network models.
Databases form the common component of many software systems, including mission
critical transaction processing systems and multi-tier Internet applications. There is a
large body of research in the performance of database management system components,
while studies of overall database system performance have been limited. Moreover,
performance models specifically targeted at the database design have not been
extensively studied.
This thesis attempts to address this concern by proposing a performance evaluation
method for database designs based on queueing network models. The method is targeted
at designs of large databases in which I/O is the dominant cost factor. The database
design queueing network performance model is suitable in providing what if
comparisons of database designs before database system implementation.
A formal specification that captures the essential database design features while keeping
the performance model sufficiently simple is presented. Furthermore, the simplicity of
the modelling algorithms permits the direct mapping between database design entities
and queueing network models. This affords for a more applicable performance model
that provides relevant feedback to database designers and can be straightforwardly
integrated into early database design development phases. The accuracy of the
modelling technique is validated by modelling an open source implementation of the
TPC-C benchmark. The contribution of this thesis is considered to be significant in that the majority of
performance evaluation models for database systems target capacity planning or overall
system properties, with limited work in detailed database transaction processing and
behaviour. In addition, this work is deemed to be an improvement over previous
methodologies in that the transaction is modelled at a finer granularity, and that the
database design queueing network model provides for the explicit representation of
active database rules and referential integrity constraints.Iqra Foundatio
Dealing with Burstiness in Multi-Tier Applications: Models and Their Parameterization
Abstract—Workloads and resource usage patterns in enterprise applications often show burstiness resulting in large degradation of the perceived user performance. In this paper, we propose a methodology for detecting burstiness symptoms in multi-tier applications but, rather than identifying the root cause of burstiness, we incorporate this information into models for performance prediction. The modeling methodology is based on the index of dispersion of the service process at a server, which is inferred by observing the number of completions within the concatenated busy times of that server. The index of dispersion is used to derive a Markov-modulated process that captures well burstiness and variability of the service process at each resource and that allows us to define queueing network models for performance prediction. Experimental results and performance model predictions are in excellent agreement and argue for the effectiveness of the proposed methodology under both bursty and non-bursty workloads. Furthermore, we show that the methodology extends to modeling flash crowds that create burstiness in the stream of requests incoming to the application. Index Terms—Capacity planning, multi-tier applications, bursty workload, bottleneck switch, index of dispersion.
Online Modeling and Tuning of Parallel Stream Processing Systems
Writing performant computer programs is hard. Code for high performance applications is profiled, tweaked, and re-factored for months specifically for the hardware for which it is to run. Consumer application code doesn\u27t get the benefit of endless massaging that benefits high performance code, even though heterogeneous processor environments are beginning to resemble those in more performance oriented arenas. This thesis offers a path to performant, parallel code (through stream processing) which is tuned online and automatically adapts to the environment it is given. This approach has the potential to reduce the tuning costs associated with high performance code and brings the benefit of performance tuning to consumer applications where otherwise it would be cost prohibitive. This thesis introduces a stream processing library and multiple techniques to enable its online modeling and tuning. Stream processing (also termed data-flow programming) is a compute paradigm that views an application as a set of logical kernels connected via communications links or streams. Stream processing is increasingly used by computational-x and x-informatics fields (e.g., biology, astrophysics) where the focus is on safe and fast parallelization of specific big-data applications. A major advantage of stream processing is that it enables parallelization without necessitating manual end-user management of non-deterministic behavior often characteristic of more traditional parallel processing methods. Many big-data and high performance applications involve high throughput processing, necessitating usage of many parallel compute kernels on several compute cores. Optimizing the orchestration of kernels has been the focus of much theoretical and empirical modeling work. Purely theoretical parallel programming models can fail when the assumptions implicit within the model are mis-matched with reality (i.e., the model is incorrectly applied). Often it is unclear if the assumptions are actually being met, even when verified under controlled conditions. Full empirical optimization solves this problem by extensively searching the range of likely configurations under native operating conditions. This, however, is expensive in both time and energy. For large, massively parallel systems, even deciding which modeling paradigm to use is often prohibitively expensive and unfortunately transient (with workload and hardware). In an ideal world, a parallel run-time will re-optimize an application continuously to match its environment, with little additional overhead. This work presents methods aimed at doing just that through low overhead instrumentation, modeling, and optimization. Online optimization provides a good trade-off between static optimization and online heuristics. To enable online optimization, modeling decisions must be fast and relatively accurate. Online modeling and optimization of a stream processing system first requires the existence of a stream processing framework that is amenable to the intended type of dynamic manipulation. To fill this void, we developed the RaftLib C++ template library, which enables usage of the stream processing paradigm for C++ applications (it is the run-time which is the basis of almost all the work within this dissertation). An application topology is specified by the user, however almost everything else is optimizable by the run-time. RaftLib takes advantage of the knowledge gained during the design of several prior streaming languages (notably Auto-Pipe). The resultant framework enables online migration of tasks, auto-parallelization, online buffer-reallocation, and other useful dynamic behaviors that were not available in many previous stream processing systems. Several benchmark applications have been designed to assess the performance gains through our approaches and compare performance to other leading stream processing frameworks. Information is essential to any modeling task, to that end a low-overhead instrumentation framework has been developed which is both dynamic and adaptive. Discovering a fast and relatively optimal configuration for a stream processing application often necessitates solving for buffer sizes within a finite capacity queueing network. We show that a generalized gain/loss network flow model can bootstrap the process under certain conditions. Any modeling effort, requires that a model be selected; often a highly manual task, involving many expensive operations. This dissertation demonstrates that machine learning methods (such as a support vector machine) can successfully select models at run-time for a streaming application. The full set of approaches are incorporated into the open source RaftLib framework
Resource Sharing for Multi-Tenant Nosql Data Store in Cloud
Thesis (Ph.D.) - Indiana University, Informatics and Computing, 2015Multi-tenancy hosting of users in cloud NoSQL data stores is favored by cloud providers because it enables resource sharing at low operating cost. Multi-tenancy takes several forms depending on whether the back-end file system is a local file system (LFS) or a parallel file system (PFS), and on whether tenants are independent or share data across tenants In this thesis I focus on and propose solutions to two cases: independent data-local file system, and shared data-parallel file system. In the independent data-local file system case, resource contention occurs under certain conditions in Cassandra and HBase, two state-of-the-art NoSQL stores, causing performance degradation for one tenant by another. We investigate the interference and propose two approaches. The first provides a scheduling scheme that can approximate resource consumption, adapt to workload dynamics and work in a distributed fashion. The second introduces a workload-aware resource reservation approach to prevent interference. The approach relies on a performance model obtained offline and plans the reservation according to different workload resource demands. Results show the approaches together can prevent interference and adapt to dynamic workloads under multi-tenancy. In the shared data-parallel file system case, it has been shown that running a distributed NoSQL store over PFS for shared data across tenants is not cost effective. Overheads are introduced due to the unawareness of the NoSQL store of PFS. This dissertation targets the key-value store (KVS), a specific form of NoSQL stores, and proposes a lightweight KVS over a parallel file system to improve efficiency. The solution is built on an embedded KVS for high performance but uses novel data structures to support concurrent writes, giving capability that embedded KVSs are not designed for. Results show the proposed system outperforms Cassandra and Voldemort in several different workloads
- …