11 research outputs found
New Checkpoint and Rollback for High Availability of Mapreduce Computing
MapReduce is a programming model and an associated implementation for processing and generating large data sets, so called big data. A MapReduce job usually splits the input data-set into independent chunks which are processed by the maptasks in a completely parallel manner. The framework sorts the outputs of the maps, which are then input to the reduce tasks. If an error occurs in a name node other name node will take over the failed node and continues its execution. Other than data node failure, if an error occurs during the program execution itself then there must be a detection and recovery steps to correct the error.A solution for this problem is to implement the checkpoint and rollback mechanism in the system. When memory error occurs in the MapReduce program then execution in all the data nodes will be stopped and it starts all over from the starting phase in hadoop. The proposed methodology is to detect the heap space error [10] and provide a recovery operations by employing a new checkpoint and recovery process. In order to realize this, a new phase based checkpoint and rollback is proposed versus the hadoop default configuration. Once an error occurs in hadoop, the memory size required by the program is raised then the configuration file setting is modified and then a checkpoint is set and from there next phases will be executed. In this way, the entire already completed phases are not needed to be re-executed. From the experimental results, the hadoop availability is increased to 53.22% compared to the default hadoop configuration thereby decreasing the running time of the application.Computer Scienc
Recovery for Memory-resident Database Systems
This paper presents a recovery mechanism for memoryresident databases. It uses some stable memory and special hardware devices to eliminate expensive I/O operations handled by the main processor. And, through this achievement, the throughput rate is improved.Computing and Information Science
Enabling Fast Failure Recovery in Shared Hadoop Clusters: Towards Failure-Aware Scheduling
International audienceHadoop emerged as the de facto state-of-the-art system for MapReduce-based data analytics. The reliability of Hadoop systems depends in part on how well they handle failures. Currently, Hadoop handles machine failures by re-executing all the tasks of the failed machines (i.e., executing recovery tasks). Unfortunately, this elegant solution is entirely entrusted to the core of Hadoop and hidden from Hadoop schedulers. The unawareness of failures therefore may prevent Hadoop schedulers from operating correctly towards meeting their objectives (e.g., fairness, job priority) and can significantly impact the performance of MapReduce applications. This paper presents Chronos, a failure-aware scheduling strategy that enables an early yet smart action for fast failure recovery while still operating within a specific scheduler objective. Upon failure detection, rather than waiting an uncertain amount of time to get resources for recovery tasks, Chronos leverages a lightweight preemption technique to carefully allocate these resources. In addition, Chronos considers data locality when scheduling recovery tasks to further improve the performance. We demonstrate the utility of Chronos by combining it with Fifo and Fair schedulers. The experimental results show that Chronos recovers to a correct scheduling behavior within a couple of seconds only and reduces the job completion times by up to 55% compared to state-of-the-art schedulers
CHECKPOINTING AND RECOVERY IN DISTRIBUTED AND DATABASE SYSTEMS
A transaction-consistent global checkpoint of a database records a state of the database which reflects the effect of only completed transactions and not the re- sults of any partially executed transactions. This thesis establishes the necessary and sufficient conditions for a checkpoint of a data item (or the checkpoints of a set of data items) to be part of a transaction-consistent global checkpoint of the database. This result would be useful for constructing transaction-consistent global checkpoints incrementally from the checkpoints of each individual data item of a database. By applying this condition, we can start from any useful checkpoint of any data item and then incrementally add checkpoints of other data items until we get a transaction- consistent global checkpoint of the database. This result can also help in designing non-intrusive checkpointing protocols for database systems. Based on the intuition gained from the development of the necessary and sufficient conditions, we also de- veloped a non-intrusive low-overhead checkpointing protocol for distributed database systems.
Checkpointing and rollback recovery are also established techniques for achiev- ing fault-tolerance in distributed systems. Communication-induced checkpointing algorithms allow processes involved in a distributed computation take checkpoints independently while at the same time force processes to take additional checkpoints to make each checkpoint to be part of a consistent global checkpoint. This thesis develops a low-overhead communication-induced checkpointing protocol and presents a performance evaluation of the protocol
In-Memory Database Management System
Objektem zájmu této diplomové práce je proprietární databázové rozhraní pro správu tabulek v operační paměti. Na začátek je podán krátký úvod do problematiky databází. Následuje představení koncepce databázových systémů využívajících jako datový nosič operační paměť a jsou rozebrány přednosti i nedostatky tohoto řešení. V závěru teoretického úvodu je uveden přehled existujícím systémů. V práci dále následuje prezentace základních informací o energetickém řídícím systému RIS s návazností na jeho paměťové databázové rozhraní. Poté se práce zaměřuje na specifikaci a návrh požadovaných úprav a rozšíření tohoto rozhraní. Následně je popsána realizovaná implementace a představeny výsledky testů. V závěru jsou pak shrnuty dosažené výsledky a diskutován budoucí vývoj.The focus of this thesis is a proprietary database interface for management tables in memory. At the beginning, there is given a short introduction to the databases. Then the concept of in-memory database systems is presented. Also the main advantages and disadvantages of this solution are discussed. The theoretical introduction is ended by brief overview of existing systems. After that the basic information about energetic management system RIS are presented together with system's in-memory database interface. Further the work aims at the specification and design of required modifications and extensions of the interface. Then the implementation details and tests results are presented. In conclusion the results are summarized and future development is discussed.
Universal Database System Analysis for Insight and Adaptivity
Database systems are ubiquitous; they serve as the cornerstone of modern
application infrastructure due to their efficient data access and
storage. Database systems are commonly deployed in a wide range of environments,
from transaction processing to analytics.
Unfortunately, this broad support comes with a trade-off in system
complexity. Database systems contain many components and features that
must work together to meet client demand. Administrators responsible
for maintaining database systems face a daunting task: they must
determine the access characteristics of the client workload they are
serving and tailor the system to optimize for
it. Complicating matters, client workloads are known to shift in
access patterns and load. Thus, administrators continuously
perform this optimization task, refining system design and
configuration to meet ever-changing client request patterns.
Researchers have focused on creating next-generation, natively adaptive database systems to
address this administrator burden. Natively adaptive database systems construct
client-request models, determine workload characteristics, and tailor
processing strategies to optimize accordingly. These systems
continuously refine their models, ensuring they are responsive to
workload shifts. While these new systems show promise in adapting system
behaviour to their environment, existing, popularly-used database systems
lack these adaptive capabilities. Porting the ideas in these new
adaptive systems to existing infrastructure requires monumental
engineering effort, slowing their adoption and leaving users stranded
with their existing, non-adaptive database systems.
In this thesis, I present Dendrite, a framework that easily
``bolts on'' to existing database systems to endow them with adaptive
capabilities. Dendrite captures database system behaviour in
a system-agnostic fashion, ensuring that its techniques are
generalizable. It compares captured behaviour to determine
how system behaviour changes over time and with respect to idealized
system performance. These differences are matched against
configurable adaption rules, which deploy user-defined
functions to remedy performance problems. As such, Dendrite can
deploy whatever adaptions are necessary to address a behaviour shift
and tailor the system to the workload at hand. Dendrite
has low tracking overhead, making it practical for intensive database
system deployments