Search CORE

366 research outputs found

Backup and Recovery Mechanisms of Cassandra Database: A Review

Author: Bohora Karina
Bothe Amol
Chopade Rupali
Pachghare V. K.
Sheth Damini
Publication venue: (Print) 1558-7215
Publication date: 16/02/2021
Field of study

Cassandra is a NoSQL database having a peer-to-peer, ring-type architecture. Cassandra offers fault-tolerance, data replication for higher availability as well as ensures no single point of failure. Given that Cassandra is a NoSQL database, it is evident that it lacks the amount of research that has gone into comparatively older and more widely and broadly used SQL databases. Cassandra’s growing popularity in recent times gives rise to the need of addressing any security-related or recovery-related concerns associated with its usage. This review paper discusses the existing deletion mechanism in Cassandra and presents some identified issues related to backup and recovery in the Cassandra database. Further, failure detection as well as handling of failures such as node failure or data center failure has been explored in the paper. In addition, several possible solutions to address backup and recovery including recovery in case of disasters have been reviewed

Embry-Riddle Aeronautical University

Recommended from our members

A new approach to detecting failures in distributed systems

Author: Leners Joshua Blaise
Publication venue
Publication date: 18/09/2015
Field of study

textFault-tolerant distributed systems often handle failures in two steps: first, detect the failure and, second, take some recovery action. A common approach to detecting failures is end-to-end timeouts, but using timeouts brings problems. First, timeouts are inaccurate: just because a process is unresponsive does not mean that process has failed. Second, choosing a timeout is hard: short timeouts can exacerbate the problem of inaccuracy, and long timeouts can make the system wait unnecessarily. In fact, a good timeout value—one that balances the choice between accuracy and speed—may not even exist, owing to the variance in a system’s end-to-end delays. ƃis dissertation posits a new approach to detecting failures in distributed systems: use information about failures that is local to each component, e.g., the contents of an OS’s process table. We call such information inside information, and use it as the basis in the design and implementation of three failure reporting services for data center applications, which we call Falcon, Albatross, and Pigeon. Falcon deploys a network of software modules to gather inside information in the system, and it guarantees that it never reports a working process as crashed by sometimes terminating unresponsive components. ƃis choice helps applications by making reports of failure reliable, meaning that applications can treat them as ground truth. Unfortunately, Falcon cannot handle network failures because guaranteeing that a process has crashed requires network communication; we address this problem in Albatross and Pigeon. Instead of killing, Albatross blocks suspected processes from using the network, allowing applications to make progress during network partitions. Pigeon renounces interference altogether, and reports inside information to applications directly and with more detail to help applications make better recovery decisions. By using these services, applications can improve their recovery from failures both quantitatively and qualitatively. Quantitatively, these services reduce detection time by one to two orders of magnitude over the end-to-end timeouts commonly used by data center applications, thereby reducing the unavailability caused by failures. Qualitatively, these services provide more specific information about failures, which can reduce the logic required for recovery and can help applications better decide when recovery is not necessary.Computer Science

Texas ScholarWorks

Quality of Service of Crash-Recovery Failure Detectors

Author: Ma Tiejun
Publication venue
Publication date: 01/01/2007
Field of study

This thesis presents the results of an investigation into the failure detection problem. We consider the specific case of the Quality of Service (QoS) of crash failure detection. In contrast to previous work, we address the crash failure detection problem when the monitored target is resilient and recovers after failure. To the best of our knowledge, this is the first work to provide an analysis of crash-recovery failure detection from the QoS perspective.We develop a probabilistic model of the behavior of a crash-recovery target, i.e. one which has the ability to recover from the crash state. We show that the fail-free run and the crash-stop run are special cases of the crash-recovery run with mean time to failure (MTTF) approaching to infinity and mean time to recovery (MTTR) approaching to infinity, respectively. We extend the previously published QoS metrics to allow the measurement of the recovery speed, and the definition of the completeness property of a failure detector. Then, the impact of the dependability of the crash-recovery target on the QoS bounds for such a crash-recovery failure detector is analyzed using general dependability metrics, such as MTTF and MTTR, based on an approximate probabilistic model of the two-process failure detection system. Then according to our approximate model, we show how to estimate the failure detector’s parameters to achieve a required QoS, based on Chen et al.’s NFD-S algorithm analytically, and how to execute the configuration procedure of this crash-recovery failure detector.In order to make the failure detector adaptive to the target’s crash-recovery behavior and enable the autonomy of the monitoring procedure, we propose two types of recovery detection protocols. One is a reliable recovery detection protocol, which can guarantee to detect each occurring failure and recovery by adopting persistent storage. The other is a lightweight recovery detection protocol, which does not guarantee to detect every failure and recovery but which reduces the system overhead. Both of these recovery detection protocols improve the completeness without reducing the other QoS aspects of a failure detector. In addition, we also demonstrate how to estimate the inputs, such as the dependability metrics, using the failure detector itself.In order to evaluate our analytical work, we simulate the following failure detection algorithms: the simple heartbeat timeout algorithm, the NFD-S algorithm and the NFDS algorithm with the lightweight recovery detection protocol, for various values of MTTF and MTTR. The simulation results show that the dependability of a recoverable monitored target could have significant impact on the QoS of such a failure detector. This conforms well to our models and analysis. We show that in the case of reasonable long MTTF, the NFD-S algorithm with the lightweight recovery detection protocol exhibits better QoS than the NFD-S algorithm for the completeness of a crash-recovery failure detector, and similarly for other QoS metrics

CiteSeerX

Edinburgh Research Archive

Managerial Accounting (2nd edition)

Author: Cataldo Anthony J., II
Publication venue: Digital Commons @ West Chester University
Publication date: 01/01/2018
Field of study

This text covers the material required in an introductory managerial accounting course and compliments my text on introductory financial accounting. Both are texts and courses required for all business degree undergraduates. My objective is to make this material available to students at a very low cost. I have used variable costing and other techniques included in this text in business litigation engagements involving GM, Ford, Chrysler, Toyota, Nissan and other automobile manufacturers, testifying in Nevada, California, Texas, and Arizona.https://digitalcommons.wcupa.edu/acc_texts/1003/thumbnail.jp

Digital Commons @ West Chester University

Towards adaptive actors for scalable iot applications at the edge

Author: Argerich Mauricio Fadel
Chen Kaifei
Fürst Jonathan
Kovacs Ernö
Publication venue
Publication date: 01/01/2018
Field of study

Traditional device-cloud architectures are not scalable to the size of future IoT deployments. While edge and fog-computing principles seem like a tangible solution, they increase the programming effort of IoT systems, do not provide the same elasticity guarantees as the cloud and are of much greater hardware heterogeneity. Future IoT applications will be highly distributed and place their computational tasks on any combination of end-devices (sensor nodes, smartphones, drones), edge and cloud resources in order to achieve their application goals. These complex distributed systems require a programming model that allows developers to implement their applications in a simple way (i.e., focus on the application logic) and an execution framework that runs these applications resiliently with a high resource efficiency, while maximizing application utility. Towards such distributed execution runtime, we propose Nandu, an actor based system that adapts and migrates tasks dynamically using developer provided hints as seed information. Nandu allows developers to focus on sequential application logic and transforms their application into distributed, adaptive actors. The resulting actors support fine-grained entry points for the execution environment. These entry points allow local schedulers to adapt actors seamlessly to the current context, while optimizing the overall application utility according to developer provided requirements

Directory of Open Access Journals

RonPub -- Research Online Publishing

The IT University of Copenhagen's Repository

Memory management unit for hardware-assisted dynamic relocation in on-board satellite systems

Author: Da Silva Fariña Antonio
García Tejedor Juan Ignacio
Guzmán García David
Losa Cruz Borja
Martínez Hellín Agustín
Parra Espada Pablo
Rodríguez Polo Óscar
Sánchez Prieto Sebastián
Sánchez Sánchez Jonatan
Publication venue: IEEE
Publication date: 08/06/2023
Field of study

Satellite on-board systems spend their lives in hostile environments, where radiation can cause critical hardware failures. One of the most radiation-sensitive elements is memory. The so-called single event effects (SEEs) can corrupt or even irretrievably damage the cells that store the data and program instructions. When one of these cells is corrupted, the program must not use it again during execution. In order to avoid rebuilding and uploading the code, a memory management unit can be used to transparently relocate the program to an error-free memory region. This article presents the design and implementation of a memory management unit that allows the dynamic relocation of on-board software. This unit provides a hardware mechanism that allows the automatic relocation of sections of code or data at run-time, only requiring software intervention for initialization and configuration. The unit has been implemented on the LEON architecture, a reference for the European Space Agency (ESA) missions. The proposed solution has been validated using the boot and application software (ASW) of the instrument control unit of the Energetic Particle Detector of the Solar Orbiter Mission as a base. Processor synthesis on different FPGAs has shown resource usage and power consumption similar to that of a conventional memory management unit. The results vary between ± 1?15% in resource usage and ± 1?7% in power consumption, depending on the number of inputs assigned to the unit and the FPGA used. When comparing performance, both the proposed and conventional memory management units show the same results.Universidad de Alcal

e_Buah - Biblioteca Digital de la Universidad de Alcalá

The Architecture of an Autonomic, Resource-Aware, Workstation-Based Distributed Database System

Author: Macdonald Angus
Publication venue
Publication date: 19/07/2012
Field of study

Distributed software systems that are designed to run over workstation machines within organisations are termed workstation-based. Workstation-based systems are characterised by dynamically changing sets of machines that are used primarily for other, user-centric tasks. They must be able to adapt to and utilize spare capacity when and where it is available, and ensure that the non-availability of an individual machine does not affect the availability of the system. This thesis focuses on the requirements and design of a workstation-based database system, which is motivated by an analysis of existing database architectures that are typically run over static, specially provisioned sets of machines. A typical clustered database system -- one that is run over a number of specially provisioned machines -- executes queries interactively, returning a synchronous response to applications, with its data made durable and resilient to the failure of machines. There are no existing workstation-based databases. Furthermore, other workstation-based systems do not attempt to achieve the requirements of interactivity and durability, because they are typically used to execute asynchronous batch processing jobs that tolerate data loss -- results can be re-computed. These systems use external servers to store the final results of computations rather than workstation machines. This thesis describes the design and implementation of a workstation-based database system and investigates its viability by evaluating its performance against existing clustered database systems and testing its availability during machine failures.Comment: Ph.D. Thesi

arXiv.org e-Print Archive

St Andrews Research Repository