366 research outputs found
Backup and Recovery Mechanisms of Cassandra Database: A Review
Cassandra is a NoSQL database having a peer-to-peer, ring-type architecture. Cassandra offers fault-tolerance, data replication for higher availability as well as ensures no single point of failure. Given that Cassandra is a NoSQL database, it is evident that it lacks the amount of research that has gone into comparatively older and more widely and broadly used SQL databases. Cassandraâs growing popularity in recent times gives rise to the need of addressing any security-related or recovery-related concerns associated with its usage. This review paper discusses the existing deletion mechanism in Cassandra and presents some identified issues related to backup and recovery in the Cassandra database. Further, failure detection as well as handling of failures such as node failure or data center failure has been explored in the paper. In addition, several possible solutions to address backup and recovery including recovery in case of disasters have been reviewed
Recommended from our members
A new approach to detecting failures in distributed systems
textFault-tolerant distributed systems often handle failures in two steps: first, detect the failure and, second, take some recovery action. A common approach to detecting failures is end-to-end timeouts, but using timeouts brings problems. First, timeouts are inaccurate: just because a process is unresponsive does not mean that process has failed. Second, choosing a timeout is hard: short timeouts can exacerbate the problem of inaccuracy, and long timeouts can make the system wait unnecessarily. In fact, a good timeout valueâone that balances the choice between accuracy and speedâmay not even exist, owing to the variance in a systemâs end-to-end delays. Æis dissertation posits a new approach to detecting failures in distributed systems: use information about failures that is local to each component, e.g., the contents of an OSâs process table. We call such information inside information, and use it as the basis in the design and implementation of three failure reporting services for data center applications, which we call Falcon, Albatross, and Pigeon. Falcon deploys a network of software modules to gather inside information in the system, and it guarantees that it never reports a working process as crashed by sometimes terminating unresponsive components. Æis choice helps applications by making reports of failure reliable, meaning that applications can treat them as ground truth. Unfortunately, Falcon cannot handle network failures because guaranteeing that a process has crashed requires network communication; we address this problem in Albatross and Pigeon. Instead of killing, Albatross blocks suspected processes from using the network, allowing applications to make progress during network partitions. Pigeon renounces interference altogether, and reports inside information to applications directly and with more detail to help applications make better recovery decisions. By using these services, applications can improve their recovery from failures both quantitatively and qualitatively. Quantitatively, these services reduce detection time by one to two orders of magnitude over the end-to-end timeouts commonly used by data center applications, thereby reducing the unavailability caused by failures. Qualitatively, these services provide more specific information about failures, which can reduce the logic required for recovery and can help applications better decide when recovery is not necessary.Computer Science
Quality of Service of Crash-Recovery Failure Detectors
This thesis presents the results of an investigation into the failure detection problem. We consider the specific case of the Quality of Service (QoS) of crash failure detection. In contrast to previous work, we address the crash failure detection problem when the monitored target is resilient and recovers after failure. To the best of our knowledge, this is the first work to provide an analysis of crash-recovery failure detection from the QoS perspective.We develop a probabilistic model of the behavior of a crash-recovery target, i.e. one
which has the ability to recover from the crash state. We show that the fail-free run
and the crash-stop run are special cases of the crash-recovery run with mean time to
failure (MTTF) approaching to infinity and mean time to recovery (MTTR) approaching to infinity, respectively. We extend the previously published QoS metrics to allow
the measurement of the recovery speed, and the definition of the completeness property
of a failure detector. Then, the impact of the dependability of the crash-recovery
target on the QoS bounds for such a crash-recovery failure detector is analyzed using general dependability metrics, such as MTTF and MTTR, based on an approximate
probabilistic model of the two-process failure detection system. Then according to
our approximate model, we show how to estimate the failure detectorâs parameters to
achieve a required QoS, based on Chen et al.âs NFD-S algorithm analytically, and how
to execute the configuration procedure of this crash-recovery failure detector.In order to make the failure detector adaptive to the targetâs crash-recovery behavior and enable the autonomy of the monitoring procedure, we propose two types of recovery detection protocols. One is a reliable recovery detection protocol, which can guarantee to detect each occurring failure and recovery by adopting persistent storage. The other is a lightweight recovery detection protocol, which does not guarantee to detect every failure and recovery but which reduces the system overhead. Both of these recovery detection protocols improve the completeness without reducing the other QoS aspects of a failure detector. In addition, we also demonstrate how to estimate the inputs, such as the dependability metrics, using the failure detector itself.In order to evaluate our analytical work, we simulate the following failure detection algorithms: the simple heartbeat timeout algorithm, the NFD-S algorithm and the NFDS
algorithm with the lightweight recovery detection protocol, for various values of
MTTF and MTTR. The simulation results show that the dependability of a recoverable
monitored target could have significant impact on the QoS of such a failure detector.
This conforms well to our models and analysis. We show that in the case of reasonable long MTTF, the NFD-S algorithm with the lightweight recovery detection protocol exhibits better QoS than the NFD-S algorithm for the completeness of a crash-recovery failure detector, and similarly for other QoS metrics
Managerial Accounting (2nd edition)
This text covers the material required in an introductory managerial accounting course and compliments my text on introductory financial accounting. Both are texts and courses required for all business degree undergraduates. My objective is to make this material available to students at a very low cost. I have used variable costing and other techniques included in this text in business litigation engagements involving GM, Ford, Chrysler, Toyota, Nissan and other automobile manufacturers, testifying in Nevada, California, Texas, and Arizona.https://digitalcommons.wcupa.edu/acc_texts/1003/thumbnail.jp
Towards adaptive actors for scalable iot applications at the edge
Traditional device-cloud architectures are not scalable to the size of future IoT deployments. While edge and fog-computing principles seem like a tangible solution, they increase the programming effort of IoT systems, do not provide the same elasticity guarantees as the cloud and are of much greater hardware heterogeneity. Future IoT applications will be highly distributed and place their computational tasks on any combination of end-devices (sensor nodes, smartphones, drones), edge and cloud resources in order to achieve their application goals. These complex distributed systems require a programming model that allows developers to implement their applications in a simple way (i.e., focus on the application logic) and an execution framework that runs these applications resiliently with a high resource efficiency, while maximizing application utility. Towards such distributed execution runtime, we propose Nandu, an actor based system that adapts and migrates tasks dynamically using developer provided hints as seed information. Nandu allows developers to focus on sequential application logic and transforms their application into distributed, adaptive actors. The resulting actors support fine-grained entry points for the execution environment. These entry points allow local schedulers to adapt actors seamlessly to the current context, while optimizing the overall application utility according to developer provided requirements
Memory management unit for hardware-assisted dynamic relocation in on-board satellite systems
Satellite on-board systems spend their lives in hostile environments, where radiation can cause critical hardware failures. One of the most radiation-sensitive elements is memory. The so-called single event effects (SEEs) can corrupt or even irretrievably damage the cells that store the data and program instructions. When one of these cells is corrupted, the program must not use it again during execution. In order to avoid rebuilding and uploading the code, a memory management unit can be used to transparently relocate the program to an error-free memory region. This article presents the design and implementation of a memory management unit that allows the dynamic relocation of on-board software. This unit provides a hardware mechanism that allows the automatic relocation of sections of code or data at run-time, only requiring software intervention for initialization and configuration. The unit has been implemented on the LEON architecture, a reference for the European Space Agency (ESA) missions. The proposed solution has been validated using the boot and application software (ASW) of the instrument control unit of the Energetic Particle Detector of the Solar Orbiter Mission as a base. Processor synthesis on different FPGAs has shown resource usage and power consumption similar to that of a conventional memory management unit. The results vary between ± 1?15% in resource usage and ± 1?7% in power consumption, depending on the number of inputs assigned to the unit and the FPGA used. When comparing performance, both the proposed and conventional memory management units show the same results.Universidad de Alcal
The Architecture of an Autonomic, Resource-Aware, Workstation-Based Distributed Database System
Distributed software systems that are designed to run over workstation
machines within organisations are termed workstation-based. Workstation-based
systems are characterised by dynamically changing sets of machines that are
used primarily for other, user-centric tasks. They must be able to adapt to and
utilize spare capacity when and where it is available, and ensure that the
non-availability of an individual machine does not affect the availability of
the system. This thesis focuses on the requirements and design of a
workstation-based database system, which is motivated by an analysis of
existing database architectures that are typically run over static, specially
provisioned sets of machines. A typical clustered database system -- one that
is run over a number of specially provisioned machines -- executes queries
interactively, returning a synchronous response to applications, with its data
made durable and resilient to the failure of machines. There are no existing
workstation-based databases. Furthermore, other workstation-based systems do
not attempt to achieve the requirements of interactivity and durability,
because they are typically used to execute asynchronous batch processing jobs
that tolerate data loss -- results can be re-computed. These systems use
external servers to store the final results of computations rather than
workstation machines. This thesis describes the design and implementation of a
workstation-based database system and investigates its viability by evaluating
its performance against existing clustered database systems and testing its
availability during machine failures.Comment: Ph.D. Thesi
- âŠ