1,547 research outputs found
Recommended from our members
Fault tolerance via diversity for off-the-shelf products: A study with SQL database servers
If an off-the-shelf software product exhibits poor dependability due to design faults, then software fault tolerance is often the only way available to users and system integrators to alleviate the problem. Thanks to low acquisition costs, even using multiple versions of software in a parallel architecture, which is a scheme formerly reserved for few and highly critical applications, may become viable for many applications. We have studied the potential dependability gains from these solutions for off-the-shelf database servers. We based the study on the bug reports available for four off-the-shelf SQL servers plus later releases of two of them. We found that many of these faults cause systematic noncrash failures, which is a category ignored by most studies and standard implementations of fault tolerance for databases. Our observations suggest that diverse redundancy would be effective for tolerating design faults in this category of products. Only in very few cases would demands that triggered a bug in one server cause failures in another one, and there were no coincident failures in more than two of the servers. Use of different releases of the same product would also tolerate a significant fraction of the faults. We report our results and discuss their implications, the architectural options available for exploiting them, and the difficulties that they may present
The Atomic Manifesto: a Story in Four Quarks
This report summarizes the viewpoints and insights gathered in the Dagstuhl Seminar on Atomicity in System Design and Execution, which was attended by 32 people from four different scientific communities: database and transaction processing systems, fault tolerance and dependable systems, formal methods for system design and correctness reasoning, and hardware architecture and programming languages. Each community presents its position in interpreting the notion of atomicity and the existing state of the art, and each community identifies scientific challenges that should be addressed in future work. In addition, the report discusses common themes across communities and strategic research problems that require multiple communities to team up for a viable solution.
The general theme of how to specify, implement, compose, and reason about extended
and relaxed notions of atomicity is viewed as a key piece in coping with
the pressing issue of building and maintaining highly dependable systems that
comprise many components with complex interaction patterns
Investigation of Workflow Processes and Best Practices in Kentucky’s CDL Program
Kentucky’s Division of Driver Licensing maintains the driver history records for all licensed drivers in Kentucky. It serves as the state driver licensing agency and is the locus for meeting the federal CDL requirements under 49 CFR 384. Kentucky relies on FMCSA’s Commercial Driver’s License Program Implementation (CDLPI) grant funding to subsidize salaries for Federally Funded Temporary Labor (FFTL). FFTLs verify and process documents as well as field phone calls from customers. DDL administrators say they cannot meet the CDL reporting timeframes without FFTL labor, but the agency is not able to fund temporary or full-time staff members on its own. However, FMCSA indicated a reluctance to continue funding FFTLs in recent grant cycles. Without the funding for FFTLs, Kentucky’s SDLA anticipates difficulty meeting FMCSA compliance standards or passing an FMCSA CDL program audit without the use of FFTLs. The research team undertook a study to identify strategies for optimizing workflow and adjusting to the loss of FFTLs. This study examines the approaches states are currently using to administer state and federal CDL requirements, and analyzes how those approaches may help state CDL programs remain compliant despite fewer resources. In addition, the study evaluates Kentucky’s current workflow to identify opportunities for improvement. The research will not only help Kentucky adjust to staffing limitations, it also provides other SDLAs with tools to implement innovative practices in their state’s CDL program
Tools for distributed application management
Distributed application management consists of monitoring and controlling an application as it executes in a distributed environment. It encompasses such activities as configuration, initialization, performance monitoring, resource scheduling, and failure response. The Meta system (a collection of tools for constructing distributed application management software) is described. Meta provides the mechanism, while the programmer specifies the policy for application management. The policy is manifested as a control program which is a soft real-time reactive program. The underlying application is instrumented with a variety of built-in and user-defined sensors and actuators. These define the interface between the control program and the application. The control program also has access to a database describing the structure of the application and the characteristics of its environment. Some of the more difficult problems for application management occur when preexisting, nondistributed programs are integrated into a distributed application for which they may not have been intended. Meta allows management functions to be retrofitted to such programs with a minimum of effort
Feedback driven adaptive combinatorial testing
The configuration spaces of modern software systems are too large to test exhaustively. Combinatorial interaction testing (CIT) approaches, such as covering arrays, systematically sample the configuration space and test only the selected configurations. The basic justification for CIT approaches is that they can cost-effectively exercise all system behaviors caused by the settings of t or fewer options. We conjecture, however, that in practice many such behaviors are not actually tested because of masking effects – failures that perturb execution so as to prevent some behaviors from being exercised. In this work we present a feedback-driven, adaptive, combinatorial testing approach aimed at detecting and working around masking effects. At each iteration we detect potential masking effects, heuristically isolate their likely causes, and then generate new covering arrays that allow previously masked combinations to be tested in the subsequent iteration. We empirically assess the effectiveness of the proposed approach on two large widely used open source software systems. Our results suggest that masking effects do exist and that our approach provides a promising and efficient way to work around them
Tools for distributed application management
Distributed application management consists of monitoring and controlling an application as it executes in a distributed environment. It encompasses such activities as configuration, initialization, performance monitoring, resource scheduling, and failure response. The Meta system is described: a collection of tools for constructing distributed application management software. Meta provides the mechanism, while the programmer specifies the policy for application management. The policy is manifested as a control program which is a soft real time reactive program. The underlying application is instrumented with a variety of built-in and user defined sensors and actuators. These define the interface between the control program and the application. The control program also has access to a database describing the structure of the application and the characteristics of its environment. Some of the more difficult problems for application management occur when pre-existing, nondistributed programs are integrated into a distributed application for which they may not have been intended. Meta allows management functions to be retrofitted to such programs with a minimum of effort
- …