10,587 research outputs found
Recommended from our members
Fault tolerance via diversity for off-the-shelf products: A study with SQL database servers
If an off-the-shelf software product exhibits poor dependability due to design faults, then software fault tolerance is often the only way available to users and system integrators to alleviate the problem. Thanks to low acquisition costs, even using multiple versions of software in a parallel architecture, which is a scheme formerly reserved for few and highly critical applications, may become viable for many applications. We have studied the potential dependability gains from these solutions for off-the-shelf database servers. We based the study on the bug reports available for four off-the-shelf SQL servers plus later releases of two of them. We found that many of these faults cause systematic noncrash failures, which is a category ignored by most studies and standard implementations of fault tolerance for databases. Our observations suggest that diverse redundancy would be effective for tolerating design faults in this category of products. Only in very few cases would demands that triggered a bug in one server cause failures in another one, and there were no coincident failures in more than two of the servers. Use of different releases of the same product would also tolerate a significant fraction of the faults. We report our results and discuss their implications, the architectural options available for exploiting them, and the difficulties that they may present
An automated wrapper-based approach to the design of dependable software
The design of dependable software systems invariably comprises two main activities: (i) the design of dependability mechanisms, and (ii) the location of dependability mechanisms. It has been shown that these activities are intrinsically difficult. In this paper we propose an automated wrapper-based methodology to circumvent the problems associated with the design and location of dependability mechanisms. To achieve this we replicate important variables so that they can be used as part of standard, efficient dependability mechanisms. These well-understood mechanisms are then deployed in all relevant locations. To validate the proposed methodology we apply it to three complex software systems, evaluating the dependability enhancement and execution overhead in each case. The results generated demonstrate that the system failure rate of a wrapped software system can be several orders of magnitude lower than that of an unwrapped equivalent
Recommended from our members
Enhancing Fault / Intrusion Tolerance through Design and Configuration Diversity
Fault/intrusion tolerance is usually the only viable way of improving the system dependability and security in the presence of continuously evolving threats. Many of the solutions in the literature concern a specific snapshot in the production or deployment of a fault-tolerant system and no immediate considerations are made about how the system should evolve to deal with novel threats. In this paper we outline and evaluate a set of operating systemsâ and applicationsâ reconfiguration rules which can be used to modify the state of a system replica prior to deployment or in between recoveries, and hence increase the replicas chance of a longer intrusion-free operation
Enhanced Failure Detection Mechanism in MapReduce
The popularity of MapReduce programming model has increased interest in the research community for its improvement. Among the other directions, the point of fault tolerance, concretely the failure detection issue seems to be a crucial one, but that until now has not reached its satisfying level. Motivated by this, I decided to devote my main research during this period into having a prototype system architecture of MapReduce framework with a new failure detection service, containing both analytical (theoretical) and implementation part. I am confident that this work should lead the way for further contributions in detecting failures to any NoSQL App frameworks, and cloud storage systems in general
Towards a cyberinfrastructure for enhanced scientific
A new generation of information and communication infrastructures, including advanced Internet computing and Grid technologies, promises to enable more direct and shared access to more widely distributed computing resources than was previously possible. Scientific and technological collaboration, consequently, is more and more coming to be seen as critically dependent upon effective access to, and sharing of digital research data, and of the information tools that facilitate data being structured for efficient storage, search, retrieval, display and higher level analysis. A recent (February 2003) report to the U.S. NSF Directorate of Computer and Information System Engineering urged that funding be provided for a major enhancement of computer and network technologies, thereby creating a cyberinfrastructure whose facilities would support and transform the conduct of scientific and engineering research. The articulation of this programmatic vision reflects a widely shared expectation that solving the technical engineering problems associated with the advanced hardware and software systems of the cyberinfrastructure will yield revolutionary payoffs by empowering individual researchers and increasing the scale, scope and flexibility of collective research enterprises. The argument of this paper, however, is that engineering breakthroughs alone will not be enough to achieve such an outcome; success in realizing the cyberinfrastructureâs potential, if it is achieved, will more likely to be the resultant of a nexus of interrelated social, legal and technical transformations. The socio-institutional elements of a new infrastructure supporting collaboration â that is to say, its supposedly âsofterâ parts -- are every bit as complicated as the hardware and computer software, and, indeed, may prove much harder to devise and implement. The roots of this latter class of challenges facing âe-Scienceâ will be seen to lie in the micro- and meso-level incentive structures created by the existing legal and administrative regimes. Although a number of these same conditions and circumstances appear to be equally significant obstacles to commercial provision of Grid services in interorganizational contexts, the domain of publicly supported scientific collaboration is held to be the more hospitable environment in which to experiment with a variety of new approaches to solving these problems. The paper concludes by proposing several âsolution modalities,â including some that also could be made applicable for fields of information-intensive collaboration in business and finance that must regularly transcends organizational boundaries.
Towards a cyberinfrastructure for enhanced scientific
Scientific and technological collaboration is more and more coming to be seen as critically dependent upon effective access to, and sharing of digital research data, and of the information tools that facilitate data being structured for efficient storage, search, retrieval, display and higher level analysis. A February 2003 report to the U.S. NSF Directorate of Computer and Information System Engineering urged that funding be provided for a major enhancement of computer and network technologies, thereby creating a cyberinfrastructure whose facilities would support and transform the conduct of scientific and engineering research. The argument of this paper is that engineering breakthroughs alone will not be enough to achieve such an outcome; success in realizing the cyberinfrastructureâs potential, if it is achieved, will more likely to be the resultant of a nexus of interrelated social, legal and technical transformations. The socio-institutional elements of a new infrastructure supporting collaboration that is to say, its supposedly âsofterâ parts -- are every bit as complicated as the hardware and computer software, and, indeed, may prove much harder to devise and implement. The roots of this latter class of challenges facing âe- Scienceâ will be seen to lie in the micro- and meso-level incentive structures created by the existing legal and administrative regimes. Although a number of these same conditions and circumstances appear to be equally significant obstacles to commercial provision of Grid services in interorganizational contexts, the domain of publicly supported scientific collaboration is held to be the more hospitable environment in which to experiment with a variety of new approaches to solving these problems. The paper concludes by proposing several âsolution modalities,â including some that also could be made applicable for fields of information-intensive collaboration in business and finance that must regularly transcends organizational boundaries.
Recommended from our members
Fault diversity among off-the-shelf SQL database servers
Fault tolerance is often the only viable way of obtaining the required system dependability from systems built out of "off-the-shelf" (OTS) products. We have studied a sample of bug reports from four off-the-shelf SQL servers so as to estimate the possible advantages of software fault tolerance - in the form of modular redundancy with diversity - in complex off-the-shelf software. We checked whether these bugs would cause coincident failures in more than one of the servers. We found that very few bugs affected two of the four servers, and none caused failures in more than two. We also found that only four of these bugs would cause identical, undetectable failures in two servers. Therefore, a fault-tolerant server, built with diverse off-the-shelf servers, seems to have a good chance of delivering improvements in availability and failure rates compared with the individual off-the-shelf servers or their replicated, nondiverse configurations
Recommended from our members
Rephrasing rules for off-the-shelf SQL database servers
We have reported previously (Gashi et al., 2004) results of a study with a sample of bug reports from four off-the-shelf SQL servers. We checked whether these bugs caused failures in more than one server. We found that very few bugs caused failures in two servers and none caused failures in more than two. This would suggest a fault-tolerant server built with diverse off-the-shelf servers would be a prudent choice for improving failure detection. To study other aspects of fault tolerance, namely failure diagnosis and state recovery, we have studied the "data diversity" mechanism and we defined a number of SQL rephrasing rules. These rules transform a client sent statement to an additional logically equivalent statement, leading to more results being returned to an adjudicator. These rules therefore help to increase the probability of a correct response being returned to a client and maintain a correct state in the database
- âŠ