34 research outputs found
Higher Order Mutation Testing
Mutation testing is a fault-based software testing technique that has been studied widely for over three decades. To date, work in this field has focused largely on first order mutants because it is believed that higher order mutation testing is too computationally expensive to be practical. This thesis argues that some higher order mutants are potentially better able to simulate real world faults and to reveal insights into programming bugs than the restricted class of first order mutants. This thesis proposes a higher order mutation testing paradigm which combines valuable higher order mutants and non-trivial first order mutants together for mutation testing. To overcome the exponential increase in the number of higher order mutants a search process that seeks fit mutants (both first and higher order) from the space of all possible mutants is proposed. A fault-based higher order mutant classification scheme is introduced. Based on different types of fault interactions, this approach classifies higher order mutants into four categories: expected, worsening, fault masking and fault shifting. A search-based approach is then proposed for locating subsuming and strongly subsuming higher order mutants. These mutants are a subset of fault mask and fault shift classes of higher order mutants that are more difficult to kill than their constituent first order mutants. Finally, a hybrid test data generation approach is introduced, which combines the dynamic symbolic execution and search based software testing approaches to generate strongly adequate test data to kill first and higher order mutants
FAULT LINKS: IDENTIFYING MODULE AND FAULT TYPES AND THEIR RELATIONSHIP
The presented research resulted in a generic component taxonomy, a generic code-faulttaxonomy, and an approach to tailoring the generic taxonomies into domain-specific aswell as project-specific taxonomies. Also, a means to identify fault links was developed.Fault links represent relationships between the types of code-faults and the types ofcomponents being developed or modified. For example, a fault link has been found toexist between Controller modules (that forms a backbone for any software via. itsdecision making characteristics) and Control/Logic faults (such as unreachable code).The existence of such fault links can be used to guide code reviews, walkthroughs, testingof new code development, as well as code maintenance. It can also be used to direct faultseeding. The results of these methods have been validated. Finally, we also verified theusefulness of the obtained fault links through an experiment conducted using graduatestudents. The results were encouraging
On the Viability of Quantitative Assessment Methods in Software Engineering and Software Services
IT help desk operations are expensive. Costs associated with IT operations present challenges to profit goals. Help desk managers need a way to plan staffing levels so that labor costs are minimized while problems are resolved efficiently. An incident prediction method is needed for planning staffing levels. The potential value of a solution to this problem is important to an IT service provider since software failures are inevitable and their timing is difficult to predict. In this research, a cost model for help desk operations is developed. The cost model relates predicted incidents to labor costs using real help desk data. Incidents are predicted using software reliability growth models. Cluster analysis is used to group products with similar help desk incident characteristics. Principal Components Analysis is used to determine one product per cluster for the prediction of incidents for all members of the cluster. Incident prediction accuracy is demonstrated using cluster representatives, and is done so successfully for all clusters with accuracy comparable to making predictions for each product in the portfolio. Linear regression is used with cost data for the resolution of incidents to relate incident predictions to help desk labor costs. Following a series of four pilot studies, the cost model is validated by successfully demonstrating cost prediction accuracy for one month prediction intervals over a 22 month period
Mathematics in Software Reliability and Quality Assurance
This monograph concerns the mathematical aspects of software reliability and quality assurance and consists of 11 technical papers in this emerging area. Included are the latest research results related to formal methods and design, automatic software testing, software verification and validation, coalgebra theory, automata theory, hybrid system and software reliability modeling and assessment
Model Based Security Testing for Autonomous Vehicles
The purpose of this dissertation is to introduce a novel approach to generate a security test suite to mitigate malicious attacks on an autonomous system. Our method uses model based testing (MBT) methods to model system behavior, attacks and mitigations as independent threads in an execution stream. The threads intersect at a rendezvous or attack point. We build a security test suite from a behavioral model, an attack type and a mitigation model using communicating extended finite state machine (CEFSM) models. We also define an applicability matrix to determine which attacks are possible with which states. Our method then builds a comprehensive test suite using edge-node coverage that allows for systematic testing of an autonomous vehicle
Recommended from our members
Data cleaning techniques for software engineering data sets
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Data quality is an important issue which has been addressed and recognised in research communities such as data warehousing, data mining and information systems. It has been agreed that poor data quality will impact the quality of results of analyses and that it will therefore impact on decisions made on the basis of these results. Empirical software engineering has neglected the issue of data quality to some extent. This fact poses the question of how researchers in empirical software engineering can trust their results without addressing the quality of the analysed data. One widely accepted definition for data quality describes it as `fitness for purpose', and the issue of poor data quality can be addressed by either introducing preventative measures or by applying means to cope with data quality issues. The research presented in this thesis addresses the latter with the special focus on noise handling.
Three noise handling techniques, which utilise decision trees, are proposed for application to software engineering data sets. Each technique represents a noise handling approach: robust filtering, where training and test sets are the same; predictive filtering, where training and test sets are different; and filtering and polish, where noisy instances are corrected. The techniques were first evaluated in two different investigations by applying them to a large real world software engineering data set. In the first investigation the techniques' ability to improve predictive accuracy in differing noise levels was tested. All three techniques improved predictive accuracy in comparison to the do-nothing approach. The filtering and polish was the most successful technique in improving predictive accuracy. The second investigation utilising the large real world software engineering data set tested the techniques' ability to identify instances with implausible values. These instances were flagged for the purpose of evaluation before applying the three techniques. Robust filtering and predictive filtering decreased the number of instances with implausible values, but substantially decreased the size of the data set too. The filtering and polish technique actually increased the number of implausible values, but it did not reduce the size of the data set.
Since the data set contained historical software project data, it was not possible to know the real extent of noise detected. This led to the production of simulated software engineering data sets, which were modelled on the real data set used in the previous evaluations to ensure domain specific characteristics. These simulated versions of the data set were then injected with noise, such that the real extent of the noise was known. After the noise injection the three noise handling techniques were applied to allow evaluation. This procedure of simulating software engineering data sets combined the incorporation of domain specific characteristics of the real world with the control over the simulated data. This is seen as a special strength of this evaluation approach.
The results of the evaluation of the simulation showed that none of the techniques performed well. Robust filtering and filtering and polish performed very poorly, and based on the results of this evaluation they would not be recommended for the task of noise reduction. The predictive filtering technique was the best performing technique in this evaluation, but it did not perform significantly well either.
An exhaustive systematic literature review has been carried out investigating to what extent the empirical software engineering community has considered data quality. The findings showed that the issue of data quality has been largely neglected by the empirical software engineering community.
The work in this thesis highlights an important gap in empirical software engineering. It provided clarification and distinctions of the terms noise and outliers. Noise and outliers are overlapping, but they are fundamentally different. Since noise and outliers are often treated the same in noise handling techniques, a clarification of the two terms was necessary.
To investigate the capabilities of noise handling techniques a single investigation was deemed as insufficient. The reasons for this are that the distinction between noise and outliers is not trivial, and that the investigated noise cleaning techniques are derived from traditional noise handling techniques where noise and outliers are combined. Therefore three investigations were undertaken to assess the effectiveness of the three presented noise handling techniques. Each investigation should be seen as a part of a multi-pronged approach.
This thesis also highlights possible shortcomings of current automated noise handling techniques. The poor performance of the three techniques led to the conclusion that noise handling should be integrated into a data cleaning process where the input of domain knowledge and the replicability of the data cleaning process are ensured
Recommended from our members
Path-based dynamic impact analysis
Successful software systems evolve over their lifetimes through the cumulative changes made by software maintainers. As software evolves, the problems resulting from software change worsen, exacerbated by increased system size and complexity, lack of program understanding, amount of effort required to make changes, and number of personnel involved. Experience shows that software changes made without visibility into their effects can lead to poor effort estimates, delays in release schedules, degraded software design, unreliable software products, increased costs, and premature retirement of the software system. Software change impact analysis, impact analysis, is a software maintenance technique meant to address these problems, by assessing the effects of changes made to a software system. While impact analysis is frequently cited as a motivation or a potential application for program analysis and software maintenance research, research specific to the task of impact analysis has languished for more than 10 years. In addition, few researchers have examined the empirical factors underlying common impact analysis techniques or the tradeoffs inherent in known techniques, and none have performed empirical studies comparing impact analysis techniques. In this dissertation we introduce a new impact analysis approach, named PathImpact, that addresses a set of tradeoffs not addressed by any current impact analysis approach. Ours is the first fully-dynamic impact analysis approach. PathImpact uses light-weight instrumentation to record program execution at the level of procedure calls and returns, then efficiently builds a compressed representation that can be directly used to estimate change impact. We next extend PathImpact to accomodate system evolution yielding a technique we call EvolveImpact. EvolveImpact updates the impact representation after a system change, whereas PathImpact requires a complete recompution. In addition, we show how our approaches can be extended to a large class of emerging software architectures, including Java component-based systems and large-scale systems. Finally, we discuss the implementation of our approaches, present the first cost models for impact analysis techniques, and report the results of the first empirical studies that compare impact analysis techniques. We also empirically examine the performance of our approaches and the factors affecting the use of our techniques in practice. We found that our approach has linear time and space complexity (in the size of the dynamic information collected) and achieved a mean compression value of 0.955 on the subjects we used in our experiments. Our investigation of program evolution across multiple versions of three of our subject programs showed that, depending on the level of change activity, EvolveImpact can update the impact representation more efficiently than recomputing it in a majority of cases