892 research outputs found
Refining Fitness Functions for Search-Based Automated Program Repair: A Case Study with ARJA and ARJA-e
Several tools support code templates as a means to specify searches within a program’s source code. Despite their ubiquity, code templates can often prove difficult to specify, and may produce too many or too few match results. In this paper, we present a search-based approach to support developers in specifying templates. This approach uses a suite of mutation operators to recommend changes to a given template, such that it matches with a desired set of code snippets. We evaluate our approach on the problem of inferring a code template that matches all instances of a design pattern, given one instance as a starting template
Effects of an Unusual Poison Identify a Lifespan Role for Topoisomerase 2 in Saccharomyces Cerevisiae
A progressive loss of genome maintenance has been implicated as both a cause and consequence of aging. Here we present evidence supporting the hypothesis that an age-associated decay in genome maintenance promotes aging in Saccharomyces cerevisiae (yeast) due to an inability to sense or repair DNA damage by topoisomerase 2 (yTop2). We describe the characterization of LS1, identified in a high throughput screen for small molecules that shorten the replicative lifespan of yeast. LS1 accelerates aging without affecting proliferative growth or viability. Genetic and biochemical criteria reveal LS1 to be a weak Top2 poison. Top2 poisons induce the accumulation of covalent Top2-linked DNA double strand breaks that, if left unrepaired, lead to genome instability and death. LS1 is toxic to cells deficient in homologous recombination, suggesting that the damage it induces is normally mitigated by genome maintenance systems. The essential roles of yTop2 in proliferating cells may come with a fitness trade-off in older cells that are less able to sense or repair yTop2-mediated DNA damage. Consistent with this idea, cells live longer when yTop2 expression levels are reduced. These results identify intrinsic yTop2-mediated DNA damage as a potentially manageable cause of aging
Leveraging Automated Unit Tests for Unsupervised Code Translation
With little to no parallel data available for programming languages, unsupervised methods are well-suited to source code translation. However, the majority of unsupervised machine translation approaches rely on back-translation, a method developed in the context of natural language translation and one that inherently involves training on noisy inputs. Unfortunately, source code is highly sensitive to small changes; a single token can result in compilation failures or erroneous programs, unlike natural languages where small inaccuracies may not change the meaning of a sentence. To address this issue, we propose to leverage an automated unit-testing system to filter out invalid translations, thereby creating a fully tested parallel corpus. We found that fine-tuning an unsupervised model with this filtered data set significantly reduces the noise in the translations so-generated, comfortably outperforming the state-of-the-art for all language pairs studied. In particular, for Java→Python and Python→C++ we outperform the best previous methods by more than 16% and 24% respectively, reducing the error rate by more than 35%
Automatically correcting syntactic and semantic errors in ATL transformations using multi-objective optimization
L’ingénierie dirigée par les modèles (EDM) est un paradigme de développement logiciel
qui promeut l’utilisation de modèles en tant qu’artefacts de première plan et de processus
automatisés pour en dériver d’autres artefacts tels que le code, la documentation et les cas de
test. La transformation de modèle est un élément important de l’EDM puisqu’elle permet de
manipuler les représentations abstraites que sont les modèles. Les transformations de modèles,
comme d’autres programmes, sont sujettes à la fois à des erreurs syntaxiques et sémantiques.
La correction de ces erreurs est difficile et chronophage, car les transformations dépendent
du langage de transformation comme ATL et des langages de modélisation dans lesquels
sont exprimés les modèles en entrée et en sortie. Les travaux existants sur la réparation
des transformations ciblent les erreurs syntaxiques ou sémantiques, une erreur à la fois, et
définissent manuellement des patrons de correctifs. L’objectif principal de notre recherche
est de proposer un cadre générique pour corriger automatiquement de multiples erreurs
syntaxiques et sémantiques. Afin d’atteindre cet objectif, nous reformulons la réparation des
transformations de modèles comme un problème d’optimisation multiobjectif et le résolvons au
moyen d’algorithmes évolutionnaires. Pour adapter le cadre aux deux catégories d’erreurs, nous
utilisons différents types d’objectifs et des stratégies sophistiquées pour guider l’exploration
de l’espace des solutions.Model-driven engineering (MDE) is a software development paradigm that promotes the
use of models as first-class artifacts and automated processes to derive other artefacts from
them such as code, documentation and test cases. Model transformation is an important
element of MDE since it allows to manipulate the abstract representations that are models.
Model transformations, as other programs are subjects to both syntactic and semantic errors.
Fixing those errors is difficult and time consuming as the transformations depend on the
transformation language such as ATL, and modeling languages in which input and output
models are expressed. Existing work on transformation repair targets either syntactic or
semantic errors, one error at a time, and define patch templates manually. The main goal of
our research is to propose a generic framework to fix multiple syntactic and semantic errors
automatically. In order to achieve this goal, we reformulate the repair of model transformations
as a multi-objective optimization problem and solve it by means of evolutionary algorithms.
To adapt the framework to the two categories of errors, we use different types of objectives
and sophisticated strategies to guide the search
Novel insights into the DNA interstrand cross-link repair in Schizosaccharomyces pombe: characterisation of Fan1 through standard and high-throughput genetic analysis
FAN1/MTMR15 (Fanconi anemia-associated nuclease 1 / Myotubularin-related protein 15) is a protein originally identified from a set of size-fractionated human brain cDNA libraries coding for large proteins in vitro (Nagase et al., 1999). FAN1 is widely conserved across eukaryotes, with the notable exception of S. cerevisiae (Smogorzewska et al., 2010; MacKay et al., 2010; Kratz et al., 2010; Liu et al., 2010; Shereda et al., 2010). Recent work has shown that FAN1 is a novel component of the Fanconi Anemia repair DNA pathway in higher eukaryotes (Smogorzewska et al., 2010; MacKay et al., 2010; Kratz et al., 2010; Yoshikiyo et al., 2010; Liu et al., 2010; Shereda et al., 2010).
My work presents a biochemical and genetic characterisation of the FAN1 Schizosaccharomyces pombe ortholog Fan1. I show that, in contrast with the situation in higher eukaryotes, Fan1 in S. pombe does not strongly interact with components of the mismatch repair pathway. The disruption of fan1 causes a mild sensitivity to interstrand cross-linking agents, dramatically augmented by the concomitant deletion of the nuclease Pso2, suggesting a role for Fan1 in the resolution of DNA interstrand cross-links. Further genetic interactions are explored by use of an automated high-throughput screen, where a non-epistatic relationship is found with Pli1, a component of the SUMOylation pathway. Finally, I show that three conserved residues in the VRR_nuc nuclease domain are required for Fan1 activity in DNA repair. Taken together, the data presented points at a role for S. pombe Fan1 in the resolution of adducts created by DNA interstrand cross-linking agents
A self-healing framework for general software systems
Modern systems must guarantee high reliability, availability, and efficiency. Their complexity, exacerbated by the dynamic integration with other systems, the use of third- party services and the various different environments where they run, challenges development practices, tools and testing techniques. Testing cannot identify and remove all possible faults, thus faulty conditions may escape verification and validation activities and manifest themselves only after the system deployment. To cope with those failures, researchers have proposed the concept of self-healing systems. Such systems have the ability to examine their failures and to automatically take corrective actions. The idea is to create software systems that can integrate the knowledge that is needed to compensate for the effects of their imperfections. This knowledge is usually codified into the systems in the form of redundancy. Redundancy can be deliberately added into the systems as part of the design and the development process, as it occurs for many fault tolerance techniques. Although this kind of redundancy is widely applied, especially for safety- critical systems, it is however generally expensive to be used for common use software systems. We have some evidence that modern software systems are characterized by a different type of redundancy, which is not deliberately introduced but is naturally present due to the modern modular software design. We call it intrinsic redundancy. This thesis proposes a way to use the intrinsic redundancy of software systems to increase their reliability at a low cost. We first study the nature of the intrinsic redundancy to demonstrate that it actually exists. We then propose a way to express and encode such redundancy and an approach, Java Automatic Workaround, to exploit it automatically and at runtime to avoid system failures. Fundamentally, the Java Automatic Workaround approach replaces some failing operations with other alternative operations that are semantically equivalent in terms of the expected results and in the developer’s intent, but that they might have some syntactic difference that can ultimately overcome the failure. We qualitatively discuss the reasons of the presence of the intrinsic redundancy and we quantitatively study four large libraries to show that such redundancy is indeed a characteristic of modern software systems. We then develop the approach into a prototype and we evaluate it with four open source applications. Our studies show that the approach effectively exploits the intrinsic redundancy in avoiding failures automatically and at runtime
Optimising non-destructive examination of newbuilding ship hull structures by developing a data-centric risk and reliability framework based on fracture mechanics
This thesis was previously held under moratorium from 18/11/19 to 18/11/21Ship structures are made of steel members that are joined with welds. Welded connections may contain various imperfections. These imperfections are inherent to this joining technology. Design rules and standards are based on the assumption that welds are made to good a workmanship level. Hence, a ship is inspected during construction to make sure it is reasonably defect-free. However, since 100% inspection coverage is not feasible, only partial inspection has been required by classification societies. Classification societies have developed rules, standards, and guidelines specifying the extent to which inspection should be performed.
In this research, a review of rules and standards from classification bodies showed some limitations in current practices. One key limitation is that the rules favour a “one-size-fits-all” approach. In addition to that, a significant discrepancy exists between rules of different classification societies.
In this thesis, an innovative framework is proposed, which combines a risk and reliability approach with a statistical sampling scheme achieving targeted and cost-effective inspections. The developed reliability model predicts the failure probability of the structure based on probabilistic fracture mechanics. Various uncertain variables influencing the predictive reliability model are identified, and their effects are considered. The data for two key variables, namely, defect statistics and material toughness are gathered and analysed using appropriate statistical analysis methods.
A reliability code is developed based Convolution Integral (CI), which estimates the predictive reliability using the analysed data. Statistical sampling principles are then used to specify the number required NDT checkpoints to achieve a certain statistical confidence about the reliability of structure and the limits set by statistical process control (SPC). The framework allows for updating the predictive reliability estimation of the structure using the inspection findings by employing a Bayesian updating method.
The applicability of the framework is clearly demonstrated in a case study structure.Ship structures are made of steel members that are joined with welds. Welded connections may contain various imperfections. These imperfections are inherent to this joining technology. Design rules and standards are based on the assumption that welds are made to good a workmanship level. Hence, a ship is inspected during construction to make sure it is reasonably defect-free. However, since 100% inspection coverage is not feasible, only partial inspection has been required by classification societies. Classification societies have developed rules, standards, and guidelines specifying the extent to which inspection should be performed.
In this research, a review of rules and standards from classification bodies showed some limitations in current practices. One key limitation is that the rules favour a “one-size-fits-all” approach. In addition to that, a significant discrepancy exists between rules of different classification societies.
In this thesis, an innovative framework is proposed, which combines a risk and reliability approach with a statistical sampling scheme achieving targeted and cost-effective inspections. The developed reliability model predicts the failure probability of the structure based on probabilistic fracture mechanics. Various uncertain variables influencing the predictive reliability model are identified, and their effects are considered. The data for two key variables, namely, defect statistics and material toughness are gathered and analysed using appropriate statistical analysis methods.
A reliability code is developed based Convolution Integral (CI), which estimates the predictive reliability using the analysed data. Statistical sampling principles are then used to specify the number required NDT checkpoints to achieve a certain statistical confidence about the reliability of structure and the limits set by statistical process control (SPC). The framework allows for updating the predictive reliability estimation of the structure using the inspection findings by employing a Bayesian updating method.
The applicability of the framework is clearly demonstrated in a case study structure
Gin: Genetic Improvement Research Made Easy
Genetic improvement (GI) is a young field of research on the cusp of transforming software development. GI uses search to improve existing software. Researchers have already shown that GI can improve human-written code, ranging from program repair to optimising run-time, from reducing energy-consumption to the transplantation of new functionality. Much remains to be done. The cost of re-implementing GI to investigate new approaches is hindering progress. Therefore, we present Gin, an extensible and modifiable toolbox for GI experimentation, with a novel combination of features. Instantiated in Java and targeting the Java ecosystem, Gin automatically transforms, builds, and tests Java projects. Out of the box, Gin supports automated test-generation and source code profiling. We show, through examples and a case study, how Gin facilitates experimentation and will speed innovation in GI
Recommended from our members
Control flow graph visualization and its application to coverage and fault localization in Python
textThis report presents a software testing tool that creates visualizations of the Control Flow Graph (CFG) from Python source code. The CFG is a representation of a program that shows execution paths that may be taken by the machine. Similar techniques to the ones here could be applied to many other languages, but the CFGs in this tool are tailored to the Python language. As computers get faster, tools to help programmers be effective at work can become more complex and still give quick feedback, without causing an undue performance burden. This tool explores several approaches to giving feedback to developers through a visualization of the CFG. First, just the viewing of a CFG gives a different perspective on the code. A programmer could choose to juxtapose the CFG with complexity metrics during development, seeing increased complexity as graphs grow larger. Second, the tool implements a mechanism to provide code coverage to Python modules. This feature extends the visualization to show code coverage as a highlighted CFG. Test coverage requirements are calculated to check node, edge, edge-pair, and prime path coverage. From studying existing testing tools, it appears no existing tool for Python provides all these test coverage levels. Third, the tool provides an interface for adding custom highlighting of the CFG, used here to visualize fault localization. Seeing the most suspicious locations from fault localization techniques could be used to reduce debugging time. The results of running the tool on several popular Python packages, and on itself, show its performance is competitive with the most popular coverage tool when measuring branch coverage. It is slightly slower on statement cover- age alone, but much faster against an unoptimized version and a logic coverage tool. This report also presents ideas for extensions to the tool. Among them is to incorporate program repair using fault localization and mutation operators. Visualizing code as a CFG provides interesting ways to look at many software testing metrics.Electrical and Computer Engineerin
- …