52,572 research outputs found
Early experiences of computerâaided assessment and administration when teaching computer programming
This paper describes early experiences with the Ceilidh system currently being piloted at over 30 institutions of higher education. Ceilidh is a courseâmanagement system for teaching computer programming whose core is an autoâassessment facility. This facility automatically marks students programs from a range of perspectives, and may be used in an iterative manner, enabling students to work towards a target level of attainment. Ceilidh also includes extensive courseâadministration and progressâmonitoring facilities, as well as support for other forms of assessment including shortâanswer marking and the collation of essays for later handâmarking. The paper discusses the motivation for developing Ceilidh, outlines its major facilities, then summarizes experiences of developing and actually using it at the coalâface over three years of teaching
Automated software quality visualisation using fuzzy logic techniques
In the past decade there has been a concerted effort by the software industry to improve the quality of its products. This has led to the inception of various techniques with which to control and measure the process involved in software development. Methods like the Capability Maturity Model have introduced processes and strategies that require measurement in the form of software metrics. With the ever increasing number of software metrics being introduced by capability based processes, software development organisations are finding it more difficult to understand and interpret metric scores. This is particularly problematic for senior management and project managers where analysis of the actual data is not feasible. This paper proposes a method with which to visually represent metric scores so that managers can easily see how their organisation is performing relative to quality goals set for each type of metric. Acting primarily as a proof of concept and prototype, we suggest ways in which real customer needs can be translated into a feasible technical solution. The solution itself visualises metric scores in the form of a tree structure and utilises Fuzzy Logic techniques, XGMML, Web Services and the .NET Framework. Future work is proposed to extend the system from the prototype stage and to overcome a problem with the masking of poor scores
Finding a boundary between valid and invalid regions of the input space
In the context of robustness testing, the boundary between the valid and
invalid regions of the input space can be an interesting source of erroneous
inputs. Knowing where a specific software under test (SUT) has a boundary is
essential for validation in relation to requirements. However, finding where a
SUT actually implements the boundary is a non-trivial problem that has not
gotten much attention. This paper proposes a method of finding the boundary
between the valid and invalid regions of the input space. The proposed method
consists of two steps. First, test data generators, directed by a search
algorithm to maximise distance to known, valid test cases, generate valid test
cases that are closer to the boundary. Second, these valid test cases undergo
mutations to try to push them over the boundary and into the invalid part of
the input space. This results in a pair of test sets, one consisting of test
cases on the valid side of the boundary and a matched set on the outer side,
with only a small distance between the two sets. The method is evaluated on a
number of examples from the standard library of a modern programming language.
We propose a method of determining the boundary between valid and invalid
regions of the input space and apply it on a SUT that has a non-contiguous
valid region of the input space. From the small distance between the developed
pairs of test sets, and the fact that one test set contains valid test cases
and the other invalid test cases, we conclude that the pair of test sets
described the boundary between the valid and invalid regions of that input
space. Differences of behaviour can be observed between different distances and
sets of mutation operators, but all show that the method is able to identify
the boundary between the valid and invalid regions of the input space. This is
an important step towards more automated robustness testing.Comment: 10 pages, conferenc
You Cannot Fix What You Cannot Find! An Investigation of Fault Localization Bias in Benchmarking Automated Program Repair Systems
Properly benchmarking Automated Program Repair (APR) systems should
contribute to the development and adoption of the research outputs by
practitioners. To that end, the research community must ensure that it reaches
significant milestones by reliably comparing state-of-the-art tools for a
better understanding of their strengths and weaknesses. In this work, we
identify and investigate a practical bias caused by the fault localization (FL)
step in a repair pipeline. We propose to highlight the different fault
localization configurations used in the literature, and their impact on APR
systems when applied to the Defects4J benchmark. Then, we explore the
performance variations that can be achieved by `tweaking' the FL step.
Eventually, we expect to create a new momentum for (1) full disclosure of APR
experimental procedures with respect to FL, (2) realistic expectations of
repairing bugs in Defects4J, as well as (3) reliable performance comparison
among the state-of-the-art APR systems, and against the baseline performance
results of our thoroughly assessed kPAR repair tool. Our main findings include:
(a) only a subset of Defects4J bugs can be currently localized by commonly-used
FL techniques; (b) current practice of comparing state-of-the-art APR systems
(i.e., counting the number of fixed bugs) is potentially misleading due to the
bias of FL configurations; and (c) APR authors do not properly qualify their
performance achievement with respect to the different tuning parameters
implemented in APR systems.Comment: Accepted by ICST 201
- âŠ