250,962 research outputs found
Development of the PEBL Traveling Salesman Problem Computerized Testbed
The traveling salesman problem (TSP) is a combinatorial optimization problem that requires finding the shortest path through a set of points (“cities”) that returns to the starting point. Because humans provide heuristic near-optimal solutions to Euclidean versions of the problem, it has sometimes been used to investigate human visual problem solving ability. The TSP is also similar to a number of tasks commonly used for neuropsychological assessment (such as the trail-making test), and so its utility in assessing reliable individual differences in problem solving has sometimes been examined. Nevertheless, the task has seen little widespread use in clinical and assessment domains, in part because no standard software implementation or item set is widely available with known psychometric properties. In this paper, we describe a computerized version of TSP running in the free and open source Psychology Experiment Building Language (PEBL). The PEBL TSP task is designed to be suitable for use within a larger battery of tests, and to examine both standard and custom TSP node configurations (i.e., problems). We report the results of a series of experiments that help establish the test’s reliability and validity. The first experiment examines test-retest reliability, establishes that the quality of solutions in the TSP are not impacted by mild physiological strain, and demonstrates how solution quality obtained by individuals in a physical version is highly correlated with solution quality obtained in the PEBL version. The second experiment evaluates a larger set of problems, and uses the data to identify a small subset of tests that have maximal coherence. A third experiment examines test-retest reliability of this smaller set that can be administered in about five minutes, and establishes that these problems produce composite scores with moderately high (R = .75) test-retest reliability, making it suitable for use in many assessment situations, including evaluations of individual differences, personality, and intelligence testing
Optimal discrete stopping times for reliability growth tests
Often, the duration of a reliability growth development test is specified in advance and the decision to terminate or continue testing is conducted at discrete time intervals. These features are normally not captured by reliability growth models. This paper adapts a standard reliability growth model to determine the optimal time for which to plan to terminate testing. The underlying stochastic process is developed from an Order Statistic argument with Bayesian inference used to estimate the number of faults within the design and classical inference procedures used to assess the rate of fault detection. Inference procedures within this framework are explored where it is shown the Maximum Likelihood Estimators possess a small bias and converges to the Minimum Variance Unbiased Estimator after few tests for designs with moderate number of faults. It is shown that the Likelihood function can be bimodal when there is conflict between the observed rate of fault detection and the prior distribution describing the number of faults in the design. An illustrative example is provided
Rigorously assessing software reliability and safety
This paper summarises the state of the art in the assessment of software reliability and safety ("dependability"), and describes some promising developments. A sound demonstration of very high dependability is still impossible before operation of the software; but research is finding ways to make rigorous assessment increasingly feasible. While refined mathematical techniques cannot take the place of factual knowledge, they can allow the decision-maker to draw more accurate conclusions from the knowledge that is available
Recommended from our members
Evaluating the resilience and security of boundaryless, evolving socio-technical Systems of Systems
Advanced Techniques for Assets Maintenance Management
16th IFAC Symposium on Information Control Problems in Manufacturing INCOM 2018
Bergamo, Italy, 11–13 June 2018. Edited by Marco Macchi, László Monostori, Roberto PintoThe aim of this paper is to remark the importance of new and advanced techniques supporting decision making in different business processes for maintenance and assets management, as well as the basic need of adopting a certain management framework with a clear processes map and the corresponding IT supporting systems. Framework processes and systems will be the key fundamental enablers for success and for continuous improvement. The suggested framework will help to define and improve business policies and work procedures for the assets operation and maintenance along their life cycle. The following sections present some achievements on this focus, proposing finally possible future lines for a research agenda within this field of assets management
Software Engineers' Information Seeking Behavior in Change Impact Analysis - An Interview Study
Software engineers working in large projects must navigate complex
information landscapes. Change Impact Analysis (CIA) is a task that relies on
engineers' successful information seeking in databases storing, e.g., source
code, requirements, design descriptions, and test case specifications. Several
previous approaches to support information seeking are task-specific, thus
understanding engineers' seeking behavior in specific tasks is fundamental. We
present an industrial case study on how engineers seek information in CIA, with
a particular focus on traceability and development artifacts that are not
source code. We show that engineers have different information seeking
behavior, and that some do not consider traceability particularly useful when
conducting CIA. Furthermore, we observe a tendency for engineers to prefer less
rigid types of support rather than formal approaches, i.e., engineers value
support that allows flexibility in how to practically conduct CIA. Finally, due
to diverse information seeking behavior, we argue that future CIA support
should embrace individual preferences to identify change impact by empowering
several seeking alternatives, including searching, browsing, and tracing.Comment: Accepted for publication in the proceedings of the 25th International
Conference on Program Comprehensio
Expert Elicitation for Reliable System Design
This paper reviews the role of expert judgement to support reliability
assessments within the systems engineering design process. Generic design
processes are described to give the context and a discussion is given about the
nature of the reliability assessments required in the different systems
engineering phases. It is argued that, as far as meeting reliability
requirements is concerned, the whole design process is more akin to a
statistical control process than to a straightforward statistical problem of
assessing an unknown distribution. This leads to features of the expert
judgement problem in the design context which are substantially different from
those seen, for example, in risk assessment. In particular, the role of experts
in problem structuring and in developing failure mitigation options is much
more prominent, and there is a need to take into account the reliability
potential for future mitigation measures downstream in the system life cycle.
An overview is given of the stakeholders typically involved in large scale
systems engineering design projects, and this is used to argue the need for
methods that expose potential judgemental biases in order to generate analyses
that can be said to provide rational consensus about uncertainties. Finally, a
number of key points are developed with the aim of moving toward a framework
that provides a holistic method for tracking reliability assessment through the
design process.Comment: This paper commented in: [arXiv:0708.0285], [arXiv:0708.0287],
[arXiv:0708.0288]. Rejoinder in [arXiv:0708.0293]. Published at
http://dx.doi.org/10.1214/088342306000000510 in the Statistical Science
(http://www.imstat.org/sts/) by the Institute of Mathematical Statistics
(http://www.imstat.org
Recommended from our members
Assessing the Risk due to Software Faults: Estimates of Failure Rate versus Evidence of Perfection.
In the debate over the assessment of software reliability (or safety), as applied to critical software, two extreme positions can be discerned: the ‘statistical’ position, which requires that the claims of reliability be supported by statistical inference from realistic testing or operation, and the ‘perfectionist’ position, which requires convincing indications that the software is free from defects. These two positions naturally lead to requiring different kinds of supporting evidence, and actually to stating the dependability requirements in different ways, not allowing any direct comparison. There is often confusion about the relationship between statements about software failure rates and about software correctness, and about which evidence can support either kind of statement. This note clarifies the meaning of the two kinds of statement and how they relate to the probability of failure-free operation, and discusses their practical merits, especially for high required reliability or safety
Evaluating testing methods by delivered reliability
There are two main goals in testing software: (1) to achieve adequate quality (debug testing), where the objective is to probe the software for defects so that these can be removed, and (2) to assess existing quality (operational testing), where the objective is to gain confidence that the software is reliable. Debug methods tend to ignore random selection of test data from an operational profile, while for operational methods this selection is all-important. Debug methods are thought to be good at uncovering defects so that these can be repaired, but having done so they do not provide a technically defensible assessment of the reliability that results. On the other hand, operational methods provide accurate assessment, but may not be as useful for achieving reliability. This paper examines the relationship between the two testing goals, using a probabilistic analysis. We define simple models of programs and their testing, and try to answer the question of how to attain program reliability: is it better to test by probing for defects as in debug testing, or to assess reliability directly as in operational testing? Testing methods are compared in a model where program failures are detected and the software changed to eliminate them. The “better” method delivers higher reliability after all test failures have been eliminated. Special cases are exhibited in which each kind of testing is superior. An analysis of the distribution of the delivered reliability indicates that even simple models have unusual statistical properties, suggesting caution in interpreting theoretical comparisons
- …