781 research outputs found
An empirical study of evolution of inheritance in Java OSS
Previous studies of Object-Oriented (OO) software have reported avoidance of the inheritance mechanism and cast doubt on the wisdom of ‘deep’ inheritance levels. From an evolutionary perspective, the picture is unclear - we still know relatively little about how, over time, changes tend to be applied by developers. Our conjecture is that an inheritance hierarchy will tend to grow ‘breadth-wise’ rather than ‘depth-wise’. This claim is made on the basis that developers will avoid extending depth in favour of breadth because of the inherent complexity of having to understand the functionality of superclasses. Thus the goal of our study is to investigate this empirically. We conduct an empirical study of seven Java Open-Source Systems (OSSs) over a series of releases to observe the nature and location of changes within the inheritance hierarchies. Results show a strong tendency for classes to be added at levels one and two of the hierarchy (rather than anywhere else). Over 96% of classes added over the course of the versions of all systems were at level 1 or level 2. The results suggest that changes cluster in the shallow levels of a hierarchy; this is relevant for developers since it indicates where remedial activities such as refactoring should be focused
Evaluating prediction systems in software project estimation
This is the Pre-print version of the Article - Copyright @ 2012 ElsevierContext: Software engineering has a problem in that when we empirically evaluate competing prediction systems we obtain conflicting results.
Objective: To reduce the inconsistency amongst validation study results and provide a more formal foundation to interpret results with a particular focus on continuous prediction systems.
Method: A new framework is proposed for evaluating competing prediction systems based upon (1) an unbiased statistic, Standardised Accuracy, (2) testing the result likelihood relative to the baseline technique of random ‘predictions’, that is guessing, and (3) calculation of effect sizes.
Results: Previously published empirical evaluations of prediction systems are re-examined and the original conclusions shown to be unsafe. Additionally, even the strongest results are shown to have no more than a medium effect size relative to random guessing.
Conclusions: Biased accuracy statistics such as MMRE are deprecated. By contrast this new empirical validation framework leads to meaningful results. Such steps will assist in performing future meta-analyses and in providing more robust and usable recommendations to practitioners.Martin Shepperd was supported by the UK Engineering and Physical Sciences Research Council (EPSRC) under Grant EP/H050329
Integrate the GM(1,1) and Verhulst models to predict software stage effort
This is the author's accepted manuscript. The final published article is available from the link below. Copyright @ 2009 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.Software effort prediction clearly plays a crucial role in software project management. In keeping with more dynamic approaches to software development, it is not sufficient to only predict the whole-project effort at an early stage. Rather, the project manager must also dynamically predict the effort of different stages or activities during the software development process. This can assist the project manager to reestimate effort and adjust the project plan, thus avoiding effort or schedule overruns. This paper presents a method for software physical time stage-effort prediction based on grey models GM(1,1) and Verhulst. This method establishes models dynamically according to particular types of stage-effort sequences, and can adapt to particular development methodologies automatically by using a novel grey feedback mechanism. We evaluate the proposed method with a large-scale real-world software engineering dataset, and compare it with the linear regression method and the Kalman filter method, revealing that accuracy has been improved by at least 28% and 50%, respectively. The results indicate that the method can be effective and has considerable potential. We believe that stage predictions could be a useful complement to whole-project effort prediction methods.National Natural Science Foundation of
China and the Hi-Tech Research
and Development Program of Chin
A general software defect-proneness prediction framework
This is the author's accepted manuscript. The final published article is available from the link below. Copyright @ 2011 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.BACKGROUND - Predicting defect-prone software components is an economically important activity and so has received a good deal of attention. However, making sense of the many, and sometimes seemingly inconsistent, results is difficult. OBJECTIVE - We propose and evaluate a general framework for software defect prediction that supports 1) unbiased and 2) comprehensive comparison between competing prediction systems. METHOD - The framework is comprised of 1) scheme evaluation and 2) defect prediction components. The scheme evaluation analyzes the prediction performance of competing learning schemes for given historical data sets. The defect predictor builds models according to the evaluated learning scheme and predicts software defects with new data according to the constructed model. In order to demonstrate the performance of the proposed framework, we use both simulation and publicly available software defect data sets. RESULTS - The results show that we should choose different learning schemes for different data sets (i.e., no scheme dominates), that small details in conducting how evaluations are conducted can completely reverse findings, and last, that our proposed framework is more effective and less prone to bias than previous approaches. CONCLUSIONS - Failure to properly or fully evaluate a learning scheme can be misleading; however, these problems may be overcome by our proposed framework.National Natural Science Foundation of
Chin
Enhancing Practice and Achievement in Introductory Programming With a Robot Olympics
© 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information
Class movement and re-location: An empirical study of Java inheritance evolution
This is the post-print version of the final paper published in Journal of Systems and Software. The published article is available from the link below. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. Copyright @ 2009 Elsevier B.V.Inheritance is a fundamental feature of the Object-Oriented (OO) paradigm. It is used to promote extensibility and reuse in OO systems. Understanding how systems evolve, and specifically, trends in the movement and re-location of classes in OO hierarchies can help us understand and predict future maintenance effort. In this paper, we explore how and where new classes were added as well as where existing classes were deleted or moved across inheritance hierarchies from multiple versions of four Java systems. We observed first, that in one of the studied systems the same set of classes was continuously moved across the inheritance hierarchy. Second, in the same system, the most frequent changes were restricted to just one sub-part of the overall system. Third, that a maximum of three levels may be a threshold when using inheritance in a system; beyond this level very little activity was observed, supporting earlier theories that, beyond three levels, complexity becomes overwhelming. We also found evidence of ‘collapsing’ hierarchies to bring classes up to shallower levels. Finally, we found that larger classes and highly coupled classes were more frequently moved than smaller and less coupled classes. Statistical evidence supported the view that larger classes and highly coupled classes were less cohesive than smaller classes and lowly coupled classes and were thus more suitable candidates for being moved (within an hierarchy)
Software defect prediction: do different classifiers find the same defects?
Open Access: This article is distributed under the terms of the Creative Commons Attribution 4.0 International License CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.During the last 10 years, hundreds of different defect prediction models have been published. The performance of the classifiers used in these models is reported to be similar with models rarely performing above the predictive performance ceiling of about 80% recall. We investigate the individual defects that four classifiers predict and analyse the level of prediction uncertainty produced by these classifiers. We perform a sensitivity analysis to compare the performance of Random Forest, Naïve Bayes, RPart and SVM classifiers when predicting defects in NASA, open source and commercial datasets. The defect predictions that each classifier makes is captured in a confusion matrix and the prediction uncertainty of each classifier is compared. Despite similar predictive performance values for these four classifiers, each detects different sets of defects. Some classifiers are more consistent in predicting defects than others. Our results confirm that a unique subset of defects can be detected by specific classifiers. However, while some classifiers are consistent in the predictions they make, other classifiers vary in their predictions. Given our results, we conclude that classifier ensembles with decision-making strategies not based on majority voting are likely to perform best in defect prediction.Peer reviewedFinal Published versio
A wide band gap metal-semiconductor-metal nanostructure made entirely from graphene
A blueprint for producing scalable digital graphene electronics has remained
elusive. Current methods to produce semiconducting-metallic graphene networks
all suffer from either stringent lithographic demands that prevent
reproducibility, process-induced disorder in the graphene, or scalability
issues. Using angle resolved photoemission, we have discovered a unique one
dimensional metallic-semiconducting-metallic junction made entirely from
graphene, and produced without chemical functionalization or finite size
patterning. The junction is produced by taking advantage of the inherent,
atomically ordered, substrate-graphene interaction when it is grown on SiC, in
this case when graphene is forced to grow over patterned SiC steps. This
scalable bottomup approach allows us to produce a semiconducting graphene strip
whose width is precisely defined within a few graphene lattice constants, a
level of precision entirely outside modern lithographic limits. The
architecture demonstrated in this work is so robust that variations in the
average electronic band structure of thousands of these patterned ribbons have
little variation over length scales tens of microns long. The semiconducting
graphene has a topologically defined few nanometer wide region with an energy
gap greater than 0.5 eV in an otherwise continuous metallic graphene sheet.
This work demonstrates how the graphene-substrate interaction can be used as a
powerful tool to scalably modify graphene's electronic structure and opens a
new direction in graphene electronics research.Comment: 11 pages, 7 figure
Silicon intercalation into the graphene-SiC interface
In this work we use LEEM, XPEEM and XPS to study how the excess Si at the
graphene-vacuum interface reorders itself at high temperatures. We show that
silicon deposited at room temperature onto multilayer graphene films grown on
the SiC(000[`1]) rapidly diffuses to the graphene-SiC interface when heated to
temperatures above 1020. In a sequence of depositions, we have been able to
intercalate ~ 6 ML of Si into the graphene-SiC interface.Comment: 6 pages, 8 figures, submitted to PR
An external replication on the effects of test-driven development using a multi-site blind analysis approach
Context: Test-driven development (TDD) is an agile practice claimed to improve the quality of a software product, as well as the productivity of its developers. A previous study (i.e., baseline experiment) at the University of Oulu (Finland) compared TDD to a test-last development (TLD) approach through a randomized controlled trial. The results failed to support the claims. Goal: We want to validate the original study results by replicating it at the University of Basilicata (Italy), using a different design. Method: We replicated the baseline experiment, using a crossover design, with 21 graduate students. We kept the settings and context as close as possible to the baseline experiment. In order to limit researchers bias, we involved two other sites (UPM, Spain, and Brunel, UK) to conduct blind analysis of the data. Results: The Kruskal-Wallis tests did not show any significant difference between TDD and TLD in terms of testing effort (p-value = .27), external code quality (p-value = .82), and developers' productivity (p-value = .83). Nevertheless, our data revealed a difference based on the order in which TDD and TLD were applied, though no carry over effect. Conclusions: We verify the baseline study results, yet our results raises concerns regarding the selection of experimental objects, particularly with respect to their interaction with the order in which of treatments are applied. We recommend future studies to survey the tasks used in experiments evaluating TDD. Finally, to lower the cost of replication studies and reduce researchers' bias, we encourage other research groups to adopt similar multi-site blind analysis approach described in this paper.This research is supported in part by the Academy of Finland Project 278354
- …
