Search CORE

4,953 research outputs found

Recommended from our members

Where Are My Intelligent Assistant's Mistakes? A Systematic Testing Approach

Author: A. Blackwell
A. Glass
B. Lim
B. Lim
H. Raghavan
J. Rowan
J. Shen
J. Talbot
J. Tullio
M. Burnett
M. Fisher
M. Klann
O. Raz
P. Frankl
R. Abraham
R. Baeza-Yates
R. Miller
T. Hastie
T. Kulesza
T. Kulesza
V. Grigoreanu
Publication venue
Publication date: 01/01/2011
Field of study

Intelligent assistants are handling increasingly critical tasks, but until now, end users have had no way to systematically assess where their assistants make mistakes. For some intelligent assistants, this is a serious problem: if the assistant is doing work that is important, such as assisting with qualitative research or monitoring an elderly parent’s safety, the user may pay a high cost for unnoticed mistakes. This paper addresses the problem with WYSIWYT/ML (What You See Is What You Test for Machine Learning), a human/computer partnership that enables end users to systematically test intelligent assistants. Our empirical evaluation shows that WYSIWYT/ML helped end users find assistants’ mistakes significantly more effectively than ad hoc testing. Not only did it allow users to assess an assistant’s work on an average of 117 predictions in only 10 minutes, it also scaled to a much larger data set, assessing an assistant’s work on 623 out of 1,448 predictions using only the users’ original 10 minutes’ testing effort

City Research Online

Crossref

Enlighten

Towards Automated Performance Bug Identification in Python

Author: Mazzawi Elie
Miranskyy Andriy
Tsakiltsidis Sokratis
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/07/2016
Field of study

Context: Software performance is a critical non-functional requirement, appearing in many fields such as mission critical applications, financial, and real time systems. In this work we focused on early detection of performance bugs; our software under study was a real time system used in the advertisement/marketing domain. Goal: Find a simple and easy to implement solution, predicting performance bugs. Method: We built several models using four machine learning methods, commonly used for defect prediction: C4.5 Decision Trees, Na\"{\i}ve Bayes, Bayesian Networks, and Logistic Regression. Results: Our empirical results show that a C4.5 model, using lines of code changed, file's age and size as explanatory variables, can be used to predict performance bugs (recall=0.73, accuracy=0.85, and precision=0.96). We show that reducing the number of changes delivered on a commit, can decrease the chance of performance bug injection. Conclusions: We believe that our approach can help practitioners to eliminate performance bugs early in the development cycle. Our results are also of interest to theoreticians, establishing a link between functional bugs and (non-functional) performance bugs, and explicitly showing that attributes used for prediction of functional bugs can be used for prediction of performance bugs

arXiv.org e-Print Archive

Crossref

Untangling Fine-Grained Code Changes

Author: Bacchelli A.
Cassou D.
Dias M.
Ducasse S.
Gousios G.
Gueheneuc Y.-G.
Publication venue
Publication date: 01/01/2015
Field of study

After working for some time, developers commit their code changes to a version control system. When doing so, they often bundle unrelated changes (e.g., bug fix and refactoring) in a single commit, thus creating a so-called tangled commit. Sharing tangled commits is problematic because it makes review, reversion, and integration of these commits harder and historical analyses of the project less reliable. Researchers have worked at untangling existing commits, i.e., finding which part of a commit relates to which task. In this paper, we contribute to this line of work in two ways: (1) A publicly available dataset of untangled code changes, created with the help of two developers who accurately split their code changes into self contained tasks over a period of four months; (2) a novel approach, EpiceaUntangler, to help developers share untangled commits (aka. atomic commits) by using fine-grained code change information. EpiceaUntangler is based and tested on the publicly available dataset, and further evaluated by deploying it to 7 developers, who used it for 2 weeks. We recorded a median success rate of 91% and average one of 75%, in automatically creating clusters of untangled fine-grained code changes

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

HAL Descartes

Radboud Repository

Hal-Diderot

Combining hardware and software instrumentation to classify program executions

Author: Porter Adam
Yılmaz Cemal
Yilmaz Cemal
Publication venue: ACM SIGSOFT
Publication date: 01/01/2010
Field of study

Several research efforts have studied ways to infer properties of software systems from program spectra gathered from the running systems, usually with software-level instrumentation. While these efforts appear to produce accurate classifications, detailed understanding of their costs and potential cost-benefit tradeoffs is lacking. In this work we present a hybrid instrumentation approach which uses hardware performance counters to gather program spectra at very low cost. This underlying data is further augmented with data captured by minimal amounts of software-level instrumentation. We also evaluate this hybrid approach by comparing it to other existing approaches. We conclude that these hybrid spectra can reliably distinguish failed executions from successful executions at a fraction of the runtime overhead cost of using software-based execution data

CiteSeerX

Crossref

Sabanci University Research Database

Predictive Analytics and Software Defect Severity: A Systematic Review and Future Directions

Author: Abayomi-Alli A.
Arogundade O. T.
Kose Utku
Misra Sanjay
Olaleye T. O.
Publication venue: Hindawi
Publication date: 01/01/2023
Field of study

Software testing identifies defects in software products with varying multiplying effects based on their severity levels and sequel to instant rectifications, hence the rate of a research study in the software engineering domain. In this paper, a systematic literature review (SLR) on machine learning-based software defect severity prediction was conducted in the last decade. The SLR was aimed at detecting germane areas central to efficient predictive analytics, which are seldom captured in existing software defect severity prediction reviews. The germane areas include the analysis of techniques or approaches which have a significant influence on the threats to the validity of proposed models, and the bias-variance tradeoff considerations techniques in data science-based approaches. A population, intervention, and outcome model is adopted for better search terms during the literature selection process, and subsequent quality assurance scrutiny yielded fifty-two primary studies. A subsequent thoroughbred systematic review was conducted on the final selected studies to answer eleven main research questions, which uncovers approaches that speak to the aforementioned germane areas of interest. The results indicate that while the machine learning approach is ubiquitous for predicting software defect severity, germane techniques central to better predictive analytics are infrequent in literature. This study is concluded by summarizing prominent study trends in a mind map to stimulate future research in the software engineering industry.publishedVersio

HIØ Brage

Is One Hyperparameter Optimizer Enough?

Author: Bergstra J.
Bergstra J.
Lewis C.
Menzies T.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 02/10/2018
Field of study

Hyperparameter tuning is the black art of automatically finding a good combination of control parameters for a data miner. While widely applied in empirical Software Engineering, there has not been much discussion on which hyperparameter tuner is best for software analytics. To address this gap in the literature, this paper applied a range of hyperparameter optimizers (grid search, random search, differential evolution, and Bayesian optimization) to defect prediction problem. Surprisingly, no hyperparameter optimizer was observed to be `best' and, for one of the two evaluation measures studied here (F-measure), hyperparameter optimization, in 50\% cases, was no better than using default configurations. We conclude that hyperparameter optimization is more nuanced than previously believed. While such optimization can certainly lead to large improvements in the performance of classifiers used in software analytics, it remains to be seen which specific optimizers should be applied to a new dataset.Comment: 7 pages, 2 columns, accepted for SWAN1

arXiv.org e-Print Archive

Crossref

Software Verification and Graph Similarity for Automated Evaluation of Students' Assignments

Author: Kuncak Viktor
Nikolic Mladen
Tosic Dusan
Vujosevic-Janicic Milena
Publication venue
Publication date: 29/06/2012
Field of study

In this paper we promote introducing software verification and control flow graph similarity measurement in automated evaluation of students' programs. We present a new grading framework that merges results obtained by combination of these two approaches with results obtained by automated testing, leading to improved quality and precision of automated grading. These two approaches are also useful in providing a comprehensible feedback that can help students to improve the quality of their programs We also present our corresponding tools that are publicly available and open source. The tools are based on LLVM low-level intermediate code representation, so they could be applied to a number of programming languages. Experimental evaluation of the proposed grading framework is performed on a corpus of university students' programs written in programming language C. Results of the experiments show that automatically generated grades are highly correlated with manually determined grades suggesting that the presented tools can find real-world applications in studying and grading

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne