Search CORE

49 research outputs found

An automated approach to program repair with semantic code search

Author: Ke Yalin
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2015
Field of study

Every year software companies dedicate numerous developer hours to debugging and fixing defects. Automated program repair has the potential to greatly decrease the costs of debugging. Existing automated repair techniques, such as Genprog, TSPRepair, and AE, show great promise but are not able to repair all bugs. We propose a new automated program repair technique, SearchRepair, which is a complementary program repair technique. We take advantage of existing open source code to find potential fixes based on the assumption that there are correct implementations in open source project code for some defects. The key challenges lie in efficiently finding code semantically similar (but not identical) to defective code and then appropriately integrating that code into the buggy program. The technique we present, SearchRepair, addresses these challenges by (1) encoding a large database of human-written code fragments as SMT constraints on input-output behavior, (2) localizing a given defect to likely-buggy program fragments, (3) dynamically analyzing those buggy fragments to derive input-output pairs that describe likely buggy behavior and that can be encoded as SMT constraints, (4) using state-of-the-art constraint solvers to find fragments in the code database that satisfy those constraints, and (5) validating patches that repair the bug against program test suites. We evaluate our technique, SearchRepair, on a program repair benchmark set IntroClass, which provides 998 buggy programs written by novice students, two test suites for each program, and repair results for existing program repair technique, Genprog, TSPRepair and AE. The two test suites, of which one is written by a human and the other one is automatically generated by a computer, are used to determine if a program is buggy and to evaluate the quality of a repair. We use instructor test suite to refer the test suite that is written by a human. And we use KLEE test suite to refer the test suite that are generated by the computer. We consider a program as a potential fixable defect if it fails and passes at least one test case in a test suite. Note that extracting input-output behaviors for the semantic code search requires that at least one passed test case so some buggy programs are excluded from our evaluation. There are 778 defects in IntroClass based on the instructor test suite and 845 defects in IntroClass based on the KLEE test suite. We find that when using the instructor test suite, SearchRepair is able to successfully repair 150 of 778 defects, Gengprog is able to fix 287 defects, TSPRepair is able to fix 247 defects, AE is able to fix 159 defects. In total, these 4 techniques are able to fix 310 defects using the instructor test suite and 20 of the 310 defects can only be fixed by SearchRepair. We also find that when using the computer generated test suite, there are 58 unique defects that can only fixed by SearchRepair out of 339 total unique defects that can be fixed by the 4 techniques. These results suggest that SearchRepair is a complementary technique to existing program repair techniques

Digital Repository @ Iowa State University (ISU)

Automated Vessel Segmentation Using Infinite Perimeter Active Contour Model with Hybrid Region Information with Application to Retinal Images

Author: Chen Ke
Harding Simon
Rada Lavdie
Zhao Yitian
Zheng Yalin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 05/03/2015
Field of study

Automated detection of blood vessel structures is becoming of crucial interest for better management of vascular disease. In this paper, we propose a new infinite active contour model that uses hybrid region information of the image to approach this problem. More specifically, an infinite perimeter regularizer, provided by using L 2 Lebesgue measure of the γ-neighborhood of boundaries, allows for better detection of small oscillatory (branching) structures than the traditional models based on the length of a feature's boundaries (i.e., H 1 Hausdorff measure). Moreover, for better general segmentation performance, the proposed model takes the advantage of using different types of region information, such as the combination of intensity information and local phase based enhancement map. The local phase based enhancement map is used for its superiority in preserving vessel edges while the given image intensity information will guarantee a correct feature's segmentation. We evaluate the performance of the proposed model by applying it to three public retinal image datasets (two datasets of color fundus photography and one fluorescein angiography dataset). The proposed model outperforms its competitors when compared with other widely used unsupervised and supervised methods. For example, the sensitivity (0.742), specificity (0.982) and accuracy (0.954) achieved on the DRIVE dataset are very close to those of the second observer's annotations

University of Liverpool Repository

Crossref

University of Strathclyde Institutional Repository

A New Method of Blind Deconvolution for Colour Fundus Retinal Images

Author: Chen Ke
Harding Simon P.
Williams Bryan M.
Zheng Yalin
Publication venue: 'The University of Iowa'
Publication date: 15/09/2015
Field of study

Fundus retinal imaging is widely used in the diagnosis and management of eye disease. Blur commonly occurs in the acquisition and when it is severe the resulting loss of resolution hampers accurate clinical assessment. In this paper, we present a new technique to address this challenging problem. We make use of implicitly constrained image deblurring, which is known to provide improved results over unconstrained and explicitly constrained methods, and build this into a multi-channel variational framework for parametric deblurring. We propose a new method for automatically selecting the regularisation parameter in the absence of the true (sharp) image using vessel segmentation. We then modify the model to include a regularisation coefficient function which is dependent on an available image mask in order to avoid potential inaccuracies caused by the addition of artificial masks. We present experimental results to demonstrate the effectiveness of our new method

University of Liverpool Repository

Ophthalmic Medical Image Analysis International Workshop

Crossref

Biodegradable mixed MPEG-SS-2SA/TPGS micelles for triggered intracellular release of paclitaxel and reversing multidrug resistance

Author: Jianfeng Xing
Kai Dong
Ke Wang
Lu Zhang
Pengchong Wang
Xianpeng Shi
Yalin Dong
Yan Yan
Publication venue: 'Dove Medical Press Ltd.'
Publication date
Field of study

Crossref

A Study of the Learnability of Relational Properties: Model Counting Meets Machine Learning (MCML)

Author: Baluta Teodora
Blumer Anselm
Chavira Mark
Cormen Thomas H.
Demsky Brian
Fierens Daan
Galeotti J. P.
GarcÃŋa Salvador
Gopinath Divya
Gopinath Divya
Heule Marijn J. H.
Håstad Johan
Iman Ronald L.
Jackson Daniel
Katz G.
Ke Yalin
Khurshid Sarfraz
Kim Moonzoo
Korel B.
Narodytska Nina
Samimi Hesam
Shalev-Shwartz Shai
Soos Mate
Spivey J. M.
Trippel Caroline
Vapnik V. N.
Vasic Marko
Wickerson John
Zave P.
Zave Pamela
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 06/09/2020
Field of study

This paper introduces the MCML approach for empirically studying the learnability of relational properties that can be expressed in the well-known software design language Alloy. A key novelty of MCML is quantification of the performance of and semantic differences among trained machine learning (ML) models, specifically decision trees, with respect to entire (bounded) input spaces, and not just for given training and test datasets (as is the common practice). MCML reduces the quantification problems to the classic complexity theory problem of model counting, and employs state-of-the-art model counters. The results show that relatively simple ML models can achieve surprisingly high performance (accuracy and F1-score) when evaluated in the common setting of using training and test datasets - even when the training dataset is much smaller than the test dataset - indicating the seeming simplicity of learning relational properties. However, MCML metrics based on model counting show that the performance can degrade substantially when tested against the entire (bounded) input space, indicating the high complexity of precisely learning these properties, and the usefulness of model counting in quantifying the true performance

arXiv.org e-Print Archive

Crossref