Search CORE

32 research outputs found

Towards Generating Functionally Correct Code Edits from Natural Language Issue Descriptions

Author: Chakraborty Saikat
Fakhoury Sarah
Lahiri Shuvendu K.
Musuvathi Madan
Publication venue
Publication date: 07/04/2023
Field of study

Large language models (LLMs), such as OpenAI's Codex, have demonstrated their potential to generate code from natural language descriptions across a wide range of programming tasks. Several benchmarks have recently emerged to evaluate the ability of LLMs to generate functionally correct code from natural language intent with respect to a set of hidden test cases. This has enabled the research community to identify significant and reproducible advancements in LLM capabilities. However, there is currently a lack of benchmark datasets for assessing the ability of LLMs to generate functionally correct code edits based on natural language descriptions of intended changes. This paper aims to address this gap by motivating the problem NL2Fix of translating natural language descriptions of code changes (namely bug fixes described in Issue reports in repositories) into correct code fixes. To this end, we introduce Defects4J-NL2Fix, a dataset of 283 Java programs from the popular Defects4J dataset augmented with high-level descriptions of bug fixes, and empirically evaluate the performance of several state-of-the-art LLMs for the this task. Results show that these LLMS together are capable of generating plausible fixes for 64.6% of the bugs, and the best LLM-based technique can achieve up to 21.20% top-1 and 35.68% top-5 accuracy on this benchmark

arXiv.org e-Print Archive

Improving Polk County Service Integration Team\u27s Resource Sharing

Author: Byrne Annalise
Chenea Jamie
Chittim Jessica
Fakhoury Sarah
Kidd Shelby
Woolley Rachel
Publication venue: Digital Commons@WOU
Publication date: 27/05/2021
Field of study

Background: Polk County Service Integration (SI) collaborates with community partners to provide resources/information for individuals and families within the community. This collaboration includes a monthly newsletter to promote community resources, services, and events. Aim: The aim was to create a standardized submission tool for newsletter contributors to use to improve communication and promote resource utilization by community members. Methodology: This process improvement was structured using the Plan Do Study Act (PDSA) model. The PDSA model allowed for reassessment of project needs, and multiple cycles were done to develop a comprehensive evaluation and recommendation for the SI newsletter process. One assessment completed was a survey of SI partners.Results: The focus of survey data was surrounding the partner\u27s participation in submitting information to the SI newsletter. It revealed an overarching theme that partners do not feel they have relevant information to contribute. This thought represented the majority of respondents with a percentage of 68.3%. Discussion: Based on the results, we recommend implementation of the standardized submission tool. Through evaluation of results, it was found that users had difficulty with the submission process as a whole. With addition of the submission tool, these problems will be mitigated via guided questioning that will spark contribution ideas from the partners. To evaluate the continued effectiveness of the submission tool, participation of partners will be monitored. Implications: Implementation of the submission tool will begin January 2021. The implications of this are to ease the submission process for the SI coordinator and improve utilization of resources

Western Oregon University: Digital Commons@WOU

Program Merge Conflict Resolution via Neural Transformers

Author: Bird Christian
Dinella Elizabeth
Fakhoury Sarah
Ghorbani Negar
Jang Jinu
Lahiri Shuvendu
Mytkowicz Todd
Sundaresan Neel
Svyatkovskiy Alexey
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 21/09/2022
Field of study

Collaborative software development is an integral part of the modern software development life cycle, essential to the success of large-scale software projects. When multiple developers make concurrent changes around the same lines of code, a merge conflict may occur. Such conflicts stall pull requests and continuous integration pipelines for hours to several days, seriously hurting developer productivity. To address this problem, we introduce MergeBERT, a novel neural program merge framework based on token-level three-way differencing and a transformer encoder model. By exploiting the restricted nature of merge conflict resolutions, we reformulate the task of generating the resolution sequence as a classification task over a set of primitive merge patterns extracted from real-world merge commit data. Our model achieves 63-68% accuracy for merge resolution synthesis, yielding nearly a 3x performance improvement over existing semi-structured, and 2x improvement over neural program merge tools. Finally, we demonstrate that MergeBERT is sufficiently flexible to work with source code files in Java, JavaScript, TypeScript, and C# programming languages. To measure the practical use of MergeBERT, we conduct a user study to evaluate MergeBERT suggestions with 25 developers from large OSS projects on 122 real-world conflicts they encountered. Results suggest that in practice, MergeBERT resolutions would be accepted at a higher rate than estimated by automatic metrics for precision and accuracy. Additionally, we use participant feedback to identify future avenues for improvement of MergeBERT.Comment: ESEC/FSE '22 camera ready version. 12 pages, 4 figures, online appendi

arXiv.org e-Print Archive

Ranking LLM-Generated Loop Invariants for Program Verification

Author: Chakraborty Saikat
Fakhoury Sarah
Lahiri Shuvendu K.
Lal Akash
Musuvathi Madanlal
Rastogi Aseem
Senthilnathan Aditya
Sharma Rahul
Swamy Nikhil
Publication venue
Publication date: 18/10/2023
Field of study

Synthesizing inductive loop invariants is fundamental to automating program verification. In this work, we observe that Large Language Models (such as gpt-3.5 or gpt-4) are capable of synthesizing loop invariants for a class of programs in a 0-shot setting, yet require several samples to generate the correct invariants. This can lead to a large number of calls to a program verifier to establish an invariant. To address this issue, we propose a {\it re-ranking} approach for the generated results of LLMs. We have designed a ranker that can distinguish between correct inductive invariants and incorrect attempts based on the problem definition. The ranker is optimized as a contrastive ranker. Experimental results demonstrate that this re-ranking mechanism significantly improves the ranking of correct invariants among the generated candidates, leading to a notable reduction in the number of calls to a verifier.Comment: Findings of The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP-findings 2023

arXiv.org e-Print Archive

Combined effect of age and body mass index on postoperative mortality and morbidity in laparoscopic cholecystectomy patients

Author: Abdul Aleem Attasi
Abdul Aleem Attasi
Abdul Aleem Attasi
Abdulaziz Al Ajlan
Ali H. Hajeer
Ali H. Hajeer
Hana M. A. Fakhoury
Hani Tamim
Hani Tamim
Sarah Daher
Ziad Yousef
Ziad Yousef
Publication venue: Frontiers Media S.A.
Publication date: 01/11/2023
Field of study

BackgroundPrevious studies have assessed the impact of age and body mass index (BMI) on surgery outcomes separately. This retrospective cohort study aimed to investigate the combined effect of age and BMI on postoperative mortality and morbidity in patients undergoing laparoscopic cholecystectomy.MethodsData from the American College of Surgeons National Surgical Quality Improvement Program (ACS NSQIP) database for laparoscopic cholecystectomy patients between 2008 and 2020 were analyzed. Patient demographics, functional status, admission sources, preoperative risk factors, laboratory data, perioperative variables, and 30-day postoperative outcomes were included in the dataset. Logistic regression was used to determine the association of age, BMI, and age/BMI with mortality and morbidity. Patients were stratified into different subcategories based on their age and BMI, and the age/BMI score was calculated. The chi-square test, independent sample t-test, and ANOVA were used as appropriate for each category.ResultsThe study included 435,052 laparoscopic cholecystectomy patients. Logistic regression analysis revealed that a higher age/BMI score was associated with an increased risk of mortality (adj OR 13.13 95% CI, 9.19–18.77, p < 0.0001) and composite morbidity (adj OR 2.57, 95% CI 2.23–2.95, p < 0.0001).ConclusionOlder age, especially accompanied by a low BMI, appears to increase the post-operative mortality and morbidity risks in laparoscopic cholecystectomy patients, while paradoxically, a higher BMI seems to be protective. Our hypothesis is that a lower BMI, perhaps secondary to malnutrition, can carry a greater risk of surgery complications for the elderly. Age/BMI is strongly and positively associated with mortality and morbidity and could be used as a new scoring system for predicting outcomes in patients undergoing surgery. Nevertheless, laparoscopic cholecystectomy remains a very safe procedure with relatively low complication rates

Directory of Open Access Journals

Recommended from our members

Models, Metrics, and Minds Empirical Perspectives on Developer Productivity

Author: Fakhoury Sarah
Publication venue: Washington State University
Publication date: 01/01/2022
Field of study

Programming tasks require inherent cognitive load, but the design of the tools and languages a programmer uses to complete their task can either increase mental burden, or optimize for it. To build software that better supports all those that interact with it, we must develop the necessary processes and frameworks to understand the impact that software has on its users and to account for it when designing the next generation of languages and tools. Understanding the complexities of comprehension processes when developing software requires diverse research strategies that bring together fundamentals of human cognition, from domains like Psychology and Cognitive Neuroscience, to empirical methods used in Software Engineering research.In this dissertation we contribute novel methods and perspectives to the domain of program comprehension and software developer productivity including: 1) a novel perspective and tools for studying cognitive processes during computing activities, 2) a better understanding of how software quality factors impact mental effort and productivity during bug localization, and 3) opportunities to improve metrics that serve as proxies for user evaluation of models for code summarization, readability, and merge tasks. We discuss how user-centered development and evaluation processes can help to develop theories that better inform and align tools designed to improve developer productivity in practice

Washington State University institutional repository

Association between angiotensin-converting enzyme insertion/deletion gene polymorphism and end-stage renal disease in lebanese patients with diabetic nephropathy

Author: Hana Fakhoury
Jamila Borjac
Mahmoud Balbaa
Rajaa Fakhoury
Sarah Fawwaz
Publication venue: 'Medknow'
Publication date: 01/01/2017
Field of study

Diabetic nephropathy (DN) is one of the leading causes of end-stage renal disease (ESRD). The development and progression of nephropathy is strongly determined by genetic factors, and few genes have been shown to contribute to DN. An insertion/deletion (I/D) polymorphism of the gene encoding angiotensin-converting enzyme (ACE) was reported as a candidate gene predisposing to DN and ESRD. Accordingly, we investigated the frequency of ACE I/D polymorphism in 50 patients with DN, of whom 33 had ESRD and compared them with 64 patients with type 2 diabetes mellitus (T2DM) but with normal renal function. Polymerase chain reaction amplification, using specific primers, was performed to genotype ACE I/D. Chi-square test was used to assess the differences between the groups. The frequencies of the ACE genotypes were as follows: 48% D/D, 40% I/D, and 12% I/I in patients with DN in contrast to 32.8% D/D, 45.3% I/D, and 21.9% I/I in T2DM. The distribution of the D/D, D/I, and I/I genotypes did not significantly differ between T2DM and DN. However, having the D allele carried a risk for the development of DN [odds ratio (OD), 1.71, P = 0.054]. On the other hand, the distribution of the D/D, D/I, and I/I genotypes was significantly different between T2DM and ESRD patients, χ2 = 7.23, P = 0.027. This was reflected by the D allele which carried a risk for the development of ESRD (OR, 2.51, P = 0.0057). These findings suggest that the D allele may be considered as a risk factor for both the development of DN and the progression of DN to ESRD in Lebanese population with T2DM

Directory of Open Access Journals

Recommended from our members

The Effect of Poor Source Code Lexicon and Readability on Developers' Cognitive Load

Author: Adesope Olusola
Arnaoudova Venera
Fakhoury Sarah
Ma Yuzhan
Publication venue: ACM
Publication date: 01/05/2018
Field of study

It has been well documented that a large portion of the cost of any software lies in the time spent by developers in understanding a program's source code before any changes can be undertaken. One of the main contributors to software comprehension, by subsequent developers or by the authors themselves, has to do with the quality of the lexicon, (i.e., the identifiers and comments) that is used by developers to embed domain concepts and to communicate with their teammates. In fact, previous research shows that there is a positive correlation between the quality of identifiers and the quality of a software project. Results suggest that poor quality lexicon impairs program comprehension and consequently increases the effort that developers must spend to maintain the software. However, we do not yet know or have any empirical evidence, of the relationship between the quality of the lexicon and the cognitive load that developers experience when trying to understand a piece of software. Given the associated costs, there is a critical need to empirically characterize the impact of the quality of the lexicon on developers' ability to comprehend a program. In this study, we explore the effect of poor source code lexicon and readability on developers' cognitive load as measured by a cutting-edge and minimally invasive functional brain imaging technique called functional Near Infrared Spectroscopy (fNIRS). Additionally, while developers perform software comprehension tasks, we map cognitive load data to source code identifiers using an eye tracking device. Our results show that the presence of linguistic antipatterns in source code significantly increases the developers' cognitive load

Washington State University institutional repository

Keep it simple: Is deep learning good for linguistic smell detection?

Author: Antoniol Giuliano
Arnaoudova Venera
Fakhoury Sarah
Khomh Foutse
Noiseux Cedric
Publication venue: IEEE
Publication date: 01/01/2018
Field of study

Crossref

PolyPublie