24 research outputs found

    Towards Generating Functionally Correct Code Edits from Natural Language Issue Descriptions

    Full text link
    Large language models (LLMs), such as OpenAI's Codex, have demonstrated their potential to generate code from natural language descriptions across a wide range of programming tasks. Several benchmarks have recently emerged to evaluate the ability of LLMs to generate functionally correct code from natural language intent with respect to a set of hidden test cases. This has enabled the research community to identify significant and reproducible advancements in LLM capabilities. However, there is currently a lack of benchmark datasets for assessing the ability of LLMs to generate functionally correct code edits based on natural language descriptions of intended changes. This paper aims to address this gap by motivating the problem NL2Fix of translating natural language descriptions of code changes (namely bug fixes described in Issue reports in repositories) into correct code fixes. To this end, we introduce Defects4J-NL2Fix, a dataset of 283 Java programs from the popular Defects4J dataset augmented with high-level descriptions of bug fixes, and empirically evaluate the performance of several state-of-the-art LLMs for the this task. Results show that these LLMS together are capable of generating plausible fixes for 64.6% of the bugs, and the best LLM-based technique can achieve up to 21.20% top-1 and 35.68% top-5 accuracy on this benchmark

    Improving Polk County Service Integration Team\u27s Resource Sharing

    Get PDF
    Background: Polk County Service Integration (SI) collaborates with community partners to provide resources/information for individuals and families within the community. This collaboration includes a monthly newsletter to promote community resources, services, and events. Aim: The aim was to create a standardized submission tool for newsletter contributors to use to improve communication and promote resource utilization by community members. Methodology: This process improvement was structured using the Plan Do Study Act (PDSA) model. The PDSA model allowed for reassessment of project needs, and multiple cycles were done to develop a comprehensive evaluation and recommendation for the SI newsletter process. One assessment completed was a survey of SI partners.Results: The focus of survey data was surrounding the partner\u27s participation in submitting information to the SI newsletter. It revealed an overarching theme that partners do not feel they have relevant information to contribute. This thought represented the majority of respondents with a percentage of 68.3%. Discussion: Based on the results, we recommend implementation of the standardized submission tool. Through evaluation of results, it was found that users had difficulty with the submission process as a whole. With addition of the submission tool, these problems will be mitigated via guided questioning that will spark contribution ideas from the partners. To evaluate the continued effectiveness of the submission tool, participation of partners will be monitored. Implications: Implementation of the submission tool will begin January 2021. The implications of this are to ease the submission process for the SI coordinator and improve utilization of resources

    Program Merge Conflict Resolution via Neural Transformers

    Full text link
    Collaborative software development is an integral part of the modern software development life cycle, essential to the success of large-scale software projects. When multiple developers make concurrent changes around the same lines of code, a merge conflict may occur. Such conflicts stall pull requests and continuous integration pipelines for hours to several days, seriously hurting developer productivity. To address this problem, we introduce MergeBERT, a novel neural program merge framework based on token-level three-way differencing and a transformer encoder model. By exploiting the restricted nature of merge conflict resolutions, we reformulate the task of generating the resolution sequence as a classification task over a set of primitive merge patterns extracted from real-world merge commit data. Our model achieves 63-68% accuracy for merge resolution synthesis, yielding nearly a 3x performance improvement over existing semi-structured, and 2x improvement over neural program merge tools. Finally, we demonstrate that MergeBERT is sufficiently flexible to work with source code files in Java, JavaScript, TypeScript, and C# programming languages. To measure the practical use of MergeBERT, we conduct a user study to evaluate MergeBERT suggestions with 25 developers from large OSS projects on 122 real-world conflicts they encountered. Results suggest that in practice, MergeBERT resolutions would be accepted at a higher rate than estimated by automatic metrics for precision and accuracy. Additionally, we use participant feedback to identify future avenues for improvement of MergeBERT.Comment: ESEC/FSE '22 camera ready version. 12 pages, 4 figures, online appendi

    Ranking LLM-Generated Loop Invariants for Program Verification

    Full text link
    Synthesizing inductive loop invariants is fundamental to automating program verification. In this work, we observe that Large Language Models (such as gpt-3.5 or gpt-4) are capable of synthesizing loop invariants for a class of programs in a 0-shot setting, yet require several samples to generate the correct invariants. This can lead to a large number of calls to a program verifier to establish an invariant. To address this issue, we propose a {\it re-ranking} approach for the generated results of LLMs. We have designed a ranker that can distinguish between correct inductive invariants and incorrect attempts based on the problem definition. The ranker is optimized as a contrastive ranker. Experimental results demonstrate that this re-ranking mechanism significantly improves the ranking of correct invariants among the generated candidates, leading to a notable reduction in the number of calls to a verifier.Comment: Findings of The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP-findings 2023

    Combined effect of age and body mass index on postoperative mortality and morbidity in laparoscopic cholecystectomy patients

    Get PDF
    BackgroundPrevious studies have assessed the impact of age and body mass index (BMI) on surgery outcomes separately. This retrospective cohort study aimed to investigate the combined effect of age and BMI on postoperative mortality and morbidity in patients undergoing laparoscopic cholecystectomy.MethodsData from the American College of Surgeons National Surgical Quality Improvement Program (ACS NSQIP) database for laparoscopic cholecystectomy patients between 2008 and 2020 were analyzed. Patient demographics, functional status, admission sources, preoperative risk factors, laboratory data, perioperative variables, and 30-day postoperative outcomes were included in the dataset. Logistic regression was used to determine the association of age, BMI, and age/BMI with mortality and morbidity. Patients were stratified into different subcategories based on their age and BMI, and the age/BMI score was calculated. The chi-square test, independent sample t-test, and ANOVA were used as appropriate for each category.ResultsThe study included 435,052 laparoscopic cholecystectomy patients. Logistic regression analysis revealed that a higher age/BMI score was associated with an increased risk of mortality (adj OR 13.13 95% CI, 9.19–18.77, p < 0.0001) and composite morbidity (adj OR 2.57, 95% CI 2.23–2.95, p < 0.0001).ConclusionOlder age, especially accompanied by a low BMI, appears to increase the post-operative mortality and morbidity risks in laparoscopic cholecystectomy patients, while paradoxically, a higher BMI seems to be protective. Our hypothesis is that a lower BMI, perhaps secondary to malnutrition, can carry a greater risk of surgery complications for the elderly. Age/BMI is strongly and positively associated with mortality and morbidity and could be used as a new scoring system for predicting outcomes in patients undergoing surgery. Nevertheless, laparoscopic cholecystectomy remains a very safe procedure with relatively low complication rates

    Association between angiotensin-converting enzyme insertion/deletion gene polymorphism and end-stage renal disease in lebanese patients with diabetic nephropathy

    No full text
    Diabetic nephropathy (DN) is one of the leading causes of end-stage renal disease (ESRD). The development and progression of nephropathy is strongly determined by genetic factors, and few genes have been shown to contribute to DN. An insertion/deletion (I/D) polymorphism of the gene encoding angiotensin-converting enzyme (ACE) was reported as a candidate gene predisposing to DN and ESRD. Accordingly, we investigated the frequency of ACE I/D polymorphism in 50 patients with DN, of whom 33 had ESRD and compared them with 64 patients with type 2 diabetes mellitus (T2DM) but with normal renal function. Polymerase chain reaction amplification, using specific primers, was performed to genotype ACE I/D. Chi-square test was used to assess the differences between the groups. The frequencies of the ACE genotypes were as follows: 48% D/D, 40% I/D, and 12% I/I in patients with DN in contrast to 32.8% D/D, 45.3% I/D, and 21.9% I/I in T2DM. The distribution of the D/D, D/I, and I/I genotypes did not significantly differ between T2DM and DN. However, having the D allele carried a risk for the development of DN [odds ratio (OD), 1.71, P = 0.054]. On the other hand, the distribution of the D/D, D/I, and I/I genotypes was significantly different between T2DM and ESRD patients, χ2 = 7.23, P = 0.027. This was reflected by the D allele which carried a risk for the development of ESRD (OR, 2.51, P = 0.0057). These findings suggest that the D allele may be considered as a risk factor for both the development of DN and the progression of DN to ESRD in Lebanese population with T2DM
    corecore