77 research outputs found

    When Neural Code Completion Models Size up the Situation: Attaining Cheaper and Faster Completion through Dynamic Model Inference

    Full text link
    Leveraging recent advancements in large language models, modern neural code completion models have demonstrated the capability to generate highly accurate code suggestions. However, their massive size poses challenges in terms of computational costs and environmental impact, hindering their widespread adoption in practical scenarios. Dynamic inference emerges as a promising solution, as it allocates minimal computation during inference while maintaining the model's performance. In this research, we explore dynamic inference within the context of code completion. Initially, we conducted an empirical investigation on GPT-2, focusing on the inference capabilities of intermediate layers for code completion. We found that 54.4% of tokens can be accurately generated using just the first layer, signifying significant computational savings potential. Moreover, despite using all layers, the model still fails to predict 14.5% of tokens correctly, and the subsequent completions continued from them are rarely considered helpful, with only a 4.2% Acceptance Rate. These findings motivate our exploration of dynamic inference in code completion and inspire us to enhance it with a decision-making mechanism that stops the generation of incorrect code. We thus propose a novel dynamic inference method specifically tailored for code completion models. This method aims not only to produce correct predictions with largely reduced computation but also to prevent incorrect predictions proactively. Our extensive evaluation shows that it can averagely skip 1.7 layers out of 16 layers in the models, leading to an 11.2% speedup with only a marginal 1.1% reduction in ROUGE-L.Comment: Accepted to ICSE2

    Learning Procedure-aware Video Representation from Instructional Videos and Their Narrations

    Full text link
    The abundance of instructional videos and their narrations over the Internet offers an exciting avenue for understanding procedural activities. In this work, we propose to learn video representation that encodes both action steps and their temporal ordering, based on a large-scale dataset of web instructional videos and their narrations, without using human annotations. Our method jointly learns a video representation to encode individual step concepts, and a deep probabilistic model to capture both temporal dependencies and immense individual variations in the step ordering. We empirically demonstrate that learning temporal ordering not only enables new capabilities for procedure reasoning, but also reinforces the recognition of individual steps. Our model significantly advances the state-of-the-art results on step classification (+2.8% / +3.3% on COIN / EPIC-Kitchens) and step forecasting (+7.4% on COIN). Moreover, our model attains promising results in zero-shot inference for step classification and forecasting, as well as in predicting diverse and plausible steps for incomplete procedures. Our code is available at https://github.com/facebookresearch/ProcedureVRL.Comment: Accepted to CVPR 202

    Predicting adsorbed gas capacity of deep shales under high temperature and pressure: Experiments and modeling

    Get PDF
    Temperature and pressure conditions of deep shale are beyond experiment range, and the amount of adsorbed gas is difficult to determine. To predict the adsorbed gas content of deep shales under formation conditions, isothermal adsorption experiments and model building were conducted on shale samples from Longmaxi Formation in China. A temperature-dependent adsorption model based on the Langmuir equation is proposed, which can be well-fitted by observed isotherms with a high correlation coefficient. Based on the fitted parameters at 303.15 K, the isothermal adsorption curves at 333.15 K, 363.15 K, and 393.15 K are predicted, showing a good agreement with experimental curves available. Compared with previous prediction methods, the biggest advantage of the proposed method is that it can be carried out only based on one-time isothermal adsorption experiment. Based on the predictions, the downward trend of the excess adsorption curves will slow down under high temperature and pressure conditions, and when the pressure reaches a certain level (> 80 MPa), the temperature has little effect on the excess adsorption capacity. While for absolute adsorption, the gas adsorption reaches saturation much slowly at high temperature, it can also reach saturation under formation pressure. Under the burial depth of marine shale, temperature plays a major role in controlling the adsorbed gas, resulting in the decrease of adsorbed gas content in deep shale, and its ratio will further decrease as the depth increases.Cited as: Zhou, S., Wang, H., Li, B., Li, S., Sepehrnoori, K., Cai, J. Predicting adsorbed gas capacity of deep shales under high temperature and pressure: Experiments and modeling. Advances in Geo-Energy Research, 2022, 6(6): 482-491. https://doi.org/10.46690/ager.2022.06.0

    Large Language Models are Few-Shot Summarizers: Multi-Intent Comment Generation via In-Context Learning

    Full text link
    Code comment generation aims at generating natural language descriptions for a code snippet to facilitate developers' program comprehension activities. Despite being studied for a long time, a bottleneck for existing approaches is that given a code snippet, they can only generate one comment while developers usually need to know information from diverse perspectives such as what is the functionality of this code snippet and how to use it. To tackle this limitation, this study empirically investigates the feasibility of utilizing large language models (LLMs) to generate comments that can fulfill developers' diverse intents. Our intuition is based on the facts that (1) the code and its pairwise comment are used during the pre-training process of LLMs to build the semantic connection between the natural language and programming language, and (2) comments in the real-world projects, which are collected for the pre-training, usually contain different developers' intents. We thus postulate that the LLMs can already understand the code from different perspectives after the pre-training. Indeed, experiments on two large-scale datasets demonstrate the rationale of our insights: by adopting the in-context learning paradigm and giving adequate prompts to the LLM (e.g., providing it with ten or more examples), the LLM can significantly outperform a state-of-the-art supervised learning approach on generating comments with multiple intents. Results also show that customized strategies for constructing the prompts and post-processing strategies for reranking the results can both boost the LLM's performances, which shed light on future research directions for using LLMs to achieve comment generation.Comment: Accepted by the 46th International Conference on Software Engineering (ICSE 2024

    PEELER: Learning to Effectively Predict Flakiness without Running Tests

    Get PDF
    —Regression testing is a widely adopted approach to expose change-induced bugs as well as to verify the correctness/robustness of code in modern software development settings. Unfortunately, the occurrence of flaky tests leads to a significant increase in the cost of regression testing and eventually reduces the productivity of developers (i.e., their ability to find and fix real problems). State-of-the-art approaches leverage dynamic test information obtained through expensive re-execution of test cases to effectively identify flaky tests. Towards accounting for scalability constraints, some recent approaches have built on static test case features, but fall short on effectiveness. In this paper, we introduce PEELER, a new fully static approach for predicting flaky tests through exploring a representation of test cases based on the data dependency relations. The predictor is then trained as a neural network based model, which achieves at the same time scalability (because it does not require any test execution), effectiveness (because it exploits relevant test dependency features), and practicality (because it can be applied in the wild to find new flaky tests). Experimental validation on 17,532 test cases from 21 Java projects shows that PEELER outperforms the state-of-the-art FlakeFlagger by around 20 percentage points: we catch 22% more flaky tests while yielding 51% less false positives. Finally, in a live study with projects in-the-wild, we reported to developers 21 flakiness cases, among which 12 have already been confirmed by developers as being indeed flaky

    Natural Language to Code: How Far Are We?

    Get PDF
    peer reviewedA longstanding dream in software engineering research is to devise e ective approaches for automating development tasks based on developers’ informally-speci ed intentions. Such intentions are generally in the form of natural language descriptions. In recent literature, a number of approaches have been proposed to automate tasks such as code search and even code generation based on natural language inputs. While these approaches vary in terms of technical designs, their objective is the same: transforming a developer’s intention into source code. The literature, however, lacks a comprehensive understanding towards the e ectiveness of existing techniques as well as their complementarity to each other. We propose to ll this gap through a large-scale empirical study where we systematically evaluate natural language to code techniques. Speci cally, we consider six state-of-the-art techniques targeting code search, and four targeting code generation. Through extensive evaluations on a dataset of 22K+ natural language queries, our study reveals the following major ndings: (1) code search techniques based on model pre-training are so far the most e ective while code generation techniques can also provide promising results; (2) complementarity widely exists among the existing techniques; and (3) combining the ten techniques together can enhance the performance for 35% compared with the most e ective standalone technique. Finally, we propose a post-processing strategy to automatically integrate di erent techniques based on their generated code. Experimental results show that our devised strategy is both e ective and extensible
    corecore