51 research outputs found

    Silent Vulnerable Dependency Alert Prediction with Vulnerability Key Aspect Explanation

    Full text link
    Due to convenience, open-source software is widely used. For beneficial reasons, open-source maintainers often fix the vulnerabilities silently, exposing their users unaware of the updates to threats. Previous works all focus on black-box binary detection of the silent dependency alerts that suffer from high false-positive rates. Open-source software users need to analyze and explain AI prediction themselves. Explainable AI becomes remarkable as a complementary of black-box AI models, providing details in various forms to explain AI decisions. Noticing there is still no technique that can discover silent dependency alert on time, in this work, we propose a framework using an encoder-decoder model with a binary detector to provide explainable silent dependency alert prediction. Our model generates 4 types of vulnerability key aspects including vulnerability type, root cause, attack vector, and impact to enhance the trustworthiness and users' acceptance to alert prediction. By experiments with several models and inputs, we confirm CodeBERT with both commit messages and code changes achieves the best results. Our user study shows that explainable alert predictions can help users find silent dependency alert more easily than black-box predictions. To the best of our knowledge, this is the first research work on the application of Explainable AI in silent dependency alert prediction, which opens the door of the related domains

    Detecting differences across multiple instances of code clones

    Get PDF
    Clone detectors find similar code fragments (i.e., instances of code clones) and report large numbers of them for industrial systems. To maintain or manage code clones, developers often have to in-vestigate differences of multiple cloned code fragments. However, existing program differencing techniques compare only two code fragments at a time. Developers then have to manually combine several pairwise differencing results. In this paper, we present an approach to automatically detecting differences across multiple clone instances. We have implemented our approach as an Eclipse plugin and evaluated its accuracy with three Java software systems. Our evaluation shows that our algorithm has precision over 97.66% and recall over 95.63 % in three open source Java projects. We also conducted a user study of 18 developers to evaluate the use-fulness of our approach for eight clone-related refactoring tasks. Our study shows that our approach can significantly improve de-velopers ’ performance in refactoring decisions, refactoring details, and task completion time on clone-related refactoring tasks. Au-tomatically detecting differences across multiple clone instances also opens opportunities for building practical applications of code clones in software maintenance, such as auto-generation of appli-cation skeleton, intelligent simultaneous code editing

    Pop Quiz! Do Pre-trained Code Models Possess Knowledge of Correct API Names?

    Full text link
    Recent breakthroughs in pre-trained code models, such as CodeBERT and Codex, have shown their superior performance in various downstream tasks. The correctness and unambiguity of API usage among these code models are crucial for achieving desirable program functionalities, requiring them to learn various API fully qualified names structurally and semantically. Recent studies reveal that even state-of-the-art pre-trained code models struggle with suggesting the correct APIs during code generation. However, the reasons for such poor API usage performance are barely investigated. To address this challenge, we propose using knowledge probing as a means of interpreting code models, which uses cloze-style tests to measure the knowledge stored in models. Our comprehensive study examines a code model's capability of understanding API fully qualified names from two different perspectives: API call and API import. Specifically, we reveal that current code models struggle with understanding API names, with pre-training strategies significantly affecting the quality of API name learning. We demonstrate that natural language context can assist code models in locating Python API names and generalize Python API name knowledge to unseen data. Our findings provide insights into the limitations and capabilities of current pre-trained code models, and suggest that incorporating API structure into the pre-training process can improve automated API usage and code representations. This work provides significance for advancing code intelligence practices and direction for future studies. All experiment results, data and source code used in this work are available at \url{https://doi.org/10.5281/zenodo.7902072}

    Let's Discover More API Relations: A Large Language Model-based AI Chain for Unsupervised API Relation Inference

    Full text link
    APIs have intricate relations that can be described in text and represented as knowledge graphs to aid software engineering tasks. Existing relation extraction methods have limitations, such as limited API text corpus and affected by the characteristics of the input text.To address these limitations, we propose utilizing large language models (LLMs) (e.g., GPT-3.5) as a neural knowledge base for API relation inference. This approach leverages the entire Web used to pre-train LLMs as a knowledge base and is insensitive to the context and complexity of input texts. To ensure accurate inference, we design our analytic flow as an AI Chain with three AI modules: API FQN Parser, API Knowledge Extractor, and API Relation Decider. The accuracy of the API FQN parser and API Relation Decider module are 0.81 and 0.83, respectively. Using the generative capacity of the LLM and our approach's inference capability, we achieve an average F1 value of 0.76 under the three datasets, significantly higher than the state-of-the-art method's average F1 value of 0.40. Compared to CoT-based method, our AI Chain design improves the inference reliability by 67%, and the AI-crowd-intelligence strategy enhances the robustness of our approach by 26%

    Mining implicit design templates for actionable code reuse

    Get PDF
    National Research Foundation (NRF) Singapor

    A key ‘foxy’ aroma gene is regulated by homologyinduced promoter indels in the iconic juice grape ‘Concord’

    Get PDF
    ‘Concord’, the most well-known juice grape with a parentage of the North American grape species Vitis labrusca L., possesses a special ‘foxy’ aroma predominantly resulted from the accumulation of methyl anthranilate (MA) in berries. This aroma, however, is often perceived as an undesirable attribute by wine consumers and rarely noticeable in the common table and wine grape species V. vinifera. Here we discovered homology-induced promoter indels as a major genetic mechanism for species-specific regulation of a key ‘foxy’ aroma gene, anthraniloyl-CoA:methanol acyltransferase (AMAT), that is responsible for MA biosynthesis. We found the absence of a 426-bp and/or a 42-bp sequence in AMAT promoters highly associated with high levels of AMAT expression and MA accumulation in ‘Concord’ and other V. labrusca-derived grapes. These promoter variants, all with direct and inverted repeats, were further confirmed in more than 1,300 Vitis germplasm. Moreover, functional impact of these indels was validated in transgenic Arabidopsis. Superimposed on the promoter regulation, large structural changes including exonic insertion of a retrotransposon were present at the AMAT locus in some V. vinifera grapes. Elucidation of the AMAT genetic regulation advances our understanding of the ‘foxy’ aroma trait and makes it genetically trackable and amenable in grapevine breeding

    The GARP/MYB-related grape transcription factor AQUILO improves cold tolerance and promotes the accumulation of raffinose family oligosaccharides

    Get PDF
    Grapevine (Vitis vinifera L.) is a widely cultivated fruit crop whose growth and productivity are greatly affected by low temperatures. On the other hand, wild Vitis species represent valuable genetic resources of natural stress tolerance. We have isolated and characterized a MYB-like gene encoding a putative GARP-type transcription factor from Amur grape (V. amurensis) designated as VaAQUILO. AQUILO (AQ) is induced by cold in both V. amurensis and V. vinifera, and its overexpression results in significantly improved tolerance to cold both in transgenic Arabidopsis and in Amur grape calli. In Arabidopsis, the ectopic expression of VaAQ increased antioxidant enzyme activities and up-regulated reactive oxygen species- (ROS) scavenging-related genes. Comparative mRNA sequencing profiling of 35S:VaAQ Arabidopsis plants suggests that this transcription factor is related to phosphate homeostasis like their Arabidopsis closest homologues: AtHRS1 and AtHHO2. However, when a cold stress is imposed, AQ is tightly associated with the cold-responsive pathway and with the raffinose family oligosaccharides (RFOs), as observed by the up-regulation of galactinol synthase (GoLS) and raffinose synthase genes. Gene co-expression network (GCN) and cis-regulatory element (CRE) analyses in grapevine indicated AQ as potentially regulating VvGoLS genes. Increased RFO content was confirmed in both transgenic Arabidopsis and Amur grape calli overexpressing VaAQ. Taken together, our results imply that AQ improves cold tolerance through promoting the accumulation of osmoprotectants
    • …
    corecore