Search CORE

51 research outputs found

Silent Vulnerable Dependency Alert Prediction with Vulnerability Key Aspect Explanation

Author: Hoang Thong
Lu Qinghua
Sun Jiamou
Xing Zhenchang
Xu Xiwei
Zhao Dehai
Zhu Liming
Publication venue
Publication date: 14/02/2023
Field of study

Due to convenience, open-source software is widely used. For beneficial reasons, open-source maintainers often fix the vulnerabilities silently, exposing their users unaware of the updates to threats. Previous works all focus on black-box binary detection of the silent dependency alerts that suffer from high false-positive rates. Open-source software users need to analyze and explain AI prediction themselves. Explainable AI becomes remarkable as a complementary of black-box AI models, providing details in various forms to explain AI decisions. Noticing there is still no technique that can discover silent dependency alert on time, in this work, we propose a framework using an encoder-decoder model with a binary detector to provide explainable silent dependency alert prediction. Our model generates 4 types of vulnerability key aspects including vulnerability type, root cause, attack vector, and impact to enhance the trustworthiness and users' acceptance to alert prediction. By experiments with several models and inputs, we confirm CodeBERT with both commit messages and code changes achieves the best results. Our user study shows that explainable alert predictions can help users find silent dependency alert more easily than black-box predictions. To the best of our knowledge, this is the first research work on the application of Explainable AI in silent dependency alert prediction, which opens the door of the related domains

arXiv.org e-Print Archive

Detecting differences across multiple instances of code clones

Author: LIN Yun
LIU Yang
PENG Xin
SUN Jun
XING Zhenchang
XUE Yinxing
ZHAO Wenyun
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 31/05/2014
Field of study

Clone detectors find similar code fragments (i.e., instances of code clones) and report large numbers of them for industrial systems. To maintain or manage code clones, developers often have to in-vestigate differences of multiple cloned code fragments. However, existing program differencing techniques compare only two code fragments at a time. Developers then have to manually combine several pairwise differencing results. In this paper, we present an approach to automatically detecting differences across multiple clone instances. We have implemented our approach as an Eclipse plugin and evaluated its accuracy with three Java software systems. Our evaluation shows that our algorithm has precision over 97.66% and recall over 95.63 % in three open source Java projects. We also conducted a user study of 18 developers to evaluate the use-fulness of our approach for eight clone-related refactoring tasks. Our study shows that our approach can significantly improve de-velopers ’ performance in refactoring decisions, refactoring details, and task completion time on clone-related refactoring tasks. Au-tomatically detecting differences across multiple clone instances also opens opportunities for building practical applications of code clones in software maintenance, such as auto-generation of appli-cation skeleton, intelligent simultaneous code editing

CiteSeerX

Crossref

Institutional Knowledge at Singapore Management University

Clonepedia: Summarizing code clones by common syntactic context for software maintenance

Author: DONG Jin Song
LIN Yun
LIU Yang
PENG Xin
SUN Jun
XING Zhenchang
ZHAO Wenyun
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

No Full Tex

Crossref

Institutional Knowledge at Singapore Management University

Pop Quiz! Do Pre-trained Code Models Possess Knowledge of Correct API Names?

Author: Du Xiaoning
Li Li
Quan Haowei
Sun Jiamou
Xing Zhenchang
Zhu Liming
Zhuo Terry Yue
Publication venue
Publication date: 14/09/2023
Field of study

Recent breakthroughs in pre-trained code models, such as CodeBERT and Codex, have shown their superior performance in various downstream tasks. The correctness and unambiguity of API usage among these code models are crucial for achieving desirable program functionalities, requiring them to learn various API fully qualified names structurally and semantically. Recent studies reveal that even state-of-the-art pre-trained code models struggle with suggesting the correct APIs during code generation. However, the reasons for such poor API usage performance are barely investigated. To address this challenge, we propose using knowledge probing as a means of interpreting code models, which uses cloze-style tests to measure the knowledge stored in models. Our comprehensive study examines a code model's capability of understanding API fully qualified names from two different perspectives: API call and API import. Specifically, we reveal that current code models struggle with understanding API names, with pre-training strategies significantly affecting the quality of API name learning. We demonstrate that natural language context can assist code models in locating Python API names and generalize Python API name knowledge to unseen data. Our findings provide insights into the limitations and capabilities of current pre-trained code models, and suggest that incorporating API structure into the pre-training process can improve automated API usage and code representations. This work provides significance for advancing code intelligence practices and direction for future studies. All experiment results, data and source code used in this work are available at \url{https://doi.org/10.5281/zenodo.7902072}

arXiv.org e-Print Archive

Let's Discover More API Relations: A Large Language Model-based AI Chain for Unsupervised API Relation Inference

Author: Cao Yuanlong
Chen Jieshan
Huang Qing
Jin Huan
Lu Jiaxing
Sun Yanbang
Xing Zhenchang
Xu Xiwei
Publication venue
Publication date: 02/11/2023
Field of study

APIs have intricate relations that can be described in text and represented as knowledge graphs to aid software engineering tasks. Existing relation extraction methods have limitations, such as limited API text corpus and affected by the characteristics of the input text.To address these limitations, we propose utilizing large language models (LLMs) (e.g., GPT-3.5) as a neural knowledge base for API relation inference. This approach leverages the entire Web used to pre-train LLMs as a knowledge base and is insensitive to the context and complexity of input texts. To ensure accurate inference, we design our analytic flow as an AI Chain with three AI modules: API FQN Parser, API Knowledge Extractor, and API Relation Decider. The accuracy of the API FQN parser and API Relation Decider module are 0.81 and 0.83, respectively. Using the generative capacity of the LLM and our approach's inference capability, we achieve an average F1 value of 0.76 under the three datasets, significantly higher than the state-of-the-art method's average F1 value of 0.40. Compared to CoT-based method, our AI Chain design improves the inference reliability by 67%, and the AI-crowd-intelligence strategy enhances the robustness of our approach by 26%

arXiv.org e-Print Archive

Mining implicit design templates for actionable code reuse

Author: DONG Jin Song
LIN Yun
LIU Yang
MENG Guozhu
PENG Xin
SUN Jun
XING Zhenchang
YUE Yinxing
ZHAO Wenyun
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

National Research Foundation (NRF) Singapor

Crossref

Institutional Knowledge at Singapore Management University

A key ‘foxy’ aroma gene is regulated by homologyinduced promoter indels in the iconic juice grape ‘Concord’

Author: Arro Jie
Cousins Peter
Cuenca José
Fan Peige
Fei Zhangjun
Gutiérrez Benjamín
Li Shaohua
Liang Zhenchang
Londo Jason
Sun Honghe
Wang Nian
Wang Yi
Xi Xiaojun
Yang Yingzhen
Zhong Gan-Yuan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

‘Concord’, the most well-known juice grape with a parentage of the North American grape species Vitis labrusca L., possesses a special ‘foxy’ aroma predominantly resulted from the accumulation of methyl anthranilate (MA) in berries. This aroma, however, is often perceived as an undesirable attribute by wine consumers and rarely noticeable in the common table and wine grape species V. vinifera. Here we discovered homology-induced promoter indels as a major genetic mechanism for species-specific regulation of a key ‘foxy’ aroma gene, anthraniloyl-CoA:methanol acyltransferase (AMAT), that is responsible for MA biosynthesis. We found the absence of a 426-bp and/or a 42-bp sequence in AMAT promoters highly associated with high levels of AMAT expression and MA accumulation in ‘Concord’ and other V. labrusca-derived grapes. These promoter variants, all with direct and inverted repeats, were further confirmed in more than 1,300 Vitis germplasm. Moreover, functional impact of these indels was validated in transgenic Arabidopsis. Superimposed on the promoter regulation, large structural changes including exonic insertion of a retrotransposon were present at the AMAT locus in some V. vinifera grapes. Elucidation of the AMAT genetic regulation advances our understanding of the ‘foxy’ aroma trait and makes it genetically trackable and amenable in grapevine breeding

ReDivia - Repositorio Digital de l'Instit Valencià d'Investigacions Agràries

of Botany,Chinese Academy Of Sciences

The GARP/MYB-related grape transcription factor AQUILO improves cold tolerance and promotes the accumulation of raffinose family oligosaccharides

Author: Chai Fengmei
Fang Ting Ting
Han Yuepeng
Li Shaohua
Liang Zhenchang
Matus José Tomás
Sun Xiaoming
Wang Qingfeng
Wang Yi
Wang Zemin
Wong Darren Chern Jan
Xin Haiping
Zhang Langlang
Zhao Li
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2018
Field of study

Grapevine (Vitis vinifera L.) is a widely cultivated fruit crop whose growth and productivity are greatly affected by low temperatures. On the other hand, wild Vitis species represent valuable genetic resources of natural stress tolerance. We have isolated and characterized a MYB-like gene encoding a putative GARP-type transcription factor from Amur grape (V. amurensis) designated as VaAQUILO. AQUILO (AQ) is induced by cold in both V. amurensis and V. vinifera, and its overexpression results in significantly improved tolerance to cold both in transgenic Arabidopsis and in Amur grape calli. In Arabidopsis, the ectopic expression of VaAQ increased antioxidant enzyme activities and up-regulated reactive oxygen species- (ROS) scavenging-related genes. Comparative mRNA sequencing profiling of 35S:VaAQ Arabidopsis plants suggests that this transcription factor is related to phosphate homeostasis like their Arabidopsis closest homologues: AtHRS1 and AtHHO2. However, when a cold stress is imposed, AQ is tightly associated with the cold-responsive pathway and with the raffinose family oligosaccharides (RFOs), as observed by the up-regulation of galactinol synthase (GoLS) and raffinose synthase genes. Gene co-expression network (GCN) and cis-regulatory element (CRE) analyses in grapevine indicated AQ as potentially regulating VvGoLS genes. Increased RFO content was confirmed in both transgenic Arabidopsis and Amur grape calli overexpressing VaAQ. Taken together, our results imply that AQ improves cold tolerance through promoting the accumulation of osmoprotectants

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Diposit Digital de Documents de la UAB

Digital.CSIC

of Botany,Chinese Academy Of Sciences