Search CORE

153 research outputs found

Drive network to a desired orbit by pinning control

Author: Wu Quanjun
Zhang Hua
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

summary:The primary objective of the present paper is to develop an approach for analyzing pinning synchronization stability in a complex delayed dynamical network with directed coupling. Some simple yet generic criteria for pinning such coupled network are derived analytically. Compared with some existing works, the primary contribution is that the synchronization manifold could be chosen as a weighted average of all the nodes states in the network for the sake of practical control tactics, which displays the different influences and contributions of the various nodes in synchronization seeking processes of the dynamical network. Furthermore, it is shown that in order to drive a complex network to a desired synchronization state, the coupling strength should vary according to the controller. In addition, the theoretical results about the time-invariant network is extended to the time-varying network, and the result on synchronization problem can also be extended to the consensus problem of networked multi-agent systems. Subsequently, the theoretic results are illustrated by a typical scale-free (SF) neuronal network. Numerical simulations with three kinds of the homogenous solutions, including an equilibrium point, a periodic orbit, and a chaotic attractor, are finally given to demonstrate the effectiveness of the proposed control methodology

Institute of Mathematics AS CR, v. v. i.

PRINCIPLES OF THE SPLASH CONTROL TECHNIQUE IN DIVING

Author: Cheng Jingjing
Qian Jingguang
Shen Jiacong
Zhang Songning
Zhu Quanjun
Publication venue: International Society of Biomechanics in Sports (ISBS)
Publication date: 18/02/2009
Field of study

INTRODUCTION: "Splash control" is a key element of water entry technique in competitive diving. This process starts from the initial contact of a diver's body with the water surface until complete entry of the rest of the body into the water. The purpose of this study was to establish the most effective hand pattern and body posture that can achieve the best “splash control” to minimize water splash

ISBS (International Society of Biomechanics in Sports): Conference Proceedings Archive

GAMMA: Revisiting Template-based Automated Program Repair via Mask Prediction

Author: Chen Zhenyu
Fang Chunrong
Sun Weisong
Yu Bowen
Zhang Quanjun
Zhang Tongke
Publication venue
Publication date: 17/09/2023
Field of study

Automated program repair (APR) aims to fix software bugs without human intervention and template-based APR has been widely investigated with promising results. However, it is challenging for template-based APR to select the appropriate donor code, which is an important repair ingredient for generating candidate patches. Inappropriate donor code may cause plausible but incorrect patch generation even with correct fix patterns, limiting the repair performance. In this paper, we aim to revisit template-based APR, and propose GAMMA, to directly leverage large pre-trained language models for donor code generation. Our main insight is that instead of retrieving donor code in the local buggy file, we can directly predict the correct code tokens based on the context code snippets and repair patterns by a cloze task. Specifically, (1) GAMMA revises a variety of fix templates from state-of-the-art template-based APR techniques (i.e., TBar) and transforms them into mask patterns. (2) GAMMA adopts a pre-trained language model to predict the correct code for masked code as a fill-in-the-blank task. The experimental results demonstrate that GAMMA correctly repairs 82 bugs on Defects4J-v1.2, which achieves 20.59\% (14 bugs) and 26.15\% (17 bugs) improvement over the previous state-of-the-art template-based approach TBar and learning-based one Recoder. Furthermore, GAMMA repairs 45 bugs and 22 bugs from the additional Defects4J-v2.0 and QuixBugs, indicating the generalizability of GAMMA in addressing the dataset overfitting issue. We also prove that adopting other pre-trained language models can provide substantial advancement, e.g., CodeBERT-based and ChatGPT-based GAMMA is able to fix 80 and 67 bugs on Defects4J-v1.2, indicating the scalability of GAMMA. Overall, our study highlights the promising future of adopting pre-trained models to generate correct patches on top of fix patterns.Comment: Accepted to 38th IEEE/ACM International Conference on Automated Software Engineering (ASE2023

arXiv.org e-Print Archive

A Survey of Learning-based Automated Program Repair

Author: Chen Zhenyu
Fang Chunrong
Ma Yuxiang
Sun Weisong
Zhang Quanjun
Publication venue
Publication date: 25/09/2023
Field of study

Automated program repair (APR) aims to fix software bugs automatically and plays a crucial role in software development and maintenance. With the recent advances in deep learning (DL), an increasing number of APR techniques have been proposed to leverage neural networks to learn bug-fixing patterns from massive open-source code repositories. Such learning-based techniques usually treat APR as a neural machine translation (NMT) task, where buggy code snippets (i.e., source language) are translated into fixed code snippets (i.e., target language) automatically. Benefiting from the powerful capability of DL to learn hidden relationships from previous bug-fixing datasets, learning-based APR techniques have achieved remarkable performance. In this paper, we provide a systematic survey to summarize the current state-of-the-art research in the learning-based APR community. We illustrate the general workflow of learning-based APR techniques and detail the crucial components, including fault localization, patch generation, patch ranking, patch validation, and patch correctness phases. We then discuss the widely-adopted datasets and evaluation metrics and outline existing empirical studies. We discuss several critical aspects of learning-based APR techniques, such as repair domains, industrial deployment, and the open science issue. We highlight several practical guidelines on applying DL techniques for future APR studies, such as exploring explainable patch generation and utilizing code features. Overall, our paper can help researchers gain a comprehensive understanding about the achievements of the existing learning-based APR techniques and promote the practical application of these techniques. Our artifacts are publicly available at \url{https://github.com/QuanjunZhang/AwesomeLearningAPR}

arXiv.org e-Print Archive

A Critical Review of Large Language Model on Software Engineering: An Example from ChatGPT and Automated Program Repair

Author: Chen Zhenyu
Fang Chunrong
Sun Weisong
Yu Bowen
Zhai Juan
Zhang Quanjun
Zhang Tongke
Publication venue
Publication date: 13/10/2023
Field of study

Large Language Models (LLMs) have been gaining increasing attention and demonstrated promising performance across a variety of Software Engineering (SE) tasks, such as Automated Program Repair (APR), code summarization, and code completion. For example, ChatGPT, the latest black-box LLM, has been investigated by numerous recent research studies and has shown impressive performance in various tasks. However, there exists a potential risk of data leakage since these LLMs are usually close-sourced with unknown specific training details, e.g., pre-training datasets. In this paper, we seek to review the bug-fixing capabilities of ChatGPT on a clean APR benchmark with different research objectives. We first introduce {\benchmark}, a new benchmark with buggy and the corresponding fixed programs from competitive programming problems starting from 2023, after the training cutoff point of ChatGPT. The results on {\benchmark} show that ChatGPT is able to fix 109 out of 151 buggy programs using the basic prompt within 35 independent rounds, outperforming state-of-the-art LLMs CodeT5 and PLBART by 27.5\% and 62.4\% prediction accuracy. We also investigate the impact of three types of prompts, i.e., problem description, error feedback, and bug localization, leading to additional 34 fixed bugs. Besides, we provide additional discussion from the interactive nature of ChatGPT to illustrate the capacity of a dialog-based repair workflow with 9 additional fixed bugs. Inspired by the findings, we further pinpoint various challenges and opportunities for advanced SE study equipped with such LLMs (e.g.,~ChatGPT) in the near future. More importantly, our work calls for more research on the reevaluation of the achievements obtained by existing black-box LLMs across various SE tasks, not limited to ChatGPT on APR

arXiv.org e-Print Archive

Backdooring Neural Code Search

Author: Chen Yuchen
Fang Chunrong
Luo Bin
Sun Weisong
Tao Guanhong
Zhang Quanjun
Zhang Xiangyu
Publication venue
Publication date: 27/05/2023
Field of study

Reusing off-the-shelf code snippets from online repositories is a common practice, which significantly enhances the productivity of software developers. To find desired code snippets, developers resort to code search engines through natural language queries. Neural code search models are hence behind many such engines. These models are based on deep learning and gain substantial attention due to their impressive performance. However, the security aspect of these models is rarely studied. Particularly, an adversary can inject a backdoor in neural code search models, which return buggy or even vulnerable code with security/privacy issues. This may impact the downstream software (e.g., stock trading systems and autonomous driving) and cause financial loss and/or life-threatening incidents. In this paper, we demonstrate such attacks are feasible and can be quite stealthy. By simply modifying one variable/function name, the attacker can make buggy/vulnerable code rank in the top 11%. Our attack BADCODE features a special trigger generation and injection procedure, making the attack more effective and stealthy. The evaluation is conducted on two neural code search models and the results show our attack outperforms baselines by 60%. Our user study demonstrates that our attack is more stealthy than the baseline by two times based on the F1 score

arXiv.org e-Print Archive

Machine Translation Testing via Syntactic Tree Pruning

Author: Fang Chunrong
Hu Haichuan
Liu Jiawei
Sun Weisong
Wang Qingyu
Zhai Juan
Zhang Quanjun
Publication venue
Publication date: 01/01/2024
Field of study

Machine translation systems have been widely adopted in our daily life, making life easier and more convenient. Unfortunately, erroneous translations may result in severe consequences, such as financial losses. This requires to improve the accuracy and the reliability of machine translation systems. However, it is challenging to test machine translation systems because of the complexity and intractability of the underlying neural models. To tackle these challenges, we propose a novel metamorphic testing approach by syntactic tree pruning (STP) to validate machine translation systems. Our key insight is that a pruned sentence should have similar crucial semantics compared with the original sentence. Specifically, STP (1) proposes a core semantics-preserving pruning strategy by basic sentence structure and dependency relations on the level of syntactic tree representation; (2) generates source sentence pairs based on the metamorphic relation; (3) reports suspicious issues whose translations break the consistency property by a bag-of-words model. We further evaluate STP on two state-of-the-art machine translation systems (i.e., Google Translate and Bing Microsoft Translator) with 1,200 source sentences as inputs. The results show that STP can accurately find 5,073 unique erroneous translations in Google Translate and 5,100 unique erroneous translations in Bing Microsoft Translator (400% more than state-of-the-art techniques), with 64.5% and 65.4% precision, respectively. The reported erroneous translations vary in types and more than 90% of them cannot be found by state-of-the-art techniques. There are 9,393 erroneous translations unique to STP, which is 711.9% more than state-of-the-art techniques. Moreover, STP is quite effective to detect translation errors for the original sentences with a recall reaching 74.0%, improving state-of-the-art techniques by 55.1% on average.Comment: Accepted to ACM Transactions on Software Engineering and Methodology 2024 (TOSEM'24

arXiv.org e-Print Archive

APPT: Boosting Automated Patch Correctness Prediction via Fine-tuning Pre-trained Models

Author: Chen Zhenyu
Fang Chunrong
Hao Xiaodong
He Tieke
Liu Yan
Sun Weisong
Zhang Quanjun
Publication venue
Publication date: 15/01/2024
Field of study

Automated program repair (APR) aims to fix software bugs automatically without human debugging efforts and plays a crucial role in software development and maintenance. Despite promising, APR is still challenged by a long-standing overfitting problem (i.e., the generated patch is plausible but overfitting). Various techniques have thus been proposed to address the overfitting problem. Recently, researchers have employed BERT to extract code features, which are then used to train a classifier for patch correctness prediction. However, BERT is restricted to feature extraction for classifier training without benefiting from the training process, potentially generating sub-optimal vector representations for patched code snippets. In this paper, we propose APPT, a pre-trained model-based automated patch correctness assessment technique by both pre-training and fine-tuning. APPT adopts a pre-trained model as the encoder stack, followed by an LSTM stack and a deep learning classifier. More importantly, the pre-trained model is fine-tuned in conjunction with other components as a whole pipeline to fully adapt it specifically for reasoning about patch correctness. We conduct an extensive experiment on 1,183 Defects4J patches and the experimental results show that APPT achieves prediction accuracy of 79.7% and recall of 83.2%, outperforming CACHE by 4.3% and 6.7%. Our additional investigation on 49,694 real-world patches shows that APPT achieves the optimum performance compared with existing representation learning techniques. We further investigate the impact of each component and find that they all positively contribute to APPT, e.g., the fine-tuning process and the LSTM stack increase F1-score by 10.22% and 4.11%, respectively. We also prove that adopting advanced pre-trained models can further provide substantial advancement, highlighting the generalizability of APPT.Comment: Accepted to IEEE Transactions on Software Engineering 2024 (TSE'24

arXiv.org e-Print Archive

A Survey of Source Code Search: A 3-Dimensional Perspective

Author: Chen Yuchen
Chen Zhenyu
Fang Chunrong
Ge Xiuting
Ge Yifei
Hu Yuling
Liu Yang
Sun Weisong
Zhang Quanjun
Publication venue
Publication date: 13/11/2023
Field of study

(Source) code search is widely concerned by software engineering researchers because it can improve the productivity and quality of software development. Given a functionality requirement usually described in a natural language sentence, a code search system can retrieve code snippets that satisfy the requirement from a large-scale code corpus, e.g., GitHub. To realize effective and efficient code search, many techniques have been proposed successively. These techniques improve code search performance mainly by optimizing three core components, including query understanding component, code understanding component, and query-code matching component. In this paper, we provide a 3-dimensional perspective survey for code search. Specifically, we categorize existing code search studies into query-end optimization techniques, code-end optimization techniques, and match-end optimization techniques according to the specific components they optimize. Considering that each end can be optimized independently and contributes to the code search performance, we treat each end as a dimension. Therefore, this survey is 3-dimensional in nature, and it provides a comprehensive summary of each dimension in detail. To understand the research trends of the three dimensions in existing code search studies, we systematically review 68 relevant literatures. Different from existing code search surveys that only focus on the query end or code end or introduce various aspects shallowly (including codebase, evaluation metrics, modeling technique, etc.), our survey provides a more nuanced analysis and review of the evolution and development of the underlying techniques used in the three ends. Based on a systematic review and summary of existing work, we outline several open challenges and opportunities at the three ends that remain to be addressed in future work.Comment: submitted to ACM Transactions on Software Engineering and Methodolog

arXiv.org e-Print Archive