Search CORE

539 research outputs found

Theoretical study of the open-flavor tetraquark $T_{c\bar{s}}(2900)$ in the process $\Lambda_b\to K^0D^0\Lambda$

Author: Chen Dian-Yong
Duan Man-Yu
Lyu Wen-Tao
Lyu Yun-He
Wang En
Wang Guan-Ying
Publication venue
Publication date: 22/10/2023
Field of study

Recently, the LHCb Collaboration has measured the processes

B^0\to\bar{D}^0D_s^+\pi^-

and

B^+\to\bar{D}^0D_s^+\pi^+

, where the

D_s^+\pi^-

and

D_s^+\pi^+

invariant mass distributions show the significant signals of two new open-flavor tetraquark states

T_{c\bar{s}}(2900)^0

and

T_{c\bar{s}}(2900)^{++}

, as the two of the isospin triplet. In this work, we have investigated the process

\Lambda_b\to K^0D^0\Lambda

by taking into account the intermediate nucleon resonance

N^*(1535)

and the tetraquark state

T_{c\bar{s}}(2900)^0

, which could be dynamically generated by the interactions of the

D^*K^*/D^*_s\rho

and the pseoduscalar mesons-octet baryons, respectively. Our results show that a clear peak of the open-flavor tetraquark

T_{c\bar{s}}(2900)

may appear in the

K^0D^0

invariant mass distribution of the process

\Lambda_b\to K^0D^0\Lambda

, which could be tested by future experiments.Comment: 9 pages, 11 figures, 1 tabl

arXiv.org e-Print Archive

Generative Type Inference for Python

Author: Gao Cuiyun
Lyu Michael R.
Peng Yun
Wang Chaozheng
Wang Wenxuan
Publication venue
Publication date: 18/07/2023
Field of study

Python is a popular dynamic programming language, evidenced by its ranking as the second most commonly used language on GitHub. However, its dynamic type system can lead to potential type errors, leading researchers to explore automatic type inference approaches for Python programs. The rule-based type inference approaches can ensure the accuracy of predicted variable types, but they suffer from low coverage problems. Supervised type inference approaches, while feature-agnostic, require large, high-quality annotated datasets and are limited to pre-defined types. As zero-shot approaches, the cloze-style approaches reformulate the type inference problem into a fill-in-the-blank problem. However, their performance is limited. This paper introduces TypeGen, a few-shot generative type inference approach that incorporates static domain knowledge from static analysis. TypeGen creates chain-of-thought (COT) prompts by translating the type inference steps of static analysis into prompts based on the type dependency graphs (TDGs), enabling language models to learn from how static analysis infers types. By combining COT prompts with code slices and type hints, TypeGen constructs example prompts from human annotations. TypeGen only requires very few annotated examples to teach language models to generate similar COT prompts via in-context learning. Moreover, TypeGen enhances the interpretability of results through the use of the input-explanation-output strategy. Experiments show that TypeGen outperforms the best baseline Type4Py by 10.0% for argument type prediction and 22.5% in return value type prediction in terms of top-1 Exact Match by using only five examples. Furthermore, TypeGen achieves substantial improvements of 27% to 84% compared to the zero-shot performance of large language models with parameter sizes ranging from 1.3B to 175B in terms of top-1 Exact Match.Comment: This paper has been accepted by ASE'2

arXiv.org e-Print Archive

Domain Knowledge Matters: Improving Prompts with Fix Templates for Repairing Python Type Errors

Author: Gao Cuiyun
Gao Shuzheng
Huo Yintong
Lyu Michael R.
Peng Yun
Publication venue
Publication date: 02/06/2023
Field of study

Although the dynamic type system of Python facilitates the developers in writing Python programs, it also brings type errors at run-time. There exist rule-based approaches for automatically repairing Python type errors. The approaches can generate accurate patches but they require domain experts to design patch synthesis rules and suffer from low template coverage of real-world type errors. Learning-based approaches alleviate the manual efforts in designing patch synthesis rules. Among the learning-based approaches, the prompt-based approach which leverages the knowledge base of code pre-trained models via pre-defined prompts, obtains state-of-the-art performance in general program repair tasks. However, such prompts are manually defined and do not involve any specific clues for repairing Python type errors, resulting in limited effectiveness. How to automatically improve prompts with the domain knowledge for type error repair is challenging yet under-explored. In this paper, we present TypeFix, a novel prompt-based approach with fix templates incorporated for repairing Python type errors. TypeFix first mines generalized fix templates via a novel hierarchical clustering algorithm. The identified fix templates indicate the common edit patterns and contexts of existing type error fixes. TypeFix then generates code prompts for code pre-trained models by employing the generalized fix templates as domain knowledge, in which the masks are adaptively located for each type error instead of being pre-determined. Experiments on two benchmarks, including BugsInPy and TypeBugs, show that TypeFix successfully repairs 26 and 55 type errors, outperforming the best baseline approach by 9 and 14, respectively. Besides, the proposed fix template mining approach can cover 75% of developers' patches in both benchmarks, increasing the best rule-based approach PyTER by more than 30%.Comment: This paper has been accepted by ICSE'2

arXiv.org e-Print Archive

API Usage Recommendation via Multi-View Heterogeneous Graph Representation Learning

Author: Chen Yujia
Gao Cuiyun
Lyu Michael R.
Peng Yun
Ren Xiaoxue
Xia Xin
Publication venue
Publication date: 03/08/2022
Field of study

Developers often need to decide which APIs to use for the functions being implemented. With the ever-growing number of APIs and libraries, it becomes increasingly difficult for developers to find appropriate APIs, indicating the necessity of automatic API usage recommendation. Previous studies adopt statistical models or collaborative filtering methods to mine the implicit API usage patterns for recommendation. However, they rely on the occurrence frequencies of APIs for mining usage patterns, thus prone to fail for the low-frequency APIs. Besides, prior studies generally regard the API call interaction graph as homogeneous graph, ignoring the rich information (e.g., edge types) in the structure graph. In this work, we propose a novel method named MEGA for improving the recommendation accuracy especially for the low-frequency APIs. Specifically, besides call interaction graph, MEGA considers another two new heterogeneous graphs: global API co-occurrence graph enriched with the API frequency information and hierarchical structure graph enriched with the project component information. With the three multi-view heterogeneous graphs, MEGA can capture the API usage patterns more accurately. Experiments on three Java benchmark datasets demonstrate that MEGA significantly outperforms the baseline models by at least 19% with respect to the Success Rate@1 metric. Especially, for the low-frequency APIs, MEGA also increases the baselines by at least 55% regarding the Success Rate@1

arXiv.org e-Print Archive

Less is More? An Empirical Study on Configuration Issues in Python PyPI Ecosystem

Author: Gao Cuiyun
Hu Ruida
Li Shuqing
Lyu Michael R.
Peng Yun
Wang Ruoke
Publication venue
Publication date: 19/10/2023
Field of study

Python is widely used in the open-source community, largely owing to the extensive support from diverse third-party libraries within the PyPI ecosystem. Nevertheless, the utilization of third-party libraries can potentially lead to conflicts in dependencies, prompting researchers to develop dependency conflict detectors. Moreover, endeavors have been made to automatically infer dependencies. These approaches focus on version-level checks and inference, based on the assumption that configurations of libraries in the PyPI ecosystem are correct. However, our study reveals that this assumption is not universally valid, and relying solely on version-level checks proves inadequate in ensuring compatible run-time environments. In this paper, we conduct an empirical study to comprehensively study the configuration issues in the PyPI ecosystem. Specifically, we propose PyCon, a source-level detector, for detecting potential configuration issues. PyCon employs three distinct checks, targeting the setup, packing, and usage stages of libraries, respectively. To evaluate the effectiveness of the current automatic dependency inference approaches, we build a benchmark called VLibs, comprising library releases that pass all three checks of PyCon. We identify 15 kinds of configuration issues and find that 183,864 library releases suffer from potential configuration issues. Remarkably, 68% of these issues can only be detected via the source-level check. Our experiment results show that the most advanced automatic dependency inference approach, PyEGo, can successfully infer dependencies for only 65% of library releases. The primary failures stem from dependency conflicts and the absence of required libraries in the generated configurations. Based on the empirical results, we derive six findings and draw two implications for open-source developers and future research in automatic dependency inference.Comment: This paper has been accepted by ICSE 202

arXiv.org e-Print Archive

Fastened CROWN: Tightened Neural Network Robustness Certificates

Author: Daniel Luca
Ko Ching-Yun
Kong Zhifeng
Lin Dahua
Lyu Zhaoyang
Wong Ngai
Publication venue
Publication date: 01/12/2019
Field of study

The rapid growth of deep learning applications in real life is accompanied by severe safety concerns. To mitigate this uneasy phenomenon, much research has been done providing reliable evaluations of the fragility level in different deep neural networks. Apart from devising adversarial attacks, quantifiers that certify safeguarded regions have also been designed in the past five years. The summarizing work of Salman et al. unifies a family of existing verifiers under a convex relaxation framework. We draw inspiration from such work and further demonstrate the optimality of deterministic CROWN (Zhang et al. 2018) solutions in a given linear programming problem under mild constraints. Given this theoretical result, the computationally expensive linear programming based method is shown to be unnecessary. We then propose an optimization-based approach \textit{FROWN} (\textbf{F}astened C\textbf{ROWN}): a general algorithm to tighten robustness certificates for neural networks. Extensive experiments on various networks trained individually verify the effectiveness of FROWN in safeguarding larger robust regions.Comment: Zhaoyang Lyu and Ching-Yun Ko contributed equally, accepted to AAAI 202

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

The Annual Rhythmic Differentiation of Populus davidiana Growth–Climate Response Under a Warming Climate in The Greater Hinggan Mountains

Author: Chen Zhenju
Cui Di
Jin Yuting
Li Junxia
Lyu Zhaoyang
Yun Ruixin
Zhao Ying
Publication venue: Hosted by Utah State University Libraries
Publication date: 20/03/2021
Field of study

The stability and balance of forest ecosystems have been seriously affected by climate change. Herein, we use dendrochronological methods to investigate the radial growth and climate response of pioneer tree species in the southern margin of cold temperate coniferous forest based on Populus davidiana growing on the Greater Hinggan Mountains in northeastern China. Correlations of P. davidiana growth with temperature and precipitation in a year (October–September) were rhythmically opposed: while temperatures in previous October–June (winter and spring) and in May–September (growing season) respectively inhibited and promoted radial growth on P. davidiana (p \u3c 0.01), precipitation in the same periods respectively promoted and inhibited of growth (p \u3c 0.01). High temperature or less rain/snow in winter and early spring, and low temperature or excess rainfall in summer, are inconducive to P. davidiana growth and vice versa (p \u3c 0.01). In addition, in March–April, when air temperature was above 0 °C and ground temperature below 0 °C, physiological drought caused significant growth inhibition in P. davidiana (p \u3c 0.05). In general, temperatures play a driving and controlling role in the synergistic effect of temperature and precipitation on P. davidiana growth. Under current conditions of available water supply, changes of temperature, especially warming, are beneficial to the growth of P. davidiana in the study area. The current climate conditions promote the growth of P. davidiana, the pioneer species, compared with the growth inhibition of Larix gmelinii, the dominant species. Thus, the structure and function of boreal forest might be changed under global warming by irreversible alterations in the growth and composition of coniferous and broadleaf tree species in the forest

DigitalCommons@USU

Machine Learning Feature Extraction Based on Binary Pixel Quantification Using Low-Resolution Images for Application of Unmanned Ground Vehicles in Apple Orchards

Author: Choi Byeongdae
Lyu Hong-Kun
Yun Sanghun
Publication venue: MDPI AG
Publication date: 01/12/2020
Field of study

Deep learning and machine learning (ML) technologies have been implemented in various applications, and various agriculture technologies are being developed based on image-based object recognition technology. We propose an orchard environment free space recognition technology suitable for developing small-scale agricultural unmanned ground vehicle (UGV) autonomous mobile equipment using a low-cost lightweight processor. We designed an algorithm to minimize the amount of input data to be processed by the ML algorithm through low-resolution grayscale images and image binarization. In addition, we propose an ML feature extraction method based on binary pixel quantification that can be applied to an ML classifier to detect free space for autonomous movement of UGVs from binary images. Here, the ML feature is extracted by detecting the local-lowest points in segments of a binarized image and by defining 33 variables, including local-lowest points, to detect the bottom of a tree trunk. We trained six ML models to select a suitable ML model for trunk bottom detection among various ML models, and we analyzed and compared the performance of the trained models. The ensemble model demonstrated the best performance, and a test was performed using this ML model to detect apple tree trunks from 100 new images. Experimental results indicate that it is possible to recognize free space in an apple orchard environment by learning using approximately 100 low-resolution grayscale images. © 2020 by the authors.1

DGIST Library Institutional Repository