539 research outputs found
Theoretical study of the open-flavor tetraquark in the process
Recently, the LHCb Collaboration has measured the processes
and , where the
and invariant mass distributions show the significant
signals of two new open-flavor tetraquark states and
, as the two of the isospin triplet. In this work, we
have investigated the process by taking into
account the intermediate nucleon resonance and the tetraquark state
, which could be dynamically generated by the
interactions of the and the pseoduscalar mesons-octet
baryons, respectively. Our results show that a clear peak of the open-flavor
tetraquark may appear in the invariant mass
distribution of the process , which could be tested
by future experiments.Comment: 9 pages, 11 figures, 1 tabl
Generative Type Inference for Python
Python is a popular dynamic programming language, evidenced by its ranking as
the second most commonly used language on GitHub. However, its dynamic type
system can lead to potential type errors, leading researchers to explore
automatic type inference approaches for Python programs. The rule-based type
inference approaches can ensure the accuracy of predicted variable types, but
they suffer from low coverage problems. Supervised type inference approaches,
while feature-agnostic, require large, high-quality annotated datasets and are
limited to pre-defined types. As zero-shot approaches, the cloze-style
approaches reformulate the type inference problem into a fill-in-the-blank
problem. However, their performance is limited.
This paper introduces TypeGen, a few-shot generative type inference approach
that incorporates static domain knowledge from static analysis. TypeGen creates
chain-of-thought (COT) prompts by translating the type inference steps of
static analysis into prompts based on the type dependency graphs (TDGs),
enabling language models to learn from how static analysis infers types. By
combining COT prompts with code slices and type hints, TypeGen constructs
example prompts from human annotations. TypeGen only requires very few
annotated examples to teach language models to generate similar COT prompts via
in-context learning. Moreover, TypeGen enhances the interpretability of results
through the use of the input-explanation-output strategy. Experiments show that
TypeGen outperforms the best baseline Type4Py by 10.0% for argument type
prediction and 22.5% in return value type prediction in terms of top-1 Exact
Match by using only five examples. Furthermore, TypeGen achieves substantial
improvements of 27% to 84% compared to the zero-shot performance of large
language models with parameter sizes ranging from 1.3B to 175B in terms of
top-1 Exact Match.Comment: This paper has been accepted by ASE'2
Domain Knowledge Matters: Improving Prompts with Fix Templates for Repairing Python Type Errors
Although the dynamic type system of Python facilitates the developers in
writing Python programs, it also brings type errors at run-time. There exist
rule-based approaches for automatically repairing Python type errors. The
approaches can generate accurate patches but they require domain experts to
design patch synthesis rules and suffer from low template coverage of
real-world type errors. Learning-based approaches alleviate the manual efforts
in designing patch synthesis rules. Among the learning-based approaches, the
prompt-based approach which leverages the knowledge base of code pre-trained
models via pre-defined prompts, obtains state-of-the-art performance in general
program repair tasks. However, such prompts are manually defined and do not
involve any specific clues for repairing Python type errors, resulting in
limited effectiveness. How to automatically improve prompts with the domain
knowledge for type error repair is challenging yet under-explored. In this
paper, we present TypeFix, a novel prompt-based approach with fix templates
incorporated for repairing Python type errors. TypeFix first mines generalized
fix templates via a novel hierarchical clustering algorithm. The identified fix
templates indicate the common edit patterns and contexts of existing type error
fixes. TypeFix then generates code prompts for code pre-trained models by
employing the generalized fix templates as domain knowledge, in which the masks
are adaptively located for each type error instead of being pre-determined.
Experiments on two benchmarks, including BugsInPy and TypeBugs, show that
TypeFix successfully repairs 26 and 55 type errors, outperforming the best
baseline approach by 9 and 14, respectively. Besides, the proposed fix template
mining approach can cover 75% of developers' patches in both benchmarks,
increasing the best rule-based approach PyTER by more than 30%.Comment: This paper has been accepted by ICSE'2
API Usage Recommendation via Multi-View Heterogeneous Graph Representation Learning
Developers often need to decide which APIs to use for the functions being
implemented. With the ever-growing number of APIs and libraries, it becomes
increasingly difficult for developers to find appropriate APIs, indicating the
necessity of automatic API usage recommendation. Previous studies adopt
statistical models or collaborative filtering methods to mine the implicit API
usage patterns for recommendation. However, they rely on the occurrence
frequencies of APIs for mining usage patterns, thus prone to fail for the
low-frequency APIs. Besides, prior studies generally regard the API call
interaction graph as homogeneous graph, ignoring the rich information (e.g.,
edge types) in the structure graph. In this work, we propose a novel method
named MEGA for improving the recommendation accuracy especially for the
low-frequency APIs. Specifically, besides call interaction graph, MEGA
considers another two new heterogeneous graphs: global API co-occurrence graph
enriched with the API frequency information and hierarchical structure graph
enriched with the project component information. With the three multi-view
heterogeneous graphs, MEGA can capture the API usage patterns more accurately.
Experiments on three Java benchmark datasets demonstrate that MEGA
significantly outperforms the baseline models by at least 19% with respect to
the Success Rate@1 metric. Especially, for the low-frequency APIs, MEGA also
increases the baselines by at least 55% regarding the Success Rate@1
Less is More? An Empirical Study on Configuration Issues in Python PyPI Ecosystem
Python is widely used in the open-source community, largely owing to the
extensive support from diverse third-party libraries within the PyPI ecosystem.
Nevertheless, the utilization of third-party libraries can potentially lead to
conflicts in dependencies, prompting researchers to develop dependency conflict
detectors. Moreover, endeavors have been made to automatically infer
dependencies. These approaches focus on version-level checks and inference,
based on the assumption that configurations of libraries in the PyPI ecosystem
are correct. However, our study reveals that this assumption is not universally
valid, and relying solely on version-level checks proves inadequate in ensuring
compatible run-time environments. In this paper, we conduct an empirical study
to comprehensively study the configuration issues in the PyPI ecosystem.
Specifically, we propose PyCon, a source-level detector, for detecting
potential configuration issues. PyCon employs three distinct checks, targeting
the setup, packing, and usage stages of libraries, respectively. To evaluate
the effectiveness of the current automatic dependency inference approaches, we
build a benchmark called VLibs, comprising library releases that pass all three
checks of PyCon. We identify 15 kinds of configuration issues and find that
183,864 library releases suffer from potential configuration issues.
Remarkably, 68% of these issues can only be detected via the source-level
check. Our experiment results show that the most advanced automatic dependency
inference approach, PyEGo, can successfully infer dependencies for only 65% of
library releases. The primary failures stem from dependency conflicts and the
absence of required libraries in the generated configurations. Based on the
empirical results, we derive six findings and draw two implications for
open-source developers and future research in automatic dependency inference.Comment: This paper has been accepted by ICSE 202
Fastened CROWN: Tightened Neural Network Robustness Certificates
The rapid growth of deep learning applications in real life is accompanied by
severe safety concerns. To mitigate this uneasy phenomenon, much research has
been done providing reliable evaluations of the fragility level in different
deep neural networks. Apart from devising adversarial attacks, quantifiers that
certify safeguarded regions have also been designed in the past five years. The
summarizing work of Salman et al. unifies a family of existing verifiers under
a convex relaxation framework. We draw inspiration from such work and further
demonstrate the optimality of deterministic CROWN (Zhang et al. 2018) solutions
in a given linear programming problem under mild constraints. Given this
theoretical result, the computationally expensive linear programming based
method is shown to be unnecessary. We then propose an optimization-based
approach \textit{FROWN} (\textbf{F}astened C\textbf{ROWN}): a general algorithm
to tighten robustness certificates for neural networks. Extensive experiments
on various networks trained individually verify the effectiveness of FROWN in
safeguarding larger robust regions.Comment: Zhaoyang Lyu and Ching-Yun Ko contributed equally, accepted to AAAI
202
The Annual Rhythmic Differentiation of Populus davidiana Growth–Climate Response Under a Warming Climate in The Greater Hinggan Mountains
The stability and balance of forest ecosystems have been seriously affected by climate change. Herein, we use dendrochronological methods to investigate the radial growth and climate response of pioneer tree species in the southern margin of cold temperate coniferous forest based on Populus davidiana growing on the Greater Hinggan Mountains in northeastern China. Correlations of P. davidiana growth with temperature and precipitation in a year (October–September) were rhythmically opposed: while temperatures in previous October–June (winter and spring) and in May–September (growing season) respectively inhibited and promoted radial growth on P. davidiana (p \u3c 0.01), precipitation in the same periods respectively promoted and inhibited of growth (p \u3c 0.01). High temperature or less rain/snow in winter and early spring, and low temperature or excess rainfall in summer, are inconducive to P. davidiana growth and vice versa (p \u3c 0.01). In addition, in March–April, when air temperature was above 0 °C and ground temperature below 0 °C, physiological drought caused significant growth inhibition in P. davidiana (p \u3c 0.05). In general, temperatures play a driving and controlling role in the synergistic effect of temperature and precipitation on P. davidiana growth. Under current conditions of available water supply, changes of temperature, especially warming, are beneficial to the growth of P. davidiana in the study area. The current climate conditions promote the growth of P. davidiana, the pioneer species, compared with the growth inhibition of Larix gmelinii, the dominant species. Thus, the structure and function of boreal forest might be changed under global warming by irreversible alterations in the growth and composition of coniferous and broadleaf tree species in the forest
Machine Learning Feature Extraction Based on Binary Pixel Quantification Using Low-Resolution Images for Application of Unmanned Ground Vehicles in Apple Orchards
Deep learning and machine learning (ML) technologies have been implemented in various applications, and various agriculture technologies are being developed based on image-based object recognition technology. We propose an orchard environment free space recognition technology suitable for developing small-scale agricultural unmanned ground vehicle (UGV) autonomous mobile equipment using a low-cost lightweight processor. We designed an algorithm to minimize the amount of input data to be processed by the ML algorithm through low-resolution grayscale images and image binarization. In addition, we propose an ML feature extraction method based on binary pixel quantification that can be applied to an ML classifier to detect free space for autonomous movement of UGVs from binary images. Here, the ML feature is extracted by detecting the local-lowest points in segments of a binarized image and by defining 33 variables, including local-lowest points, to detect the bottom of a tree trunk. We trained six ML models to select a suitable ML model for trunk bottom detection among various ML models, and we analyzed and compared the performance of the trained models. The ensemble model demonstrated the best performance, and a test was performed using this ML model to detect apple tree trunks from 100 new images. Experimental results indicate that it is possible to recognize free space in an apple orchard environment by learning using approximately 100 low-resolution grayscale images. © 2020 by the authors.1
- …