10 research outputs found
SCALE: Constructing Structured Natural Language Comment Trees for Software Vulnerability Detection
Recently, there has been a growing interest in automatic software
vulnerability detection. Pre-trained model-based approaches have demonstrated
superior performance than other Deep Learning (DL)-based approaches in
detecting vulnerabilities. However, the existing pre-trained model-based
approaches generally employ code sequences as input during prediction, and may
ignore vulnerability-related structural information, as reflected in the
following two aspects. First, they tend to fail to infer the semantics of the
code statements with complex logic such as those containing multiple operators
and pointers. Second, they are hard to comprehend various code execution
sequences, which is essential for precise vulnerability detection.
To mitigate the challenges, we propose a Structured Natural Language Comment
tree-based vulnerAbiLity dEtection framework based on the pre-trained models,
named SCALE. The proposed Structured Natural Language Comment Tree (SCT)
integrates the semantics of code statements with code execution sequences based
on the Abstract Syntax Trees (ASTs). Specifically, SCALE comprises three main
modules: (1) Comment Tree Construction, which aims at enhancing the model's
ability to infer the semantics of code statements by first incorporating Large
Language Models (LLMs) for comment generation and then adding the comment node
to ASTs. (2) Structured Natural Language Comment Tree Construction}, which aims
at explicitly involving code execution sequence by combining the code syntax
templates with the comment tree. (3) SCT-Enhanced Representation, which finally
incorporates the constructed SCTs for well capturing vulnerability patterns.Comment: Accepted by ISSTA 202
What Makes Good In-context Demonstrations for Code Intelligence Tasks with LLMs?
Pre-trained models of source code have gained widespread popularity in many
code intelligence tasks. Recently, with the scaling of the model and corpus
size, large language models have shown the ability of in-context learning
(ICL). ICL employs task instructions and a few examples as demonstrations, and
then inputs the demonstrations to the language models for making predictions.
This new learning paradigm is training-free and has shown impressive
performance in various natural language processing and code intelligence tasks.
However, the performance of ICL heavily relies on the quality of
demonstrations, e.g., the selected examples. It is important to systematically
investigate how to construct a good demonstration for code-related tasks. In
this paper, we empirically explore the impact of three key factors on the
performance of ICL in code intelligence tasks: the selection, order, and number
of demonstration examples. We conduct extensive experiments on three code
intelligence tasks including code summarization, bug fixing, and program
synthesis. Our experimental results demonstrate that all the above three
factors dramatically impact the performance of ICL in code intelligence tasks.
Additionally, we summarize our findings and provide takeaway suggestions on how
to construct effective demonstrations, taking into account these three
perspectives. We also show that a carefully-designed demonstration based on our
findings can lead to substantial improvements over widely-used demonstration
construction methods, e.g., improving BLEU-4, EM, and EM by at least 9.90%,
175.96%, and 50.81% on code summarization, bug fixing, and program synthesis,
respectivelyComment: This paper is accepted by ASE 202
Risk Factors for Cervical Lymph Node Metastasis of Papillary Thyroid Microcarcinoma: A Single-Center Retrospective Study
Objective. To identify the clinicopathological features correlated to lymph node metastasis (LNM) in patients with papillary thyroid microcarcinoma (PTMC). Methods. Clinical data of 785 PTMC patients who underwent surgical treatment at the Lishui Municipal Central Hospital from September 2008 to December 2017 were retrospectively analyzed. Clinical and pathological risk factors for lymph node metastasis (LNM), central lymph node metastasis (CLNM), and lateral lymph node metastasis (LLNM) were analyzed. Results. LNM was found in 236 (30.2%) patients. Multivariate logistic regression analysis revealed that in PTMC, male gender, age5 mm, bilateral lesions, and extrathyroidal extension were independent risk factors for LNM in general and for CLNM. For LLNM, tumor size>5 mm, multifocal lesions, and extrathyroidal extension were independent risk factors. Conclusions. Identification of risk factors for cervical LNM could assist individualization of clinical management for PTMC
CEPC Conceptual Design Report: Volume 2 - Physics & Detector
The Circular Electron Positron Collider (CEPC) is a large international scientific facility proposed by the Chinese particle physics community to explore the Higgs boson and provide critical tests of the underlying fundamental physics principles of the Standard Model that might reveal new physics. The CEPC, to be hosted in China in a circular underground tunnel of approximately 100 km in circumference, is designed to operate as a Higgs factory producing electron-positron collisions with a center-of-mass energy of 240 GeV. The collider will also operate at around 91.2 GeV, as a Z factory, and at the WW production threshold (around 160 GeV). The CEPC will produce close to one trillion Z bosons, 100 million W bosons and over one million Higgs bosons. The vast amount of bottom quarks, charm quarks and tau-leptons produced in the decays of the Z bosons also makes the CEPC an effective B-factory and tau-charm factory. The CEPC will have two interaction points where two large detectors will be located. This document is the second volume of the CEPC Conceptual Design Report (CDR). It presents the physics case for the CEPC, describes conceptual designs of possible detectors and their technological options, highlights the expected detector and physics performance, and discusses future plans for detector R&D and physics investigations. The final CEPC detectors will be proposed and built by international collaborations but they are likely to be composed of the detector technologies included in the conceptual designs described in this document. A separate volume, Volume I, recently released, describes the design of the CEPC accelerator complex, its associated civil engineering, and strategic alternative scenarios
CEPC Conceptual Design Report: Volume 2 - Physics & Detector
The Circular Electron Positron Collider (CEPC) is a large international scientific facility proposed by the Chinese particle physics community to explore the Higgs boson and provide critical tests of the underlying fundamental physics principles of the Standard Model that might reveal new physics. The CEPC, to be hosted in China in a circular underground tunnel of approximately 100 km in circumference, is designed to operate as a Higgs factory producing electron-positron collisions with a center-of-mass energy of 240 GeV. The collider will also operate at around 91.2 GeV, as a Z factory, and at the WW production threshold (around 160 GeV). The CEPC will produce close to one trillion Z bosons, 100 million W bosons and over one million Higgs bosons. The vast amount of bottom quarks, charm quarks and tau-leptons produced in the decays of the Z bosons also makes the CEPC an effective B-factory and tau-charm factory. The CEPC will have two interaction points where two large detectors will be located. This document is the second volume of the CEPC Conceptual Design Report (CDR). It presents the physics case for the CEPC, describes conceptual designs of possible detectors and their technological options, highlights the expected detector and physics performance, and discusses future plans for detector R&D and physics investigations. The final CEPC detectors will be proposed and built by international collaborations but they are likely to be composed of the detector technologies included in the conceptual designs described in this document. A separate volume, Volume I, recently released, describes the design of the CEPC accelerator complex, its associated civil engineering, and strategic alternative scenarios
CEPC Conceptual Design Report: Volume 2 - Physics & Detector
The Circular Electron Positron Collider (CEPC) is a large international scientific facility proposed by the Chinese particle physics community to explore the Higgs boson and provide critical tests of the underlying fundamental physics principles of the Standard Model that might reveal new physics. The CEPC, to be hosted in China in a circular underground tunnel of approximately 100 km in circumference, is designed to operate as a Higgs factory producing electron-positron collisions with a center-of-mass energy of 240 GeV. The collider will also operate at around 91.2 GeV, as a Z factory, and at the WW production threshold (around 160 GeV). The CEPC will produce close to one trillion Z bosons, 100 million W bosons and over one million Higgs bosons. The vast amount of bottom quarks, charm quarks and tau-leptons produced in the decays of the Z bosons also makes the CEPC an effective B-factory and tau-charm factory. The CEPC will have two interaction points where two large detectors will be located. This document is the second volume of the CEPC Conceptual Design Report (CDR). It presents the physics case for the CEPC, describes conceptual designs of possible detectors and their technological options, highlights the expected detector and physics performance, and discusses future plans for detector R&D and physics investigations. The final CEPC detectors will be proposed and built by international collaborations but they are likely to be composed of the detector technologies included in the conceptual designs described in this document. A separate volume, Volume I, recently released, describes the design of the CEPC accelerator complex, its associated civil engineering, and strategic alternative scenarios