8 research outputs found
Neural Transition-based Parsing of Library Deprecations
This paper tackles the challenging problem of automating code updates to fix
deprecated API usages of open source libraries by analyzing their release
notes. Our system employs a three-tier architecture: first, a web crawler
service retrieves deprecation documentation from the web; then a specially
built parser processes those text documents into tree-structured
representations; finally, a client IDE plugin locates and fixes identified
deprecated usages of libraries in a given codebase. The focus of this paper in
particular is the parsing component. We introduce a novel transition-based
parser in two variants: based on a classical feature engineered classifier and
a neural tree encoder. To confirm the effectiveness of our method, we gathered
and labeled a set of 426 API deprecations from 7 well-known Python data science
libraries, and demonstrated our approach decisively outperforms a non-trivial
neural machine translation baseline.Comment: 11 pages + references and appendix (14 total). This is an edited
version of our rejected submission to ESEC/FSE 2022 to include a citation of
our earlier short paper and remove all content pertaining to the demo paper
submission currently under review for ICSE 202
How Effective Are Neural Networks for Fixing Security Vulnerabilities
Security vulnerability repair is a difficult task that is in dire need of
automation. Two groups of techniques have shown promise: (1) large code
language models (LLMs) that have been pre-trained on source code for tasks such
as code completion, and (2) automated program repair (APR) techniques that use
deep learning (DL) models to automatically fix software bugs.
This paper is the first to study and compare Java vulnerability repair
capabilities of LLMs and DL-based APR models. The contributions include that we
(1) apply and evaluate five LLMs (Codex, CodeGen, CodeT5, PLBART and InCoder),
four fine-tuned LLMs, and four DL-based APR techniques on two real-world Java
vulnerability benchmarks (Vul4J and VJBench), (2) design code transformations
to address the training and test data overlapping threat to Codex, (3) create a
new Java vulnerability repair benchmark VJBench, and its transformed version
VJBench-trans and (4) evaluate LLMs and APR techniques on the transformed
vulnerabilities in VJBench-trans.
Our findings include that (1) existing LLMs and APR models fix very few Java
vulnerabilities. Codex fixes 10.2 (20.4%), the most number of vulnerabilities.
(2) Fine-tuning with general APR data improves LLMs' vulnerability-fixing
capabilities. (3) Our new VJBench reveals that LLMs and APR models fail to fix
many Common Weakness Enumeration (CWE) types, such as CWE-325 Missing
cryptographic step and CWE-444 HTTP request smuggling. (4) Codex still fixes
8.3 transformed vulnerabilities, outperforming all the other LLMs and APR
models on transformed vulnerabilities. The results call for innovations to
enhance automated Java vulnerability repair such as creating larger
vulnerability repair training data, tuning LLMs with such data, and applying
code simplification transformation to facilitate vulnerability repair.Comment: This paper has been accepted to appear in the proceedings of the 32nd
ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA
2023), and to be presented at the conference, that will be held in Seattle,
USA, 17-21 July 202
DocLLM: A layout-aware generative language model for multimodal document understanding
Enterprise documents such as forms, invoices, receipts, reports, contracts,
and other similar records, often carry rich semantics at the intersection of
textual and spatial modalities. The visual cues offered by their complex
layouts play a crucial role in comprehending these documents effectively. In
this paper, we present DocLLM, a lightweight extension to traditional large
language models (LLMs) for reasoning over visual documents, taking into account
both textual semantics and spatial layout. Our model differs from existing
multimodal LLMs by avoiding expensive image encoders and focuses exclusively on
bounding box information to incorporate the spatial layout structure.
Specifically, the cross-alignment between text and spatial modalities is
captured by decomposing the attention mechanism in classical transformers to a
set of disentangled matrices. Furthermore, we devise a pre-training objective
that learns to infill text segments. This approach allows us to address
irregular layouts and heterogeneous content frequently encountered in visual
documents. The pre-trained model is fine-tuned using a large-scale instruction
dataset, covering four core document intelligence tasks. We demonstrate that
our solution outperforms SotA LLMs on 14 out of 16 datasets across all tasks,
and generalizes well to 4 out of 5 previously unseen datasets.Comment: 16 pages, 4 figure
BuDDIE: A Business Document Dataset for Multi-task Information Extraction
The field of visually rich document understanding (VRDU) aims to solve a
multitude of well-researched NLP tasks in a multi-modal domain. Several
datasets exist for research on specific tasks of VRDU such as document
classification (DC), key entity extraction (KEE), entity linking, visual
question answering (VQA), inter alia. These datasets cover documents like
invoices and receipts with sparse annotations such that they support one or two
co-related tasks (e.g., entity extraction and entity linking). Unfortunately,
only focusing on a single specific of documents or task is not representative
of how documents often need to be processed in the wild - where variety in
style and requirements is expected. In this paper, we introduce BuDDIE
(Business Document Dataset for Information Extraction), the first multi-task
dataset of 1,665 real-world business documents that contains rich and dense
annotations for DC, KEE, and VQA. Our dataset consists of publicly available
business entity documents from US state government websites. The documents are
structured and vary in their style and layout across states and types (e.g.,
forms, certificates, reports, etc.). We provide data variety and quality
metrics for BuDDIE as well as a series of baselines for each task. Our
baselines cover traditional textual, multi-modal, and large language model
approaches to VRDU
Design and baseline characteristics of the finerenone in reducing cardiovascular mortality and morbidity in diabetic kidney disease trial
Background: Among people with diabetes, those with kidney disease have exceptionally high rates of cardiovascular (CV) morbidity and mortality and progression of their underlying kidney disease. Finerenone is a novel, nonsteroidal, selective mineralocorticoid receptor antagonist that has shown to reduce albuminuria in type 2 diabetes (T2D) patients with chronic kidney disease (CKD) while revealing only a low risk of hyperkalemia. However, the effect of finerenone on CV and renal outcomes has not yet been investigated in long-term trials.
Patients and Methods: The Finerenone in Reducing CV Mortality and Morbidity in Diabetic Kidney Disease (FIGARO-DKD) trial aims to assess the efficacy and safety of finerenone compared to placebo at reducing clinically important CV and renal outcomes in T2D patients with CKD. FIGARO-DKD is a randomized, double-blind, placebo-controlled, parallel-group, event-driven trial running in 47 countries with an expected duration of approximately 6 years. FIGARO-DKD randomized 7,437 patients with an estimated glomerular filtration rate >= 25 mL/min/1.73 m(2) and albuminuria (urinary albumin-to-creatinine ratio >= 30 to <= 5,000 mg/g). The study has at least 90% power to detect a 20% reduction in the risk of the primary outcome (overall two-sided significance level alpha = 0.05), the composite of time to first occurrence of CV death, nonfatal myocardial infarction, nonfatal stroke, or hospitalization for heart failure.
Conclusions: FIGARO-DKD will determine whether an optimally treated cohort of T2D patients with CKD at high risk of CV and renal events will experience cardiorenal benefits with the addition of finerenone to their treatment regimen.
Trial Registration: EudraCT number: 2015-000950-39; ClinicalTrials.gov identifier: NCT02545049
Automatic Ellipsis Resolution: Recovering Covert Information from Text
Ellipsis is a linguistic process that makes certain aspects of text meaning not directly traceable to surface text elements and, therefore, inaccessible to most language processing technologies. However, detecting and resolving ellipsis is an indispensable capability for language-enabled intelligent agents. The key insight of the work presented here is that not all cases of ellipsis are equally difficult: some can be detected and resolved with high confidence even before we are able to build agents with full human-level semantic and pragmatic understanding of text. This paper describes a fully automatic, implemented and evaluated method of treating one class of ellipsis: elided scopes of modality. Our cognitively-inspired approach, which centrally leverages linguistic principles, has also been applied to overt referring expressions with equally promising results
Gasâemission craters of the Yamal and Gydan peninsulas: Aproposed mechanism for lake genesis and development ofpermafrost landscapes
This paper describes two gasâemission craters (GECs) in permafrost regions of the Yamal and Gydan peninsulas. We show that in three consecutive years after GEC formation (2014â2017), both morphometry and hydrochemistry of the inner crater lakes can become indistinguishable from other lakes. Craters GECâ1 and AntGEC, with initial depths of 50â70 and 15â19 m respectively, have transformed into lakes 3â5 m deep. Craterâlike depressions were mapped in the bottom of 13 out of 22 Yamal lakes. However, we found no evidence that these depressions could have been formed as a result of gas emission. Dissolved methane (dCH4) concentration measured in the water collected from these depressions was at a background level (45 ppm on average). Yet, the concentration of dCH4 from the nearâbottom layer of lake GECâ1 was significantly higher (824â968 ppm) during initial stages. We established that hydrochemical parameters (dissolved organic carbon, major ions, isotopes) measured in GEC lakes approached values measured in other lakes over time. Therefore, these parameters could not be used to search for Western Siberian lakes that potentially resulted from gas emission. Temperature profiles measured in GEC lakes show that the water column temperatures in GECâ1 are lower than in Yamal lakes and in AntGEC â close to values of Gydan lakes. Given the initial GEC depth > 50 m, we suggest that at least in GECâ1 possible reâfreezing of sediments from below might take place. However, with the present data we cannot establish the modern thickness of the closed talik under newly formed GEC lakes