Search CORE

36 research outputs found

Code-Aware Prompting: A study of Coverage Guided Test Generation in Regression Setting using LLM

Author: Jain Siddhartha
Ma Xiaofei
Ramanathan Murali Krishna
Ray Baishakhi
Ryan Gabriel
Shang Mingyue
Wang Shiqi
Publication venue
Publication date: 02/04/2024
Field of study

Testing plays a pivotal role in ensuring software quality, yet conventional Search Based Software Testing (SBST) methods often struggle with complex software units, achieving suboptimal test coverage. Recent works using large language models (LLMs) for test generation have focused on improving generation quality through optimizing the test generation context and correcting errors in model outputs, but use fixed prompting strategies that prompt the model to generate tests without additional guidance. As a result LLM-generated testsuites still suffer from low coverage. In this paper, we present SymPrompt, a code-aware prompting strategy for LLMs in test generation. SymPrompt's approach is based on recent work that demonstrates LLMs can solve more complex logical problems when prompted to reason about the problem in a multi-step fashion. We apply this methodology to test generation by deconstructing the testsuite generation process into a multi-stage sequence, each of which is driven by a specific prompt aligned with the execution paths of the method under test, and exposing relevant type and dependency focal context to the model. Our approach enables pretrained LLMs to generate more complete test cases without any additional training. We implement SymPrompt using the TreeSitter parsing framework and evaluate on a benchmark challenging methods from open source Python projects. SymPrompt enhances correct test generations by a factor of 5 and bolsters relative coverage by 26% for CodeGen2. Notably, when applied to GPT-4, SymPrompt improves coverage by over 2x compared to baseline prompting strategies

arXiv.org e-Print Archive

CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion

Author: Ahmad Wasi Uddin
Bhatia Parminder
Ding Hantian
Ding Yangruibo
Jain Nihal
Nallapati Ramesh
Ramanathan Murali Krishna
Roth Dan
Tan Ming
Wang Zijian
Xiang Bing
Publication venue
Publication date: 16/11/2023
Field of study

Code completion models have made significant progress in recent years, yet current popular evaluation datasets, such as HumanEval and MBPP, predominantly focus on code completion tasks within a single file. This over-simplified setting falls short of representing the real-world software development scenario where repositories span multiple files with numerous cross-file dependencies, and accessing and understanding cross-file context is often required to complete the code correctly. To fill in this gap, we propose CrossCodeEval, a diverse and multilingual code completion benchmark that necessitates an in-depth cross-file contextual understanding to complete the code accurately. CrossCodeEval is built on a diverse set of real-world, open-sourced, permissively-licensed repositories in four popular programming languages: Python, Java, TypeScript, and C#. To create examples that strictly require cross-file context for accurate completion, we propose a straightforward yet efficient static-analysis-based approach to pinpoint the use of cross-file context within the current file. Extensive experiments on state-of-the-art code language models like CodeGen and StarCoder demonstrate that CrossCodeEval is extremely challenging when the relevant cross-file context is absent, and we see clear improvements when adding these context into the prompt. However, despite such improvements, the pinnacle of performance remains notably unattained even with the highest-performing model, indicating that CrossCodeEval is also capable of assessing model's capability in leveraging extensive context to make better code completion. Finally, we benchmarked various methods in retrieving cross-file context, and show that CrossCodeEval can also be used to measure the capability of code retrievers.Comment: To appear at NeurIPS 2023 (Datasets and Benchmarks Track

arXiv.org e-Print Archive

Multi-agent modeling of the South Korean avian influenza epidemic

Author: A Litvak-Hinenzonm
A Stegeman
A Vazquez
Aidong Zhang
AL Menach
B Lupiani
B Maher
C Fraser
C Soares
CDCP
D Normile
DJ Smith
EPR
F MT
G Jeon
G Neumann
GJ Smith
IADMT
IMJ Longini
JA Nelder
JC Obenauer
JM Epstein
JP Graham
LA Amaral
M Chan
M Galbiati
MD Jong
MJ Keeling
Murali Ramanathan
NM Ferguson
NM Ferguson
QRA
RJ Webby
S Bertozzi
S Eubank
Surajit Sen
T Kim
T Kim
Taehyong Kim
TR Krishna Mohan
VE Pitzer
VJ Munster
WHO
WHO
Woochang Hwang
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Several highly pathogenic avian influenza (AI) outbreaks have been reported over the past decade. South Korea recently faced AI outbreaks whose economic impact was estimated to be 6.3 billion dollars, equivalent to nearly 50% of the profit generated by the poultry-related industries in 2008. In addition, AI is threatening to cause a human pandemic of potentially devastating proportions. Several studies show that a stochastic simulation model can be used to plan an efficient containment strategy on an emerging influenza. Efficient control of AI outbreaks based on such simulation studies could be an important strategy in minimizing its adverse economic and public health impacts. Methods We constructed a spatio-temporal multi-agent model of chickens and ducks in poultry farms in South Korea. The spatial domain, comprised of 76 (37.5 km × 37.5 km) unit squares, approximated the size and scale of South Korea. In this spatial domain, we introduced 3,039 poultry flocks (corresponding to 2,231 flocks of chickens and 808 flocks of ducks) whose spatial distribution was proportional to the number of birds in each province. The model parameterizes the properties and dynamic behaviors of birds in poultry farms and quarantine plans and included infection probability, incubation period, interactions among birds, and quarantine region. Results We conducted sensitivity analysis for the different parameters in the model. Our study shows that the quarantine plan with well-chosen values of parameters is critical for minimize loss of poultry flocks in an AI outbreak. Specifically, the aggressive culling plan of infected poultry farms over 18.75 km radius range is unlikely to be effective, resulting in higher fractions of unnecessarily culled poultry flocks and the weak culling plan is also unlikely to be effective, resulting in higher fractions of infected poultry flocks. Conclusions Our results show that a prepared response with targeted quarantine protocols would have a high probability of containing the disease. The containment plan with an aggressive culling plan is not necessarily efficient, causing a higher fraction of unnecessarily culled poultry farms. Instead, it is necessary to balance culling with other important factors involved in AI spreading. Better estimations for the containment of AI spreading with this model offer the potential to reduce the loss of poultry and minimize economic impact on the poultry industry.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Path-aware analysis of program invariants

Author: Ramanathan Murali Krishna
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2007
Field of study

Ensuring software reliability is a critical problem in the software development process. There are three overarching issues that help improve reliability of complex software systems: (a) availability of specifications that describe important invariants; (b) tools to identify when specifications are violated, and why these violations occur; and (c) the impact of modifications of programs on derived specifications. In this dissertation, we present scalable and efficient path-aware analyses that offer solutions to these three concerns and demonstrate how these solutions lead to improved software reliability. We develop a static path-aware analysis to infer specifications automatically from large software sources. We describe a static inference mechanism for identifying the preconditions that must hold whenever a procedure is called. These preconditions may reflect both dataflow properties (e.g., whenever p is called, variable x must be non-null) as well as control-flow properties (e.g., every call to p must be preceded by a call to q). We derive these preconditions using an inter-procedural path-aware dataflow analysis that gathers predicates at each program point. We apply mining techniques to these predicates to make specification inference robust with respect to errors. This technique also allows us to derive higher-level specifications that abstract structural similarities among predicates (e.g., procedure p is called immediately after a conditional test that checks whether some variable v is non-null). To identify those program statements that influence a specification or assertion, we develop a dynamic path-aware analysis that combines relevant information from multiple paths leading to an assertion point. This path information is encoded as a Boolean formula. The elements of this formula are derived from the predicates in conditional guards found on paths leading to an assertion point. These paths are generated from multiple dynamic runs that pass through the assertion. In addition to describing a test generation mechanism that drives execution through the assertion, we also present a novel representation scheme that coalesces paths using Binary Decision Diagrams (BDDs). Our representation thus allows effective pruning of redundant predicates. Finally, we present a novel solution to the general problem of understanding how specifications are influenced by revisions in program sources. When a revision, even a minor one, does occur, the changes it induces must be tested to ensure that invariants assumed in the original version are not violated unintentionally. In order to avoid testing components that are unchanged across revisions, impact analysis is often used to identify code blocks or functions that are affected by a change. Our approach employs dynamic programming on instrumented traces of different program binaries to identify longest common subsequences in strings generated by these traces. Our formulation allows us to perform impact analysis and also to detect the smallest set of locations within the functions where the effect of the changes actually manifests itself

Purdue E-Pubs

Sieve: A Tool for Automatically Detecting Variations Across Program Versions

Author: Grama Ananth Y.
Jagannathan Suresh
Ramanathan Murali Krishna
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2005
Field of study

Revisions are an essential cllaracteristic of large-scale software development. Software systems often undergo many revisions during their lifetime because new features are added, bugs repaired, abstractions simplified and refactored, and performance improved. When a revision: even a minor one: does occur: the changes it induces must be tested to ensure that assumed invariants in the original are not violated. In order to avoid testing components that are unchanged across revisions: impact analysis is often used to identify those code blocks or functions that. are affected by a. change. In this paper, we present a new solution to this general problem that uses dynamic progranlming on inst.rumented traces of different program binaries to identify longest common subsequences in the strings generated by these traces. Our formulation not only allo\\~s 11s to perform impact analysis, but can also be used to detect the smallest set of locations within these functions where the effect of the changes actually manifest. Sieve is a tool that incorporates these ideas. Sieve is unobtrusive, requiring no programmer or compiler involvement to guide its behavior. We have tested Sieve on multiple versions of open-source C programs and find that the accuracy of impact analysis is improved by 10-30 % compared to existing state-of-the-art implementations. hlore significantly, Sieve can identify the regions \\here the changes manifest: and discovers that lor the vast majority of impacted lunctions: the locus of change is limited to often less than three lines of code. These results lead us to conclude that Sieve can play a beneficial role in program testing and software maintenance

CiteSeerX

Purdue E-Pubs

Finding good peers in peer-to-peer networks

Author: Jim Pruyne
Jim Pruyne
Murali Krishna Ramanathan
Murali Krishna Ramanathan
Vana Kalogeraki
Vana Kalogeraki
Publication venue
Publication date: 01/01/2002
Field of study

peer-to-peer networks, decentralized As computing and communication capabilities have continued to increase, more and more activity is taking place at the edges of the network, typically in homes or on workers desktops. This trend has been demonstrated by the increasing popularity and usability of &quot;peer-to-peer &quot; systems such as Napster and Gnutella. Unfortunately, this popularity has quickly shown the limitations of these systems, particularly in terms of scale. Because the networks form in an ad-hoc manner, they typically make inefficient use of resources. We propose a mechanism, using only local knowledge, to improve the overall performance of peer-to-peer networks based on interests. Peers monitor which other peers frequently respond successfully to their requests for information. When a peer is discovered to frequently provide good results, the peer attempts to move closer to it in the network by creating a new connection with that peer. This leads to clusters of peers with similar interests, and in turn allows us to limit the depth of searches required to find good results. We have implemented our algorithm in the context of a distributed encyclopedia-style information sharing application which is built on top of the gnutella network. In our testing environment, we have shown the ability to greatly reduce the amount of communication resources required to find the desired articles in the encyclopedia

CiteSeerX

Path-Sensitive Inference of Function Precedence Protocols

Author: Grama Ananth Y.
Jagannathan Suresh
Ramanathan Murali Krishna
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2006
Field of study

Function precedence protocols define ordering relations among function calls in a program. In some instances, precedence protocols are well-understood (e.g., a call to pthread mutex init must always be present on all program paths before a call to pthread mutex lock). Oftentimes, however, these protocols are neither welldocumented, nor easily derived. As a result, protocol violations can lead to subtle errors that are difficult to identify and correct. In this paper, we present CHRONICLER, a tool that applies scalable inter-procedural path-sensitive static analysis to automatically infer accurate function precedence protocols. CHRONICLER computes precedence relations based on a program’s control-flow structure, integrates these relations into a repository, and analyzes them using sequence mining techniques to generate a collection of feasible precedence protocols. Deviations from these protocols found in the program are tagged as violations, and represent potential sources of bugs. We demonstrate CHRONICLER’s effectiveness by deriving protocols for a collection of benchmarks ranging in size from 66K to 2M lines of code. Our results not only confirm the existence of bugs in these programs due to precedence protocol violations, but also highlight the importance of path sensitivity on accuracy and scalability

CiteSeerX

Crossref

Purdue E-Pubs

Trace Driven Dynamic Deadlock Detection and Reproduction

Author: Ramanathan Murali Krishna
Samak Malavika
Publication venue: ASSOC COMPUTING MACHINERY
Publication date: 01/01/2014
Field of study

Dynamic analysis techniques have been proposed to detect potential deadlocks. Analyzing and comprehending each potential deadlock to determine whether the deadlock is feasible in a real execution requires significant programmer effort. Moreover, empirical evidence shows that existing analyses are quite imprecise. This imprecision of the analyses further void the manual effort invested in reasoning about non-existent defects. In this paper, we address the problems of imprecision of existing analyses and the subsequent manual effort necessary to reason about deadlocks. We propose a novel approach for deadlock detection by designing a dynamic analysis that intelligently leverages execution traces. To reduce the manual effort, we replay the program by making the execution follow a schedule derived based on the observed trace. For a real deadlock, its feasibility is automatically verified if the replay causes the execution to deadlock. We have implemented our approach as part of WOLF and have analyzed many large (upto 160KLoC) Java programs. Our experimental results show that we are able to identify 74% of the reported defects as true (or false) positives automatically leaving very few defects for manual analysis. The overhead of our approach is negligible making it a compelling tool for practical adoption

Crossref

ePrints@IISc