35 research outputs found

    Autonomous Large Language Model Agents Enabling Intent-Driven Mobile GUI Testing

    Full text link
    GUI testing checks if a software system behaves as expected when users interact with its graphical interface, e.g., testing specific functionality or validating relevant use case scenarios. Currently, deciding what to test at this high level is a manual task since automated GUI testing tools target lower level adequacy metrics such as structural code coverage or activity coverage. We propose DroidAgent, an autonomous GUI testing agent for Android, for semantic, intent-driven automation of GUI testing. It is based on Large Language Models and support mechanisms such as long- and short-term memory. Given an Android app, DroidAgent sets relevant task goals and subsequently tries to achieve them by interacting with the app. Our empirical evaluation of DroidAgent using 15 apps from the Themis benchmark shows that it can set up and perform realistic tasks, with a higher level of autonomy. For example, when testing a messaging app, DroidAgent created a second account and added a first account as a friend, testing a realistic use case, without human intervention. On average, DroidAgent achieved 61% activity coverage, compared to 51% for current state-of-the-art GUI testing techniques. Further, manual analysis shows that 317 out of the 374 autonomously created tasks are realistic and relevant to app functionalities, and also that DroidAgent interacts deeply with the apps and covers more features.Comment: 10 page

    Large Language Models are Few-shot Testers: Exploring LLM-based General Bug Reproduction

    Full text link
    Many automated test generation techniques have been developed to aid developers with writing tests. To facilitate full automation, most existing techniques aim to either increase coverage, or generate exploratory inputs. However, existing test generation techniques largely fall short of achieving more semantic objectives, such as generating tests to reproduce a given bug report. Reproducing bugs is nonetheless important, as our empirical study shows that the number of tests added in open source repositories due to issues was about 28% of the corresponding project test suite size. Meanwhile, due to the difficulties of transforming the expected program semantics in bug reports into test oracles, existing failure reproduction techniques tend to deal exclusively with program crashes, a small subset of all bug reports. To automate test generation from general bug reports, we propose LIBRO, a framework that uses Large Language Models (LLMs), which have been shown to be capable of performing code-related tasks. Since LLMs themselves cannot execute the target buggy code, we focus on post-processing steps that help us discern when LLMs are effective, and rank the produced tests according to their validity. Our evaluation of LIBRO shows that, on the widely studied Defects4J benchmark, LIBRO can generate failure reproducing test cases for 33% of all studied cases (251 out of 750), while suggesting a bug reproducing test in first place for 149 bugs. To mitigate data contamination, we also evaluate LIBRO against 31 bug reports submitted after the collection of the LLM training data terminated: LIBRO produces bug reproducing tests for 32% of the studied bug reports. Overall, our results show LIBRO has the potential to significantly enhance developer efficiency by automatically generating tests from bug reports.Comment: Accepted to IEEE/ACM International Conference on Software Engineering 2023 (ICSE 2023

    Towards Autonomous Testing Agents via Conversational Large Language Models

    Full text link
    Software testing is an important part of the development cycle, yet it requires specialized expertise and substantial developer effort to adequately test software. The recent discoveries of the capabilities of large language models (LLMs) suggest that they can be used as automated testing assistants, and thus provide helpful information and even drive the testing process. To highlight the potential of this technology, we present a taxonomy of LLM-based testing agents based on their level of autonomy, and describe how a greater level of autonomy can benefit developers in practice. An example use of LLMs as a testing assistant is provided to demonstrate how a conversational framework for testing can help developers. This also highlights how the often criticized hallucination of LLMs can be beneficial while testing. We identify other tangible benefits that LLM-driven testing agents can bestow, and also discuss some potential limitations

    The GitHub Recent Bugs Dataset for Evaluating LLM-based Debugging Applications

    Full text link
    Large Language Models (LLMs) have demonstrated strong natural language processing and code synthesis capabilities, which has led to their rapid adoption in software engineering applications. However, details about LLM training data are often not made public, which has caused concern as to whether existing bug benchmarks are included. In lieu of the training data for the popular GPT models, we examine the training data of the open-source LLM StarCoder, and find it likely that data from the widely used Defects4J benchmark was included, raising the possibility of its inclusion in GPT training data as well. This makes it difficult to tell how well LLM-based results on Defects4J would generalize, as for any results it would be unclear whether a technique's performance is due to LLM generalization or memorization. To remedy this issue and facilitate continued research on LLM-based SE, we present the GitHub Recent Bugs (GHRB) dataset, which includes 76 real-world Java bugs that were gathered after the OpenAI data cut-off point

    Structural basis for arginine glycosylation of host substrates by bacterial effector proteins

    Get PDF
    The bacterial effector proteins SseK and NleB glycosylate host proteins on arginine residues, leading to reduced NF-κB-dependent responses to infection. Salmonella SseK1 and SseK2 are E. coli NleB1 orthologs that behave as NleB1-like GTs, although they differ in protein substrate specificity. Here we report that these enzymes are retaining glycosyltransferases composed of a helix-loop-helix (HLH) domain, a lid domain, and a catalytic domain. A conserved HEN motif (His-Glu-Asn) in the active site is important for enzyme catalysis and bacterial virulence. We observe differences between SseK1 and SseK2 in interactions with substrates and identify substrate residues that are critical for enzyme recognition. Long Molecular Dynamics simulations suggest that the HLH domain determines substrate specificity and the lid-domain regulates the opening of the active site. Overall, our data suggest a front-face SNi mechanism, explain differences in activities among these effectors, and have implications for future drug development against enteric pathogens

    Immunohistochemical localization of galectin-3 in the granulomatous lesions of paratuberculosis-infected bovine intestine

    Get PDF
    The presence of galectin-3 was immunohistochemically quantified in bovine intestines infected with paratuberculosis (Johne's disease) to determine whether galectin-3 was involved in the formation of granulation tissue associated with the disease. Mycobacterium avium subsp. paratuberculosis infection was histochemically confirmed using Ziehl-Neelsen staining and molecularly diagnosed through rpoB DNA sequencing. Galectin-3 was detected in the majority of inflammatory cells, possibly macrophages, in the granulomatous lesions within affected tissues, including the ileum. These findings suggest that galectin-3 is associated with the formation of chronic granulation tissues in bovine paratuberculosis, probably through cell adhesion and anti-apoptosis mechanisms

    Korea's intellectual property rights system and its application to the phases of industrial development

    No full text

    2011 경제발전경험모듈화사업 : 한국의 산업발전 단계별 지식재산권 제도·정책의 정비 및 활용사례 : 특허제도를 중심으로

    No full text
    Prologue Summary Chapter 1 Introduction 1. Existing Discussions on the IPR System and Industrial Development 2. Observation of Korea’s IPR System in the Context of Industrial Development 3. Proposing Three Developmental Phases of Korea’s IPR System Chapter 2 The First Developmental Phase of Korea’s IPR System: Introduction Period (1900s-70s) 1. Overview of the Introduction Period 2. Outline of Patent System&Policy 3. Major Details of Patent Administration and Infrastructure Chapter 3 The Second Developmental Phase of Korea’s IPR System: Settlement Period (1980-late 1990s) 1. Overview of the Settlement Period (Second Period: from 1980 to the late 1990s) 2. Major Details of Patent System and Policy 3. Administration Institutes and Infrastructure for Patents Chapter 4 The Third Developmental Phase of Korea’s IPR System: Advancement Period (late 1990s-Present) 1. Overview of the Advancement Period 2. Execution of Patent Institutions and Policies 3. Restructuring of Patent Administration and Infrastructure Chapter 5 Evaluation and Implications 1. Three Keys to Korea’s Success 2. Discussion on Limitations 3. Closing Remarks References Appendi
    corecore