Search CORE

80 research outputs found

WELL: Applying Bug Detectors to Bug Localization via Weakly Supervised Learning

Author: Jin Zhi
Li Ge
Li Zhuo
Zhang Huangzhao
Publication venue
Publication date: 27/05/2023
Field of study

Bug localization is a key software development task, where a developer locates the portion of the source code that must be modified based on the bug report. It is label-intensive and time-consuming due to the increasing size and complexity of the modern software. Effectively automating this task can greatly reduce costs by cutting down the developers' effort. Researchers have already made efforts to harness the great powerfulness of deep learning (DL) to automate bug localization. However, training DL models demands a large quantity of annotated training data, while the buggy-location-annotated dataset with reasonable quality and quantity is difficult to collect. This becomes an obstacle to the effective usage of DL for bug localization. We notice that the data pairs for bug detection, which provide weak buggy-or-not binary classification supervision, are much easier to obtain. Inspired by weakly supervised learning, this paper proposes WEakly supervised bug LocaLization (WELL), an approach to transform bug detectors to bug locators. Through the CodeBERT model finetuned by bug detection, WELL is capable to locate bugs in a weakly supervised manner based on the attention. The evaluations on three datasets of WELL show competitive performance with the existing strongly supervised DL solutions. WELL even outperforms current SOTA models in tasks of variable misuse and binary operator misuse.Comment: (Preprint) Software Engineer; Deep Learning; Bug Detection & Localizatio

arXiv.org e-Print Archive

Given 2n eyeballs, all quality flaws are shallow

Author: Gkikopoulos Panagiotis
Mateos Cristian
Spillner Josef
Teyseyre Alfredo
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 07/12/2020
Field of study

We demonstrate the capabilities of the Microservice Artefact Observatory (MAO), a federated software quality assessment middleware. MAO’s extensible assessment tools continuously scan for quality flaws, defects and inconsistencies in microservice artefacts and observe runtime behaviour. The federation reduces bias and also increases the resilience and overcomes per-site failures, leading to a single, merged timeline of software quality. Already serving concurrently by n = 3 observant operators in Argentina and Switzerland, the federation is designed to become a community-wide consensus voting-based ground truth repository with query interfaces for large-scale software quality and evolution insights. These insights can be exploited for excluding buggy software before or after deployment, for optimised resource allocation, and further software management tasks

Crossref

ZHAW digitalcollection

Opinion Mining for Software Development: A Systematic Literature Review

Author: Alexander Serebrenik
Bin Lin
Gabriele Bavota
Michele Lanza
Nathan Cassee
Nicole Novielli
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2022
Field of study

Opinion mining, sometimes referred to as sentiment analysis, has gained increasing attention in software engineering (SE) studies. SE researchers have applied opinion mining techniques in various contexts, such as identifying developers’ emotions expressed in code comments and extracting users’ critics toward mobile apps. Given the large amount of relevant studies available, it can take considerable time for researchers and developers to figure out which approaches they can adopt in their own studies and what perils these approaches entail. We conducted a systematic literature review involving 185 papers. More specifically, we present 1) well-defined categories of opinion mining-related software development activities, 2) available opinion mining approaches, whether they are evaluated when adopted in other studies, and how their performance is compared, 3) available datasets for performance evaluation and tool customization, and 4) concerns or limitations SE researchers might need to take into account when applying/customizing these opinion mining techniques. The results of our study serve as references to choose suitable opinion mining tools for software development activities, and provide critical insights for the further development of opinion mining techniques in the SE domain

Pure OAI Repository

Archivio istituzionale della ricerca - Università di Bari

Antipatterns in Software Classification Taxonomies

Author: Capiluppi Andrea
Sas Cezar
Publication venue
Publication date: 19/04/2022
Field of study

Empirical results in software engineering have long started to show that findings are unlikely to be applicable to all software systems, or any domain: results need to be evaluated in specified contexts, and limited to the type of systems that they were extracted from. This is a known issue, and requires the establishment of a classification of software types. This paper makes two contributions: the first is to evaluate the quality of the current software classifications landscape. The second is to perform a case study showing how to create a classification of software types using a curated set of software systems. Our contributions show that existing, and very likely even new, classification attempts are deemed to fail for one or more issues, that we named as the `antipatterns' of software classification tasks. We collected 7 of these antipatterns that emerge from both our case study, and the existing classifications. These antipatterns represent recurring issues in a classification, so we discuss practical ways to help researchers avoid these pitfalls. It becomes clear that classification attempts must also face the daunting task of formulating a taxonomy of software types, with the objective of establishing a hierarchy of categories in a classification.Comment: Accepted for publish at the Journal of Systems and Softwar

arXiv.org e-Print Archive

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Characterizing and Detecting Duplicate Logging Code Smells

Author: Li Zhenhao
Publication venue
Publication date: 01/08/2019
Field of study

Developers rely on software logs for a wide variety of tasks, such as debugging, testing, program comprehension, verification, and performance analysis. Despite the importance of logs, prior studies show that there is no industrial standard on how to write logging statements. Recent research on logs often only considers the appropriateness of a log as an individual item (e.g., one single logging statement); while logs are typically analyzed in tandem. In this thesis, we focus on studying duplicate logging statements, which are logging statements that have the same static text message. Such duplications in the text message are potential indications of logging code smells, which may affect developers’ understanding of the dynamic view of the system. We manually studied over 3K duplicate logging statements and their surrounding code in four large-scale open source systems: Hadoop, CloudStack, ElasticSearch, and Cassandra. We uncovered five patterns of duplicate logging code smells. For each instance of the code smell, we further manually identify the problematic (i.e., require fixes) and justifiable (i.e., do not require fixes) cases. Then, we contact developers in order to verify our manual study result. We integrated our manual study result and developers’ feedback into our automated static analysis tool, DLFinder, which automatically detects problematic duplicate logging code smells. We evaluated DLFinder on the four manually studied systems and four additional systems: Kafka, Flink, Camel and Wicket. In total, combining the results of DLFinder and our manual analysis, we reported 91 problematic code smell instances to developers and all of them have been fixed. This thesis provides an initial step on creating a logging guideline for developers to improve the quality of logging code. DLFinder is also able to detect duplicate logging code smells with high precision and recall

Crossref

Concordia University Research Repository

Antipatterns in software classification taxonomies

Author: Capiluppi Andrea
Sas Cezar
Publication venue: 'Elsevier BV'
Publication date: 01/08/2022
Field of study

Empirical results in software engineering have long started to show that findings are unlikely to be applicable to all software systems, or any domain: results need to be evaluated in specified contexts, and limited to the type of systems that they were extracted from. This is a known issue, and requires the establishment of a classification of software types. This paper makes two contributions: the first is to evaluate the quality of the current software classifications landscape. The second is to perform a case study showing how to create a classification of software types using a curated set of software systems. Our contributions show that existing, and very likely even new, classification attempts are deemed to fail for one or more issues, that we named as the ‘antipatterns’ of software classification tasks. We collected 7 of these antipatterns that emerge from both our case study, and the existing classifications. These antipatterns represent recurring issues in a classification, so we discuss practical ways to help researchers avoid these pitfalls. It becomes clear that classification attempts must also face the daunting task of formulating a taxonomy of software types, with the objective of establishing a hierarchy of categories in a classification

Proceedings - University of Groningen

University of Groningen

Dissertations of the University of Groningen

Bots in software engineering: a systematic mapping study

Author: Hecking Tobias
Santhanam Sivasurya
Schreiber Andreas
Wagner Stefan
Publication venue: 'PeerJ'
Publication date: 01/02/2022
Field of study

Bots have emerged from research prototypes to deployable systems due to the recent developments in machine learning, natural language processing and understanding techniques. In software engineering, bots range from simple automated scripts to decision-making autonomous systems. The spectrum of applications of bots in software engineering is so wide and diverse, that a comprehensive overview and categorization of such bots is needed. Existing works considered selective bots to be analyzed and failed to provide the overall picture. Hence it is significant to categorize bots in software engineering through analyzing why, what and how the bots are applied in software engineering. We approach the problem with a systematic mapping study based on the research articles published in this topic. This study focuses on classification of bots used in software engineering, the various dimensions of the characteristics, the more frequently researched area, potential research spaces to be explored and the perception of bots in the developer community. This study aims to provide an introduction and a broad overview of bots used in software engineering. Discussions of the feedback and results from several studies provide interesting insights and prospective future directions

Institute of Transport Research:Publications

PubMed Central

Speculative Analysis for Quality Assessment of Code Comments

Author: Rani Pooja
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 18/02/2021
Field of study

Previous studies have shown that high-quality code comments assist developers in program comprehension and maintenance tasks. However, the semi-structured nature of comments, unclear conventions for writing good comments, and the lack of quality assessment tools for all aspects of comments make their evaluation and maintenance a non-trivial problem. To achieve high-quality comments, we need a deeper understanding of code comment characteristics and the practices developers follow. In this thesis, we approach the problem of assessing comment quality from three different perspectives: what developers ask about commenting practices, what they write in comments, and how researchers support them in assessing comment quality. Our preliminary findings show that developers embed various kinds of information in class comments across programming languages. Still, they face problems in locating relevant guidelines to write consistent and informative comments, verifying the adherence of their comments to the guidelines, and evaluating the overall state of comment quality. To help developers and researchers in building comment quality assessment tools, we provide: (i) an empirically validated taxonomy of comment convention-related questions from various community forums, (ii) an empirically validated taxonomy of comment information types from various programming languages, (iii) a language-independent approach to automatically identify the information types, and (iv) a comment quality taxonomy prepared from a systematic literature review.Comment: 5 pages, 1 figure, conferenc

arXiv.org e-Print Archive

Bern Open Repository and Information System (BORIS)

Can NMT Understand Me? Towards Perturbation-based Evaluation of NMT Models for Code Generation

Author: Cotroneo Domenico
Cukic Bojan
De Vivo Simona
Improta Cristina
Liguori Pietro
Natella Roberto
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2022
Field of study

Neural Machine Translation (NMT) has reached a level of maturity to be recognized as the premier method for the translation between different languages and aroused interest in different research areas, including software engineering. A key step to validate the robustness of the NMT models consists in evaluating the performance of the models on adversarial inputs, i.e., inputs obtained from the original ones by adding small amounts of perturbation. However, when dealing with the specific task of the code generation (i.e., the generation of code starting from a description in natural language), it has not yet been defined an approach to validate the robustness of the NMT models. In this work, we address the problem by identifying a set of perturbations and metrics tailored for the robustness assessment of such models. We present a preliminary experimental evaluation, showing what type of perturbations affect the model the most and deriving useful insights for future directions.Comment: Paper accepted for publication in the proceedings of The 1st Intl. Workshop on Natural Language-based Software Engineering (NLBSE) to be held with ICSE 202

arXiv.org e-Print Archive

Archivio della ricerca - Università degli studi di Napoli Federico II

ANTIVIRUS PERFORMANCE EVALUATION AGAINST POWERSHELL OBFUSCATED MALWARE

Author: Dimov Radostin
Savova Zhaneta
Publication venue: Rezekne Academy of Technologies
Publication date: 22/06/2024
Field of study

In recent years, malware attacks have become increasingly sophisticated, and the methods used by attackers to evade Windows defenses have grown more complex. As a result, detecting and defending against these attacks has become an ever more pressing challenge for security professionals. Despite significant efforts to improve Windows security, attackers continue to find new ways to bypass these defenses and infiltrate systems. The techniques covered in this paper are all currently active and effective at evading Windows defenses. Our findings underscore the need for continued vigilance and the importance of staying up to date with the latest threats and countermeasures

Journals of Rezekne Academy of Technologies