Search CORE

13 research outputs found

Which log level should developers choose for a new logging statement?

Author: Hassan Ahmed E.
Li Heng
Shang Weiyi
Publication venue: Springer
Publication date: 14/10/2016
Field of study

Crossref

PolyPublie

Characterizing and Detecting Duplicate Logging Code Smells

Author: Li Zhenhao
Publication venue
Publication date: 01/08/2019
Field of study

Developers rely on software logs for a wide variety of tasks, such as debugging, testing, program comprehension, verification, and performance analysis. Despite the importance of logs, prior studies show that there is no industrial standard on how to write logging statements. Recent research on logs often only considers the appropriateness of a log as an individual item (e.g., one single logging statement); while logs are typically analyzed in tandem. In this thesis, we focus on studying duplicate logging statements, which are logging statements that have the same static text message. Such duplications in the text message are potential indications of logging code smells, which may affect developers’ understanding of the dynamic view of the system. We manually studied over 3K duplicate logging statements and their surrounding code in four large-scale open source systems: Hadoop, CloudStack, ElasticSearch, and Cassandra. We uncovered five patterns of duplicate logging code smells. For each instance of the code smell, we further manually identify the problematic (i.e., require fixes) and justifiable (i.e., do not require fixes) cases. Then, we contact developers in order to verify our manual study result. We integrated our manual study result and developers’ feedback into our automated static analysis tool, DLFinder, which automatically detects problematic duplicate logging code smells. We evaluated DLFinder on the four manually studied systems and four additional systems: Kafka, Flink, Camel and Wicket. In total, combining the results of DLFinder and our manual analysis, we reported 91 problematic code smell instances to developers and all of them have been fixed. This thesis provides an initial step on creating a logging guideline for developers to improve the quality of logging code. DLFinder is also able to detect duplicate logging code smells with high precision and recall

Crossref

Concordia University Research Repository

Log4Perf: Suggesting and Updating Logging Locations for Web-based Systems' Performance Monitoring

Author: Yao Kundi
Publication venue
Publication date: 22/08/2018
Field of study

Performance assurance activities are an essential step in the release cycle of software systems. Logs have become one of the most important sources of information that is used to monitor, understand and improve software performance. However, developers often face the challenge of making logging decisions, i.e., neither logging too little and logging too much is desirable. Although prior research has proposed techniques to assist in logging decisions, those automated logging guidance techniques are rather general, without considering a particular goal, such as monitoring software performance. In this thesis, we present Log4Perf, an automated approach that provides suggestions of where to insert logging statements with the goal of monitoring web-based systems' software performance. In particular, our approach builds and manipulates a statistical performance model to identify the locations in the source code that statistically significantly influence software performance. To evaluate Log4Perf, we conduct case studies on open source systems, i.e., CloudStore and OpenMRS, and one large-scale commercial system. Our evaluation results show that Log4Perf can build well-fit statistical performance models, indicating that such models can be leveraged to investigate the influence of locations in the source code on performance. Also, the suggested logging locations are often small and simple methods that do not have logging statements and that are not performance hotspots, making our approach an ideal complement to traditional approaches that are based on software metrics or performance hotspots. In addition, we proposed approaches that can suggest the need for updating logging locations when software evolves. After evaluating our approach, we manually examine the logging locations that are newly suggested or deprecated and identify seven root-causes. Log4Perf is integrated into the release engineering process of the commercial software to provide logging suggestions on a regular basis

Concordia University Research Repository

An Exploratory Study on the Characteristics of Logging Practices in Mobile Apps: A Case Study on F-Droid

Author: Zeng Yi
Publication venue
Publication date: 01/08/2019
Field of study

Logging is a common practice in software engineering. Prior research has investigated the characteristics of logging practices in system software (e.g., web servers or databases) as well as desktop applications. However, despite the popularity of mobile apps, little is known about their logging practices. In this thesis, we sought to study logging practices in mobile apps. In particular, we conduct a case study on 1,444 open source Android apps in the F-Droid repository. Through a quantitative study, we find that although mobile app logging is less pervasive than server and desktop applications, logging is leveraged in almost all studied apps. However, we find that there exist considerable differences between the logging practices of mobile apps and the logging practices in server and desktop applications observed by prior studies. In order to further understand such differences, we conduct a firehouse email interview and a qualitative annotation on the rationale of using logs in mobile app development. By comparing the logging level of each logging statement with developers' rationale of using the logs, we find that all too often (35.4%), the chosen logging level and the rationale are inconsistent. Such inconsistency may prevent the useful runtime information to be recorded or may generate unnecessary logs that may cause performance overhead. Finally, to understand the magnitude of such performance overhead, we conduct a performance evaluation between generating all the logs and not generating any logs in eight mobile apps. In general, we observe a statistically significant performance overhead based on various performance metrics (response time, CPU and battery consumption). In addition, we find that if the performance overhead of logging is significantly observed in an app, disabling the unnecessary logs indeed provides a statistically significant performance improvement. Our results show the need for a systematic guidance and automated tool support to assist in mobile logging practices

Concordia University Research Repository

Use and misuse of the term "Experiment" in mining software repositories research

Author: Ayala Martínez Claudia Patricia
Franch Gutiérrez Javier
Juristo Juzgado Natalia
Turhan Burak
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/11/2022
Field of study

The significant momentum and importance of Mining Software Repositories (MSR) in Software Engineering (SE) has fostered new opportunities and challenges for extensive empirical research. However, MSR researchers seem to struggle to characterize the empirical methods they use into the existing empirical SE body of knowledge. This is especially the case of MSR experiments. To provide evidence on the special characteristics of MSR experiments and their differences with experiments traditionally acknowledged in SE so far, we elicited the hallmarks that differentiate an experiment from other types of empirical studies and characterized the hallmarks and types of experiments in MSR. We analyzed MSR literature obtained from a small-scale systematic mapping study to assess the use of the term experiment in MSR. We found that 19% of the papers claiming to be an experiment are indeed not an experiment at all but also observational studies, so they use the term in a misleading way. From the remaining 81% of the papers, only one of them refers to a genuine controlled experiment while the others stand for experiments with limited control. MSR researchers tend to overlook such limitations, compromising the interpretation of the results of their studies. We provide recommendations and insights to support the improvement of MSR experiments.This work has been partially supported by the Spanish project: MCI PID2020-117191RB-I00.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

Automated Evolution of Feature Logging Statement Levels Using Git Histories and Degree of Interest

Author: Bagherzadeh Mehdi
Khatchadourian Raffi
Spektor Allan
Tang Yiming
Publication venue: CUNY Academic Works
Publication date: 15/04/2021
Field of study

Logging—used for system events and security breaches to more informational yet essential aspects of software features—is pervasive. Given the high transactionality of today’s software, logging effectiveness can be reduced by information overload. Log levels help alleviate this problem by correlating a priority to logs that can be later filtered. As software evolves, however, levels of logs documenting surrounding feature implementations may also require modification as features once deemed important may have decreased in urgency and vice-versa. We present an automated approach that assists developers in evolving levels of such (feature) logs. The approach, based on mining Git histories and manipulating a degree of interest (DOI) model, transforms source code to revitalize feature log levels based on the “interestingness” of the surrounding code. Built upon JGit and Mylyn, the approach is implemented as an Eclipse IDE plug-in and evaluated on 18 Java projects with ~3 million lines of code and ~4K log statements. Our tool successfully analyzes 99.26% of logging statements, increases log level distributions by ~20%, identifies logs manually modified with a recall of ~80% and a level-direction match rate of ~87%, and increases the focus of logs in bug fix contexts ~83% of the time. Moreover, pull (patch) requests were integrated into large and popular open-source projects. The results indicate that the approach is promising in assisting developers in evolving feature log levels

arXiv.org e-Print Archive

City University of New York

MobiLogLeak: A Study on Data Leakage Caused by Poor Logging Practices

Author: Zhou Rui
Publication venue
Publication date: 16/07/2020
Field of study

Logging is an essential software practice that is used by developers to debug, diagnose and audit software systems. Despite the advantages of logging, poor logging practices can potentially leak sensitive data. The problem of data leakage is more severe in applications that run on mobile devices, since these devices carry sensitive identification information ranging from physical device identifiers (e.g., IMEI MAC address) to communications network identifiers (e.g., SIM, IP, Bluetooth ID), and application-specific identifiers related to the location and accounts of users. This study explores the impact of logging practices on data leakage of such sensitive information. Particularly, we want to investigate whether logs inserted into an application code could lead to data leakage. While studying logging practices in mobile applications is an active research area, to our knowledge, this is the first study that explores the interplay between logging and security in the context of mobile applications for Android. We propose an approach called MobiLogLeak that identifies log statements in deployed apps that leak sensitive data. MobiLogLeak relies on taint flow analysis. Among 5,000 Android apps that we studied, we found that 200 apps leak sensitive data through logging

Concordia University Research Repository

Two Techniques For Automated Logging Statement Evolution

Author: Spektor Allan R
Publication venue: CUNY Academic Works
Publication date: 21/07/2020
Field of study

This thesis presents and explores two techniques for automated logging statement evolution. The first technique reinvigorates logging statement levels to reduce information overload using degree of interest obtained via software repository mining. The second technique converts legacy method calls to deferred execution to achieve performance gains, eliminating unnecessary evaluation overhead

City University of New York