Search CORE

81 research outputs found

30 Years of Software Refactoring Research:A Systematic Literature Review

Author: Abid Chaima
Alizadeh Vahid
Dig Danny
Ferreira Thiago do Nascimento
Kessentini Marouane
Publication venue
Publication date: 25/06/2020
Field of study

Due to the growing complexity of software systems, there has been a dramatic increase and industry demand for tools and techniques on software refactoring in the last ten years, defined traditionally as a set of program transformations intended to improve the system design while preserving the behavior. Refactoring studies are expanded beyond code-level restructuring to be applied at different levels (architecture, model, requirements, etc.), adopted in many domains beyond the object-oriented paradigm (cloud computing, mobile, web, etc.), used in industrial settings and considered objectives beyond improving the design to include other non-functional requirements (e.g., improve performance, security, etc.). Thus, challenges to be addressed by refactoring work are, nowadays, beyond code transformation to include, but not limited to, scheduling the opportune time to carry refactoring, recommendations of specific refactoring activities, detection of refactoring opportunities, and testing the correctness of applied refactorings. Therefore, the refactoring research efforts are fragmented over several research communities, various domains, and objectives. To structure the field and existing research results, this paper provides a systematic literature review and analyzes the results of 3183 research papers on refactoring covering the last three decades to offer the most scalable and comprehensive literature review of existing refactoring research studies. Based on this survey, we created a taxonomy to classify the existing research, identified research trends, and highlighted gaps in the literature and avenues for further research.Comment: 23 page

arXiv.org e-Print Archive

Deep Blue Documents at the University of Michigan

30 Years of Software Refactoring Research: A Systematic Literature Review

Author: Abid Chaima
Alizadeh Vahid
Ferreira Thiago
Kessentini Marouane
Publication venue
Publication date: 25/06/2020
Field of study

Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/155872/4/30YRefactoring.pd

Deep Blue Documents at the University of Michigan

Modeling and Simulating Causal Dependencies on Process-aware Information Systems from a Cost Perspective

Author: Mutschler Bela
Publication venue
Publication date: 01/01/2008
Field of study

Providing effective IT support for business processes has become crucial for enterprises to stay competitive in their market. Business processes must be defined, implemented, enacted, monitored, and continuously adapted to changing situations. Process life cycle support and continuous process improvement become critical success factors in contemporary and future enterprise computing. In this context, process-aware information systems (PAISs) adopt a key role. Thereby, organization-specific and generic process support systems are distinguished. In the former case, the PAIS is build "from scratch" and incorporates organization-specific information about the structure and processes to be supported. In the latter case, the PAIS does not contain any information about the structure and processes of a particular organization. Instead, an organization needs to configure the PAIS by specifying processes, organizational entities, and business objects. To enable the realization of PAISs, numerous process support paradigms, process modeling standards, and business process management tools have been introduced. The application of these approaches in PAIS engineering projects is not only influenced by technological, but also by organizational and project-specific factors. Between these factors there exist numerous causal dependencies, which, in turn, often lead to complex and unexpected effects in PAIS engineering projects. In particular, the costs of PAIS engineering projects are significantly influenced by these causal dependencies. What is therefore needed is a comprehensive approach enabling PAIS engineers to systematically investigate these causal dependencies as well as their impact on the costs of PAIS engineering projects. Existing economic-driven IT evaluation and software cost estimation approaches, however, are unable to take into account causal dependencies and resulting effects. In response, this thesis introduces the EcoPOST framework. This framework utilizes evaluation models to describe the interplay of technological, organizational, and project-specific evaluation factors, and simulation concepts to unfold the dynamic behavior of PAIS engineering projects. In this context, the EcoPOST framework also supports the reuse of evaluation models based on a library of generic, predefined evaluation patterns and also provides governing guidelines (e.g., model design guidelines) which enhance the transfer of the EcoPOST framework into practice. Tool support is available as well. Finally, we present the results of two online surveys, three case studies, and one controlled software experiment. Based on these empirical and experimental research activities, we are able to validate evaluation concepts underlying the EcoPOST framework and additionally demonstrate its practical applicability

DBIS EPub

University of Twente Research Information

Utilizing traceable software artifacts to improve bug localization

Author: Rath Michael
Publication venue
Publication date: 01/01/2022
Field of study

Die Entwicklung von Softwaresystemen ist eine komplexe Aufgabe. Qualitätssicherung versucht auftretenden Softwarefehler (bugs) in Systemen zu vermeiden, jedoch können Fehler nie ausgeschlossen werden. Sobald ein Softwarefehler entdeckt wird, wird typischerweise ein Fehlerbericht (bug report) erstellt. Dieser dient als Ausgangspunkt für den Entwickler den Fehler im Quellcode der Software zu finden und zu beheben (bug fixing). Fehlerberichte sowie weitere Softwareartefakte, z.B. Anforderungen und der Quellcode selbst, werden in Software Repositories abgelegt. Diese erlauben die Artefakte mit trace links zur Nachvollziehbarkeit (traceability) zu verknüpfen. Oftmals ist die Erstellung der trace links im Entwicklungsprozess vorgeschrieben. Dazu zählen u.a. die Luftfahrt- und Automobilindustrie, sowie die Entwicklung von medizinischen Geräten. Das Auffinden von Softwarefehlern in großen Systemen mit tausenden Artefakten ist eine anspruchsvolle, zeitintensive und fehleranfällige Aufgabe, welche eine umfangreiche Projektkenntnis erfordert. Deswegen wird seit Jahren aktiv an der Automatisierung dieses Prozesses geforscht. Weiterhin wird die manuelle Erstellung und Pflege von trace links als Belastung empfunden und sollte weitgehend automatisiert werden. In dieser Arbeit wird ein neuartiger Algorithmus zum Auffinden von Softwarefehlern vorgestellt, der aktiv die erstellten trace links ausnutzt. Die Artefakte und deren Beziehungen dienen zur Erstellung eines Nachvollziehbarkeitsgraphen, welcher analysiert wird um fehlerhafte Quellcodedateien anhand eines Fehlerberichtes zu finden. Jedoch muss angenommen werden, dass nicht alle notwendigen trace links zwischen den Softwareartefakten eines Projektes erstellt wurden. Deswegen wird ein vollautomatisierter, projektunabhängiger Ansatz vorgestellt, der diese fehlenden trace links erstellt (augmentation). Die Grundlage zur Entwicklung dieses Algorithmus ist der typische Entwicklungsprozess eines Softwareprojektes. Die entwickelten Ansätze wurden mit mehr als 32.000 Fehlerberichten von 27 Open-Source Projekten evaluiert und die Ergebnisse zeigen, dass die Einbeziehung von traceability signifikant das Auffinden von Fehlern im Quellcode verbessert. Weiterhin kann der entwickelte Augmentation Algorithmus zuverlässig fehlende trace links erstellen.The development of software systems is a very complex task. Quality assurance tries to prevent defects – software bugs – in deployed systems, but it is impossible to avoid bugs all together, especially during development. Once a bug is observed, typically a bug report is written. It guides the responsible developer to locate the bug in the project's source code, and once found to fix it. The bug reports, along with other development artifacts such as requirements and the source code are stored in software repositories. The repositories also allow to create relationships – trace links – among contained artifacts. Establishing this traceability is demanded in many domains, such as safety related ones like the automotive and aviation industry, or in development of medical devices. However, in large software systems with thousands of artifacts, especially source code files, manually locating a bug is time consuming, error-prone, and requires extensive knowledge of the project. Thus, automating the bug localization process is actively researched since many years. Further, manually creating and maintaining trace links is often considered as a burden, and there is the need to automate this task as well. Multiple studies have shown, that traceability is beneficial for many software development tasks. This thesis presents a novel bug localization algorithm utilizing traceability. The project's artifacts and trace links are used to create a traceability graph. This graph is then analyzed to locate defective source code files for a given bug report. Since the existing trace link set of a project is possibly incomplete, another algorithm is prosed to augment missing links. The algorithm is fully automated, project independent, and derived from a project's development workflow. An evaluation on more than 32,000 bug reports from 27 open-source projects shows, that incorporating traceability information into bug localization significantly improves the bug localization performance compared to two state of the art algorithms. Further, the trace link augmentation approach reliably constructs missing links and therefore simplifies the required trace maintenance

Digitale Bibliothek Thüringen

LEVERAGING MACHINE LEARNING TO IDENTIFY QUALITY ISSUES IN THE MEDICAID CLAIM ADJUDICATION PROCESS

Author: Hoseini Cyrus
Publication venue: Indiana State University
Publication date: 01/12/2020
Field of study

Medicaid is the largest health insurance in the U.S. It provides health coverage to over 68 million individuals, costs the nation over $600 billion a year, and subject to improper payments (fraud, waste, and abuse) or inaccurate payments (claim processed erroneously). Medicaid programs partially use Fee-For-Services (FFS) to provide coverage to beneficiaries by adjudicating claims and leveraging traditional inferential statistics to verify the quality of adjudicated claims. These quality methods only provide an interval estimate of the quality errors and are incapable of detecting most claim adjudication errors, potentially millions of dollar opportunity costs. This dissertation studied a method of applying supervised learning to detect erroneous payment in the entire population of adjudicated claims in each Medicaid Management Information System (MMIS), focusing on two specific claim types: inpatient and outpatient. A synthesized source of adjudicated claims generated by the Centers for Medicare & Medicaid Services (CMS) was used to create the original dataset. Quality reports from California FFS Medicaid were used to extract the underlying statistical pattern of claim adjudication errors in each Medicaid FFS and data labeling utilizing the goodness of fit and Anderson-Darling tests. Principle Component Analysis (PCA) and business knowledge were applied for dimensionality reduction resulting in the selection of sixteen (16) features for the outpatient and nineteen (19) features for the inpatient claims models. Ten (10) supervised learning algorithms were trained and tested on the labeled data: Decision tree with two configurations - Entropy and Gini, Random forests with two configurations - Entropy and Gini, Naïve Bayes, K Nearest Neighbor, Logistic Regression, Neural Network, Discriminant Analysis, and Gradient Boosting. Five (5) cross-validation and event-based sampling were applied during the training process (with oversampling using SMOTE method and stratification within oversampling). The prediction power (Gini importance) for the selected features were measured using the Mean Decrease in Impurity (MDI) method across three algorithms. A one-way ANOVA and Tukey and Fisher LSD pairwise comparisons were conducted. Results show that the Claim Payment Amount significantly outperforms the rest of the prediction power (highest Mean F-value for Gini importance at the α = 0.05 significance) for both claim types. Finally, all algorithms' recall and F1-score were measured for both claim types (inpatient and outpatient) and with and without oversampling. A one-way ANOVA and Tukey and Fisher LSD pairwise comparisons were conducted. The results show a statistically significant difference in the algorithm's performance in detecting quality issues in the outpatient and inpatient claims. Gradient Boosting, Decision Tree (with various configurations and sampling strategies) outperform the rest of the algorithms in recall and F1-measure on both datasets. Logistic Regression showing better recall on the outpatient than inpatient data, and Naïve Bays performs considerably better from recall and F1- score on outpatient data. Medicaid FFS programs and consultants, Medicaid administrators, and researchers could use this study to develop machine learning models to detect quality issues in the Medicaid FFS claim datasets at scale, saving potentially millions of dollars

Sycamore Scholars

Software Usability

Author
Publication venue: 'IntechOpen'
Publication date: 27/07/2022
Field of study

This volume delivers a collection of high-quality contributions to help broaden developers’ and non-developers’ minds alike when it comes to considering software usability. It presents novel research and experiences and disseminates new ideas accessible to people who might not be software makers but who are undoubtedly software users

Directory of Open Access Books (DOAB)

The Software Vulnerability Ecosystem: Software Development In The Context Of Adversarial Behavior

Author: Clark Saender Aren
Publication venue: ScholarlyCommons
Publication date: 01/01/2016
Field of study

Software vulnerabilities are the root cause of many computer system security fail- ures. This dissertation addresses software vulnerabilities in the context of a software lifecycle, with a particular focus on three stages: (1) improving software quality dur- ing development; (2) pre- release bug discovery and repair; and (3) revising software as vulnerabilities are found. The question I pose regarding software quality during development is whether long-standing software engineering principles and practices such as code reuse help or hurt with respect to vulnerabilities. Using a novel data-driven analysis of large databases of vulnerabilities, I show the surprising result that software quality and software security are distinct. Most notably, the analysis uncovered a counterintu- itive phenomenon, namely that newly introduced software enjoys a period with no vulnerability discoveries, and further that this “Honeymoon Effect” (a term I coined) is well-explained by the unfamiliarity of the code to malicious actors. An important consequence for code reuse, intended to raise software quality, is that protections inherent in delays in vulnerability discovery from new code are reduced. The second question I pose is the predictive power of this effect. My experimental design exploited a large-scale open source software system, Mozilla Firefox, in which two development methodologies are pursued in parallel, making that the sole variable in outcomes. Comparing the methodologies using a novel synthesis of data from vulnerability databases, These results suggest that the rapid-release cycles used in agile software development (in which new software is introduced frequently) have a vulnerability discovery rate equivalent to conventional development. Finally, I pose the question of the relationship between the intrinsic security of software, stemming from design and development, and the ecosystem into which the software is embedded and in which it operates. I use the early development lifecycle to examine this question, and again use vulnerability data as the means of answering it. Defect discovery rates should decrease in a purely intrinsic model, with software maturity making vulnerabilities increasingly rare. The data, which show that vulnerability rates increase after a delay, contradict this. Software security therefore must be modeled including extrinsic factors, thus comprising an ecosystem

ScholarlyCommons@Penn

Advances in Robotics, Automation and Control

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

The book presents an excellent overview of the recent developments in the different areas of Robotics, Automation and Control. Through its 24 chapters, this book presents topics related to control and robot design; it also introduces new mathematical tools and techniques devoted to improve the system modeling and control. An important point is the use of rational agents and heuristic techniques to cope with the computational complexity required for controlling complex systems. Through this book, we also find navigation and vision algorithms, automatic handwritten comprehension and speech recognition systems that will be included in the next generation of productive systems developed by man

Directory of Open Access Books (DOAB)

Explainable, Security-Aware and Dependency-Aware Framework for Intelligent Software Refactoring

Author: Abid Chaima
Publication venue
Publication date: 30/11/2021
Field of study

As software systems continue to grow in size and complexity, their maintenance continues to become more challenging and costly. Even for the most technologically sophisticated and competent organizations, building and maintaining high-performing software applications with high-quality-code is an extremely challenging and expensive endeavor. Software Refactoring is widely recognized as the key component for maintaining high-quality software by restructuring existing code and reducing technical debt. However, refactoring is difficult to achieve and often neglected due to several limitations in the existing refactoring techniques that reduce their effectiveness. These limitation include, but not limited to, detecting refactoring opportunities, recommending specific refactoring activities, and explaining the recommended changes. Existing techniques are mainly focused on the use of quality metrics such as coupling, cohesion, and the Quality Metrics for Object Oriented Design (QMOOD). However, there are many other factors identified in this work to assist and facilitate different maintenance activities for developers: 1. To structure the refactoring field and existing research results, this dissertation provides the most scalable and comprehensive systematic literature review analyzing the results of 3183 research papers on refactoring covering the last three decades. Based on this survey, we created a taxonomy to classify the existing research, identified research trends and highlighted gaps in the literature for further research. 2. To draw attention to what should be the current refactoring research focus from the developers’ perspective, we carried out the first large scale refactoring study on the most popular online Q&A forum for developers, Stack Overflow. We collected and analyzed posts to identify what developers ask about refactoring, the challenges that practitioners face when refactoring software systems, and what should be the current refactoring research focus from the developers’ perspective. 3. To improve the detection of refactoring opportunities in terms of quality and security in the context of mobile apps, we designed a framework that recommends the files to be refactored based on user reviews. We also considered the detection of refactoring opportunities in the context of web services. We proposed a machine learning-based approach that helps service providers and subscribers predict the quality of service with the least costs. Furthermore, to help developers make an accurate assessment of the quality of their software systems and decide if the code should be refactored, we propose a clustering-based approach to automatically identify the preferred benchmark to use for the quality assessment of a project. 4. Regarding the refactoring generation process, we proposed different techniques to enhance the change operators and seeding mechanism by using the history of applied refactorings and incorporating refactoring dependencies in order to improve the quality of the refactoring solutions. We also introduced the security aspect when generating refactoring recommendations, by investigating the possible impact of improving different quality attributes on a set of security metrics and finding the best trade-off between them. In another approach, we recommend refactorings to prioritize fixing quality issues in security-critical files, improve quality attributes and remove code smells. All the above contributions were validated at the large scale on thousands of open source and industry projects in collaboration with industry partners and the open source community. The contributions of this dissertation are integrated in a cloud-based refactoring framework which is currently used by practitioners.Ph.D.College of Engineering & Computer ScienceUniversity of Michigan-Dearbornhttp://deepblue.lib.umich.edu/bitstream/2027.42/171082/1/Chaima Abid Final Dissertation.pdfDescription of Chaima Abid Final Dissertation.pdf : Dissertatio

Deep Blue Documents at the University of Michigan

Combining SOA and BPM Technologies for Cross-System Process Automation

Author: Herr Sebastian
Läufer Konstantin
Shafaee John
Thiruvathukal George K.
Wirtz Guido
Publication venue: Loyola eCommons
Publication date: 01/01/2008
Field of study

This paper summarizes the results of an industry case study that introduced a cross-system business process automation solution based on a combination of SOA and BPM standard technologies (i.e., BPMN, BPEL, WSDL). Besides discussing major weaknesses of the existing, custom-built, solution and comparing them against experiences with the developed prototype, the paper presents a course of action for transforming the current solution into the proposed solution. This includes a general approach, consisting of four distinct steps, as well as specific action items that are to be performed for every step. The discussion also covers language and tool support and challenges arising from the transformation

Loyola eCommons