Search CORE

4,197 research outputs found

Analysis and Detection of Information Types of Open Source Software Issue Discussions

Author: Arya Deeksha
Cheng Jinghui
Guo Jin L. C.
Wang Wenting
Publication venue
Publication date: 01/01/2019
Field of study

Most modern Issue Tracking Systems (ITSs) for open source software (OSS) projects allow users to add comments to issues. Over time, these comments accumulate into discussion threads embedded with rich information about the software project, which can potentially satisfy the diverse needs of OSS stakeholders. However, discovering and retrieving relevant information from the discussion threads is a challenging task, especially when the discussions are lengthy and the number of issues in ITSs are vast. In this paper, we address this challenge by identifying the information types presented in OSS issue discussions. Through qualitative content analysis of 15 complex issue threads across three projects hosted on GitHub, we uncovered 16 information types and created a labeled corpus containing 4656 sentences. Our investigation of supervised, automated classification techniques indicated that, when prior knowledge about the issue is available, Random Forest can effectively detect most sentence types using conversational features such as the sentence length and its position. When classifying sentences from new issues, Logistic Regression can yield satisfactory performance using textual features for certain information types, while falling short on others. Our work represents a nontrivial first step towards tools and techniques for identifying and obtaining the rich information recorded in the ITSs to support various software engineering activities and to satisfy the diverse needs of OSS stakeholders.Comment: 41st ACM/IEEE International Conference on Software Engineering (ICSE2019

arXiv.org e-Print Archive

Crossref

PolyPublie

Improved Management Understanding of Research Through Concepts and Preliminary Studies for Empirical Problem Solving

Author: Budiarto Eka
Lestari Fauziyyah Husna Nurdiayanah
Sunarya Po Abas
Publication venue: 'Pandawan'
Publication date: 01/01/2018
Field of study

In the process of job management, many problems are faced. So that good management is needed, which is required to provide problem solving. Problem solving is done by conducting research on objects, in order to produce quality management. Research is a way to objectively seek truth, where truth here is not only conceptually or deductively obtained, but also must be tested empirically. The purpose of this paper is to provide an understanding of 10 (ten) basic research, as a form of research management, namely the understanding of research, research, research steps, motivation and research objectives, research processes, characteristics of research, preliminary studies, benefits and objectives preliminary studies, how to conduct preliminary studies, concepts of research methods and methodologies.In the process of job management, many problems are faced. So that good management is needed, which is required to provide problem solving. Problem solving is done by conducting research on objects, in order to produce quality management. Research is a way to objectively seek truth, where truth here is not only conceptually or deductively obtained, but also must be tested empirically. The purpose of this paper is to provide an understanding of 10 (ten) basic research, as a form of research management, namely the understanding of research, research, research steps, motivation and research objectives, research processes, characteristics of research, preliminary studies, benefits and objectives preliminary studies, how to conduct preliminary studies, concepts of research methods and methodologies

Neliti

iLearning Journal Center (IJC)

JavaScript Dead Code Identification, Elimination, and Empirical Assessment

Author: Lago Patricia
Lombardi Salvatore
Malavolta Ivano
Nirghin Kishan
Romano Simone
Scanniello Giuseppe
Scoccia Gian Luca
Publication venue
Publication date: 01/07/2023
Field of study

Web apps are built by using a combination of HTML, CSS, and JavaScript. While building modern web apps, it is common practice to make use of third-party libraries and frameworks, as to improve developers' productivity and code quality. Alongside these benefits, the adoption of such libraries results in the introduction of JavaScript dead code, i.e., code implementing unused functionalities. The costs for downloading and parsing dead code can negatively contribute to the loading time and resource usage of web apps. The goal of our study is two-fold. First, we present Lacuna, an approach for automatically detecting and eliminating JavaScript dead code from web apps. The proposed approach supports both static and dynamic analyses, it is extensible and can be applied to any JavaScript code base, without imposing constraints on the coding style or on the use of specific JavaScript constructs. Secondly, by leveraging Lacuna we conduct an experiment to empirically evaluate the run-time overhead of JavaScript dead code in terms of energy consumption, performance, network usage, and resource usage in the context of mobile web apps. We applied Lacuna four times on 30 mobile web apps independently developed by third-party developers, each time eliminating dead code according to a different optimization level provided by Lacuna. Afterward, each different version of the web app is executed on an Android device, while collecting measures to assess the potential run-time overhead caused by dead code. Experimental results, among others, highlight that the removal of JavaScript dead code has a positive impact on the loading time of mobile web apps, while significantly reducing the number of bytes transferred over the network

arXiv.org e-Print Archive

VU Research Portal

Refining GPT-3 Embeddings with a Siamese Structure for Technical Post Duplicate Detection

Author: Khomh Foutse
Li Heng
Washizaki Hironori
Wu Xingfang
Yoshioka Nobukazu
Publication venue
Publication date: 04/03/2024
Field of study

One goal of technical online communities is to help developers find the right answer in one place. A single question can be asked in different ways with different wordings, leading to the existence of duplicate posts on technical forums. The question of how to discover and link duplicate posts has garnered the attention of both developer communities and researchers. For example, Stack Overflow adopts a voting-based mechanism to mark and close duplicate posts. However, addressing these constantly emerging duplicate posts in a timely manner continues to pose challenges. Therefore, various approaches have been proposed to detect duplicate posts on technical forum posts automatically. The existing methods suffer from limitations either due to their reliance on handcrafted similarity metrics which can not sufficiently capture the semantics of posts, or their lack of supervision to improve the performance. Additionally, the efficiency of these methods is hindered by their dependence on pair-wise feature generation, which can be impractical for large amount of data. In this work, we attempt to employ and refine the GPT-3 embeddings for the duplicate detection task. We assume that the GPT-3 embeddings can accurately represent the semantics of the posts. In addition, by training a Siamese-based network based on the GPT-3 embeddings, we obtain a latent embedding that accurately captures the duplicate relation in technical forum posts. Our experiment on a benchmark dataset confirms the effectiveness of our approach and demonstrates superior performance compared to baseline methods. When applied to the dataset we constructed with a recent Stack Overflow dump, our approach attains a Top-1, Top-5, and Top-30 accuracy of 23.1%, 43.9%, and 68.9%, respectively. With a manual study, we confirm our approach's potential of finding unlabelled duplicates on technical forums.Comment: SANER 202

arXiv.org e-Print Archive

Elaborating validation scenarios based on the context analysis and combinatorial method: Example of the power-efficiency framework innomterics

Author
Publication venue: 'MDPI AG'
Publication date: 10/02/2021
Field of study

open5siThis research project is carried out under the support of the Russian Science Foundation Grant No. 19-19-00623.The preliminary task of a project consists of the definition of the scenarios that will guide further development work and validate the results. In this paper, we present an approach for the systematic generation of validation scenarios using a specifically developed taxonomy and combinatorial testing. We applied this approach to our research project for the development of the energy-efficiency evaluation framework named Innometrics. We described in detail all steps for taxonomy creation, generation of abstract validation scenarios, and identification of relevant industrial and academic case studies. We created the taxonomy of the target computer systems and then elaborated test cases using combinatorial testing. The classification criteria were the type of the system, its purpose, enabling hardware components and connectivity technologies, basic design patterns, programming language, and development lifecycle. The combinatorial testing results in 13 cases for one-way test coverage, which was considered enough to create a comprehensive test suite. We defined the case study for each particular scenario. These case studies represent the real industrial, educational, and open-source software development projects that will be used in further work on the Innometrics project.openCiancarini P.; Kruglov A.; Sadovykh A.; Succi G.; Zuev E.Ciancarini P.; Kruglov A.; Sadovykh A.; Succi G.; Zuev E

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Data Science, Data Visualization, and Digital Twins

Author
Publication venue: 'IntechOpen'
Publication date: 27/07/2022
Field of study

Real-time, web-based, and interactive visualisations are proven to be outstanding methodologies and tools in numerous fields when knowledge in sophisticated data science and visualisation techniques is available. The rationale for this is because modern data science analytical approaches like machine/deep learning or artificial intelligence, as well as digital twinning, promise to give data insights, enable informed decision-making, and facilitate rich interactions among stakeholders.The benefits of data visualisation, data science, and digital twinning technologies motivate this book, which exhibits and presents numerous developed and advanced data science and visualisation approaches. Chapters cover such topics as deep learning techniques, web and dashboard-based visualisations during the COVID pandemic, 3D modelling of trees for mobile communications, digital twinning in the mining industry, data science libraries, and potential areas of future data science development

Directory of Open Access Books (DOAB)

Software Support for Discourse-Based Textual Information Analysis: A Systematic Literature Review and Software Guidelines in Practice

Author: Martín-Rodilla Patricia
Sánchez Miguel
Publication venue: 'MDPI AG'
Publication date: 07/05/2020
Field of study

[Abstract] The intrinsic characteristics of humanities research require technological support and software assistance that also necessarily goes through the analysis of textual narratives. When these narratives become increasingly complex, pragmatics analysis (i.e., at discourse or argumentation levels) assisted by software is a great ally in the digital humanities. In recent years, solutions have been developed from the information visualization domain to support discourse analysis or argumentation analysis of textual sources via software, with applications in political speeches, debates, online forums, but also in written narratives, literature or historical sources. This paper presents a wide and interdisciplinary systematic literature review (SLR), both in software-related areas and humanities areas, on the information visualization and the software solutions adopted to support pragmatics textual analysis. As a result of this review, this paper detects weaknesses in existing works on the field, especially related to solutions’ availability, pragmatic framework dependence and lack of information sharing and reuse software mechanisms. The paper also provides some software guidelines for improving the detected weaknesses, exemplifying some guidelines in practice through their implementation in a new web tool, Viscourse. Viscourse is conceived as a complementary tool to assist textual analysis and to facilitate the reuse of informational pieces from discourse and argumentation text analysis tasks.Ministerio de Economía, Industria y Competitividad; FJCI-2016-6 28032Ministerio de Ciencia, Innovación y Universidades; RTI2018-093336-B-C2

Repositorio da Universidade da Coruña

Evaluating and improving web performance using free-to-use tools

Author: Kinnunen M. (Matias)
Publication venue: University of Oulu
Publication date: 14/05/2020
Field of study

Abstract. Fast website loading speeds can increase conversion rates and search engine rankings as well as encourage users to explore the site further, among other positive things. The purpose of the study was to find and compare free-to-use tools that can both evaluate the performance (loading and rendering speed) of a website and give suggestions how the performance could be improved. In addition, three tools were used to evaluate the performance of an existing WordPress site. Some of the performance improvement suggestions given by the tools were then acted upon, and the performance of the website was re-evaluated using the same tools. The research method used in the study was experimental research, and the research question was “How to evaluate and improve web performance using free-to-use tools?” There were also five sub-questions, of which the first two related to the tools and their features, and the last three to the case website. Eight free-to-use web performance evaluation tools were compared focusing on what performance metrics they evaluate, what performance improvement suggestions they can give, and six other features that can be useful to know in practice. In alphabetical order, the tools were: GTmetrix, Lighthouse, PageSpeed Insights, Pingdom Tools, Test My Site, WebPageTest, Website Speed Test (by Dotcom-Tools) and Website Speed Test (by Uptrends). The amounts of metrics evaluated by the tools ranged from one to fifteen. The performance improvement suggestions given by the tools could be put into three categories, meaning that the suggestions largely overlapped between the tools. All tools except Lighthouse were web-based tools. The performance of the case website was evaluated using GTmetrix, PageSpeed Insights and WebPageTest. On desktop, the performance was in the high-end range though varying between the three tools, and on mobile, the performance was noticeably slower due to the challenges of mobile devices (e.g. lower processing power compared to desktop computers) and mobile networks (e.g. higher latency compared to broadband connections). The common bottlenecks based on the suggestions given by the three tools seemed to be lack of using a CDN (Content Delivery Network), serving unoptimized images and serving large amounts of JavaScript. The results of the performance re-evaluation were mixed, highlighting the importance of carefully considering each performance improvement suggestion. The main takeaways of the study for practitioners are to use multiple tools to get a wide variety of performance metrics and suggestions, and to regard the suggestions and relative performance scores given by the tools only as guidelines with the main goal being to improve the time-based performance metrics

University of Oulu Repository - Jultika