Search CORE

25 research outputs found

Identifying Software Engineering Challenges in Software SMEs: A Case Study in Thailand

Author: Choetkiertikul M
Krinke J
Ragkhitwetsagul C
Sarro F
Sunetnanta T
Publication venue: IEEE International Conference on Software Analysis, Evolution and Reengineering
Publication date: 01/03/2022
Field of study

A Taxonomy for Mining and Classifying Privacy Requirements in Issue Reports

Author: Choetkiertikul Morakot
Dam Hoa Khanh
Ghose Aditya
Ragkhitwetsagul Chaiyong
Sangaroonsilp Pattaraporn
Publication venue
Publication date: 04/01/2021
Field of study

Digital and physical footprints are a trail of user activities collected over the use of software applications and systems. As software becomes ubiquitous, protecting user privacy has become challenging. With the increasing of user privacy awareness and advent of privacy regulations and policies, there is an emerging need to implement software systems that enhance the protection of personal data processing. However, existing privacy regulations and policies only provide high-level principles which are difficult for software engineers to design and implement privacy-aware systems. In this paper, we develop a taxonomy that provides a comprehensive set of privacy requirements based on two well-established and widely-adopted privacy regulations and frameworks, the General Data Protection Regulation (GDPR) and the ISO/IEC 29100. These requirements are refined into a level that is implementable and easy to understand by software engineers, thus supporting them to attend to existing regulations and standards. We have also performed a study on how two large open-source software projects (Google Chrome and Moodle) address the privacy requirements in our taxonomy through mining their issue reports. The paper discusses how the collected issues were classified, and presents the findings and insights generated from our study.Comment: Submitted to IEEE Transactions on Software Engineering on 23 December 202

arXiv.org e-Print Archive

Research Online

Studying the association between Gitcoin's issues and resolving outcomes

Author: Chandra Pandaree
Choetkiertikul Morakot
Hata Hideaki
Maipradit Rungroj
Matsumoto Kenichi
Puengmongkolchaikit Arada
Ragkitwetsakul Chaiyong
Sunetnanta Thanwadee
Publication venue
Publication date: 26/09/2023
Field of study

The development of open-source software (OSS) projects usually have been driven through collaborations among contributors and strongly relies on volunteering. Thus, allocating software practitioners (e.g., contributors) to a particular task is non-trivial and draws attention away from the development. Therefore, a number of bug bounty platforms have emerged to address this problem through bounty rewards. Especially, Gitcoin, a new bounty platform, introduces a bounty reward mechanism that allows individual issue owners (backers) to define a reward value using cryptocurrencies rather than using crowdfunding mechanisms. Although a number of studies have investigated the phenomenon on bounty platforms, those rely on different bounty reward systems. Our study thus investigates the association between the Gitcoin bounties and their outcomes (i.e., success and non-success). We empirically study over 4,000 issues with Gitcoin bounties using statistical analysis and machine learning techniques. We also conducted a comparative study with the Bountysource platform to gain insights into the usage of both platforms. Our study highlights the importance of factors such as the length of the project, issue description, type of bounty issue, and the bounty value, which are found to be highly correlated with the outcome of bounty issues. These findings can provide useful guidance to practitioners

arXiv.org e-Print Archive

Mining the Characteristics of Jupyter Notebooks in Data Science Projects

Author: Choetkiertikul Morakot
Hoonlor Apirak
Jiravatvanich Vacharavich
Kaewpichai Urisayar
Pongpaichet Siripen
Ragkhitwetsagul Chaiyong
Settewong Tasha
Sunetnanta Thanwadee
Publication venue
Publication date: 11/04/2023
Field of study

Nowadays, numerous industries have exceptional demand for skills in data science, such as data analysis, data mining, and machine learning. The computational notebook (e.g., Jupyter Notebook) is a well-known data science tool adopted in practice. Kaggle and GitHub are two platforms where data science communities are used for knowledge-sharing, skill-practicing, and collaboration. While tutorials and guidelines for novice data science are available on both platforms, there is a low number of Jupyter Notebooks that received high numbers of votes from the community. The high-voted notebook is considered well-documented, easy to understand, and applies the best data science and software engineering practices. In this research, we aim to understand the characteristics of high-voted Jupyter Notebooks on Kaggle and the popular Jupyter Notebooks for data science projects on GitHub. We plan to mine and analyse the Jupyter Notebooks on both platforms. We will perform exploratory analytics, data visualization, and feature importances to understand the overall structure of these notebooks and to identify common patterns and best-practice features separating the low-voted and high-voted notebooks. Upon the completion of this research, the discovered insights can be applied as training guidelines for aspiring data scientists and machine learning practitioners looking to improve their performance from novice ranking Jupyter Notebook on Kaggle to a deployable project on GitHub

arXiv.org e-Print Archive

A CMMI-Based Automated Risk Assessment Framework

Author: Choetkiertikul Morakot
Dam Hoa K
Ghose Aditya K
Sunetnanta Thanwadee T
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

Risk assessment is crucial to the increase of software development project success. Current risk assessment approaches provide only a rough guide. Risk assessment experts and domain experts are required in conducting risk assessments in software projects. Therefore, traditional risk assessment approaches require extra activities besides development tasks, and possibly leading to extra costs. We believe that an effective risk assessment approach should be transparently embedded in software development process. This paper aims to present an automated risk assessment framework using CMMI and risk taxnomy as a guidance to develop a risk assessment model. A pragmatic approach will be applied as a basis in building this suggested risk prediction model and the case studies of our practice. These studies are considered as our proof of concept

Crossref

Research Online

Developing analytics models for software project management

Author: Choetkiertikul Morakot
Publication venue: School of Computing and Information Technology
Publication date: 01/01/2018
Field of study

Schedule and cost overruns constitute a major problem in software projects and have been a source of concern for the software engineering community for a long time. Managing software projects to meet time and cost constraints is highly challenging, due to the dynamic nature of software development. Software project management covers a range of complex activities such as project planning, progress monitoring, and risk management. These tasks require project managers the capability to make correct estimations and foresee future significant events in their projects. However, there has been little work on providing automated support for software project management activities

Research Online

The path to success: A study of user behaviour and success criteria in online communities

Author: Choetkiertikul Morakot
Diker Vedat
Hiscock Philippa A.
Lazar J.
Schwagereit Felix
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 15/10/2017
Field of study

Maintaining online communities is vital in order to increase and retain their economic and social value. That is why community managers look to gauge the success of their communities by measuring a variety of user behaviour, such as member activity, turnover and interaction. However, such communities vary widely in their purpose, implementation and user demographics, and although many success indicators have been proposed in the literature, we will show that there is no one- ts-all approach to community success: Different success criteria depend on different user behaviour. To demonstrate this, we put together a set of user behaviour features, including many that have been used in the literature as indicators of success, and then we define and predict community success in three different types of online communities: Questions & Answers (Q&A), Healthcare and Emotional Support (Life & Health), and Encyclopaedic Knowledge Creation. The results show that it is feasible to relate community success to specific user behaviour with an accuracy of 0.67–0.93 F1 score and 0.77–1.0 AUC.This research has been conducted with the financial support of Science Foundation Ireland (Grant Number SFI/12/RC/2289) and with data provided by Stack Exchange, Boards.ie and Wikipedia.non-peer-reviewe

Crossref

Access to Research at National University of Ireland, Galway

A deep learning model for estimating story points

Author: Choetkiertikul Morakot
Dam Hoa K
Ghose Aditya K
Menzies Tim
Pham Trang
Tran Truyen
Publication venue: 'Sociological Research Online'
Publication date: 01/01/2019
Field of study

Although there has been substantial research in software analytics for effort estimation in traditional software projects, little work has been done for estimation in agile projects, especially estimating the effort required for completing user stories or issues. Story points are the most common unit of measure used for estimating the effort involved in completing a user story or resolving an issue. In this paper, we propose a prediction model for estimating story points based on a novel combination of two powerful deep learning architectures: long short-term memory and recurrent highway network. Our prediction system is end-to-end trainable from raw input data to prediction outcomes without any manual feature engineering. We offer a comprehensive dataset for story points-based estimation that contains 23,313 issues from 16 open source projects. An empirical evaluation demonstrates that our approach consistently outperforms three common baselines (Random Guessing, Mean, and Median methods) and six alternatives (e.g. using Doc2Vec and Random Forests) in Mean Absolute Error, Median Absolute Error, and the Standardized Accuracy

Deakin Research Online

Crossref

Research Online

An empirical study of automated privacy requirements classification in issue reports

Author: Choetkiertikul Morakot
Dam Hoa Khanh
Ghose Aditya
Sangaroonsilp Pattaraporn
Publication venue: Research Online
Publication date: 01/11/2023
Field of study

The recent advent of data protection laws and regulations has emerged to protect privacy and personal information of individuals. As the cases of privacy breaches and vulnerabilities are rapidly increasing, people are aware and more concerned about their privacy. These bring a significant attention to software development teams to address privacy concerns in developing software applications. As today’s software development adopts an agile, issue-driven approach, issues in an issue tracking system become a centralised pool that gathers new requirements, requests for modification and all the tasks of the software project. Hence, establishing an alignment between those issues and privacy requirements is an important step in developing privacy-aware software systems. This alignment also facilitates privacy compliance checking which may be required as an underlying part of regulations for organisations. However, manually establishing those alignments is labour intensive and time consuming. In this paper, we explore a wide range of machine learning and natural language processing techniques which can automatically classify privacy requirements in issue reports. We employ six popular techniques namely Bag-of-Words (BoW), N-gram Inverse Document Frequency (N-gram IDF), Term Frequency-Inverse Document Frequency (TF-IDF), Word2Vec, Convolutional Neural Network (CNN) and Bidirectional Encoder Representations from Transformers (BERT) to perform the classification on privacy-related issue reports in Google Chrome and Moodle projects. The evaluation showed that BoW, N-gram IDF, TF-IDF and Word2Vec techniques are suitable for classifying privacy requirements in those issue reports. In addition, N-gram IDF is the best performer in both projects

Research Online

The path to success: A study of user behaviour and success criteria in online communities

Author: Choetkiertikul Morakot
Diker Vedat
Hiscock Philippa A.
Lazar J.
Schwagereit Felix
Publication venue: 'American College of Medical Physics (ACMP)'
Publication date: 15/10/2017
Field of study

Crossref

Irish Universities

Access to Research at National University of Ireland, Galway