5 research outputs found
Predicting Good Configurations for GitHub and Stack Overflow Topic Models
Software repositories contain large amounts of textual data, ranging from
source code comments and issue descriptions to questions, answers, and comments
on Stack Overflow. To make sense of this textual data, topic modelling is
frequently used as a text-mining tool for the discovery of hidden semantic
structures in text bodies. Latent Dirichlet allocation (LDA) is a commonly used
topic model that aims to explain the structure of a corpus by grouping texts.
LDA requires multiple parameters to work well, and there are only rough and
sometimes conflicting guidelines available on how these parameters should be
set. In this paper, we contribute (i) a broad study of parameters to arrive at
good local optima for GitHub and Stack Overflow text corpora, (ii) an
a-posteriori characterisation of text corpora related to eight programming
languages, and (iii) an analysis of corpus feature importance via per-corpus
LDA configuration. We find that (1) popular rules of thumb for topic modelling
parameter configuration are not applicable to the corpora used in our
experiments, (2) corpora sampled from GitHub and Stack Overflow have different
characteristics and require different configurations to achieve good model fit,
and (3) we can predict good configurations for unseen corpora reliably. These
findings support researchers and practitioners in efficiently determining
suitable configurations for topic modelling when analysing textual data
contained in software repositories.Comment: to appear as full paper at MSR 2019, the 16th International
Conference on Mining Software Repositorie
A Q-learning-based memetic algorithm for multi-objective dynamic software project scheduling
Software project scheduling is the problem of allocating employees to tasks in a software project. Due to the large scale of current software projects, many studies have investigated the use of optimization algorithms to find good software project schedules. However, despite the importance of human factors to the success of software projects, existing work has considered only a limited number of human properties when formulating software project scheduling as an optimization problem. Moreover, the changing environments of software companies mean that software project scheduling is a dynamic optimization problem. However, there is a lack of effective dynamic scheduling approaches to solve this problem. This work proposes a more realistic mathematical model for the dynamic software project scheduling problem. This model considers that skill proficiency can improve over time and, different from previous work, it considers that such improvement is affected by the employees’ properties of motivation and learning ability, and by the skill difficulty. It also defines the objective of employees’ satisfaction with the allocation. It is considered together with the objectives of project duration, cost, robustness and stability under a variety of practical constraints. To adapt schedules to the dynamically changing software project environments, a multi-objective two-archive memetic algorithm based on Q-learning (MOTAMAQ) is proposed to solve the problem in a proactive-rescheduling way. Different from previous work, MOTAMAQ learns the most appropriate global and local search methods to be used for different software project environment states by using Q-learning. Extensive experiments on 18 dynamic benchmark instances and 3 instances derived from real-world software projects were performed. A comparison with seven other meta-heuristic algorithms shows that the strategies used by our novel approach are very effective in improving its convergence performance in dynamic environments, while maintaining a good distribution and spread of solutions. The Q-learning-based learning mechanism can choose appropriate search operators for the different scheduling environments. We also show how different trade-offs among the five objectives can provide software managers with a deeper insight into various compromises among many objectives, and enabling them to make informed decisions
How to Evaluate Solutions in Pareto-based Search-Based Software Engineering? A Critical Review and Methodological Guidance
With modern requirements, there is an increasing tendency of considering
multiple objectives/criteria simultaneously in many Software Engineering (SE)
scenarios. Such a multi-objective optimization scenario comes with an important
issue -- how to evaluate the outcome of optimization algorithms, which
typically is a set of incomparable solutions (i.e., being Pareto non-dominated
to each other). This issue can be challenging for the SE community,
particularly for practitioners of Search-Based SE (SBSE). On one hand,
multi-objective optimization could still be relatively new to SE/SBSE
researchers, who may not be able to identify the right evaluation methods for
their problems. On the other hand, simply following the evaluation methods for
general multi-objective optimization problems may not be appropriate for
specific SE problems, especially when the problem nature or decision maker's
preferences are explicitly/implicitly available. This has been well echoed in
the literature by various inappropriate/inadequate selection and
inaccurate/misleading use of evaluation methods. In this paper, we first carry
out a systematic and critical review of quality evaluation for multi-objective
optimization in SBSE. We survey 717 papers published between 2009 and 2019 from
36 venues in seven repositories, and select 95 prominent studies, through which
we identify five important but overlooked issues in the area. We then conduct
an in-depth analysis of quality evaluation indicators/methods and general
situations in SBSE, which, together with the identified issues, enables us to
codify a methodological guidance for selecting and using evaluation methods in
different SBSE scenarios.Comment: This paper has been accepted by IEEE Transactions on Software
Engineering, available as full OA:
https://ieeexplore.ieee.org/document/925218
The determinants of value addition: a crtitical analysis of global software engineering industry in Sri Lanka
It was evident through the literature that the perceived value delivery of the global software
engineering industry is low due to various facts. Therefore, this research concerns global
software product companies in Sri Lanka to explore the software engineering methods and
practices in increasing the value addition. The overall aim of the study is to identify the key
determinants for value addition in the global software engineering industry and critically
evaluate the impact of them for the software product companies to help maximise the value
addition to ultimately assure the sustainability of the industry.
An exploratory research approach was used initially since findings would emerge while the
study unfolds. Mixed method was employed as the literature itself was inadequate to
investigate the problem effectively to formulate the research framework. Twenty-three face-to-face online interviews were conducted with the subject matter experts covering all the
disciplines from the targeted organisations which was combined with the literature findings as
well as the outcomes of the market research outcomes conducted by both government and nongovernment institutes. Data from the interviews were analysed using NVivo 12. The findings
of the existing literature were verified through the exploratory study and the outcomes were
used to formulate the questionnaire for the public survey. 371 responses were considered after
cleansing the total responses received for the data analysis through SPSS 21 with alpha level
0.05. Internal consistency test was done before the descriptive analysis. After assuring the
reliability of the dataset, the correlation test, multiple regression test and analysis of variance
(ANOVA) test were carried out to fulfil the requirements of meeting the research objectives.
Five determinants for value addition were identified along with the key themes for each area.
They are staffing, delivery process, use of tools, governance, and technology infrastructure.
The cross-functional and self-organised teams built around the value streams, employing a
properly interconnected software delivery process with the right governance in the delivery
pipelines, selection of tools and providing the right infrastructure increases the value delivery.
Moreover, the constraints for value addition are poor interconnection in the internal processes,
rigid functional hierarchies, inaccurate selections and uses of tools, inflexible team
arrangements and inadequate focus for the technology infrastructure. The findings add to the
existing body of knowledge on increasing the value addition by employing effective processes,
practices and tools and the impacts of inaccurate applications the same in the global software
engineering industry