15 research outputs found

    Measuring failed disruption propagation in genetic programming

    Get PDF
    Information theory explains the robustness of deep GP trees, with on average up to 83.3% of crossover run time disruptions failing to propagate to the root node, and so having no impact on fitness, leading to phenotypic convergence. Monte Carlo simulations of perturbations covering the whole tree demonstrate a model based on random synchronisation of the evaluation of the parent and child which cause parent and offspring evaluations to be identical. This predicts the effectiveness of fitness measurement grows slowly as O(log(n)) with number n of test cases. This geometric distribution model is tested on genetic programming symbolic regression

    Investigating the Effectiveness of Clustering for Story Point Estimation

    Get PDF
    Automated techniques to estimate Story Points (SP) for user stories in agile software development came to the fore a decade ago. Yet, the state-of-the-art estimation techniques’ accuracy has room for improvement. In this paper, we present a new approach for SP estimation, based on analysing textual features of software issues by employing latent Dirichlet allocation (LDA) and clustering. We first use LDA to represent issue reports in a new space of generated topics. We then use hierarchical clustering to agglomerate issues into clusters based on their topic similarities. Next, we build estimation models using the issues in each cluster. Then, we find the closest cluster to the new coming issue and use the model from that cluster to estimate the SP. Our approach is evaluated on a dataset of 26 open source projects with a total of 31,960 issues and compared against both baselines and state-of-the-art SP estimation techniques. The results show that the estimation performance of our proposed approach is as good as the state-of-the-art. However, none of these approaches is statistically significantly better than more naive estimators in all cases, which does not justify their additional complexity. We therefore encourage future work to develop alternative strategies for story points estimation. The experimental data and scripts we used in this work are publicly available to allow for replication and extension

    Software Engineering in the Age of App Stores: Feature-Based Analyses to Guide Mobile Software Engineers

    Get PDF
    Mobile app stores are becoming the dominating distribution platform of mobile applications. Due to their rapid growth, their impact on software engineering practices is not yet well understood. There has been no comprehensive study that explores the mobile app store ecosystem's effect on software engineering practices. Therefore, this thesis, as its first contribution, empirically studies the app store as a phenomenon from the developers' perspective to investigate the extent to which app stores affect software engineering tasks. The study highlights the importance of a mobile application's features as a deliverable unit from developers to users. The study uncovers the involvement of app stores in eliciting requirements, perfective maintenance and domain analysis in the form of discoverable features written in text form in descriptions and user reviews. Developers discover possible features to include by searching the app store. Developers, through interviews, revealed the cost of such tasks given a highly prolific user base, which major app stores exhibit. Therefore, the thesis, in its second contribution, uses techniques to extract features from unstructured natural language artefacts. This is motivated by the indication that developers monitor similar applications, in terms of provided features, to understand user expectations in a certain application domain. This thesis then devises a semantic-aware technique of mobile application representation using textual functionality descriptions. This representation is then shown to successfully cluster mobile applications to uncover a finer-grained and functionality-based grouping of mobile apps. The thesis, furthermore, provides a comparison of baseline techniques of feature extraction from textual artefacts based on three main criteria: silhouette width measure, human judgement and execution time. Finally, this thesis, in its final contribution shows that features do indeed migrate in the app store beyond category boundaries and discovers a set of migratory characteristics and their relationship to price, rating and popularity in the app stores studied

    A Versatile Dataset of Agile Open Source Software Projects

    Get PDF
    Agile software development is nowadays a widely adopted practise in both open-source and industrial software projects. Agile teams typically heavily rely on issue management tools to document new issues and keep track of outstanding ones, in addition to storing their technical details, effort estimates, assignment to developers, and more. Previous work utilised the historical information stored in issue management systems for various purposes; however, when researchers make their empirical data public, it is usually relevant solely to the study’s objective. In this paper, we present a more holistic and versatile dataset containing a wealth of information on more than half a million issues from 44 open-source Agile software, making it well-suited to several research avenues, and cross-analyses therein, including effort estimation, issue prioritisation, issue assignment and many more. We make this data publicly available on GitHub to facilitate ease of use, maintenance, and extensibility

    Mobile app and app store analysis, testing and optimisation

    Get PDF
    This talk presents results on analysis and testing of mobile apps and app stores, reviewing the work of the UCL App Analysis Group (UCLappA) on App Store Mining and Analysis. The talk also covers the work of the UCL CREST centre on Genetic Improvement, applicable to app improvement and optimisation

    Genetic Improvement of LLVM Intermediate Representation

    No full text
    Evolving LLVM IR is widely applicable, with LLVM Clang offering support for an increasing range of computer hardware and programming languages. Local search mutations are used to hill climb industry C code released to support geographic open standards: Open Location Code (OLC) from Google and Uber’s Hexagonal Hierarchical Spatial Index (H3), giving up to two percent speed up on compiler optimised code

    Using Collaborative Translation to Increase Students Knowledge of Web Accessibility

    No full text
    This paper presents a suggested approach to increase students' knowledge in web accessibility. The approach is based on the use of collaborative translation assisted by prisoner's dilemma game to increase the quality of the translation. The result shows an overall satisfaction of the new knowledge gained in domain of web accessibility

    Empirical comparison of text-based mobile apps similarity measurement techniques

    Get PDF
    Context: Code-free software similarity detection techniques have been used to support different software engineering tasks, including clustering mobile applications (apps). The way of measuring similarity may affect both the efficiency and quality of clustering solutions. However, there has been no previous comparative study of feature extraction methods used to guide mobile app clustering. Objective: In this paper, we investigate different techniques to compute the similarity of apps based on their textual descriptions and evaluate their effectiveness using hierarchical agglomerative clustering. Method: To this end we carry out an empirical study comparing five different techniques, based on topic modelling and keyword feature extraction, to cluster 12,664 apps randomly sampled from the Google Play App Store. The comparison is based on three main criteria: silhouette width measure, human judgement and execution time. Results: The results of our study show that using topic modelling, in addition to collocation-based and dependency-based feature extractors perform similarly in detecting app-feature similarity. However, dependency-based feature extraction performs better than any other in finding application domain similarity (ρ = 0.7,p − value < 0.01). Conclusions: Current categorisation in the app store studied does not exhibit a good classification quality in terms of the claimed feature space. However, a better quality can be achieved using a good feature extraction technique and a traditional clustering method

    Clustering Mobile Apps Based on Mined Textual Features

    Get PDF
    CONTEXT: Categorising software systems according to their functionality yields many benefits to both users and developers. GOAL: In order to uncover the latent clustering of mobile apps in app stores, we propose a novel technique that measures app similarity based on claimed behaviour. METHOD: Features are extracted using information retrieval augmented with ontological analysis and used as attributes to characterise apps. These attributes are then used to cluster the apps using agglomerative hierarchical clustering. We empirically evaluate our approach on 17,877 apps mined from the BlackBerry and Google app stores in 2014. RESULTS: The results show that our approach dramatically improves the existing categorisation quality for both Blackberry (from 0.02 to 0.41 on average) and Google (from 0.03 to 0.21 on average) stores. We also find a strong Spearman rank correlation (ρ= 0.96 for Google and ρ= 0.99 for BlackBerry) between the number of apps and the ideal granularity within each category, indicating that ideal granularity increases with category size, as expected. CONCLUSIONS: Current categorisation in the app stores studied do not exhibit a good classification quality in terms of the claimed feature space. However, a better quality can be achieved using a good feature extraction technique and a traditional clustering method
    corecore