13 research outputs found

    Causal impact analysis for app releases in google play

    Get PDF
    App developers would like to understand the impact of their own and their competitors' software releases. To address this we introduce Causal Impact Release Analysis for app stores, and our tool, CIRA, that implements this analysis. We mined 38,858 popular Google Play apps, over a period of 12 months. For these apps, we identified 26,339 releases for which there was adequate prior and posterior time series data to facilitate causal impact analysis. We found that 33% of these releases caused a statistically significant change in user ratings. We use our approach to reveal important characteristics that distinguish causal significance in Google Play. To explore the actionability of causal impact analysis, we elicited the opinions of app developers: 56 companies responded, 78% concurred with the causal assessment, of which 33% claimed that their company would consider changing its app release strategy as a result of our findings

    Application of Developers' and Users' Dependent Factors in App Store Optimization

    Full text link
    This paper presents an application of developers' and users' dependent factors in the app store optimization. The application is based on two main fields: developers' dependent factors and users' dependent factors. Developers' dependent factors are identified as: developer name, app name, subtitle, genre, short description, long description, content rating, system requirements, page url, last update, what's new and price. Users' dependent factors are identified as: download volume, average rating, rating volume and reviews. The proposed application in its final form is modelled after mining sample data from two leading app stores: Google Play and Apple App Store. Results from analyzing collected data show that developer dependent elements can be better optimized. Names and descriptions of mobile apps are not fully utilized. In Google Play there is one significant correlation between download volume and number of reviews, whereas in App Store there is no significant correlation between factors

    App Store Analysis for Software Engineering

    Get PDF
    App Store Analysis concerns the mining of data from apps, made possible through app stores. This thesis extracts publicly available data from app stores, in order to detect and analyse relationships between technical attributes, such as software features, and non-technical attributes, such as rating and popularity information. The thesis identifies the App Sampling Problem, its effects and a methodology to ameliorate the problem. The App Sampling Problem is a fundamental sampling issue concerned with mining app stores, caused by the rather limited ‘most-popular-only’ ranked app discovery present in mobile app stores. This thesis provides novel techniques for the analysis of technical and non-technical data from app stores. Topic modelling is used as a feature extraction technique, which is shown to produce the same results as n-gram feature extraction, that also enables linking technical features from app descriptions with those in user reviews. Causal impact analysis is applied to app store performance data, leading to the identification of properties of statistically significant releases, and developer-controlled properties which could increase a release’s chance for causal significance. This thesis introduces the Causal Impact Release Analysis tool, CIRA, for performing causal impact analysis on app store data, which makes the aforementioned research possible; combined with the earlier feature extraction technique, this enables the identification of the claimed software features that may have led to significant positive and negative changes after a release

    Identifying Unmaintained Projects in GitHub

    Full text link
    Background: Open source software has an increasing importance in modern software development. However, there is also a growing concern on the sustainability of such projects, which are usually managed by a small number of developers, frequently working as volunteers. Aims: In this paper, we propose an approach to identify GitHub projects that are not actively maintained. Our goal is to alert users about the risks of using these projects and possibly motivate other developers to assume the maintenance of the projects. Method: We train machine learning models to identify unmaintained or sparsely maintained projects, based on a set of features about project activity (commits, forks, issues, etc). We empirically validate the model with the best performance with the principal developers of 129 GitHub projects. Results: The proposed machine learning approach has a precision of 80%, based on the feedback of real open source developers; and a recall of 96%. We also show that our approach can be used to assess the risks of projects becoming unmaintained. Conclusions: The model proposed in this paper can be used by open source users and developers to identify GitHub projects that are not actively maintained anymore.Comment: Accepted at 12th International Symposium on Empirical Software Engineering and Measurement (ESEM), 10 pages, 201

    Predictive analytics for software testing: Keynote paper

    Get PDF
    This keynote discusses the use of Predictive Analytics for Software Engineering, and in particular for Software Defect Prediction and Software Testing, by presenting the latest results achieved in these fields leveraging Artificial Intelligence, Search-based and Machine Learning methods, and by giving some directions for future work

    Exploring the effects of ad schemes on the performance cost of mobile phones

    Get PDF
    Advertising is an important revenue source for mobile app development, especially for free apps. However, ads also carry costs to users. Displaying ads can interfere user experience, and lead to less user retention and reduced earnings ultimately. Although there are recent studies devoted to directly mitigating ad costs, for example, by reducing the battery or memory consumed, comprehensive analysis on ad embedded schemes (e.g., ad sizes and ad providers) has rarely been conducted. In this paper, we focus on analyzing three types of performance cost, i.e., cost of memory/CPU, traffic, and battery. We explore 12 ad schemes used in 104 popular Android apps and compare their performance consumption. We show that the performance costs of the ad schemes we analyzed are significantly different. We also summarize the ad schemes that would generate low resource cost to users. Our summary is endorsed by 37 experienced app developers we surveyed
    corecore