13 research outputs found
Causal impact analysis for app releases in google play
App developers would like to understand the impact of their own and their competitors' software releases. To address this we introduce Causal Impact Release Analysis for app stores, and our tool, CIRA, that implements this analysis. We mined 38,858 popular Google Play apps, over a period of 12 months. For these apps, we identified 26,339 releases for which there was adequate prior and posterior time series data to facilitate causal impact analysis. We found that 33% of these releases caused a statistically significant change in user ratings. We use our approach to reveal important characteristics that distinguish causal significance in Google Play. To explore the actionability of causal impact analysis, we elicited the opinions of app developers: 56 companies responded, 78% concurred with the causal assessment, of which 33% claimed that their company would consider changing its app release strategy as a result of our findings
Application of Developers' and Users' Dependent Factors in App Store Optimization
This paper presents an application of developers' and users' dependent
factors in the app store optimization. The application is based on two main
fields: developers' dependent factors and users' dependent factors. Developers'
dependent factors are identified as: developer name, app name, subtitle, genre,
short description, long description, content rating, system requirements, page
url, last update, what's new and price. Users' dependent factors are identified
as: download volume, average rating, rating volume and reviews. The proposed
application in its final form is modelled after mining sample data from two
leading app stores: Google Play and Apple App Store. Results from analyzing
collected data show that developer dependent elements can be better optimized.
Names and descriptions of mobile apps are not fully utilized. In Google Play
there is one significant correlation between download volume and number of
reviews, whereas in App Store there is no significant correlation between
factors
App Store Analysis for Software Engineering
App Store Analysis concerns the mining of data from apps, made possible through app stores. This thesis extracts publicly available data from app stores, in order to detect and analyse relationships between technical attributes, such as software features, and non-technical attributes, such as rating and popularity information. The thesis identifies the App Sampling Problem, its effects and a methodology to ameliorate the problem. The App Sampling Problem is a fundamental sampling issue concerned with mining app stores, caused by the rather limited ‘most-popular-only’ ranked app discovery present in mobile app stores. This thesis provides novel techniques for the analysis of technical and non-technical data from app stores. Topic modelling is used as a feature extraction technique, which is shown to produce the same results as n-gram feature extraction, that also enables linking technical features from app descriptions with those in user reviews. Causal impact analysis is applied to app store performance data, leading to the identification of properties of statistically significant releases, and developer-controlled properties which could increase a release’s chance for causal significance. This thesis introduces the Causal Impact Release Analysis tool, CIRA, for performing causal impact analysis on app store data, which makes the aforementioned research possible; combined with the earlier feature extraction technique, this enables the identification of the claimed software features that may have led to significant positive and negative changes after a release
Identifying Unmaintained Projects in GitHub
Background: Open source software has an increasing importance in modern
software development. However, there is also a growing concern on the
sustainability of such projects, which are usually managed by a small number of
developers, frequently working as volunteers. Aims: In this paper, we propose
an approach to identify GitHub projects that are not actively maintained. Our
goal is to alert users about the risks of using these projects and possibly
motivate other developers to assume the maintenance of the projects. Method: We
train machine learning models to identify unmaintained or sparsely maintained
projects, based on a set of features about project activity (commits, forks,
issues, etc). We empirically validate the model with the best performance with
the principal developers of 129 GitHub projects. Results: The proposed machine
learning approach has a precision of 80%, based on the feedback of real open
source developers; and a recall of 96%. We also show that our approach can be
used to assess the risks of projects becoming unmaintained. Conclusions: The
model proposed in this paper can be used by open source users and developers to
identify GitHub projects that are not actively maintained anymore.Comment: Accepted at 12th International Symposium on Empirical Software
Engineering and Measurement (ESEM), 10 pages, 201
Predictive analytics for software testing: Keynote paper
This keynote discusses the use of Predictive Analytics for Software Engineering, and in particular for Software Defect Prediction and Software Testing, by presenting the latest results achieved in these fields leveraging Artificial Intelligence, Search-based and Machine Learning methods, and by giving some directions for future work
Exploring the effects of ad schemes on the performance cost of mobile phones
Advertising is an important revenue source for mobile app development, especially for free apps. However, ads also carry costs to users. Displaying ads can interfere user experience, and lead to less user retention and reduced earnings ultimately. Although there are recent studies devoted to directly mitigating ad costs, for example, by reducing the battery or memory consumed, comprehensive analysis on ad embedded schemes (e.g., ad sizes and ad providers) has rarely been conducted. In this paper, we focus on analyzing three types of performance cost, i.e., cost of memory/CPU, traffic, and battery. We explore 12 ad schemes used in 104 popular Android apps and compare their performance consumption. We show that the performance costs of the ad schemes we analyzed are significantly different. We also summarize the ad schemes that would generate low resource cost to users. Our summary is endorsed by 37 experienced app developers we surveyed