5 research outputs found
Mining domain knowledge from app descriptions
Domain analysis aims at obtaining knowledge to a particular domain in the early stage of software development. A key challenge in domain analysis is to extract features automatically from related product artifacts. Compared with other kinds of artifacts, high volume of descriptions can be collected from app marketplaces (such as Google Play and Apple Store) easily when developing a new mobile application (App), so it is essential for the success of domain analysis to obtain features and relationship from them using data technologies. In this paper, we propose an approach to mine
domain knowledge from App descriptions automatically. In our approach, the information of features in a single app description is firstly extracted and formally described by a Concern-based Description Model (CDM), this process is based on predefined rules of feature extraction and a modified topic modeling method; then the overall knowledge in the domain is identified by classifying, clustering and merging the knowledge in the set of CDMs and topics, and the results are formalized by a Data-based Raw Domain Model (DRDM). Furthermore, we propose a quantified evaluation method for prioritizing the knowledge in DRDM. The proposed approach is validated by a series of experiments
Recommending software features to designers: From the perspective of users
With lots of public software descriptions emerging in the application market, it is significant to extract common software features from these descriptions and recommend them to new designers. However, existing approaches often recommend features according to their frequencies which reflect designersâ preferences. In order to identify those usersâ favorite features and help design more popular software, this paper proposes to make use of the public data of usersâ ratings and productsâ downloads which reflect usersâ preferences to recommend extracted features. The proposed approach distinguishes usersâ perspective from designersâ perspective and argues that usersâ perspective is better for recommending features because most products are designed for users and expect to be popular among users. Based on the lasso regression to estimate the relationship between the extracted features and the usersâ ratings, it proposes to first distinguish the extracted features to identify those rec- ommendable and undesirable features. By treating each download as a support from users to the product features, it further mines the feature association rules from usersâ perspective for recommending features. By taking the public data on the market of SoftPedia.com for evaluation, our empirical studies indicate that: (1) selecting recommendable features by lasso regression is better than that by feature frequencies in terms of F1 measure; and (2) recommending features based on the feature association rules mined from usersâ perspective is not only feasible but also has competitive performance compared with that based on the rules mined from designsâ perspective in terms of F1 measure
Software Engineering in the Age of App Stores: Feature-Based Analyses to Guide Mobile Software Engineers
Mobile app stores are becoming the dominating distribution platform of mobile applications. Due to their rapid growth, their impact on software engineering practices is not yet well understood. There has been no comprehensive study that explores the mobile app store ecosystem's effect on software engineering practices. Therefore, this thesis, as its first contribution, empirically studies the app store as a phenomenon from the developers' perspective to investigate the extent to which app stores affect software engineering tasks. The study highlights the importance of a mobile application's features as a deliverable unit from developers to users. The study uncovers the involvement of app stores in eliciting requirements, perfective maintenance and domain analysis in the form of discoverable features written in text form in descriptions and user reviews. Developers discover possible features to include by searching the app store. Developers, through interviews, revealed the cost of such tasks given a highly prolific user base, which major app stores exhibit. Therefore, the thesis, in its second contribution, uses techniques to extract features from unstructured natural language artefacts. This is motivated by the indication that developers monitor similar applications, in terms of provided features, to understand user expectations in a certain application domain. This thesis then devises a semantic-aware technique of mobile application representation using textual functionality descriptions. This representation is then shown to successfully cluster mobile applications to uncover a finer-grained and functionality-based grouping of mobile apps. The thesis, furthermore, provides a comparison of baseline techniques of feature extraction from textual artefacts based on three main criteria: silhouette width measure, human judgement and execution time. Finally, this thesis, in its final contribution shows that features do indeed migrate in the app store beyond category boundaries and discovers a set of migratory characteristics and their relationship to price, rating and popularity in the app stores studied
Identifying reusable knowledge in developer instant messaging communication.
Context and background: Software engineering is a complex and knowledge-intensive
activity. Required knowledge (e.g., about technologies, frameworks, and design decisions)
changes fast and the knowledge needs of those who design, code, test and maintain
software constantly evolve. On the other hand, software developers use a wide range of
processes, practices and tools where developers explicitly and implicitly âproduceâ and
capture different types of knowledge.
Problem: Software developers use instant messaging tools (e.g., Slack, Microsoft
Teams and Gitter) to discuss development-related problems, share experiences and to
collaborate in projects. This communication takes place in chat rooms that accumulate
potentially relevant knowledge to be reused by other developers. Therefore, in this
research we analyze whether there is reusable knowledge in developer instant messaging
communication by exploring (a) which instant messaging platforms can be a source
of reusable knowledge, and (b) software engineering themes that represent the main
discussions of developers in instant messaging communication. We also analyze how
this reusable knowledge can be identified with the use of topic modeling (a natural
language processing technique to discover abstract topics in text) by (c) surveying the
literature on how topic modeling has been applied in software engineering research, and
(d) evaluating how topic models perform with developer instant messages.
Method: First, we conducted a Field Study through an exploratory case study and a
reflexive thematic analysis to check whether there is reusable knowledge in developer
instant messaging communication, and if so, what this knowledge (main themes discussed)
is. Then, we conducted a Sample Study to explore how reusable knowledge in
developer instant messaging communication can we identified. In this study, we applied
a literature survey and software repository mining (i.e. short text topic modeling).
Findings and contributions: We (a) developed a comparison framework for instant
messaging tools, (b) identified a map of the main themes discussed in chat rooms of an
instant messaging tool (Gitter, a platform used by software developers), (c) provided a
comprehensive literature review that offers insights and references on the use of topic
modeling in software engineering, and (d) provided an evaluation of the performance of
topic models applied to developer instant messages based on topic coherence metrics
and human judgment for topic quality