431 research outputs found

    RoseMatcher: Identifying the Impact of User Reviews on App Updates

    Full text link
    Release planning for mobile apps has recently become an area of active research. Prior research concentrated on app analysis based on app release notes in App Store, or tracking user reviews to support app evolution with issue trackers. However, as a platform for development teams to communicate with users, Apple Store has not been studied for detecting the relevance between release notes and user reviews. In this paper, we introduce RoseMatcher, an automatic approach to match relevant user reviews with app release notes, and identify matched pairs with high confidence. We collected 944 release notes and 1,046,862 user reviews from 5 mobile apps in the Apple App Store as research data, and evaluated the effectiveness and accuracy of RoseMatcher. Our evaluation shows that RoseMatcher can reach a hit ratio of 0.718 for identifying relevant matched pairs. We further conducted manual labelling and content analysis on 984 relevant matched pairs, and defined 8 roles user reviews play in app update according to the relationship between release notes and user reviews in the relevant matched pairs. The study results show that release notes tend to respond and solve feature requests, bug reports, and complaints raised in user reviews, while user reviews also tend to give positive, negative, and constructive feedback on app updates. Additionally, in the time dimension, the relevant reviews of release notes tend to be posed in a small period of time before and after the release of release notes. In the matched pairs, the time interval between the post time of release notes and user reviews reaches a maximum of three years and an average of one year. These findings indicate that the development teams do adopt user reviews when updating apps, and users show their interest in app release notes.Comment: 18 pages, 7 figure

    EFFECTIVE METHODS AND TOOLS FOR MINING APP STORE REVIEWS

    Get PDF
    Research on mining user reviews in mobile application (app) stores has noticeably advanced in the past few years. The main objective is to extract useful information that app developers can use to build more sustainable apps. In general, existing research on app store mining can be classified into three genres: classification of user feedback into different types of software maintenance requests (e.g., bug reports and feature requests), building practical tools that are readily available for developers to use, and proposing visions for enhanced mobile app stores that integrate multiple sources of user feedback to ensure app survivability. Despite these major advances, existing tools and techniques still suffer from several drawbacks. Specifically, the majority of techniques rely on the textual content of user reviews for classification. However, due to the inherently diverse and unstructured nature of user-generated online textual reviews, text-based review mining techniques often produce excessively complicated models that are prone to over-fitting. Furthermore, the majority of proposed techniques focus on extracting and classifying the functional requirements in mobile app reviews, providing a little or no support for extracting and synthesizing the non-functional requirements (NFRs) raised in user feedback (e.g., security, reliability, and usability). In terms of tool support, existing tools are still far from being adequate for practical applications. In general, there is a lack of off-the-shelf tools that can be used by researchers and practitioners to accurately mine user reviews. Motivated by these observations, in this dissertation, we explore several research directions aimed at addressing the current issues and shortcomings in app store review mining research. In particular, we introduce a novel semantically aware approach for mining and classifying functional requirements from app store reviews. This approach reduces the dimensionality of the data and enhances the predictive capabilities of the classifier. We then present a two-phase study aimed at automatically capturing the NFRs in user reviews. We also introduce MARC, a tool that enables developers to extract, classify, and summarize user reviews

    Modeling Users Feedback Using Bayesian Methods for Data-Driven Requirements Engineering

    Get PDF
    Data-driven requirements engineering represents a vision for a shift from the static traditional methods of doing requirements engineering to dynamic data-driven user-centered methods. App developers now receive abundant user feedback from user comments in app stores and social media, i.e., explicit feedback, to feedback from usage data and system logs, i.e, implicit feedback. In this dissertation, we describe two novel Bayesian approaches that utilize the available user\u27s to support requirements decisions and activities in the context of applications delivered through software marketplaces (web and mobile). In the first part, we propose to exploit implicit user feedback in the form of usage data to support requirements prioritization and validation. We formulate the problem as a popularity prediction problem and present a novel Bayesian model that is highly interpretable and offers early-on insights that can be used to support requirements decisions. Experimental results demonstrate that the proposed approach achieves high prediction accuracy and outperforms competitive models. In the second part, we discuss the limitations of previous approaches that use explicit user feedback for requirements extraction, and alternatively, propose a novel Bayesian approach that can address those limitations and offer a more efficient and maintainable framework. The proposed approach (1) simplifies the pipeline by accomplishing the classification and summarization tasks using a single model, (2) replaces manual steps in the pipeline with unsupervised alternatives that can accomplish the same task, and (3) offers an alternative way to extract requirements using example-based summaries that retains context. Experimental results demonstrate that the proposed approach achieves equal or better classification accuracy and outperforms competitive models in terms of summarization accuracy. Specifically, we show that the proposed approach can capture 91.3% of the discussed requirement with only 19% of the dataset, i.e., reducing the human effort needed to extract the requirements by 80%

    Mining app reviews to support software engineering

    Get PDF
    The thesis studies how mining app reviews can support software engineering. App reviews —short user reviews of an app in app stores— provide a potentially rich source of information to help software development teams maintain and evolve their products. Exploiting this information is however difficult due to the large number of reviews and the difficulty in extracting useful actionable information from short informal texts. A variety of app review mining techniques have been proposed to classify reviews and to extract information such as feature requests, bug descriptions, and user sentiments but the usefulness of these techniques in practice is still unknown. Research in this area has grown rapidly, resulting in a large number of scientific publications (at least 182 between 2010 and 2020) but nearly no independent evaluation and description of how diverse techniques fit together to support specific software engineering tasks have been performed so far. The thesis presents a series of contributions to address these limitations. We first report the findings of a systematic literature review in app review mining exposing the breadth and limitations of research in this area. Using findings from the literature review, we then present a reference model that relates features of app review mining tools to specific software engineering tasks supporting requirements engineering, software maintenance and evolution. We then present two additional contributions extending previous evaluations of app review mining techniques. We present a novel independent evaluation of opinion mining techniques using an annotated dataset created for our experiment. Our evaluation finds lower effectiveness than initially reported by the techniques authors. A final part of the thesis, evaluates approaches in searching for app reviews pertinent to a particular feature. The findings show a general purpose search technique is more effective than the state-of-the-art purpose-built app review mining techniques; and suggest their usefulness for requirements elicitation. Overall, the thesis contributes to improving the empirical evaluation of app review mining techniques and their application in software engineering practice. Researchers and developers of future app mining tools will benefit from the novel reference model, detailed experiments designs, and publicly available datasets presented in the thesis

    Opinion Mining for Software Development: A Systematic Literature Review

    Get PDF
    Opinion mining, sometimes referred to as sentiment analysis, has gained increasing attention in software engineering (SE) studies. SE researchers have applied opinion mining techniques in various contexts, such as identifying developers’ emotions expressed in code comments and extracting users’ critics toward mobile apps. Given the large amount of relevant studies available, it can take considerable time for researchers and developers to figure out which approaches they can adopt in their own studies and what perils these approaches entail. We conducted a systematic literature review involving 185 papers. More specifically, we present 1) well-defined categories of opinion mining-related software development activities, 2) available opinion mining approaches, whether they are evaluated when adopted in other studies, and how their performance is compared, 3) available datasets for performance evaluation and tool customization, and 4) concerns or limitations SE researchers might need to take into account when applying/customizing these opinion mining techniques. The results of our study serve as references to choose suitable opinion mining tools for software development activities, and provide critical insights for the further development of opinion mining techniques in the SE domain

    Rakenduste kasutajaarvustustest informatsiooni kaevandamine tarkvara arendustegevuste soodustamiseks

    Get PDF
    Kasutajate vajaduste ja ootuste hindamine on arendajate jaoks oluline oma tarkvararakenduste kvaliteedi parandamiseks. Mobiilirakenduste platvormidele sisestatud arvustused on kasulikuks infoallikaks kasutajate pidevalt muutuvate vajaduste hindamiseks. IgapĂ€evaselt rakenduste platvormidele esitatud arvustuste suur maht nĂ”uab aga automaatseid meetodeid neist kasuliku info leidmiseks. Arvustuste automaatseks liigitamiseks, nt veateatis vĂ”i uue funktsionaalsuse kĂŒsimine, saab kasutada teksti klassifitseerimismudeleid. Rakenduse funktsioonide automaatne kaevandamine arvustustest aitab teha kokkuvĂ”tteid kasutajate meelsusest rakenduse olemasolevate funktsioonide osas. KĂ”igepealt eksperimenteerime erinevate tekstiklassifitseerimise mudelitega ning vĂ”rdleme lihtsaid, leksikaalseid tunnuseid kasutavaid mudeleid keerukamatega, mis kasutavad rikkalikke lingvistilisi tunnuseid vĂ”i mis pĂ”hinevad tehisnĂ€rvivĂ”rkudel. Erinevate faktorite mĂ”ju uurimiseks funktsioonide kaevandamise meetoditele me teeme kĂ”igepealt kindlaks erinevate meetodite baastaseme tĂ€psuse rakendades neid samades eksperimentaalsetes tingimustes. SeejĂ€rel vĂ”rdleme neid meetodeid erinevates tingimustes, varieerides treenimiseks kasutatud annoteeritud andmestikke ning hindamismeetodeid. Kuna juhendatud masinĂ”ppel baseeruvad kaevandamismeetodid on vĂ”rreldes reeglipĂ”histega tundlikumad (1) andmete mĂ€rgendamisel kasutatud annoteerimisjuhistele ning (2) mĂ€rgendatatud andmestiku suurusele, siis uurisime nende faktorite mĂ”ju juhendatud masinĂ”ppe kontekstis ja pakkusime vĂ€lja uued annoteerimisjuhised, mis vĂ”ivad aidata funktsioonide kaevandamise tĂ€psust parandada. KĂ€esoleva doktoritöö projekti tulemusel valmis ka kontseptuaalne tööriist, mis vĂ”imaldab konkureerivaid rakendusi omavahel vĂ”rrelda. Tööriist kombineerib arvustuse tekstide klassifitseerimise ja rakenduse funktsioonide kaevandamise meetodid. Tööriista hinnanud kĂŒmme tarkvaraarendajat leidsid, et sellest vĂ”ib olla kasu rakenduse kvaliteedi parandamiselFor app developers, it is important to continuously evaluate the needs and expectations of their users to improve app quality. User reviews submitted to app marketplaces are regarded as a useful information source to re-access evolving user needs. The large volume of user reviews received every day requires automatic methods to find such information in user reviews. Text classification models can be used to categorize review information into types such as feature requests and bug reports, while automatic app feature extraction from user reviews can help in summarizing users’ sentiments at the level of app features. For classifying review information, we perform experiments to compare the performance of simple models using only lexical features to models with rich linguistic features and models built on deep learning architectures, i.e., Convolutional Neural Network (CNN). To investigate factors influencing the performance of automatic app feature extraction methods, i.e. rule-based and supervised machine learning, we first establish a baseline in a single experimental setting and then compare the performances in different experimental settings (i.e., varying annotated datasets and evaluation methods). Since the performance of supervised feature extraction methods is more sensitive than rule- based methods to (1) guidelines used to annotate app features in user reviews and (2) the size of the annotated data, we investigate their impact on the performance of supervised feature extraction models and suggest new annotation guidelines that have the potential to improve feature extraction performance. To make the research results of the thesis project also applicable for non-experts, we developed a proof-of-concept tool for comparing competing apps. The tool combines review classification and app feature extraction methods and has been evaluated by ten developers from industry who perceived it useful for improving the app quality.  https://www.ester.ee/record=b529379

    Detection of spam review on mobile app stores, evaluation of helpfulness of user reviews and extraction of quality aspects using machine learning techniques

    Get PDF
    As mobile devices have overtaken fixed Internet access, mobile applications and distribution platforms have gained in importance. App stores enable users to search and purchase mobile applications and then to give feedback in the form of reviews and ratings. A review might contain critical information about user experience, feature requests and bug reports. User reviews are valuable not only to developers and software organizations interested in learning the opinion of their customers but also to prospective users who would like to find out what others think about an app. Even though some surveys have inventoried techniques and methods in opinion mining and sentiment analysis, no systematic literature review (SLR) study had yet reported on mobile app store opinion mining and spam review detection problems. Mining opinions from app store reviews requires pre-processing at the text and content levels, including filtering-out nonopinionated content and evaluating trustworthiness and genuineness of the reviews. In addition, the relevance of the extracted features are not cross-validated with main software engineering concepts. This research project first conducted a systematic literature review (SLR) on the evaluation of mobile app store opinion mining studies. Next, to fill the identified gaps in the literature, we used a novel convolutional neural network to learn document representation for deceptive spam review detection by characterizing an app store review dataset which includes truthful and spam reviews for the first time in the literature. Our experiments reported that our neural network based method achieved 82.5% accuracy, while a baseline Support Vector Machine (SVM) classification model reached only 70% accuracy despite leveraging various feature combinations. We next compared four classification models to assess app store user review helpfulness and proposed a predictive model which makes use of review meta-data along with structural and lexical features for helpfulness prediction. In the last part of this research study, we constructed an annotated app store review dataset for the aspect extraction task, based on ISO 25010 - Systems and software Product Quality Requirements and Evaluation standard and two deep neural network models: Bi-directional Long-Short Term Memory and Conditional Random Field (Bi-LSTM+CRF) and Deep Convolutional Neural Networks and Conditional Random Field (CNN+CRF) for aspect extraction from app store user reviews. Both models achieved nearly 80% F1 score (the weighted average of precision and recall which takes both false positives and false negatives into account) in exact aspect matching and 86% F1 score in partial aspect matching
    • 

    corecore