210 research outputs found

    On Using Active Learning and Self-Training when Mining Performance Discussions on Stack Overflow

    Full text link
    Abundant data is the key to successful machine learning. However, supervised learning requires annotated data that are often hard to obtain. In a classification task with limited resources, Active Learning (AL) promises to guide annotators to examples that bring the most value for a classifier. AL can be successfully combined with self-training, i.e., extending a training set with the unlabelled examples for which a classifier is the most certain. We report our experiences on using AL in a systematic manner to train an SVM classifier for Stack Overflow posts discussing performance of software components. We show that the training examples deemed as the most valuable to the classifier are also the most difficult for humans to annotate. Despite carefully evolved annotation criteria, we report low inter-rater agreement, but we also propose mitigation strategies. Finally, based on one annotator's work, we show that self-training can improve the classification accuracy. We conclude the paper by discussing implication for future text miners aspiring to use AL and self-training.Comment: Preprint of paper accepted for the Proc. of the 21st International Conference on Evaluation and Assessment in Software Engineering, 201

    Case Studies in Industry: What We Have Learnt

    Full text link
    Case study research has become an important research methodology for exploring phenomena in their natural contexts. Case studies have earned a distinct role in the empirical analysis of software engineering phenomena which are difficult to capture in isolation. Such phenomena often appear in the context of methods and development processes for which it is difficult to run large, controlled experiments as they usually have to reduce the scale in several respects and, hence, are detached from the reality of industrial software development. The other side of the medal is that the realistic socio-economic environments where we conduct case studies -- with real-life cases and realistic conditions -- also pose a plethora of practical challenges to planning and conducting case studies. In this experience report, we discuss such practical challenges and the lessons we learnt in conducting case studies in industry. Our goal is to help especially inexperienced researchers facing their first case studies in industry by increasing their awareness for typical obstacles they might face and practical ways to deal with those obstacles.Comment: Proceedings of the 4th International Workshop on Conducting Empirical Studies in Industry, co-located with ICSE, 201

    A Longitudinal Study of Identifying and Paying Down Architectural Debt

    Full text link
    Architectural debt is a form of technical debt that derives from the gap between the architectural design of the system as it "should be" compared to "as it is". We measured architecture debt in two ways: 1) in terms of system-wide coupling measures, and 2) in terms of the number and severity of architectural flaws. In recent work it was shown that the amount of architectural debt has a huge impact on software maintainability and evolution. Consequently, detecting and reducing the debt is expected to make software more amenable to change. This paper reports on a longitudinal study of a healthcare communications product created by Brightsquid Secure Communications Corp. This start-up company is facing the typical trade-off problem of desiring responsiveness to change requests, but wanting to avoid the ever-increasing effort that the accumulation of quick-and-dirty changes eventually incurs. In the first stage of the study, we analyzed the status of the "before" system, which indicated the impacts of change requests. This initial study motivated a more in-depth analysis of architectural debt. The results of this analysis were used to motivate a comprehensive refactoring of the software system. The third phase of the study was a follow-on architectural debt analysis which quantified the improvements made. Using this quantitative evidence, augmented by qualitative evidence gathered from in-depth interviews with Brightsquid's architects, we present lessons learned about the costs and benefits of paying down architecture debt in practice.Comment: Submitted to ICSE-SEIP 201

    Predicting and Evaluating Software Model Growth in the Automotive Industry

    Full text link
    The size of a software artifact influences the software quality and impacts the development process. In industry, when software size exceeds certain thresholds, memory errors accumulate and development tools might not be able to cope anymore, resulting in a lengthy program start up times, failing builds, or memory problems at unpredictable times. Thus, foreseeing critical growth in software modules meets a high demand in industrial practice. Predicting the time when the size grows to the level where maintenance is needed prevents unexpected efforts and helps to spot problematic artifacts before they become critical. Although the amount of prediction approaches in literature is vast, it is unclear how well they fit with prerequisites and expectations from practice. In this paper, we perform an industrial case study at an automotive manufacturer to explore applicability and usability of prediction approaches in practice. In a first step, we collect the most relevant prediction approaches from literature, including both, approaches using statistics and machine learning. Furthermore, we elicit expectations towards predictions from practitioners using a survey and stakeholder workshops. At the same time, we measure software size of 48 software artifacts by mining four years of revision history, resulting in 4,547 data points. In the last step, we assess the applicability of state-of-the-art prediction approaches using the collected data by systematically analyzing how well they fulfill the practitioners' expectations. Our main contribution is a comparison of commonly used prediction approaches in a real world industrial setting while considering stakeholder expectations. We show that the approaches provide significantly different results regarding prediction accuracy and that the statistical approaches fit our data best

    Software Engineers' Information Seeking Behavior in Change Impact Analysis - An Interview Study

    Get PDF
    Software engineers working in large projects must navigate complex information landscapes. Change Impact Analysis (CIA) is a task that relies on engineers' successful information seeking in databases storing, e.g., source code, requirements, design descriptions, and test case specifications. Several previous approaches to support information seeking are task-specific, thus understanding engineers' seeking behavior in specific tasks is fundamental. We present an industrial case study on how engineers seek information in CIA, with a particular focus on traceability and development artifacts that are not source code. We show that engineers have different information seeking behavior, and that some do not consider traceability particularly useful when conducting CIA. Furthermore, we observe a tendency for engineers to prefer less rigid types of support rather than formal approaches, i.e., engineers value support that allows flexibility in how to practically conduct CIA. Finally, due to diverse information seeking behavior, we argue that future CIA support should embrace individual preferences to identify change impact by empowering several seeking alternatives, including searching, browsing, and tracing.Comment: Accepted for publication in the proceedings of the 25th International Conference on Program Comprehensio

    Web Application for Students and Lecturer of Electrical Engineering Faculty

    Get PDF
    Electrical Engineering Faculty, nowadays needs to develop by their students the “FIE Student MIS Application”. Development and usage of Web applications are popular due to the ubiquity of the browser as a client, commented mostly as a thin client. The ability to update and maintain web applications without distributing and installing software on potentially thousands of client computers is a key reason for their popularity. This paper presents the “FIE Student MIS Application” web platform designed for Electrical Engineering Faculty students and their lecturer to support on time correctly without delays and informing online all students and their lecturer receiving online the student’s status as per the FIE exam schema: attendance/activation at lecturer/seminar presence %, exam %, project/essay %, final grade of the exam; schedule of lecturers/seminars by day/week/month per class, other information coming from secretary part to all students in general or to one student. The process of implementation of the “FIE Student MIS Application” is based on project management of web application as per: definition, planning, management and evaluation

    A pilot empirical study of applying a usability technique in an open source software project

    Full text link
    Context: The growth in the number of non-technical open source software (OSS) application users and the escalating use of these applications have redoubled the need for, and interest in, developing usable OSS. OSS communities are unclear about which techniques to use in each development process activity. Objective: The aim of our research is to adapt a usability technique (visual brainstorming) to an OSS project and evaluate the feasibility of its application. Method: We used the case study research method to investigate technique application and participation in a project. To do this, we participated as volunteers in the HistoryCal project. Results: We identified adverse conditions that were an obstacle to technique application (like it was not easy to recruit OSS users to participate) and modified the technique to make it applicable. Conclusion: We conclude that these changes were helpful for applying the technique using web artifacts like blogsThis research was funded by the SENESCYT , Quevedo State Tech- nical University, TIN2014-52129-R and TIN2014-60490-P projects and the e-Madrid project (S2013/ICE-2715

    An Embedded Multiple-Case Study on OSS Design Quality Assessment across Domains

    Get PDF
    • …
    corecore