5 research outputs found

    Gitana: a SQL-based Git Repository Inspector

    Get PDF
    International audienceSoftware development projects are notoriously complex and difficult to deal with. Several support tools such as issue tracking, code review and Source Control Management (SCM) systems have been introduced in the past decades to ease development activities. While such tools efficiently track the evolution of a given aspect of the project (e.g., bug reports), they provide just a partial view of the project and often lack of advanced querying mechanisms limiting themselves to command line or simple GUI support. This is particularly true for projects that rely on Git, the most popular SCM system today. In this paper, we propose a conceptual schema for Git and an approach that, given a Git repository, exports its data to a relational database in order to (1) promote data integration with other existing SCM tools and (2) enable writing queries on Git data using standard SQL syntax. To ensure efficiency, our approach comes with an incremental propagation mechanism that refreshes the database content with the latest modifications. We have implemented our approach in Gitana, an open-source tool available on GitHub

    Mega Software Engineering

    Full text link
    Techinical Report of Software Engineering Lab in Osaka Univ. SEL-Sep-22-200

    Synchronous development in open-source projects: A higher-level perspective

    Get PDF
    Mailing lists are a major communication channel for supporting developer coordina tion in open-source software projects. In a recent study, researchers explored tempo ral relationships (e.g., synchronization) between developer activities on source code and on the mailing list, relying on simple heuristics of developer collaboration (e.g., co-editing fles) and developer communication (e.g., sending e-mails to the mailing list). We propose two methods for studying synchronization between collaboration and communication activities from a higher-level perspective, which captures the complex activities and views of developers more precisely than the rather technical perspective of previous work. On the one hand, we explore developer collaboration at the level of features (not fles), which are higher-level concepts of the domain and not mere technical artifacts. On the other hand, we lift the view of developer com munication from a message-based model, which treats each e-mail individually, to a conversation-based model, which is semantically richer due to grouping e-mails that represent conceptually related discussions. By means of an empirical study, we investigate whether the diferent abstraction levels afect the observed relationship between commit activity and e-mail communication using state-of-the-art time series analysis. For this purpose, we analyze a combined history of 40 years of data for three highly active and widely deployed open-source projects: QEMU, BusyBox, and OpenSSL. Overall, we found evidence that a higher-level view on the coordina tion of developers leads to identifying a stronger statistical dependence between the technical activities of developers than a less abstract and rather technical view

    Evidence-based Software Process Recovery

    Get PDF
    Developing a large software system involves many complicated, varied, and inter-dependent tasks, and these tasks are typically implemented using a combination of defined processes, semi-automated tools, and ad hoc practices. Stakeholders in the development process --- including software developers, managers, and customers --- often want to be able to track the actual practices being employed within a project. For example, a customer may wish to be sure that the process is ISO 9000 compliant, a manager may wish to track the amount of testing that has been done in the current iteration, and a developer may wish to determine who has recently been working on a subsystem that has had several major bugs appear in it. However, extracting the software development processes from an existing project is expensive if one must rely upon manual inspection of artifacts and interviews of developers and their managers. Previously, researchers have suggested the live observation and instrumentation of a project to allow for more measurement, but this is costly, invasive, and also requires a live running project. In this work, we propose an approach that we call software process recovery that is based on after-the-fact analysis of various kinds of software development artifacts. We use a variety of supervised and unsupervised techniques from machine learning, topic analysis, natural language processing, and statistics on software repositories such as version control systems, bug trackers, and mailing list archives. We show how we can combine all of these methods to recover process signals that we map back to software development processes such as the Unified Process. The Unified Process has been visualized using a time-line view that shows effort per parallel discipline occurring across time. This visualization is called the Unified Process diagram. We use this diagram as inspiration to produce Recovered Unified Process Views (RUPV) that are a concrete version of this theoretical Unified Process diagram. We then validate these methods using case studies of multiple open source software systems

    Process-Centric Analytical Processing of Version Control Data

    No full text
    This paper introduces a novel approach to enabling analytical processing of project data. The approach exploits source code repositories for information about project evolution. Furthermore this paper proposes a new perspective on analyzing version control data. It takes up a processcentric viewpoint, addresses related analysis problems like collaboration of programmers and proposes metrics for them. The research has yielded an implementation of the approach, which comprises visualizations that assist in examining the evolution of software process
    corecore