5 research outputs found

    Identifying Bugs in Make and JVM-Oriented Builds

    Full text link
    Incremental and parallel builds are crucial features of modern build systems. Parallelism enables fast builds by running independent tasks simultaneously, while incrementality saves time and computing resources by processing the build operations that were affected by a particular code change. Writing build definitions that lead to error-free incremental and parallel builds is a challenging task. This is mainly because developers are often unable to predict the effects of build operations on the file system and how different build operations interact with each other. Faulty build scripts may seriously degrade the reliability of automated builds, as they cause build failures, and non-deterministic and incorrect build results. To reason about arbitrary build executions, we present buildfs, a generally-applicable model that takes into account the specification (as declared in build scripts) and the actual behavior (low-level file system operation) of build operations. We then formally define different types of faults related to incremental and parallel builds in terms of the conditions under which a file system operation violates the specification of a build operation. Our testing approach, which relies on the proposed model, analyzes the execution of single full build, translates it into buildfs, and uncovers faults by checking for corresponding violations. We evaluate the effectiveness, efficiency, and applicability of our approach by examining hundreds of Make and Gradle projects. Notably, our method is the first to handle Java-oriented build systems. The results indicate that our approach is (1) able to uncover several important issues (245 issues found in 45 open-source projects have been confirmed and fixed by the upstream developers), and (2) orders of magnitude faster than a state-of-the-art tool for Make builds

    Tracing software build processes to uncover license compliance inconsistencies

    No full text

    Automated analysis for auto-generated build systems

    Get PDF
    Software build systems are crucial for software development as they translate the source code and resources into a deliverable. Prior work has identified that build systems account for 9% of software systems. However, their maintenance imposes a 36% overhead on software development. This overhead stems from the unique and hard to comprehend the nature of build systems. When executed, the build system is evaluated into a dependency-graph that captures how the system’s artifacts relate to each other. The graph generated depends on the selected build configurations. This graph is then traversed to perform the build. Prior work has emphasized the need for analysis support to tackle the challenges of evolving and maintaining build systems. In this thesis, we tackle three challenges associated with the maintenance and evolution of build systems. As the build system evolves, it’s not trivial to understand the impact of build code changes on its semantics. To tackle this, we propose a build code differencing technique to identify the semantic changes between two versions of a given build system. This would provide visibility on how the build system is evolving along with the software system. The second challenge we tackle is localizing faults within build systems. Build-time failures occur after the build code has been evaluated, and during the traversal of the dependency graph, it’s challenging to trace back the failure from the graph back to its root cause in the build system code. To this end, we propose a novel approach to localize faults in build code. For a given build failure, it returns a ranked list of statements in the build code that are suspected of causing the failure. This would aid in reducing the overhead of debugging and root causing build failures. The third challenge is to extract knowledge from build systems for analysis purposes. We propose an approach to extract the presence conditions of source code files from within the build system. This aims to support configuration aware analysis of configurable source code influenced by the build system. We then proceed to propose a foundation for developers to create analysis techniques to help them understand, maintain, and migrate their generator-based build system. We illustrate the use of the platform with two approaches: one to help developers better understand their build systems and another to detect build smells to improve the build code quality. To evaluate our work, we implement our proposed approaches against the widely used GNU build suite. Then, we use open-source projects to evaluate each of the approaches

    Dependency Management 2.0 – A Semantic Web Enabled Approach

    Get PDF
    Software development and evolution are highly distributed processes that involve a multitude of supporting tools and resources. Application programming interfaces are commonly used by software developers to reduce development cost and complexity by reusing code developed by third-parties or published by the open source community. However, these application programming interfaces have also introduced new challenges to the Software Engineering community (e.g., software vulnerabilities, API incompatibilities, and software license violations) that not only extend beyond the traditional boundaries of individual projects but also involve different software artifacts. As a result, there is the need for a technology-independent representation of software dependency semantics and the ability to seamlessly integrate this representation with knowledge from other software artifacts. The Semantic Web and its supporting technology stack have been widely promoted to model, integrate, and support interoperability among heterogeneous data sources. This dissertation takes advantage of the Semantic Web and its enabling technology stack for knowledge modeling and integration. The thesis introduces five major contributions: (1) We present a formal Software Build System Ontology – SBSON, which captures concepts and properties for software build and dependency management systems. This formal knowledge representation allows us to take advantage of Semantic Web inference services forming the basis for a more flexibility API dependency analysis compared to traditional proprietary analysis approaches. (2) We conducted a user survey which involved 53 open source developers to allow us to gain insights on how actual developers manage API breaking changes. (3) We introduced a novel approach which integrates our SBSON model with knowledge about source code usage and changes within the Maven ecosystem to support API consumers and producers in managing (assessing and minimizing) the impacts of breaking changes. (4) A Security Vulnerability Analysis Framework (SV-AF) is introduced, which integrates builds system, source code, versioning system, and vulnerability ontologies to trace and assess the impact of security vulnerabilities across project boundaries. (5) Finally, we introduce an Ontological Trustworthiness Assessment Model (OntTAM). OntTAM is an integration of our build, source code, vulnerability and license ontologies which supports a holistic analysis and assessment of quality attributes related to the trustworthiness of libraries and APIs in open source systems. Several case studies are presented to illustrate the applicability and flexibility of our modelling approach, demonstrating that our knowledge modeling approach can seamlessly integrate and reuse knowledge extracted from existing build and dependency management systems with other existing heterogeneous data sources found in the software engineering domain. As part of our case studies, we also demonstrate how this unified knowledge model can enable new types of project dependency analysis
    corecore