121,765 research outputs found

    The International Land Model Benchmarking (ILAMB) System: Design, Theory, and Implementation

    Full text link
    The increasing complexity of Earth system models has inspired efforts to quantitatively assess model fidelity through rigorous comparison with best available measurements and observational data products. Earth system models exhibit a high degree of spread in predictions of land biogeochemistry, biogeophysics, and hydrology, which are sensitive to forcing from other model components. Based on insights from prior land model evaluation studies and community workshops, the authors developed an open source model benchmarking software package that generates graphical diagnostics and scores model performance in support of the International Land Model Benchmarking (ILAMB) project. Employing a suite of in situ, remote sensing, and reanalysis data sets, the ILAMB package performs comprehensive model assessment across a wide range of land variables and generates a hierarchical set of web pages containing statistical analyses and figures designed to provide the user insights into strengths and weaknesses of multiple models or model versions. Described here is the benchmarking philosophy and mathematical methodology embodied in the most recent implementation of the ILAMB package. Comparison methods unique to a few specific data sets are presented, and guidelines for configuring an ILAMB analysis and interpreting resulting model performance scores are discussed. ILAMB is being adopted by modeling teams and centers during model development and for model intercomparison projects, and community engagement is sought for extending evaluation metrics and adding new observational data sets to the benchmarking framework.Key PointThe ILAMB benchmarking system broadly compares models to observational data sets and provides a synthesis of overall performancePeer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/146994/1/jame20779_am.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/146994/2/jame20779.pd

    Too Trivial To Test? An Inverse View on Defect Prediction to Identify Methods with Low Fault Risk

    Get PDF
    Background. Test resources are usually limited and therefore it is often not possible to completely test an application before a release. To cope with the problem of scarce resources, development teams can apply defect prediction to identify fault-prone code regions. However, defect prediction tends to low precision in cross-project prediction scenarios. Aims. We take an inverse view on defect prediction and aim to identify methods that can be deferred when testing because they contain hardly any faults due to their code being "trivial". We expect that characteristics of such methods might be project-independent, so that our approach could improve cross-project predictions. Method. We compute code metrics and apply association rule mining to create rules for identifying methods with low fault risk. We conduct an empirical study to assess our approach with six Java open-source projects containing precise fault data at the method level. Results. Our results show that inverse defect prediction can identify approx. 32-44% of the methods of a project to have a low fault risk; on average, they are about six times less likely to contain a fault than other methods. In cross-project predictions with larger, more diversified training sets, identified methods are even eleven times less likely to contain a fault. Conclusions. Inverse defect prediction supports the efficient allocation of test resources by identifying methods that can be treated with less priority in testing activities and is well applicable in cross-project prediction scenarios.Comment: Submitted to PeerJ C

    Using a Combination of Measurement Tools to Extract Metrics from Open Source Projects

    Get PDF
    Software measurement can play a major role in ensuring the quality and reliability of software products. The measurement activities require appropriate tools to collect relevant metric data. Currently, there are several such tools available for software measurement. The main objective of this paper is to provide some guidelines in using a combination of multiple measurement tools especially for products built using object-oriented techniques and languages. In this paper, we highlight three tools for collecting metric data, in our case from several Java-based open source projects. Our research is currently based on the work of Card and Glass, who argue that design complexity measures (data complexity and structural complexity) are indicators/predictors of procedural/cyclomatic complexity (decision counts) and errors (discovered from system tests). Their work was centered on structured design and our work is with object-oriented designs and the metrics we use parallel those of Card and Glass, being, Henry and Kafura's Information Flow Metrics, McCabe's Cyclomatic Complexity, and Chidamber and Kemerer Object-oriented Metrics

    Analysis of source code metrics from ns-2 and ns-3 network simulators

    Get PDF
    Ns-2 and its successor ns-3 are discrete-event simulators which are closely related to each other as they share common background, concepts and similar aims. Ns-3 is still under development, but it offers some interesting characteristics for developers while ns-2 still has a large user base. While other studies have compared different network simulators, focusing on performance measurements, in this paper we adopted a different approach by focusing on technical characteristics and using software metrics to obtain useful conclusions. We chose ns-2 and ns-3 for our case study because of the popularity of the former in research and the increasing use of the latter. This reflects the current situation where ns-3 has emerged as a viable alternative to ns-2 due to its features and design. The paper assesses the current state of both projects and their respective evolution supported by the measurements obtained from a broad set of software metrics. By considering other qualitative characteristics we obtained a summary of technical features of both simulators including, architectural design, software dependencies or documentation policies.Ministerio de Ciencia e Innovación TEC2009-10639-C04-0
    corecore