62,543 research outputs found

    Model evaluation of target product profiles of an infant vaccine against respiratory syncytial virus (RSV) in a developed country setting

    Get PDF
    Respiratory syncytial virus (RSV) is a major cause of lower respiratory tract disease in children worldwide and is a significant cause of hospital admissions in young children in England. No RSV vaccine has been licensed but a number are under development. In this work, we present two structurally distinct mathematical models, parameterized using RSV data from the UK, which have been used to explore the effect of introducing an RSV paediatric vaccine to the National programme. We have explored different vaccine properties, and dosing regimens combined with a range of implementation strategies for RSV control. The results suggest that vaccine properties that confer indirect protection have the greatest effect in reducing the burden of disease in children under 5 years. The findings are reinforced by the concurrence of predictions from the two models with very different epidemiological structure. The approach described has general application in evaluating vaccine target product profiles

    Dropout Model Evaluation in MOOCs

    Full text link
    The field of learning analytics needs to adopt a more rigorous approach for predictive model evaluation that matches the complex practice of model-building. In this work, we present a procedure to statistically test hypotheses about model performance which goes beyond the state-of-the-practice in the community to analyze both algorithms and feature extraction methods from raw data. We apply this method to a series of algorithms and feature sets derived from a large sample of Massive Open Online Courses (MOOCs). While a complete comparison of all potential modeling approaches is beyond the scope of this paper, we show that this approach reveals a large gap in dropout prediction performance between forum-, assignment-, and clickstream-based feature extraction methods, where the latter is significantly better than the former two, which are in turn indistinguishable from one another. This work has methodological implications for evaluating predictive or AI-based models of student success, and practical implications for the design and targeting of at-risk student models and interventions

    VPPA weld model evaluation

    Get PDF
    NASA uses the Variable Polarity Plasma Arc Welding (VPPAW) process extensively for fabrication of Space Shuttle External Tanks. This welding process has been in use at NASA since the late 1970's but the physics of the process have never been satisfactorily modeled and understood. In an attempt to advance the level of understanding of VPPAW, Dr. Arthur C. Nunes, Jr., (NASA) has developed a mathematical model of the process. The work described in this report evaluated and used two versions (level-0 and level-1) of Dr. Nunes' model, and a model derived by the University of Alabama at Huntsville (UAH) from Dr. Nunes' level-1 model. Two series of VPPAW experiments were done, using over 400 different combinations of welding parameters. Observations were made of VPPAW process behavior as a function of specific welding parameter changes. Data from these weld experiments was used to evaluate and suggest improvements to Dr. Nunes' model. Experimental data and correlations with the model were used to develop a multi-variable control algorithm for use with a future VPPAW controller. This algorithm is designed to control weld widths (both on the crown and root of the weld) based upon the weld parameters, base metal properties, and real-time observation of the crown width. The algorithm exhibited accuracy comparable to that of the weld width measurements for both aluminum and mild steel welds

    Causal networks for climate model evaluation and constrained projections

    Get PDF
    Global climate models are central tools for understanding past and future climate change. The assessment of model skill, in turn, can benefit from modern data science approaches. Here we apply causal discovery algorithms to sea level pressure data from a large set of climate model simulations and, as a proxy for observations, meteorological reanalyses. We demonstrate how the resulting causal networks (fingerprints) offer an objective pathway for process-oriented model evaluation. Models with fingerprints closer to observations better reproduce important precipitation patterns over highly populated areas such as the Indian subcontinent, Africa, East Asia, Europe and North America. We further identify expected model interdependencies due to shared development backgrounds. Finally, our network metrics provide stronger relationships for constraining precipitation projections under climate change as compared to traditional evaluation metrics for storm tracks or precipitation itself. Such emergent relationships highlight the potential of causal networks to constrain longstanding uncertainties in climate change projections. Algorithms to assess causal relationships in data sets have seen increasing applications in climate science in recent years. Here, the authors show that these techniques can help to systematically evaluate the performance of climate models and, as a result, to constrain uncertainties in future climate change projections

    Revisiting Precision and Recall Definition for Generative Model Evaluation

    Full text link
    In this article we revisit the definition of Precision-Recall (PR) curves for generative models proposed by Sajjadi et al. (arXiv:1806.00035). Rather than providing a scalar for generative quality, PR curves distinguish mode-collapse (poor recall) and bad quality (poor precision). We first generalize their formulation to arbitrary measures, hence removing any restriction to finite support. We also expose a bridge between PR curves and type I and type II error rates of likelihood ratio classifiers on the task of discriminating between samples of the two distributions. Building upon this new perspective, we propose a novel algorithm to approximate precision-recall curves, that shares some interesting methodological properties with the hypothesis testing technique from Lopez-Paz et al (arXiv:1610.06545). We demonstrate the interest of the proposed formulation over the original approach on controlled multi-modal datasets.Comment: ICML 201

    User's appraisal of yield model evaluation criteria

    Get PDF
    The five major potential USDA users of AgRISTAR crop yield forecast models rated the Yield Model Development (YMD) project Test and Evaluation Criteria by the importance placed on them. These users were agreed that the "TIMELINES" and "RELIABILITY" of the forecast yields would be of major importance in determining if a proposed yield model was worthy of adoption. Although there was considerable difference of opinion as to the relative importance of the other criteria, "COST", "OBJECTIVITY", "ADEQUACY", AND "MEASURES OF ACCURACY" generally were felt to be more important that "SIMPLICITY" and "CONSISTENCY WITH SCIENTIFIC KNOWLEDGE". However, some of the comments which accompanied the ratings did indicate that several of the definitions and descriptions of the criteria were confusing

    Online Model Evaluation in a Large-Scale Computational Advertising Platform

    Full text link
    Online media provides opportunities for marketers through which they can deliver effective brand messages to a wide range of audiences. Advertising technology platforms enable advertisers to reach their target audience by delivering ad impressions to online users in real time. In order to identify the best marketing message for a user and to purchase impressions at the right price, we rely heavily on bid prediction and optimization models. Even though the bid prediction models are well studied in the literature, the equally important subject of model evaluation is usually overlooked. Effective and reliable evaluation of an online bidding model is crucial for making faster model improvements as well as for utilizing the marketing budgets more efficiently. In this paper, we present an experimentation framework for bid prediction models where our focus is on the practical aspects of model evaluation. Specifically, we outline the unique challenges we encounter in our platform due to a variety of factors such as heterogeneous goal definitions, varying budget requirements across different campaigns, high seasonality and the auction-based environment for inventory purchasing. Then, we introduce return on investment (ROI) as a unified model performance (i.e., success) metric and explain its merits over more traditional metrics such as click-through rate (CTR) or conversion rate (CVR). Most importantly, we discuss commonly used evaluation and metric summarization approaches in detail and propose a more accurate method for online evaluation of new experimental models against the baseline. Our meta-analysis-based approach addresses various shortcomings of other methods and yields statistically robust conclusions that allow us to conclude experiments more quickly in a reliable manner. We demonstrate the effectiveness of our evaluation strategy on real campaign data through some experiments.Comment: Accepted to ICDM201
    corecore