20,862 research outputs found

    Multicriteria Analysis of Neural Network Forecasting Models: An Application to German Regional Labour Markets

    Get PDF
    This paper develops a flexible multi-dimensional assessment method for the comparison of different statistical-econometric techniques based on learning mechanisms with a view to analysing and forecasting regional labour markets. The aim of this paper is twofold. A first major objective is to explore the use of a standard choice tool, namely Multicriteria Analysis (MCA), in order to cope with the intrinsic methodological uncertainty on the choice of a suitable statistical-econometric learning technique for regional labour market analysis. MCA is applied here to support choices on the performance of various models -based on classes of Neural Network (NN) techniques-that serve to generate employment forecasts in West Germany at a regional/district level. A second objective of the paper is to analyse the methodological potential of a blend of approaches (NN-MCA) in order to extend the analysis framework to other economic research domains, where formal models are not available, but where a variety of statistical data is present. The paper offers a basis for a more balanced judgement of the performance of rival statistical tests

    How to Host a Data Competition: Statistical Advice for Design and Analysis of a Data Competition

    Full text link
    Data competitions rely on real-time leaderboards to rank competitor entries and stimulate algorithm improvement. While such competitions have become quite popular and prevalent, particularly in supervised learning formats, their implementations by the host are highly variable. Without careful planning, a supervised learning competition is vulnerable to overfitting, where the winning solutions are so closely tuned to the particular set of provided data that they cannot generalize to the underlying problem of interest to the host. This paper outlines some important considerations for strategically designing relevant and informative data sets to maximize the learning outcome from hosting a competition based on our experience. It also describes a post-competition analysis that enables robust and efficient assessment of the strengths and weaknesses of solutions from different competitors, as well as greater understanding of the regions of the input space that are well-solved. The post-competition analysis, which complements the leaderboard, uses exploratory data analysis and generalized linear models (GLMs). The GLMs not only expand the range of results we can explore, they also provide more detailed analysis of individual sub-questions including similarities and differences between algorithms across different types of scenarios, universally easy or hard regions of the input space, and different learning objectives. When coupled with a strategically planned data generation approach, the methods provide richer and more informative summaries to enhance the interpretation of results beyond just the rankings on the leaderboard. The methods are illustrated with a recently completed competition to evaluate algorithms capable of detecting, identifying, and locating radioactive materials in an urban environment.Comment: 36 page
    • …
    corecore