40,588 research outputs found

    Using simulation studies to evaluate statistical methods

    Get PDF
    Simulation studies are computer experiments that involve creating data by pseudorandom sampling. The key strength of simulation studies is the ability to understand the behaviour of statistical methods because some 'truth' (usually some parameter/s of interest) is known from the process of generating the data. This allows us to consider properties of methods, such as bias. While widely used, simulation studies are often poorly designed, analysed and reported. This tutorial outlines the rationale for using simulation studies and offers guidance for design, execution, analysis, reporting and presentation. In particular, this tutorial provides: a structured approach for planning and reporting simulation studies, which involves defining aims, data-generating mechanisms, estimands, methods and performance measures ('ADEMP'); coherent terminology for simulation studies; guidance on coding simulation studies; a critical discussion of key performance measures and their estimation; guidance on structuring tabular and graphical presentation of results; and new graphical presentations. With a view to describing recent practice, we review 100 articles taken from Volume 34 of Statistics in Medicine that included at least one simulation study and identify areas for improvement.Comment: 31 pages, 9 figures (2 in appendix), 8 tables (1 in appendix

    Counterfactual Estimation and Optimization of Click Metrics for Search Engines

    Full text link
    Optimizing an interactive system against a predefined online metric is particularly challenging, when the metric is computed from user feedback such as clicks and payments. The key challenge is the counterfactual nature: in the case of Web search, any change to a component of the search engine may result in a different search result page for the same query, but we normally cannot infer reliably from search log how users would react to the new result page. Consequently, it appears impossible to accurately estimate online metrics that depend on user feedback, unless the new engine is run to serve users and compared with a baseline in an A/B test. This approach, while valid and successful, is unfortunately expensive and time-consuming. In this paper, we propose to address this problem using causal inference techniques, under the contextual-bandit framework. This approach effectively allows one to run (potentially infinitely) many A/B tests offline from search log, making it possible to estimate and optimize online metrics quickly and inexpensively. Focusing on an important component in a commercial search engine, we show how these ideas can be instantiated and applied, and obtain very promising results that suggest the wide applicability of these techniques

    New evidence on returns to scale and product mix among U.S. commercial banks

    Get PDF
    Numerous studies have found that banks exhaust scale economies at low levels of output, but most are based on the estimation of parametric cost functions which misrepresent bank cost. Here we avoid specification error by using nonparametric kernal regression techniques. We modify measures of scale and product mix economies introduced by Berger et al. (1987) to accommodate the nonparametric estimation approach, and estimate robust confidence intervals to assess the statistical significance of returns to scale. We find that banks experience increasing returns to scale up to approximately $500 million of assets, and essentially constant returns thereafter. We also find that minimum efficient scale has increased since 1985.Banks and banking ; Banks and banking - Costs ; Economies of scale

    Variation in population synchrony in a multi-species seabird community: response to changes in predator abundance

    Get PDF
    Ecologically similar sympatric species, subject to typical environmental conditions, may be expected to exhibit synchronous temporal fluctuations in demographic parameters, while populations of dissimilar species might be expected to show less synchrony. Previous studies have tested for synchrony in different populations of single species, and those including data from more than one species have compared fluctuations in only one demographic parameter. We tested for synchrony in inter-annual changes in breeding population abundance and productivity among four tern species on Coquet Island, northeast England. We also examined how manipulation of one independent environmental variable (predator abundance) influenced temporal changes in ecologically similar and dissimilar tern species. Changes in breeding abundance and productivity of ecologically similar species (Arctic Sterna paradisaea, Common S. hirundo and Roseate Terns S. dougallii) were synchronous with one another over time, but not with a species with different foraging and breeding behaviour (Sandwich Terns Thalasseus sandvicensis). With respect to changes in predator abundance, there was no clear pattern. Roseate Tern abundance was negatively correlated with that of large gulls breeding on the island from 1975 to 2013, while Common Tern abundance was positively correlated with number of large gulls, and no significant correlations were found between large gull and Arctic and Sandwich Tern populations. Large gull abundance was negatively correlated with productivity of Arctic and Common Terns two years later, possibly due to predation risk after fledging, while no correlation with Roseate Tern productivity was found. The varying effect of predator abundance is most likely due to specific differences in the behaviour and ecology of even these closely-related species. Examining synchrony in multi-species assemblages improves our understanding of how whole communities react to long-term changes in the environment and suggests that changes in predator abundance may differentially affect populations of sympatric seabird species

    Measuring Mimicry in Task-Oriented Conversations: The More the Task is Difficult, The More we Mimick our Interlocutors

    Get PDF
    The tendency to unconsciously imitate others in conversations is referred to as mimicry, accommodation, interpersonal adap- tation, etc. During the last years, the computing community has made significant efforts towards the automatic detection of the phenomenon, but a widely accepted approach is still miss- ing. Given that mimicry is the unconscious tendency to imitate others, this article proposes the adoption of speaker verification methodologies that were originally conceived to spot people trying to forge the voice of others. Preliminary experiments suggest that mimicry can be detected by measuring how much speakers converge or diverge with respect to one another in terms of acoustic evidence. As a validation of the approach, the experiments show that convergence (the speakers become more similar in terms of acoustic properties) tends to appear more frequently when a task is difficult and, therefore, requires more time to be addressed
    corecore