30,004 research outputs found

    Developing a quality assurance metric: a panoptic view

    Get PDF
    This article is a post-print of the published article that may be accessed at the link below. Copyright @ 2006 Sage Publications.There are a variety of techniques that lecturers can use to get feedback on their teaching - for example, module feedback and coursework results. However, a question arises about how reliable and valid are the content that goes into these quality assurance metrics. The aim of this article is to present a new approach for collecting and analysing qualitative feedback from students that could be used as the first stage in developing more reliable quality assurance metrics. The approach, known as the multi-dimensional crystal view, is based on the belief that individuals have different views on the benefits that the embedded process in a system can have on the behaviour of the system. The results of this study indicate that in the context of evaluation and feedback methods, the multi-dimensional approach appears to provide the opportunity for developing more effective student feedback mechanisms

    Time-Sensitive Bayesian Information Aggregation for Crowdsourcing Systems

    Get PDF
    Crowdsourcing systems commonly face the problem of aggregating multiple judgments provided by potentially unreliable workers. In addition, several aspects of the design of efficient crowdsourcing processes, such as defining worker's bonuses, fair prices and time limits of the tasks, involve knowledge of the likely duration of the task at hand. Bringing this together, in this work we introduce a new time--sensitive Bayesian aggregation method that simultaneously estimates a task's duration and obtains reliable aggregations of crowdsourced judgments. Our method, called BCCTime, builds on the key insight that the time taken by a worker to perform a task is an important indicator of the likely quality of the produced judgment. To capture this, BCCTime uses latent variables to represent the uncertainty about the workers' completion time, the tasks' duration and the workers' accuracy. To relate the quality of a judgment to the time a worker spends on a task, our model assumes that each task is completed within a latent time window within which all workers with a propensity to genuinely attempt the labelling task (i.e., no spammers) are expected to submit their judgments. In contrast, workers with a lower propensity to valid labeling, such as spammers, bots or lazy labelers, are assumed to perform tasks considerably faster or slower than the time required by normal workers. Specifically, we use efficient message-passing Bayesian inference to learn approximate posterior probabilities of (i) the confusion matrix of each worker, (ii) the propensity to valid labeling of each worker, (iii) the unbiased duration of each task and (iv) the true label of each task. Using two real-world public datasets for entity linking tasks, we show that BCCTime produces up to 11% more accurate classifications and up to 100% more informative estimates of a task's duration compared to state-of-the-art methods

    DELIBERATION, JUDGEMENT AND THE NATURE OF EVIDENCE

    Get PDF
    A normative Bayesian theory of deliberation and judgement requires a procedure for merging the evidence of a collection of agents. In order to provide such a procedure, one needs to ask what the evidence is that grounds Bayesian probabilities. After finding fault with several views on the nature of evidence (the views that evidence is knowledge; that evidence is whatever is fully believed; that evidence is observationally set credence; that evidence is information), it is argued that evidence is whatever is rationally taken for granted. This view is shown to have consequences for an account of merging evidence, and it is argued that standard axioms for merging need to be altered somewhat

    A Bayesian copula model for stochastic claims reserving

    Get PDF
    We present a full Bayesian model for assessing the reserve requirement of multiline Non-Life insurance companies. Bayesian models for claims reserving allow to account for expert knowledge in the evaluation of Outstanding Loss Liabilities, allowing the use of additional information at a low cost. This paper combines a standard Bayesian approach for the estimation of marginal distribution for the single Lines of Business for a Non-Life insurance company and a Bayesian copula procedure for the estimation of aggregate reserves. The model we present allows to "mix" own-assessments of dependence between LoBs at a company level and market-wide estimates provided by regulators. We illustrate results for the single lines of business and we compare standard copula aggregation for different copula choices and the Bayesian copula approach.stochastic claims reserving; bayesian copulas; solvency capital requirement; loss reserving; bayesian methods

    When Conciliation Frustrates the Epistemic Priorities of Groups

    Get PDF
    Our aim in this chapter is to draw attention to what we see as a disturbing feature of conciliationist views of disagreement. Roughly put, the trouble is that conciliatory responses to in-group disagreement can lead to the frustration of a group's epistemic priorities: that is, the group's favoured trade-off between the "Jamesian goals" of truth-seeking and error-avoidance. We show how this problem can arise within a simple belief aggregation framework, and draw some general lessons about when the problem is most pronounced. We close with a tentative proposal for how to solve the problem raised without rejecting conciliationism

    Grading qualifications in the QCF: guidance for awarding organisations

    Get PDF

    A composite leading indicator for the Peruvian economy based on the BCRP's monthly business tendency surveys

    Get PDF
    This paper documents the construction of a composite leading indicator for the Peruvian economy based on the business tendency surveys (BTS) conducted by the Banco Central de Reserva del Perú (BCRP). We first classify potential composite leading indicators into "semantic" and "sophisti-cated" types. The former are based on the contents of the underlying indicators, whereas the latter results from statistical analyses relating to pre-determined reference series. We show that the BCRP BTS data provides a suitable basis for the construction of a sophisticated indicator with the Peru-vian year-on-year GDP growth rate as a reference series. The indicator selection consists of a num-ber of steps comprising semantic analyses of the questionnaire items, cross-correlation analyses as well as turning point analyses. We argue that based on these analyses, the choice should fall on five indicators, relating to firm-specific questionnaire items as well as to items relating to the sector or economy as a whole. The composite leading indicator is computed as the fist principal component of the selected variables. In-sample, it shows a lead of four months before the reference series, which amounts to about six months before the first official data release dates. Due to the limited number of observations (the BCRP's BTS now covering about eight years), we did not reserve any data points for out-of-sample analyses of the suggested composite leading indicator. Accordingly, the performance of the indicator still has to stand the test of time and its lead should be carefully monitored.

    The effect of discrete vs. continuous-valued ratings on reputation and ranking systems

    Full text link
    When users rate objects, a sophisticated algorithm that takes into account ability or reputation may produce a fairer or more accurate aggregation of ratings than the straightforward arithmetic average. Recently a number of authors have proposed different co-determination algorithms where estimates of user and object reputation are refined iteratively together, permitting accurate measures of both to be derived directly from the rating data. However, simulations demonstrating these methods' efficacy assumed a continuum of rating values, consistent with typical physical modelling practice, whereas in most actual rating systems only a limited range of discrete values (such as a 5-star system) is employed. We perform a comparative test of several co-determination algorithms with different scales of discrete ratings and show that this seemingly minor modification in fact has a significant impact on algorithms' performance. Paradoxically, where rating resolution is low, increased noise in users' ratings may even improve the overall performance of the system.Comment: 6 pages, 2 figure
    • …
    corecore