4,728 research outputs found

    Tools to integrate organoleptic quality criteria into breeding programs

    Get PDF
    This technical booklet provides methodologies and guidance to implement sensory evaluations for organoleptic quality assessment in multi-actor-projects for organic agriculture. It presents five detailed tests that can be used in sensory evaluation, methodologies on how to prepare the samples and a glossary. This booklet has been developed under Solibam project and updated during Diversifood project

    Full Issue Spring 2010 Volume 5, Issue 2

    Get PDF

    Evaluation of the NAS-ILAB Matrix for Monitoring International Labor Standards: Project Report

    Get PDF
    The Bureau of International Labor Affairs (ILAB) engaged the National Research Council of the National Academy of Sciences (NAS) to recommend a method to monitor and evaluate labor conditions in a given country. The method focuses on 5 labor standards: freedom of association and collective bargaining, forced or compulsory labor, child labor, discrimination, and acceptable conditions of work

    Objective and automated protocols for the evaluation of biomedical search engines using No Title Evaluation protocols

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The evaluation of information retrieval techniques has traditionally relied on human judges to determine which documents are relevant to a query and which are not. This protocol is used in the Text Retrieval Evaluation Conference (TREC), organized annually for the past 15 years, to support the unbiased evaluation of novel information retrieval approaches. The TREC Genomics Track has recently been introduced to measure the performance of information retrieval for biomedical applications.</p> <p>Results</p> <p>We describe two protocols for evaluating biomedical information retrieval techniques without human relevance judgments. We call these protocols No Title Evaluation (NT Evaluation). The first protocol measures performance for focused searches, where only one relevant document exists for each query. The second protocol measures performance for queries expected to have potentially many relevant documents per query (high-recall searches). Both protocols take advantage of the clear separation of titles and abstracts found in Medline. We compare the performance obtained with these evaluation protocols to results obtained by reusing the relevance judgments produced in the 2004 and 2005 TREC Genomics Track and observe significant correlations between performance rankings generated by our approach and TREC. Spearman's correlation coefficients in the range of 0.79–0.92 are observed comparing bpref measured with NT Evaluation or with TREC evaluations. For comparison, coefficients in the range 0.86–0.94 can be observed when evaluating the same set of methods with data from two independent TREC Genomics Track evaluations. We discuss the advantages of NT Evaluation over the TRels and the data fusion evaluation protocols introduced recently.</p> <p>Conclusion</p> <p>Our results suggest that the NT Evaluation protocols described here could be used to optimize some search engine parameters before human evaluation. Further research is needed to determine if NT Evaluation or variants of these protocols can fully substitute for human evaluations.</p

    What you think and what I think: Studying intersubjectivity in knowledge artifacts evaluation

    Get PDF
    Miscalibration, the failure to accurately evaluate one’s own work relative to others' evaluation, is a common concern in social systems of knowledge creation where participants act as both creators and evaluators. Theories of social norming hold that individual’s self-evaluation miscalibration diminishes over multiple iterations of creator-evaluator interactions and shared understanding emerges. This paper explores intersubjectivity and the longitudinal dynamics of miscalibration between creators' and evaluators' assessments in IT-enabled social knowledge creation and refinement systems. Using Latent Growth Modeling, we investigated dynamics of creator’s assessments of their own knowledge artifacts compared to peer evaluators' to determine whether miscalibration attenuates over multiple interactions. Contrary to theory, we found that creator’s self-assessment miscalibration does not attenuate over repeated interactions. Moreover, depending on the degree of difference, we found self-assessment miscalibration to amplify over time with knowledge artifact creators' diverging farther from their peers' collective opinion. Deeper analysis found no significant evidence of the influence of bias and controversy on miscalibration. Therefore, relying on social norming to correct miscalibration in knowledge creation environments (e.g., social media interactions) may not function as expected

    s-AWARE: Measure-based Supervised Merging Algorithms for Crowd Assessors in Information Retrieval

    Get PDF
    In this thesis we develop a new approach to exploit crowd assessors relevance judgements for IR evaluation. We compute evaluation measures based on each assessor's ground truth. These measures are then merged weighting each assessor on the basis of his expertise level, estimated as the closeness between the assessor measures and gold standard measures, on a training set. The results highlight the greater performance of s-AWARE approach with respect to the majority of tested approaches
    • …
    corecore